Since my other question has been answered and has given me a way to do high-precision timing, I've been experimenting with it a bit. The first thing I did was write a simple benchmark using the FRAMES variable as a time reference. When doing that, I noticed something odd. I wondered if there was any overhead to the REM statement, which there was (albeit very little). However, I noticed the overhead was smaller if there was no text after the REM. The line would execute in 0.04 or 0.05 PAL frames if the line was simply REM, but would take 0.06 PAL frames if the line was REM ANYTHING GOES HERE. It seems not to matter if the text is one character or a hundred characters. It always takes slightly longer if there is any text. This was completely reproducible all the dozens of times I tried it. Note that I am currently using an emulator, however it is a cycle-accurate emulator and so should be identical in behavior to a real Sinclair ZX Spectrum running 128 BASIC. The benchmarking code is below:
10 LET T1=PEEK 23672
20 LET T1=PEEK 23673*256+T1
30 FOR I=0 TO 100
40 GO SUB 130
50 NEXT I
60 LET T2=PEEK 23672
70 LET T2=PEEK 23673*256+T2
80 LET TD=T2-T1-67
90 IF TD<0 THEN LET TD=0
100 PRINT "SECONDS","FRAMES"
110 PRINT TD/5000,TD/100
120 STOP
130 REM LINE GOES HERE
140 RETURN
What could be causing this behavior? Why would the presence of text in a REM statement affect the time it takes to execute in any way? I can understand why the overhead is not zero, but not this. My suspicion is that it is related to tokenization. Perhaps the interpreter takes in the token and the rest of the text in separately regardless of whether or not the text is to be interpreted or not. I can't verify this.
Why does REM have less overhead than REM FOO in 128 BASIC?
EDIT: I tested this more with different ROMs and BASIC versions, and I've concluded that there must have been a problem with my test. I thought I had already tried this code and reproduced the results, but when I tried again and actually wrote down the results, it seems there is actually no change in the number of frames that pass by when interpreting REM with and without accompanying text. This makes it seem like the only reason my previous code gave the results it was giving had to do with the time it took GO SUB to jump to its target, or something along those lines. I used this code:
10 LET T1=PEEK 23672
20 LET T1=PEEK 23673*256+T1
30 FOR I=0 TO 1000
40 REM THIS IS A COMMENT
50 NEXT I
60 LET T2=PEEK 23672
70 LET T2=PEEK 23673*256+T2
80 PRINT T2-T1
I ran this on 48K, as well as on 128K with both its 48 BASIC and 128 BASIC interpreters, replacing line 40 with both a REM and even removing it all together. The results of 1000 loops of this test in PAL frames elapsed are recorded below. Clearly, something was wrong with my previous methodology.
| +REM +text | +REM −text | −REM −text | |
|---|---|---|---|
| 48 BASIC, ZX48 | 241 | 241 | 221 |
| 48 BASIC, ZX128 | 238 | 238 | 219 |
| 128 BASIC, ZX128 | 372 | 372 | 311 |
| 48 BASIC, Pentagon | 227 | 228 | 209 |
| 128 BASIC, Pentagon | 338–343 | 338–343 | 286–288 |
The results are telling. When not using GO SUB, there is no overhead incurred by adding any text to REM (but still some for interpreting the command itself, albeit not much). 48 BASIC is the fastest, though slightly slower on 48K than on 128K running 48 BASIC. 128 BASIC was by far the slowest.
I will do more tests to try and find out what caused the previous behavior that I misattributed to REM.
1 GOTO 100010 testprgramm from here until 998and999 RETURN. This minimizes GOSUB overhead in any Basic interpreter. After all, you want to measure the target, not the GOSUB. | This works due the way a GOSUB searches for it's target. The great part is that this structure will never have an negative impact, no matter what BASIC is used. With most BASICs it will have a great reduction of search time, as only one line (#1) has to be skiped to find the subroutine start. – Raffzahn Feb 25 '18 at 11:10-67part is. Testing showed 67 frames worth of overhead for 100 iterations of an empty loop (i.e. jumping straight to theRETURN). – forest Feb 26 '18 at 01:59RETURN. – forest Feb 26 '18 at 04:02PAUSEespecially, it's possible to determine at least an approximate amount of overhead. Currently I don't have any other ways to robustly test performance from the BASIC interpreter. When the math is adjusted so a hundred iterations over a single blank line reports 0 frames passed, then it's safe to say that (almost) any other lines can have their delays measured with at least a rough accuracy. It's certainly not a robust test setup, but it's all I have when working with the constraints of the BASIC interpreter. – forest Feb 26 '18 at 04:29GO SUBandRETURNdo not affect the benchmark by putting in a number of sample lines and testing to make sure the delay increased in a linear fashion (so e.g. theRETURNstatement doesn't take significantly longer if it has to jump up a couple lines farther). – forest Feb 26 '18 at 04:31GO SUBas a variable to the equation, the behavior changed (though I still don't know why it had that behavior in the past). See my edit. – forest Feb 27 '18 at 02:49GO SUB. Or, if there are other reasons (like extendable structure), minimize teh influence of any noise. That was my aproach. Bottom line, noone needs 5 valid digits. All you need is up to thre but solid ones. – Raffzahn Feb 27 '18 at 14:44