125

A couple years ago the Toronto PET Users Group published an article (The Great Commodore/Microsoft Easter Egg War, on p. 7) about a newly discovered anti-Microsoft Easter Egg that Commodore hid in the ROM of the Commodore 64 and other machines. Some sample code for triggering the Easter Egg is provided and I have verified that it really does print an unflattering message about Bill Gates (at least in the VICE C64 emulator):

enter image description here

How exactly does this work? The article is sketchy on the details, saying only that the message is "embedded inside BASIC's random number generator". How did Commodore embed this message inside the random number generator, and how did the TPUG author reverse-engineer it?

Raymond68
  • 1,255
  • 2
  • 9
  • 9
  • 43
    This looks more like an "infinitive number of monkeys writing Shakespeare" thing than an Easter Egg. You should be able to find much more "wisdom" (actually, any wisdom) in a perfect random number generator... – tofro Aug 02 '17 at 17:48
  • 1
    I think as the C64 ROMs were developed, Bill Gates were only the leader of a quasi-nameless startup of the IBM. – peterh Aug 03 '17 at 22:13
  • 11
    @peterh, Bill Gates and Paul Allen wrote the Commodore BASIC interpreter. – dan-gph Aug 04 '17 at 07:53
  • 1
    @tofto: it's a pseudorandom number generator returning 40-bit values, it's nowhere near close to invoke the monkeys. They had to do it on purpose. – ajuc Aug 04 '17 at 12:05
  • 1
    The only easter egg I know of is WAIT 6502,x It print MICROSOFT! x times. – cup Aug 04 '17 at 12:36
  • 7
    @peterh No. Microsoft wasn't a start-up and wasn't an IBM company. At that time, Microsoft BASIC was already a very common and successful BASIC interpreter -- as has been pointed out, Commodore BASIC itself was a derivative of Microsoft BASIC. Indeed, the success of Microsoft BASIC was probably a big part of the reason that Microsoft got the operating system contract for the IBM PC. But there's no reason that Commodore would include an Easter egg insulting one of their suppliers who was just one of many successful companies and nothing like the monopolistic giant we know today. – David Richerby Aug 04 '17 at 13:51
  • 4
    Basically, the output message is just encoded in the "magic numbers" assigned to A in the code. – JimmyB Aug 04 '17 at 14:24
  • 2
    So the answer suggests it's not strictly based on an infinite number of monkeys, just on finding three very specific monkeys, each of which types one of these three words. – SF. Aug 08 '17 at 18:16

2 Answers2

240

I'm the author of the TPUG article.

The "BILL GATES SUCKS" message isn't really an Easter egg; that was just a conceit of mine to make the article a bit more interesting and to turn it into a bit of a puzzle. Here's how it works and how it was created:

In any given infinite sequence random numbers, it's a mathematical certainty that a given subsequence of numbers will appear at some point. For example, if you repeatedly roll a normal six-sided die, then at some point you are bound to roll the sequence 6, 5, 4, 3, 2, 1—though it might take you hours of rolling before this happens. Similarly, if you had a hypothetical 26-sided die labelled with the letters of the alphabet, then repeatedly rolling it you would expect at some point to form the words "BILL", "GATES", and "SUCKS". Again, this might take you hours, days, or even months of continuous rolling.

The "random" number generator in the Commodore 64 and its brethren isn't truly random, but it's close enough that you can still expect to find arbitrary subsequences in its output. The only problem (or in my case, feature) with it is that, every time you turn on the machine, it always produces the same "random" sequence of floating-point numbers, all between 0 and 1. The random sequence can be changed only by seeding the generator with a numeric value. Each seed value causes the generator to produce a different random sequence.

With this in mind, I set out to find three seed values such that the first five or six numbers of their respective sequences, multiplied by 22 and rounded down to the nearest integer, correspond to the the alphabetic indices of the letters in the words BILL, GATES, and SUCKS, followed by a zero. For example, there is some seed value that makes the C64 random number generator start off with a sequence of floating-point numbers x1, x2, x3, x4, x5 such that ⌊22x1⌋ = 2, ⌊22x2⌋ = 9, ⌊22x3⌋ = 12, ⌊22x4⌋ = 12, and ⌊22x5⌋ = 0; note that B is the 2nd letter of the alphabet, I is the 9th, and L is the 12th. (I multiply by 22 because that's one more than the index of U, the alphabetically last letter in the message.)

These three seed values were found by writing a BASIC program that iterated through all possible integer seed values until it found one that generated the correct sequence:

  0 k=1:f=22
 10 inputw$
 15 w$=w$+chr$(64)
 20 l=len(w$):dimw(l)
 30 fori=1tol
 35 printi
 40 w(i)=asc(mid$(w$,i,1))-64
 50 next
 60 i=-1
 70 a=rnd(i):j=k
 90 printi:ifj>lthenend
100 ifint(rnd(1)*f)=w(j)thenj=j+k:goto90
110 i=i-k:goto70
120 ifi>-1theninputi:ifi>-1theni=-i
130 i=rnd(i)
140 i=int(rnd(k)*f)
150 ifi=0thenprint:end
160 printchr$(i+64);:goto140

To use it, first think of a short word you want to "encode", and change the value of f in line 0 to the index of the alphabetically last letter in the word plus one. Then run the program and type in the word. The program will then test each seed value, printing them out as it goes. It stops when it finds a seed that prints the string.

The program looks a bit odd, but that's because it's optimized for speed. (For instance, I prefer using variables and gotos to constants and for loops.) I compiled it using the Blitz! compiler and ran it using the VICE emulator in warp mode. This way seeds can be found within a few minutes or hours. Runtime on a real C64 would have been tens or hundreds of times longer.

Once I had the three seed values, it was a simple matter to write a short BASIC program that used them to seed the random number generator three times and print out the generated letters.

Psychonaut
  • 7,133
  • 2
  • 24
  • 54
  • 60
    It's always delightful when a SE question gets an answer from an authority on the subject. Welcome to retrocomputing! – Wayne Conrad Aug 02 '17 at 22:51
  • Now the puzzle is to find out how lucky you were that the numbers you've got are at most in the low tens of millions rather than hundreds of millions. – Leo B. Aug 03 '17 at 00:49
  • 6
    If you have a fast computer, mist64/cbmbasic runs at native speed. – scruss Aug 03 '17 at 02:01
  • @scruss It's not actually "native speed" as it's a 6502 emulator running CBM BASIC ROMs. Not much different than VICE except how its output is displayed. –  Aug 03 '17 at 06:14
  • 4
    Actually, you're only looking for character distances in the series of numbers, as there is an offset added to the numbers to build characters anyhow - this offset can vary as well - should be even easier to find whatever words you want, then. – tofro Aug 03 '17 at 06:46
  • 17
    "In any given infinite sequence random numbers, it's a mathematical certainty that a given subsequence of numbers will appear at some point." - strictly speaking, that's not true (it's just the probability of not getting the given numbers goes to zero). It's even less true for pseudorandom sequences that are bound to repeat. – Radovan Garabík Aug 03 '17 at 07:40
  • 6
    @RossRidge it really is native speed. cbmbasic says “This source does not emulate 6502 code; all code is completely native”. Setting F=6 and W$="ABCDE", this 4 GHz i7 and cbmbasic solves for -106484 in ~1½ seconds. x64 in warp mode (~ 2300%, 105 fps) on the same machine takes about 6 minutes. – scruss Aug 03 '17 at 07:47
  • 1
    @scruss The source you linked emulates a 6502 despite what it says. While the authors tried to hide what they're doing by not providing the actual source code, but it's still obvious that cbmbasic.c a 6502 emulator. You can see global variables for the CPU registers (A, PC, S, X, Y), CPU flags (Z, N, C, V, B, D, I), as well as RAM. You can see the ROM embedded in statements like *((&llvm_cbe_ROM[((signed int )1u)])) = ((unsigned char )-29);. Note that without precisely emulating Microsoft's 40-bit floating point code like this, you wouldn't get the same result. –  Aug 03 '17 at 09:12
  • 3
    respects dear sir, despite it is a deception to call it an easter egg, it nonetheless is a very enjoyable deception. Having said that... Is that program working too on a real C64 ? I mean, the whole idea relies on the assumption that same RNG in that ROM will generate same numbers with a specific seed. While this will still be true on a C64, that specific sequence might be different on a C64 because of underlying hardware difference (emulators RNG implementation might differ). Perhaps you would have to run your seed finder again on that hardware. – Tuncay Göncüoğlu Aug 03 '17 at 13:01
  • 1
    Just as I suspected - just brute force then. A very neat idea nontheless! Welcome! – Zac67 Aug 03 '17 at 19:09
  • 1
    Ah I see. so it's just using some math to generate the message. aw that's disappointing. but amazing and interesting at the same time! – Nicholas DiPiazza Aug 03 '17 at 22:15
  • 1
    I'm curious, is TPUG meant to be an Onion style newsletter, or was the author of this just trolling people? I'm assuming the whole article is an invention? – user2851943 Aug 04 '17 at 14:22
  • @user2851943 If you think this article is so appalling, you should read its sequel, a genealogy of Commodore guru Jim Butterfield, which appeared in the subsequent issue: http://www.tpug.ca/tpug-media/nl/92-Spring2016.pdf (To answer your question seriously, though, no, the TPUG Newletter is not an Onion-style publication, though they are not averse to running the occasional April Fool's joke or other light-hearted articles.) – Psychonaut Aug 04 '17 at 18:31
  • @RossRidge I think it is quite different from VICE as VICE is an emulation that can run arbitrary 6510 code. cbmbasic does not contain any 6510 code. cbmbasic.c is the result of a compilation of 6510 code — the BASIC ROM — to C with LLVM used for optimizing. I would not call static recompilation emulation. – BlackJack Aug 05 '17 at 16:40
  • @TuncayGöncüoğlu I would not expect a difference on a real C64 since the program doesn't use hardware — the numbers solely rely on the deterministic (P)RNG in the Basic ROM. That (P)RNG code should be the same on all official ROMs. Note that RND() isn't called with zero as argument, as that would select an alternate way to produce the result. – BlackJack Aug 06 '17 at 12:40
  • So, it may be possible to find different seeds which produce BILL GATES RULES? Of course that would be dishonest and a waste of time. – Reversed Engineer Aug 07 '17 at 06:20
  • 1
    Thanks @Psychonaut So was any part of the article true? – user2851943 Aug 07 '17 at 09:55
  • Can we find a version with only one seed? – Carsten S Aug 07 '17 at 15:10
  • @user2851943 The parts describing the C128 and Plus/4 Easter Eggs, as well as Microsoft's original Easter egg and how it works, are true. – Psychonaut Aug 07 '17 at 15:23
  • @CarstenS Probably there is a version with only one seed, but I don't think it can be found using my brute-force method. The running time would be just too high. It might also be worthwhile to consider non-integer seeds, though I can't think of a convenient way of doing that. You could try all possible bit patterns to find the smallest floating-point seed, but there might be no simple way of writing that bit pattern as a constant expression in BASIC. – Psychonaut Aug 07 '17 at 15:31
  • Just to be absolutely clear. The random generator does not generate truly random numbers - it generates a stream of numbers in a completely reproducible way (this is the reason the same message is shown for everybody) which has the same properties as truly random numbers. – Thorbjørn Ravn Andersen Dec 29 '17 at 20:26
64

That's not a real easter egg. Someone just made an effort to find random seeds that produce the numbers to create the intended words. It would be an easter egg if the seed numbers were in some way related to CBM or Microsoft.

A=RND(-A) initializes the (pseudo) random generator with A, generates a random number and stores it in A.

The GOSUB20 subroutine then runs a random number series from the seed until RND(A) is too low (close to zero), using the random number as index for letters A-Z.

So, to generate any arbitrary string of reasonable length, all you need to do is to find the correct initialization value for the random generator, simply by trying all possible start values (brute force). See Psychonaut's excellent answer for details.

Zac67
  • 3,914
  • 1
  • 17
  • 23