21

The Apollo Guidance Computer was used to control the command/service module and lunar module on the missions to the moon. (Definitely a retrocomputer!) As noted in this answer, programs were written in assembly language. There are several emulators available today, including one which can be run in a web browser.

Even though the AGC was invented before the C programming language, is a C compiler possible for this architecture? If not, why?

For the purposes of this question, a satisfactory compiler would support all of the C operators (including arithmetic, boolean, structure, and pointer) the original purpose of the AGC: notably, real-time signal processing and control. It does not have to be a lunar mission; the AGC was also used in a Navy rescue submarine and in the first airplane with computer fly-by-wire control.

Less important but nice to have:

  • Originally I included structure and pointer operations as a requirement. However, arrays with indices would probably suffice.
  • Ability to act as a general-purpose platform.
  • Compliance to one or more standards (including but not limited to K&R, ANSI, and Embedded C).
  • Floating point. The original software used fixed-point numbers, with subroutines for subtraction, multiplication, and division. Such numbers can be declared with Embedded C's fixed type. We'll call that good enough, even if it is possible to implement IEEE floating point.
  • Standard libraries or system calls (i.e. stdio should not be a concern).

The compiler would be hosted on another system, not on the AGC itself.

I hope these clarifications help!

(Photograph of Apollo Director of Software Engineering Margaret Hamilton, next to the source code of her team)
Margaret Hamilton

DrSheldon
  • 15,979
  • 5
  • 49
  • 113
  • 2
    "A satisfactory compiler..." - for what purpose? – Bruce Abbott Sep 03 '18 at 21:52
  • 2
    The Standard does explicitly recognize that an implementation might be freestanding, with no support for much of the standard library, rather than hosted, with full support for it. – Davislor Sep 03 '18 at 23:04
  • It will depends heavily on the level of compliance, e.g. not sure if the standard library could fit in the memory. – user3528438 Sep 03 '18 at 23:21
  • Is there any reason to assume it's not possible? – Mast Sep 04 '18 at 05:55
  • 3
    This is weird. The guidance computer is much more powerful than a lot of processors I use today and for which there are a myriad of commercial C compilers - including floats and pointers in 512 words of code and 27 bytes of RAM. – pipe Sep 04 '18 at 07:55
  • That's a lot of source code. – Omar and Lorraine Sep 04 '18 at 09:02
  • @pipe: When you say powerful, are you talking about its 55W power consumption? (joking). https://en.wikipedia.org/wiki/Apollo_Guidance_Computer#Timing says it runs at ~1MHz. The transport-triggered computation is neat (store to a special location and load it back, and you get your data shifted right by 7 to extract the high bit of the byte). Wiki says each instruction took at least 12 "timing pulses", which is presumably a clock cycle. Multiply used 8 * 12 clocks. What currently-used microcontrollers are slower than the AGS, and for what kinds of computations? – Peter Cordes Sep 04 '18 at 09:05
  • @PeterCordes Here is one that is pretty useful, and multiplication would be pretty slow because it doesn't have a multiplier and the instruction set is pretty clumsy. However, it will not beat the guidance computer on power consumption! Only 300 µW :( – pipe Sep 04 '18 at 09:29
  • @pipe: All single-cycle Instructions except for program branches which are two cycles, and it has a 4MHz clock. So it's about 48 times faster in instructions per second, assuming both are running mostly simple instructions. To beat the AGS at integer mul, you only need to implement a 15-bit integer multiply out of shift and add (PIC has that, right?) in 4 (clock ratio) * 8 (mul cost) * 12 (clocks per subgroup on AGS) = 384 clocks. Unless add/adc is complete garbage on PIC (for 16-bit integer add on an 8-bit CPU), it should be much faster. (I might be off by a factor of 4 for a pulse) – Peter Cordes Sep 04 '18 at 09:35
  • @PeterCordes Actually they lie a lot because one instruction cycle is 4 clocks which they rarely mention. According to some math lib datasheet I found, a 16x16 bit multiply takes around 300 instruction cycles (so that's at 1 MHz). But target clock speed is not really a factor for a C compiler, the problem as I see it is the instruction set, code space, and RAM, and this PIC sucks at all of them but still have compatible compilers. – pipe Sep 04 '18 at 09:50
  • 1
    @PeterCordes: While the PIC10F series doesn't include crystal oscillator circuitry, it was not uncommon to run older PICs off a 32768Hz crystal, which would result in them executing 8192 instructions/second, while consuming very little power. – supercat Sep 05 '18 at 04:29
  • 1
    Featured picture on English Wikipedia 7/19/2019, part of a series of NASA/Apollo 11/etc. pictures. – manassehkatz-Moving 2 Codidact Jul 19 '19 at 03:22
  • Cowgol is an "Ada-inspired programming language" for the 6502 and Z80 and AGC. It can compile and run (under emulation) a lunar lander game. – Kelvin Sherlock Jul 23 '19 at 03:52

4 Answers4

21

A full conforming compiler would be impractical, but it would probably be possible to write a compiler for a subset of the language which a couple of features removed:

  1. While it would be possible for a compiler to emulate recursion, code that needs to support re-entrancy would likely be much less efficient that code which doesn't. Given that the Standard imposes no requirement that compilers support recursion usefully (there's no guarantee that it be possible to nest any particular function more than one deep without bombing the stack) simply refusing to support recursion would seem more practical than generating re-entrant code for functions, and more "honest" than accepting such code but behaving in goofy fashion if functions are invoked recursively.

  2. The Standard would require that an implementation support floating-point math on values with a mantissa of greater than 32 bits. Limiting floating-point computations to a 32-bit or even 16-bit mantissa would allow them to be handled with smaller and faster code than would be possible with a standard-conforming "double".

Usable C compilers exist for microprocessors whose architecture is even less "C-friendly" such as the CDP1802 (interesting chip, but the code to accomplish something like "ptr[index] = 1234;" would take 21 instructions) so the Apollo computer, which has an INDEX instruction, doesn't look too bad by comparison, at least if code doesn't need to support re-entrant functions.

supercat
  • 35,993
  • 3
  • 63
  • 159
  • Wasn't there a recognized subset of C without floating point, back in the day? (A quick web search didn't find it - but I can't remember exactly what it was called). – alephzero Sep 03 '18 at 19:49
  • 19
    The C standard actually doesn't set any limits to floating point precision. It just says double needs to be at least as precise as float. It's just a convention that most ompulers use IEEE754 for floating point, and that says "float is 32 bits" (both mantissa and exponent). – tofro Sep 03 '18 at 20:10
  • 7
    There's a few C compilers for microcontrollers that implement software floating point as a library and allow you to not link the floating point library if your program only use integers. – slebetman Sep 03 '18 at 23:15
  • @supercat: Note clarification to question. Hope this helps! – DrSheldon Sep 04 '18 at 01:42
  • 3
    @slebetman: That was also the case for early PC C compilers (e.g., Microsoft QuickC and Borland Turbo C) before the x87 FPU became a standard feature. – dan04 Sep 04 '18 at 04:33
  • 1
    @dan04: Same for the Amiga / MC68851, and I guess lots of other platforms. – DevSolar Sep 04 '18 at 07:34
  • @tofro: besides that annex F says that float has to match the iec 60559 single format, and even though only informative annex E requires FLT_MAX 1e+37 and FLT_EPSILON 1E-5 leading to pretty much 32 bit float. – PlasmaHH Sep 04 '18 at 09:33
  • I' no compiler writer but reentrancy seems simple (basically implement a stack). What's hard about it? – Peter - Reinstate Monica Sep 04 '18 at 12:52
  • @PeterA.Schneider: Unless a platform supports base+displacement addressing, storing variables on the stack is apt to be far less efficient than storing them at fixed addresses. Taking a 2:1 or worse performance and code-size hit to support recursion is typically not worthwhile. – supercat Sep 04 '18 at 14:54
  • @tofro: The Standard specifies that DBL_DIG must be at least 10, which would require a significant of at least 35 bits. If the Standard had only required conforming implementations to make it be at least 8, that would have greatly reduced (often by 50% or more) the cost of floating-point math on FPU-less systems that perform all math using the same floating-point type. On the ARM, for example, a type that combined a 32-bit significand word with explicit leading "1" and a 32-bit sign+exponent word could offer an excellent blend of precision and performance. – supercat Sep 04 '18 at 15:57
  • @dan04: An under-appreciated feature of IEEE-754's extended-precision type is that while it's often associated with the 8087, computations using that type were faster than computations on IEEE-754 double-precision type especially on 16-bit or 32-bit systems without an FPU. On such systems, it will generally be necessary to unpack the exponent and significand from a 64-bit value into five separate 16-bit words or three separate 32-bit words before doing anything with them, and then pack those words back into a 64 bit value afterword. – supercat Sep 04 '18 at 16:05
  • @dan04: Although ANSI C undermined the usefulness of extended-precision floating-point types by botching the semantics of how they interact with variadic or non-prototyped functions (it should have kept the guarantee that all floating-point expressions get converted to a common type [possibly truncating to double precision] while providing a type that can be used to wrap/pass full-precision values to code that explicitly expects them) the semantics and performance they offered were much better than those achieved when performing computations directly on float and double. – supercat Sep 04 '18 at 16:25
  • 1
    @PeterA.Schneider: On the AGC, the prologue for a non-reentrant non-leaf function would need to use two XCH instructions to move Q to a location which is hard-coded to hold that function's return address, and the epilogue would use two instructions (INDEX and TC) to jump there. A statement like x=y-z; would need three instructions--one to load -z, one to add y, and one to store the result. Allowing for re-entrancy would make require to adding three instructions to the prologue and epilogue to adjust a register used as the stack pointer, and would require adding an... – supercat Sep 04 '18 at 22:23
  • INDEX instruction before every instruction that accesses something on the stack. The penalty for re-entrant code wouldn't be as bad on the AGC as it would be on some other platforms, but would still be quite severe when using "raw" machine code (about a 2:1 cost in space and time) but perhaps not unworkable when using interpreted code. – supercat Sep 04 '18 at 22:25
  • As far as I know there ain't no rule that the compiler can't be a full tracing compiler and only emit recursive-enabled code for functions that actually are recursive. – Joshua Feb 11 '24 at 20:28
  • @Joshua: No rule would forbid that, beyond the fact that it would undermine one of the major design purposes of C, i.e. to reduce the required complexity of a build system that could--with a programmer's assistance, generate reasonably efficient machine code. – supercat Feb 11 '24 at 21:07
11

I'm aware this is an old thread, but I thought it would be worth a brief comment given the relevance of some work I've done.

Personally I hope the answer to this question is yes, since last year I started a personal project to implement an LLVM compiler for AGC (LLVM, Clang).

I found it took a lot of effort just to manipulate the existing framework of the compiler, designed for modern processors, to be able to represent the funky aspects of the AGC assembly and programming model. For example, the blurred lines between what is a register and what is 'memory' required quite a bit of thought.

There's also the question of searching through the entire pipeline of the compiler to track down places where numbers are assumed to be two's complement. Constant folding comes to mind as something that would completely mess up your code if it made this assumption.

I only have experience with LLVM, but I would imagine even to an expert, writing a GCC backend would throw up the same class of issues. I'd expect there would be more chance of success from writing a well designed compiler from scratch.

For those who are interested I gave a talk this year at FOSDEM on this: https://fosdem.org/2019/schedule/event/llvm_apollo/

lewis-revill
  • 111
  • 1
  • 3
8

Even though the AGC was invented before the C programming language, is a C compiler possible for this architecture?

To begin with it depends on the value of for :))

  • If the question is about that a compiler can be writen (on some computer) to produce code for the AGS (aka a crosscompiler), the answer is a clear Yes.

  • If it asks for a compiler running on the AGS it gets harder. Not so much for having a compiler in it's 30 kWords of program ROM, but for not so theoretical a way of inputing a source to be compiled. Here I would go for a theoretical yes, but, by all love for low level interfaces, in praxis the answer is No.

For the purposes of this question, a satisfactory compiler would support all of the C operators (including arithmetic, boolean, structure, and pointer). It would not need to support all of the standard libraries or system calls (i.e. stdio should not be a concern).

"Satisfactory" is a nice word - just not very clear. Is it satisfactionary that it for example the only data type available is a 15 bit word, or is floating point mandatory? Does it only need to follow basic K&R, or is C99 (or C11) a goal?

The compiler would be hosted on another system, not on the AGC itself.

Sounds good. So the answer is a clear Yes.

It is possible to do a C compiler (even on contemporary computer to the AGC) following K&R, even including FP, whose output can be loaded (well, wired) into the AGC. For doing FP it might have to carry a considerable large runtime (library), so C without FP and only a 15 bit integer datatype might be the prefered solution to keep as much as possible ROM for useful code.

And then there is the question asked in the title (emphasis by me) which somewhat got lost in the question text:

Would a C compiler for the Apollo Guidance Computer be plausible?

Here the answer is a clear No.

The result of a C compiler will alway be inferiour to what an Assembler programmer can squeeze out. Considering the small code space (36 kWords) and complex job to handle, Assembly might have been the only way to go.

If at all, a language more suited to system programming than C would have been used - most likely something similar to MOL-360 or PL/1 (or rather its specialized cousin PL/S).

Raffzahn
  • 222,541
  • 22
  • 631
  • 918
  • 5
    "a C compiler will alway be inferiour...": Note that optimizers can easily outsmart humans, but you are correct that humans can make assumptions that optimizers are not allowed to. On a general purpose computer, I'd say a C compiler will usually be superior to hand crafted assembly, and that's ignoring all the advantages of a higher level language over assembly (readability, etc.) – isanae Sep 04 '18 at 01:17
  • 4
    @isanae As I see it, if a compiler outsmarts a human Assembly programmer, then he wasn'tvery smart to start with. Beside not seeing many advantages of high level languages, I still haven't seen in ~40 years of programing a single compiler produceing superiour code to what a human programmer can do (not at least due the fact that all their tricks have been devised by human programmers in the first place). Compilers only may bring advantage compared to less than good programers. But then again, these are the same using less than fiting algorithms - something no compiler can't put right again. – Raffzahn Sep 04 '18 at 01:32
  • 1
    @Raffzahn: Note clarification to question. Hope this helps! – DrSheldon Sep 04 '18 at 01:43
  • 1
    @DrSheldon Thanks - just, I can't realy see how this would change my answer - or what did I miss? BTW, droping tructures doesn't change a lot, as these are rather constructions for compile time address calculation. Nothing that may (or may not) hurt the resulting target. Somewhat similar for pointer - more important, pointers are an exteme important part of basic C, so without a lot is lost in (standard) compatibility. – Raffzahn Sep 04 '18 at 01:44
  • 7
    Ah yes, the famed Assembly Programmer. If you cannot see many advantages to high level languages compared to assembly, then we have such a fundamental disagreement and exceptionally different professional experience that I don't think we'll ever agree on who's the best optimizer ;) – isanae Sep 04 '18 at 01:49
  • 9
    @isanae: You happen to have picked a really bad architecture to try to make your point with. The statement about the compiler output beating a decent assembly programmer wasn't really true until after the rearranging processors came out. On a fixed-order fixed-execution-cycle processor the assembly programmer tends to win. – Joshua Sep 04 '18 at 01:58
  • 1
    @Joshua Huh? My first comment says "On a general purpose computer, I'd say a C compiler will usually be superior to hand crafted assembly". – isanae Sep 04 '18 at 01:59
  • 2
    @isanae Well, evading is always a good way to preserve a bias, isn't it? I wouldn't be so sure that our experiance is as different - it might be rather the viewpoint. To start with, there is no 'best' optimizer as that implies that writing Assembly works the same way as a compiler does - first creating some code and then trying to improve it. Choosing a good code sequence (and more important structure) is part of the basic implementation process - it's picking the right algo for the right situation. Not sure if I'm famed, but doing 40 years of mainframe (and micro) Assembly leaves some scars:) – Raffzahn Sep 04 '18 at 02:02
  • @Joshua: Wasn't that rather just before heavily rearranging processors? In the period where we got instruction sets with a grazillion registers and parallel pipelines but everything might stall (or give wrong results) if you used a register too soon. – hmakholm left over Monica Sep 04 '18 at 04:27
  • 1
    I'd say it would definitely be a problem - with only 2 16-bit general purpose registers and no stack, dedicated shift and rotate registers, implementing a working C compiler is a bit of a challenge.... Even writing programs in Assembly sounds like a challenge to me. – tofro Sep 04 '18 at 05:33
  • 2
    @isanae: A clever human these days will start with compiler output (if it's not horrible) when looking for optimizations. That makes it impossible to lose to the compiler on the machine you're benchmarking on for tuning. (You may introduce some things that are slow on other CPUs you don't know about, e.g. if you don't check https://agner.org/optimize/ intruction tables for Ryzen and optimize something using pext/pdep which is fast on Intel but slow on AMD.) – Peter Cordes Sep 04 '18 at 08:46
  • 1
    @isanae: Compilers can do constant-propagation on a large scale which would be unmaintainable for hand-written asm, but on a small scale (a single hot loop) they're sometimes optimal, but more often leave some performance on the table. See Sometimes you can hand-hold them into making near-optimal asm (what I recommend trying to actually do in real life), but hand-written asm can in some cases be even faster. See C++ code for testing the Collatz conjecture faster than hand-written assembly - why? for examples of both. – Peter Cordes Sep 04 '18 at 08:49
  • 3
    @isanae: Current compilers (gcc and clang) are still terrible for ARM SIMD intrinsics (which is weird because they're very good with x86 SIMD intrinsics). Writing manually-vectorized loops in asm by hand is definitely still recommended for ARM (but not x86 or PowerPC). I totally agree with you that writing whole programs in asm is not sensible these days, but knowing asm, and how your C or C++ will compile to asm, is useful to keep in mind. (I basically can't stop myself from thinking in terms of asm even if I wanted to, except by writing in bash or perl instead of C/C++.) – Peter Cordes Sep 04 '18 at 08:52
  • 1
    The question of what is "inferior" will very much depend on your viewpoint. If you haven't read The Story of Mel, I would very much recommend it. It's fascinating and educational; and edited as a free-verse poem it has a strange flowing beauty. It's important to remember that the landing computer had numerous bugs, one of which famously nearly caused Neil Armstrong to abort landing, and arguably a less impenetrable program would have been easier to review for correctness. Doing the wrong thing faster is not generally considered "better"! – Graham Sep 04 '18 at 11:20
  • 3
    @isanae: Nope. Beat the optimizing compiler on the Pentium 1 was first semester assembly. It's not that hard, but it takes an hour a function. – Joshua Sep 04 '18 at 13:57
  • 2
    @Graham: Don't run the rendezvous radar and the landing radar at the same time was in the manual. That "bug" was asking for more CPU than was available. No amount of proving correct can handle not enough CPU because a device driver that should not be running is stealing it. – Joshua Sep 04 '18 at 14:01
  • 2
    One aspect to all this which I find interesting is applying modern compilers on modern CPUs to generate code for very old CPUs — you sometimes end up with code which would have been near-impossible to devise when the old CPUs were still “relevant”. I get the impression the question here is asking about C compilers at the time code was being written for the AGC, but all the discussion about optimising compilers v. hand-written assembly changes somewhat if you cross-compile using modern tools. – Stephen Kitt Sep 04 '18 at 14:07
  • 2
    To whoever's flagging comments: moderators aren't Fallacy Man. Just post another comment pointing out the flaws. – wizzwizz4 Sep 04 '18 at 15:04
  • 1
    @Graham Regarding the program alarms triggered during the Apollo 11 descent towards the lunar surface, if anything, the guidance computer design ensured that those were a nuisance, not critical. The multitasking, real-time properties of the computer and software allowed the computer to keep performing the required tasks and dispense with the non-critical ones in a controlled fashion, thus maintaining its real-time properties. Mission Control: The Unsung Heroes of Apollo makes a point of this, referring to a simulated landing with a similar error triggered but the LM remaining controllable. – user Sep 04 '18 at 15:10
7

One of the biggest problems with C for this architecture is the fragmented address space.  You would almost want some extensions for C that direct the compiler where to locate (global) data so that the various data would be accessible in an easy and known way from the code that uses it.  Somewhat reminiscent of FORTRAN Common Blocks...

Consider for a minute the 8086 extended, 20-bit addressing.  Compilers for that architecture had to choose a memory layout model for program execution.   There are basically three options:

  • Stick with 16-bit pointers — and forgo the larger memory for the program (i.e. everything fits in 64k), leaving that additional address space for running multiple programs (rather than for running larger programs).

  • Use 32-bit pointers to store 20-bit addresses — that means that every pointer dereference or array indexing operation required multiple instructions, involving swapping of segment registers and the like.  So, a simple *p++ = *q++; becomes a dozen or more instructions, whereas it is ideally a single instruction.

  • Use 16-bit spaces for each of code, global data, stack, and heap.  Thus programs of 256k are possible with 64k of each of the above.  This was a reasonable option for Pascal due to being a less pointer-oriented (by having reference parameters, for example), but not as much for C, which is much more pointer happy (e.g. using pointers instead of reference parameters).

Architectures with paged memory banks using segment-specifying registers are surprisingly easy to program by human in assembly but hard to work by a compiler.  These architectures typically use a common base page, perfect for some of the globals, but easy to overflow if you put all the globals there.  So, again, you would almost want some location directives in C to inform the compiler that these globals should go in the coveted base page, vs. elsewhere.

Apparently the AGC has two levels of memory segments, the second due to the expansion by the Block II (via the SBank/SuperBank bit).  These things tend to wreak havoc with models of code generation and C's expectation that a universal full-address-space-sized pointer can refer to anything: code, data, stack, heap...

That's not to say it couldn't be done, but you'd want a number of language extensions, or else you'd find it extremely difficult to reach the efficiency of hand-written assembly.

Erik Eidt
  • 3,357
  • 1
  • 14
  • 21
  • It's been a while since I used it, but I'm convinced one of the MSDOS C compilers I worked with had a way of annotating variables to put them into a common block. Most modern C compilers provide a way of specifying the object file section to put an object in; once you've done that a linker script can be used to place the sections in appropriate locations. – Jules Sep 06 '18 at 08:57
  • 1
    There were two different ways of using 32-bit pointers: far and huge. A far pointer would be limited to accessing an object of 65,520 bytes or less anywhere in memory, but code to access far pointers could be reanably efficient. Given int *p,*q;, *p++ += *q++; would generate something like les bx,[bp+12] / add word [bp+12], 2mov ax,[es:bx] / les bx,[bp+8] / add word [bp+8],2 / add [es:bx],ax. Six instructions total. Compared with using near pointers, this requires using les instead of mov... – supercat Sep 06 '18 at 17:00
  • 1
    ...and requires adding an es: prefix for the operations that use the pointers. A huge pointer could handle individual objects up to the full size of memory, but almost any address computation would either require a subroutine call or bloat the code enormously. Something really simple like p++ would become something like add byte[bp+8],2 / cmp byte [bp+8],17 / jb ok / inc word [bp+10] / sub byte [bp+8],16 / ok: [which might be practical without a subroutine call] but the best code for something like p+=someLong would probably be something like... – supercat Sep 06 '18 at 17:07
  • uint24_t temp = p_ofs <<4 + p_seg<<8; temp += someLong<<4; p_ofs = (temp & 240) >> 4; p_seg = temp>>8; Too painful for words. Even p+=someInt isn't all that much better. I think most complaints about the 8086 come from people who don't understand how to use far pointers effectively, since the code for them is more efficient than for any other 16-bit processor that needs to access 65,520 byte objects at arbitrary locations in RAM. – supercat Sep 06 '18 at 17:26
  • Weren't "far" and "near" C keywords created exactly for that? – jeancallisti Sep 13 '18 at 10:59
  • This isn't a big issue for C compilers. For example, GCC can generate code for AVR processors that use separate program and data busses (Harvard architecture) and non-contiguous memory addressing. – user Jul 22 '19 at 11:08
  • 4
    'far' and 'near' were implementation-specific extensions to the language, not part of the language. – dave Jul 22 '19 at 12:19
  • @another-dave: By the time the C Standard was published, a substantial plurality of the C code that had been written was targeted to support the near/far keywords. IMHO, the Standard should have recognized such keywords as distinguishing between objects in, or pointers to, an implementation-defined subset of memory (which may or may not include all of memory) versus those that could be anywhere. Even a modern architecture like ARM could benefit from such distinction if tooling were set up to exploit them. – supercat Mar 15 '20 at 19:57
  • @another-dave: All implementations would be capable of processing code that used the keywords as well as codes that didn't if they simply allowed all objects and pointers to be located everywhere, but recognition of such keywords would allow many programs to run more efficiently on many platforms if they partitioned their objects into a few groups identified by macros that could be defined as near or far, depending upon how much near space a particular target had available. – supercat Mar 15 '20 at 20:00