16

I have personally always been of the opinion that it would make sense for the default integer type to be unsigned, though it's been a long time since that would've been a live issue for debate; C in the 1970s was already defaulting to signed integers, nor was it the first language to do so.

I'm interested in exactly when, why and how the decision was first made. In assembly language, there isn't really a default; you always specify signed versus unsigned. So we should look at high-level languages (using here the classical definition of high-level as 'higher level than assembly'). The first significantly influential high-level language was Fortran. Modern Fortran standards mandate that compilers shall treat integer variables as signed unless otherwise specified.

When did Fortran decide this? Was the decision already made in the earliest Fortran compilers? Did any compilers on any machines, treat integers as unsigned?

rwallace
  • 60,953
  • 17
  • 229
  • 552
  • 3
    I don't think there was any "deciding" to be done - they had a target computer. – dave Aug 24 '20 at 11:35
  • 18
    Re, "In assembly language, ...you always specify signed versus unsigned." I don't know what assembler you've been using, but I've never used one that allowed one to declare variables with types. All the assemblers that I've ever used allowed one to reserve space and, to use any available op-code to access that space and operate on its content. Also note: With 2s complement arithmetic, many operations (including addition and subtraction) don't come in "signed" or "unsigned" variants. That's why 2s complement: It doesn't need as many op-codes to support both signed and unsigned numbers. – Solomon Slow Aug 24 '20 at 11:36
  • 6
    True enough, but if you've got the typical condition codes, you do need to make a choice between (for example) "branch on greater, signed" and "branch on greater, unsigned". The "specification" is therefore in terms of opcode selection - though as you say, it's not "always specify", more like "sometimes" or even "occasionally". – dave Aug 24 '20 at 14:02
  • 1
    Regardless of computer architecture and assembly opcodes, virtually no quantities in scientific computation are "non-negative by default". I would guess it never even occurred to the inventor(s) of Fortran to make real and integer values unsigned. – alephzero Aug 24 '20 at 14:36
  • 2
    @alephzero: Virtually no unsigned-only quantities? Distance, time, energy, mass, volume, current, ... in fact, all the SI units. That said, signed values are often nicer to compute with because you're not usually working right at the very edge of the valid range. – Greg Hewgill Aug 24 '20 at 21:53
  • 3
    Uh, voltage is signed. Consider AC for example: swings + and - relative to a reference 0. Time is signed: when some of you were born, my age was negative :-) – dave Aug 24 '20 at 23:57
  • 2
    @SolomonSlow /360 Assembler does type its labels/variables. There are more than a dotzend types. For example type 'A' will be a pointer (unsigned), while 'F' defined an Integer (signed) and 'E' a (single) float - All of them of word sized (and aligned) in memory – Raffzahn Aug 25 '20 at 04:37
  • 5
    Many of the replies implicitly assume that the designers of FORTRAN in the mid-1950s were working on an ISA like the two’s-complement machines we have today, They were not. The target architecture had a signed fixed-point type and a signed floating-point type, so that was what the original FORTRAN supported. Unsigned integer math was never an option, given their design goals. It was as simple as that. – Davislor Aug 25 '20 at 09:30
  • 2
    @GregHewgill, Some of the quantities you named (e.g., energy, mass, volume) are absolutes: Negative amounts of them would be unphysical (e.g., no physical thing has negative mass.) But, physical science would be crippled—never would have advanced beyond Aristotle—if we could not use negative numbers to represent changes and differences in those quantities. – Solomon Slow Aug 25 '20 at 13:02
  • 1
    P.S., "Distance" is just the magnitude of displacement, which is a vector quantity; Time is a coordinate (maybe you were thinking of duration which is the magnitude of a displacement in time); and current (like Voltage) is fundamentally a signed quantity. In fact, both current and Voltage really are 3D vector quantities, but we often pretend that they are one dimensional because we're mainly interested in circuits made of skinny wires. A signed real number is equivalent to a 1D vector. – Solomon Slow Aug 25 '20 at 13:12
  • 1
    If you are only going to support one of them then signed is the better choice. – Brian Aug 25 '20 at 14:35

2 Answers2

22

FORTRAN was originally developed for the IBM 704 computer, which stored integers in sign-and-magnitude format. In the original documentation, it supports fixed-point variables, which used the machine’s native format, floating-point variables, and unsigned fixed-point constants, which were intended for line numbers and subscripts. These would be translated into offsets that fit into the 704’s index registers. Other than indexing by those three registers, the 704 had only a small handful of instruction for unsigned integer arithmetic, ACY and CAL (which could add a logical word to the accumulator). It was also possible to do multiplication by constants through a combination of shifts and additions, which was enough to compute the address of an array element, but not to perform other arithmetic.

John Backus wrote later, “We certainly had no idea that languages almost identical to the one we were working on would be used for more than one IBM computer, not to mention those of other manufacturers.”

There was no serious consideration of unsigned integer variables, because the architecture FORTRAN was designed to run on had no such native type. The decision was made for that technical reason. Considerations about how useful one would or would not have been for computational physics were not historically important to it.

Davislor
  • 8,686
  • 1
  • 28
  • 34
  • 10
    "Formula Translation" indicates that the language was intended for calculating things. That in itself argues in favour of being able to handle negative integers. – dave Aug 24 '20 at 14:06
  • 2
    @another-dave Backus was very explicit, in his retrospective, that compiling to efficient code was a more important goal than mathematical elegance. – Davislor Aug 24 '20 at 16:25
  • 1
    Perhaps - but I don't class negative numbers as being particularly elegant mathematics. – dave Aug 24 '20 at 23:59
  • 4
    @another-dave The 704 wasn't designed for elegant mathematics, but for practical scientific and engineering calculation. And Fortran was designed to make it practical for scientists and engineers who were not specialists in machine language to program the 704. – John Doty Aug 25 '20 at 00:22
  • Fun reading through the manual. Interesting that (at least in the bits I read) it is referred to as a calculator and as an electronic data processing machine but not as a computer. But it has all the classic elements of a mainframe - tape drives, punched cards, core memory, printer, etc. I never worked on a 704 one myself (I'm not that old) but I do remember one of my professors talking about it. – manassehkatz-Moving 2 Codidact Aug 25 '20 at 00:34
  • 1
    @another-dave: I'm intrigued by your dissing of negative numbers. When you do differential calculus, do you only work with functions that increase? How do you manage things that oscillate (like, for example, simple sine waves - which I personally think are very elegant). Fortran was designed to solve engineering and scientific calculation, not elegant computer science problems. – Flydog57 Aug 25 '20 at 03:57
  • 2
    @Flydog57 It didn’t historically matter to their decision, because they were writing for an ISA that did not support multiplying, dividing or subtracting unsigned fixed-point numbers. So it would not have been feasible and there is no evidence they ever considered it. – Davislor Aug 25 '20 at 08:58
  • 2
    When did I diss negative numbers? Having been told that Backus wasn't concerned with mathematical elegance, my point was intended to convey that support of negative integers was not kow-towing to purist notions of elegance, it's just necessary basic arithmetic. Perhaps I was misunderstood. – dave Aug 25 '20 at 22:55
4

Fortran was developed with scientific computing in mind. Negative values clearly occur quite frequently when doing scientific computing or, for that matter, in many other problem domains that the developers of the language might have considered.

Supporting unsigned integral would have had some value but the language would have still had to support signed integral types. Since almost all interesting unsigned values fit comfortably within a signed integer variable, they may have felt that they were able to support their target audience (scientific computing) while also providing a reasonable level of support for those who really did want to work with unsigned values. Alternatively, they may have simply decided to focus on signed arithmetic without making any real effort to consider what it might mean to support unsigned arithmetic simply because their target audience would almost all need a language that supported signed values.

daboulet
  • 41
  • 2
  • 3
    According to the lead designer, John Backus, they didn’t think much about that when they did the language design. – Davislor Aug 25 '20 at 04:27
  • 2
    My impression is that negative numbers tend to occur much more frequently on my bank account than "when doing scientific computing" - But your mileage may vary. – tofro Aug 25 '20 at 07:57
  • 3
    Negative real values occur all the time in scientific computing. For integer variables, it's much more common to know the range a priori, and it either being nonnegative naturally (array indices) or at least could easily be shifted into the positive by a constant offset. But yeah, from Fortran's POV there just wouldn't have been any good reason to not allow negative integers everywhere anyway. – leftaroundabout Aug 25 '20 at 10:19
  • @leftaroundabout The unsigned int might have been a bad idea in other languages as well. It's right up there with the 2-digit year, except the bugs they cause aren't all neatly packed around a single date and usage. – Therac Aug 25 '20 at 18:45
  • 1
    @ZOMVID-20 no, unsigned ints and 2-digit years are completely different pairs of shoes. There are lots of situations where negative numbers really conceptually can/should not happen, and having a dedicated type for those is theoretically speaking good for safety. Now, you may argue that unsigned ints with their overflow behaviour do a bad job at enforcing this type guarantee, but that's more about the implementation rather than idea of unsigned types. — For 2-digit years meanwhile, it was always clear that this is never conceptually right, but just a “YAGNI” compromise. – leftaroundabout Aug 25 '20 at 21:51
  • Internally, of course, signed integers were necessary, given the subtractive indexing of the 704 and later systems. – dave Aug 25 '20 at 23:00
  • 1
    Unsigned integers exactly model how most computer addresses are considered: in a 16-bit address, the address after 32767 is generally thought of as address 32768, not -32768; the top of the address space is thought of as 65535, not -1. Unsigned integer arithmetic is vital to system programmers. – dave Aug 25 '20 at 23:02