27

Traditional PC serial ports, based on members of the 8250/16550 UART families (or their clones within SuperIO chips) support some unusual serial data formats, specifically 5- and 6-bit data, and 1.5 stop bits.

Some USB-Serial adaptors (Prolific) claim to support these formats, some (FTDI) only offer 7 and 8 bit data. I don't think many microcontroller UARTs support 5 or 6 bit data either, though it's time-consuming to research that.

My own experience stretches fairly well back into the distance past, but I cannot recall ever having seen anything use a 5- or 6-bit serial data format. I would say that 7-bit data formats are old-fashioned and tend to come from the 80s or earlier, and that 5/6 bit formats are nothing more than an historical curiosity.

I'd like to recommend to a project I'm involved in which interfaces to serial ports that we drop support (or at least test coverage) for 5- and 6-bit data and 1.5 stop bits. It would be useful to establish whether anyone knows of any application for these which is still in service.

What were the legacy applications for 5 or 6 bit serial data, and do any of them still exist?

wizzwizz4
  • 18,543
  • 10
  • 78
  • 144
  • 4
    a 5 bit telex card circa 1960's https://qph.ec.quoracdn.net/main-qimg-204ad86b20a37ecaadb35836c2b4e9d1-c –  Feb 11 '17 at 12:19
  • Most Atmel 8bit ATMega's support 5,6,7,8 and 9bit UART formats. –  Feb 11 '17 at 12:48
  • I'm still wondering how you can have 1.5 bits of anything. – cbmeeks Feb 13 '17 at 21:33
  • 4
    @cbmeeks That's simply 500 millibits more than one bit. – tofro Feb 14 '17 at 07:04
  • @tofro yeah I know the math. Just curious how you could store it without a higher level scheme. – cbmeeks Feb 14 '17 at 12:12
  • 11
    Jokes aside: The "stop bits" measure is actually a measurement of time rather than a unit of information. So 1.5 bits is 1.5 bit times, one bit time equalling 1/bit rate s. And 1.5 stop bits means the stop bit signal needs to be stable for 1.5 bit times, i.e. 50% longer than for a "normal" data bit. – tofro Feb 14 '17 at 12:18
  • 1
    @tofro is exactly right. "1.5 stop bits" just means that after the last data or parity bit the transmitter holds the line at "marking" condition for at least 1.5 bit times before allowing the "start bit" of the next character to be sent. No actual data is being sent or received (any more than for the start bit) so no storage is needed. It's just a delimiter for the character frame. Some teleprinters required 1.5 stop bits to give them time to complete the operation, esp for a carriage return/line feed. It works perfectly fine to send them two stop bits instead, at a slight loss of throughput. – Jamie Hanrahan Feb 15 '17 at 22:39
  • @JamieHanrahan: Generally, the "stop bits" control on a UART doesn't control what the UART "requires', but rather forces the UART to leave the line idle for some extra time between characters. If a device whose clock were 0.1% faster than ideal were to transmit data with 1 stop bit as fast as it could to a a device which would immediately resend it, and the latter device had a clock that was 0.1% slower than ideal, the second device would only be able to send about 999 bytes in the time required for 1001 to arrive. If transmission was continuous, data loss would be inevitable. – supercat Feb 15 '17 at 23:05
  • @JamieHanrahan: Having the device which is sending data as fast as it can incorporate an extra fractional stop bit will ensure that the number of bytes a relay device can send per second won't exceed the number of bytes it receives. I've seen a couple UART chips which allow the number of stop bit length to be programmed in increments of 1/16; I'm not sure why that's not a more common feature. Another feature that might be helpful if implemented would be to have a UART configurable to ensure that the stop bit length would always be an integer number of bit times, plus 1/2. That would... – supercat Feb 15 '17 at 23:08
  • ...ensure that a receiver that watches for signals which change near the middle of a bit time would notice any framing errors and hopefully recover from them. – supercat Feb 15 '17 at 23:09
  • @JamieHanrahan, a mechanical teleprinter typically needed a lot more than just one or two bit times to return the carraige and advance to the next line. I remember, some computers would send several NUL characters after sending each CRLF. If your terminal was a Teletype, and if you didn't configure enough NUL characters, the first character of the next line would be struck while the carraige still was moving backward, resulting in a smear somewhere in the middle of the line. – Solomon Slow Sep 26 '18 at 16:26
  • Um, IME, well-maintained ones didn't. I worked with systems that used both ASR33 and model 20 for many years. And I know what characters were sent over the line. CR and LF are separate things; the CR does not have to be done when the LF happens, and the machine can be receiving the first printing character of the line while the LF is happening. So there's really two whole character times for it to get done. I did see the phenomenon you described but it usually meant that the machine needed service (often just lubrication). – Jamie Hanrahan Sep 26 '18 at 22:28
  • But that supposes also that LF can be done in one character time :-) Not intending to quibble, but as far as I recall as a programmer of a certain age, any system that supported teletypes provided a means to insert pad characters. This to me suggests some sort of necessity. – dave Feb 21 '19 at 13:59
  • @another-dave: Perhaps there were 300-baud teleprinters that required the carriage to be fully returned before the next character arrived (some 300-baud teleprinters could buffer one or more full characters before the carriage finished returning, and output them at a rate of slightly faster than 30cps once that happened, but it would seem plausible that some teleprinters might have operated at 300 baud without having buffering circuitry). – supercat Feb 21 '19 at 16:56
  • Maybe so, but 300 baud was an unimaginable data rate to me :-) – dave Feb 22 '19 at 00:29

9 Answers9

42

Five-bit teletypewriter codes ("Baudot", etc.)

As far back as the early 1900s (believe it or not) there were teletypewriters. They were intended to replace Morse-style telegraphy, directly printing hardcopy rather than requiring an operator to listen to the Morse code and transcribe the messages by hand.

The first successful such equipment was invented by a man named Baudot. He also invented the five-bit character code that his machines used. (See ["Baudot code"]) Later, a different five-bit (or "five-level") code was developed that made it easier to build a mechanical typewriter-like keyboard that would generate the code. The most successful such code was called ITA2.

These codes provided upper case only. They were primarily used for communication between machines like the Teletype Model 15, in networks like Telex, TWX, and Western Union's "Telegram" service.

These networks have almost completely faded away. Some ham radio operators still use Radio Teletype (RTTY) and they do still use the five-bit codes, partly due to tradition, partly for efficiency (they need fewer bits to send a character than the modern 8-bit codes). But this is most often done using computers as terminals, not ancient teletypewriters like the Model 15. A few of the die-hards do keep some of the old machines running.

Most old Teletype machines with three rows of keyboard keys (instead of the four that were common on typewriters) used these five-bit codes. They only needed three rows on the keyboard because numbers and special characters were shifted from the alphabetic keys. Two keys labeled "Figs" and, I believe, "Ltrs" sent the "shift in" and "shift out" codes. If you hear somebody talking about "three row" teletype machines, this is what they're referring to.

I wrote another answer about how teletypewriters in general, and the 5-level ("Baudot") machines in more detail, here.

The 6-bit TTS code

As for six-bit codes, the most common use (at least in terms of async serial communications) was probably the "TeleTypesetter" (TTS) code. I say this because virtually every newspaper that subscribed to a wire service (like AP or UPI) was equipped to receive it, usually with multiple feeds.

I'm going to say quite a bit about this because, while five-bit teletypewriter info is easy to find out about, there's very little out there about TTS.

The TTS code was a clear descendant of the five-bit codes like ITA2. Despite having six bits it still used "shift in" and "shift out" codes (like the Baudot code family did), permitting TTS to carry over 100 different glyphs (printable characters) and control commands. So it included upper and lower case alphabets, digits, a large assortment of special characters ("Wingdings" - far more than what you'd find on a typical typewriter), plus typesetting-oriented commands like "flush left" (which means "end a paragraph and justify the last line to the left"), center, and flush right.

In the wire room

In the old days, news wire services like AP and UPI (and smaller local ones, like City News Service) would send stories to their member newspapers via this code, over dedicated leased telephone lines. In each newspaper's "wire room" the copy would be printed on a keyboard-less TeleType Model 20 (so the editors could read and select the stories), and also fed to a "reperforator" (paper tape punch). There was one such pair of machines for each wire service the paper subscribed to.

Between each story the wire services would send a bunch of NULs (which would punch essentially blank tape, with only the feed holes), then a bunch of characters that would print as nonsense but would punch out the next-following Story ID on the tape in block letters, followed by more NULs. This made it relatively easy to find the section of tape that corresponded to the following printed copy. It also helped in identifying the correct direction and orientation for the paper tape, as the six-level tape was symmetric about the feed holes! But close inspection of the feed holes gave another way to tell the direction: They "led" the data holes slightly, so the back edge of a feed hole corresponded to the center line of the data holes it went with.

A new hire in the wire room would have the job of tearing off the hardcopy and the tape that went with it as it came out, filing the tape, and distributing the copy to the editors.

The tape for selected stories could then be fed directly to a Linotype machine equipped with a "Teletypesetter Operating Unit", which was a paper tape reader connected to a metal box that was placed on top of the Linotype's keyboard. The box was conceptually very simple: It had a solenoid for each of the Linotype's keys and it simply "pressed the keys" as the tape was read. The result, just as when a human was typing, was cast metal type that could be put on a press, inked up, and printed.

It was possible to edit the story before typesetting by using a TTS Teletype machine, tape reader, and tape punch ("reperforator"). The tape would be duplicated until the desired edit point was reached, then the operator could type additional text. Or to skip things, they would advance the original tape without copying it to the new tape.

A radio or TV station's news operation would have the model 20 Teletype, but no reperforator or Linotype. Copy from the Teletype would be torn off and handed to editors who turned it into the (generally much shorter) stories the anchors would read. In small stations the on-air news readers also did the writing. Eventually the wire services offered feeds already edited for radio or TV, used by smaller stations that didn't want to hire news copy editors.

The Teletype machines and paper tape punches ran nearly continuously, as stories were updated and reposted throughout the day. Ear protection was a good idea in the wire room! To this day, a few "all news" radio stations use the sound of one of those machines pounding out copy as background sound to their live on-air reporters.

TTS code and computer typesetting

In later years the incoming 6-bit signal was connected directly into a computer's serial port, stored on disk, and made available for review and editing via video terminals. After editing the computer would send the copy to a phototypesetter, resulting in nicely set type on photo paper.

(I spent a few years working for a newspaper with such equipment, based on HP 2100 minicomputers. We still had the reperforators but they were turned on only during the hour or so when we ran the nightly backups. Due to repetition on the feeds, almost all stories they wanted to use would be found already in the computer, but if need be they could find the tape for a missed story and read it via a paper tape reader attached to the computer. Incidentally our phototypesetters were made by Merganthaler, the same company that made Linotypes. The software system was called "Text II", from a company called Systems Development Corporation, based in Santa Monica.)

Virtually everything you ever saw printed in a newspaper with "AP", "UPI", or some other news "wire" service in the slug line came into the paper via systems like this. The local newspaper may, however, have edited the stories, sometimes significantly.

TTS phase-out

Around the late 70s, about the same time that I left the paper, AP was planning to offer a higher-speed network using eight-bit codes. So today, nearly 40 years later, I doubt that there's much if any of this six-bit TTS code left in use today - any more than there are five-bit ("three-row") teletype networks.

Well-maintained operating examples of TTS equipment are virtually nonexistent as there is virtually no hobbyist interest in it. One Teletype enthusiast site claims there is one operating example of a Teletype model 20.

Computers and six-bit codes

There were also many computers that used 6-bit character codes internally. Examples were the PDP-8 (a 12-bit machine, so two 6-bit characters could fit in a machine word), IBM 1401 and 7090, CDC 3000 and 6000 series (24-, 48-, and 60-bit words respectively), etc. These were not the same six-bit codes as TTS, and "shifting" was generally not used to expand the code set. Nor did they commonly use six-bit paper tape. Printers and punched cards of the day only supported upper case alphabets, so 64 different glyphs were enough. (See, for example, the Wikipedia article on the IBM 1401.) Some of them did, however, communicate with these six-bit codes over early modems.

IBM's System/360 set a de facto standard of 8-bit bytes (but using IBM's EBCDIC character code, which had very little uptake elsewhere), and several very successful mini-computers with 16-bit words (HP 2100, DG Nova, DEC PDP-11) started using 8-bit characters (generally using ASCII character codes) at around the same time. That pretty much ended the era of six-bit character codes.

The IBM 2741 and 1050 printing terminals were originally used with machines like the 7090 and the 1401, and these terminals used six-bit characters (at 134.5 bit/s, 1.5 stop bits). This was, however, a very different six-bit code from TTS, and in many cases different also from the character code used within the machines they connected to. (Of course.) Like the TTS code, though, they did have "shift in" and "shift out" commands, which corresponded literally to "shifting" the typewriter mechanism (ie rotating the ball 180 degrees). The 2741 and 1050 were based on the IBM Selectric mechanism; the Selectric typeballs of that time had 88 printable characters, already too many for six bits even before other controls are counted. (But not, alas, quite enough for the 94 printable glyphs of seven-bit ASCII.)

Later the same terminals were connected to System/360 and other eight-bit machines and the computers, or purpose-designed interfaces, had to do the code conversion. So if you happen to find a working 2741, yeah, you'll need that six-bit code to talk to it. :) You'll also need to be able to set your serial port to 134.5 bit/s.

Jamie Hanrahan
  • 1,431
  • 10
  • 9
  • I wonder if it would have been hard to design a device to translate a pair of consecutive spaces on a Linotype type into a code for an end-of-sentence space? The use of Linotypes has had a deleterious effect on the quality of typography, and it would seem that should have been avoidable. – supercat Feb 15 '17 at 23:12
  • 1
    @supercat the Linotype did have keys for en, em and "thin" spaces as well as the expandable-for-justification space band. And TTS code did allow for all four. If the wire services didn't send them, or the Linotype keyboard operators didn't use them, that is hardly Mergenthaler's fault. :) – Jamie Hanrahan Feb 16 '17 at 01:07
  • From what I read, Linotype machines tended to behave badly when fed a tape containing two consecutive word-space characters. Requiring that nobody should ever send two consecutive word-space characters seems like a rather extreme solution compared with simply making the machine recognize the combination as an en quad (or whatever amount of space the typesetter deems appropriate). – supercat Feb 16 '17 at 17:55
  • 2
    Ah. Word-spaces are the "expandable" space. The "spacebands" are wedge-shaped, and higher than the letterform matrices. Justification was done by a bar that came down against the tops of the spacebands, forcing them further into the stack of matrices, causing them to take more width on the line. I can imagine several reasons why you wouldn't want two or more of these in a row. – Jamie Hanrahan Feb 16 '17 at 19:53
  • Yup. Having a larger space after a full stop than after an abbreviation had long been recognized as typographically desirable, and people with typewriters would have made it easy for the Linotype to preserve that distinction. From what I understand, though, the Linotype's inability to handle typists who used "period space space" to indicate a full stop has eliminated what used to generally be a useful typographical semantic distinction (situations where a line ends with a period could sometimes be ambiguous, but most potential ambiguities are resolved, without the reader even having... – supercat Feb 16 '17 at 20:00
  • For the machine to recognize two wordspaces as something else, when it saw the first one, it would have to wait to see what the next keystroke was. Then if that wasn't another wordspace, it would emit the spaceband and then the next character. That's quite a bit more "stateful" than anything else the machine does, afaik. Another approach would be to go ahead and drop the first spaceband but then, if it saw a second one, remove the first spaceband from the line and do something else instead. But they had no way to remove matrices from the line, not even the most recent (no "backspace" key). – Jamie Hanrahan Feb 16 '17 at 20:01
  • (my "I can imagine several reasons" referred to the machine's operation, not the typographical desireability. I'm a two-spaces-after-sentence-end typist myself. :) – Jamie Hanrahan Feb 16 '17 at 20:03
  • By my understanding, the machines were usually run off paper tape, suggesting at least 3 solutions, depending upon the coding used: (1) Replace any space character that followed another space with a non-expanding space of some sort. (2) Have the read mechanism include sense fingers which check if the character following the current one is a word space and make a substitution if so; (3) Have the punching machine include a mechanism to punch an extra hole in the previous character if it was a space and its successor is also a space. None of those seem overly hard. – supercat Feb 16 '17 at 20:09
  • Not overly hard today, but when Teletype Corp. first built that TTS-Linotype "operating mechanism", the "logic" would have been implemented mechanically, like most all of the teleprinters. They obviously thought it was easier to just tell the people who prepared the tapes "no two wordspaces in a row!" From my understanding, direct keying of the Linotype machine wasn't all that uncommon, particularly for smaller papers and locally-written stories. – Jamie Hanrahan Feb 16 '17 at 21:49
  • I understand that the logic was mechanical, but that doesn't imply that it would have had to have been particularly difficult. If each type-piece dispenser including wedges had a rod that moved vertically, the dispenser for the sentence-extra-space character was adjacent to the expanding-space dispenser, and there was a horizontal rod that cycled once per character while the dispensing rod was in the active position, all that would be needed would be to have a spring-loaded selector in line with the expanding-space rod which, when cocked, would select an expanding space and then release,... – supercat Feb 16 '17 at 22:01
  • ...when decocked would push on the extra-sentence-space and stay released, and when doing neither of the above during a cycle of the character-was-dispensed lever would get pushed back to the cocked position. Or, if hole patterns allowed, one could have a mechanism which would feel for two patterns on consecutive characters and, if detected, would punch some additional holes (by my understanding, people who were typing on the machine directly knew how to use different kinds of spaces--the problem was with tapes punched by qwerty typists). – supercat Feb 16 '17 at 22:06
16

The Baudot code uses 5 bits, and IIRC at least one of the mechanical teleprinters needed 1.5 stop bits to provide the time for the mechanism to do its thing, this at 45 baud.

RTTY radio comms is probably the one place you still see this stuff in use.

9 bit on the other hand is semi common in RS485 industrial controllers.

Dan Mills
  • 481
  • 2
  • 3
  • It is supported when you don't insist on an additional parity bit. – Janka Feb 11 '17 at 13:14
  • 45 baud... and I thought my Atari's casette recorder at 600 baud was slow! Any decent typist is faster than that! – SF. Feb 24 '17 at 14:50
  • 1
    In the days of RTTY and 5 bit "Baudot", we didn't really talk about bits per second. The common description was that they operated at "60 words per minute". – gbarry Jul 16 '17 at 06:50
13

Traditional Teletypes and many paper tape storage systems use these. You won't be able to read reels of punched paper tape if you drop support for 5 bit data.

Nor will you be able to interface with Colossus systems, or the Lorentz communications they were designed to decode. At least one of each is apparently still in service - as a museum piece.

Full disclosure : despite being fairly archaic myself, I haven't personally interacted with either of these systems...

user_1818839
  • 640
  • 4
  • 9
  • Reading "reels of punched paper tape" is not really an issue. The OP is asking about a serial interface. A paper tape reader (or punch) is a parallel device, not serial. – Jamie Hanrahan Feb 12 '17 at 09:06
  • I haven't personally interacted with... I have. I burned out the last Teletype left in my University. (Re-purposed it as a serial printer... Of course that typed lines quicker than any human typist. So the stepper-motor overheated and went up in flames :-)) Man, I'm getting old... – Tonny Feb 12 '17 at 19:23
  • @JamieHanrahan Tape or punch-card is parallel internally. That's true, but most of those interfaced serially to other devices. I have also seen Shugart, SCSI and GPIB interfaces. I can't recall ever having seen a real parallel one. – Tonny Feb 12 '17 at 19:29
  • Hm, well... that's direct opposite to my experience. The paper tape readers and punches I worked with on HP, DEC, and Data General gear all used a very straightforward parallel interface (though not exactly like that of either Centronics or Data Products parallel printers). Note that Shugart, SCSI (in that era), and GPIB are all parallel. – Jamie Hanrahan Feb 12 '17 at 21:58
  • 1
    WRT the Colossus and Lorenz being "apparently" still in service, there's one rebuilt Colossus in The National Museum of Computing at Bletchley Park which is quite something to behold (the paper fairly whizzes through though the data speed is akin to what one complains about as slow if using dial-up Internet access). Lorenz machines are a bit more plentiful, in that there are a handful surviving, which is more than the single Colossus. – Jon Hanna Feb 13 '17 at 02:31
  • I've worked with paper tape (I think it was even 5-bit) for programming a CNC EDM cutter. When I left the company in the late 90s, there was talk of replacing the tape with a serial connection, but getting a computer to work in close proximity to a controlled lightning bolt is a tricky proposition. – Mark Feb 13 '17 at 10:21
9

It would be useful to establish whether anyone knows of any application for these which is still in service.

RTTY (FSK modem over radio) still uses 5 bit code at low bitrates 45/50. It's often still used for automatic weather reports.

What were the legacy applications for 5 or 6 bit serial data, and do any of them still exist?

The six bit code was convenient for golf-ball style teleprinters, where it could directly drive the six solenoids that operated the print head without needing additional translation. This is why you'll often see it used with more than one stop bit, as the solenoids need time to get back to their resting position.

You can find lots of information on RTTY and its history here http://www.rtty.com/

Jamie Hanrahan
  • 1,431
  • 10
  • 9
James
  • 91
  • 1
6

Many many years ago, I interfaced an old IBM Selectric Terminal to my CPM machine. It used a 6 bit code that was not supported by my hardware and a 134.5 bit per second data rate also not supported. I wound up bit banging the data out with a software UART. Lucky for me, I was not interested in getting data from the Keyboard as that would have been much more difficult.

Peter Camilleri
  • 1,162
  • 6
  • 13
  • I still have one of these old Selectric typewriter terminals somewhere in the attic - Last used in ~2000 together with a Sinclair Spectrum (whose IF1 wasn't able to spit out the correct number of data bits and so needed to have a converter box that built 50 bps/6N2 from its 12008N1...) – tofro Feb 11 '17 at 14:54
  • "Last used in 2000" - the lubricant has likely turned to something close to chewing gum in consistency, ie no longer a lubricant. Don't expect it to work right! – Jamie Hanrahan Jan 13 '18 at 00:23
4

The most common one still around is probably the International Program Airline Reservation System (IPARS) 6-bit character set used used in the Airline Link Control (ALC) protocol used in the international airline reservation system. Yes, down in the guts of the global air travel system, blocks of 6-bit characters are still being shuffled around to this day all over the world. You can get many of the gory details by looking up 'SITA P1024B'. ALC was traditionally carried over X.25 or a serial bisync connection, but these days I suppose you'll find it encapsulated in IP most of the time (there's an RFC for that).

KJ Seefried
  • 1,765
  • 11
  • 11
2

The IBM 6 bit transcode was one of the serial modes supported by the IBM 2780 amongst other things.

There used to be a (printed) handbook that compared all the then known codes from 5 bit Baudot to 8 bit ASCII (which was rather new at the time). It might be interesting to see if there is a scanned copy somewhere.

I think you will find the 6 bit transcode mentioned in the first edition of The art of electronics.

Peter Smith
  • 308
  • 1
  • 8
2

Peter Camilerri -- I did exactly the same thing around 1980. I was able to get "letter quality" output while everyone else had "dot matrix". I did this by driving an IBM Selectric terminal using 6-bit serial (with the bit order reversed), using a program I had written in BASIC. This was on an 8-bit Z-80 system running MP/M, the multiuser version of the CP/M OS.

2

Stock tickers date from 1863 (a variant was Edison's first successful invention in 1869). The text-output ones only printed 32 characters, so would have been a 5-bit application... which is possibly the eldest of the 5-bit-compatible applications of serial data. I don't know the timeline for bit-serial (some tickers operated like a pulse-dial phone, took 'way more than 5 bit-times to print "Z").

Ticker tape went completely obsolete in the 1960s.

Whit3rd
  • 2,170
  • 10
  • 15