Why did the MS-DOS API choose software interrupts for its interface?

Question

Access to the DOS API was done through the INT 21h x86 instruction. This was always counter-intuitive to me, coming from 8-bit systems that accessed system services by calling subroutines through a jump table. This simple system seems to give the same benefits from indirection as using software interrupts on x86. An example of such a system is the kernel (or "Kernal", for Commodore purists) used on Commodore's 8-bit machines.

Additionally, the reliance on software interrupts might be a reason for why the transition to using Protected mode, and accessing more than 640KiB of memory, was so slow and difficult for MS-DOS users.

Why didn't MS-DOS just use the CALL instruction with a jump table instead of software interrupts? Did this choice impact MS-DOS programs being able to run in protected mode on the 80286+?

DOS did also support CALL 5 in early versions, for CP/M source compatibility. Wikipedia reports on the A20 hassles that resulted from trying to keep that section of memory CP/M-esque: https://en.wikipedia.org/wiki/A20_line#CALL_5 . Bit of a digression though. — Tommy, Feb 25 '20 at 17:04
IBM's BIOS also relied software interrupts. I wouldn't be surprised if Microsoft was following IBM's lead. Also, software interrupts assemble to two bytes (e.g., CD 21). — Jim Nelson, Feb 25 '20 at 17:29
... because that was the whole point of software interrupts? They were designed to be a simple way for applications to call into the OS. I've done plenty of programming on DOS, and the interrupt-based API was certainly easier to use than having to deal with calling into a jump table (it pretty much abstracts the jump table away). — Luaan, Feb 26 '20 at 09:38
The interrupt vector table essentially is this jump table in the x86 world. Coming from a higher-level language like Turbo-C, it was a lot easier to use the DOS.H identifiers it provided (_AH, _AL, geninterrupt()...) than to try to call functions at arbitrary addresses. — smitelli, Feb 26 '20 at 12:58
Far more sophisticated OSes than DOS use interrupts here: https://www.tldp.org/LDP/khg/HyperNews/get/syscall/syscall86.html — rackandboneman, Feb 26 '20 at 16:48
BTW, it wasn't getting into protected mode via INT 21h that was slow, it was getting out of it. — davidbak, Feb 27 '20 at 16:31
It is much easier to use short INT as a single point of system calls than maintain a table. INT is, generally, shorter than CALL (e.g. compare the RST X vs CALL on the Z80 / 8080 CPU). The ultimate way is to have a special SVC instruction (SuperVisor Call), of course... — Martin Maly, Mar 06 '20 at 10:14
fantastic question! Today, like, over 26 years from release of DOS 6.22 (not even counting earlier versions), I asked myself "why 21h and not a system call"? And voila - there's another curious mind who decided to ask this question in 2020 :D MS-DOS lives forever! — Dimitry K, Aug 22 '20 at 12:51

Raffzahn · Accepted Answer · 2021-11-15T12:27:33.697

TL;DR;

Using INT comes not only natural due the way the 8086 is designed, but was as well intended by Intel as OS entry point, much like a Supervisor Call (SVC) on /360 type mainframes:

(Excerpt from the October 1979 Intel 8086 Family User's Manual page 2-28.)

Software-initiated interrupt procedures may be used as service routines ("supervisor calls") for other programs in the system. In this case, the interrupt procedure is activated when a program, rather than an external device, needs attention. The "attention" might be to search a file for a record, send a message to another program, request an allocation of free memory, etc. Software interrupt procedures can be advantageous in systems that dynamically relocate programs during execution. Since the interrupt pointer table is at a fixed storage location, procedures may "call" each other through the table by using software interrupt instructions . This provides a stable communication "exchange" that is independent of procedure addresses. The interrupt procedures may themselves be moved so long as the interrupt pointer table always is updated to provide the linkage from the "calling" program via the interrupt type code.

INT is intended to move address dependencies from physical to logical, offering an abstract interface to services. Exactly what BIOS and DOS is. Using INTs for either is simply as it should be.

The Long Read

Access to the DOS API was done through the INT 21h x86 instruction. This was always counter-intuitive to me, coming from 8-bit systems that accessed system services by calling subroutines through a jump table.

For one, this is as well known for 8 bit, like with 8080/85/Z80 systems using the RST instruction. But more important, the interrupts are exactly this, an indirect subroutine jump using a jump table. It got several advantages:

Short two byte opcode vs. six byte for indirect far call
Fast execution due to fixed address
Portability as the address is not coded within the user program
Use of a logical number that could be redirected in future versions
Executing an INT is independent from the address mode the application or the OS runs in
Taking the table out of user memory improves compatibility
It's the most upward compatible design possible, as new CPU generations can use this as hook for task switch and alike without breaking compatibility

The only 'cost' may seem that the flag register gets saved in addition. In reality this simplifies the OS interface even more, as the flag word is now located at a fixed address (SP+4), so its content can easy be manipulated for return information - like carry set for error handling. The function handler does not need to take care of producing the right flags by some artificial source just before return (like on many 8 bit OS's), but simply sets a bit in a defined memory location, the rest is done by hardware.

This simple system seems to give the same benefits from indirection as using software interrupts on x86.

Not really, as for a call the user software needs to know the address of said table, which makes it quite hard to move or virtualize it in future CPU/OS versions.

An example of such a system is the kernel (or "Kernal", for Commodore purists) used on Commodore's 8-bit machines.

Commodore is a great example how fixed entry points complicates development of software. Most 6500 based commodore Kernals provide the same functionalities but located at different entry points. Software needs to be ported for each machine. Given, it often can be done with a few switches and recompilation, but writing software that can run on more than one or two machines needs to bring a compatibility layer.

Additionally, the reliance on software interrupts might be a reason for why the transition to using Protected mode, and accessing more than 640KiB of memory, was so slow and difficult for MS-DOS users.

Why? Do you have any proof thereof? The INTs were of no problem, as they work quite well in protected mode. The CPU handles everything necessary - which is exactly the reason why INT had to be used in the first place, it allows a simple hook for upward compatibility. Use of INT instructions for any userland -> OS call in applications is the base for transparent move to a protected mode system.

The real issue with DOS applications wasn't the INT system, or anything about the CPU, but applications breaking two basic rules of well behaving: Hardware assumptions about memory management and direct hardware access in general, both without using any protocol. Especially the first one is what made most of the DOS problems, the assumption that Segment+1 is the same as Offset+16. Within a protected mode system this is no longer true. As a result, any program trying to 'outsmart' the OS will fail.

Why didn't MS-DOS just use the CALL instruction with a jump table instead of software interrupts?

See above, using a call table would give up all the advantages - and in most cases welcome the associated disadvantages.

Did this choice impact MS-DOS programs being able to run in protected mode on the 80286+?

No. It was, as said, direct hardware access, including memory management. Disrespecting the logic structure of the CPU is what broke code and made the move to protected mode basically impossible.

Bottom line: Adding the INT instruction is one of the best decisions when making the 8086, full in line with the goal to create a CPU made for complex high level software. Using it was the right way to do.

Maybe Intel should have called it 'SVC' like 20+ years before Amdahl did for the IBM /360, where opcode X'0A' SVC worked similar (*1) - and enabled compatibility across OS versions over many decades from real mode single CPU machines all the way to virtual multi-processor systems and 64 bit code.

Then again, the 8086 is a simple microprocessor, so it makes sense to combine hardware interrupts and CPU exceptions with OS/function calls into a single mechanic.

The only dark shadow, cast over the INT use, was due IBM's decision to use the interrupts below 20h for BIOS functions, as they were reserved by Intel for CPU exceptions.

*1 - Well, not completely, at least on early machines it used the default interrupt and the OS had to decode and jump.

I think you are overstating the advantage of not needing a fixed address a little bit. For example on the Amiga you have one fixed address only, address 4, which is a pointer to the main system library jump tables. And besides you have a fixed address - interrupt vector 21. — user, Feb 26 '20 at 14:49
There are down sides too, such as software needing the privilege to execute soft interrupts which was an issue even in the 70s as multi-tasking operating systems already existed. — user, Feb 26 '20 at 14:50
Another down side is that it makes patching OS functions more difficult. — user, Feb 26 '20 at 14:52
@user DOS wasn’t designed in an environment where process privilege mattered, and given the quick-and-dirty development, I don’t think that was considered at all. Patching OS functions accessed using an interrupt vector table with no privilege is straightforward: point the interrupt vector to your own code, and pass any functions you don’t care about to the original vector... — Stephen Kitt, Feb 26 '20 at 15:42
In fact, patching can even be done in a manageable and compatible way by using the DOS functions supplied to change vector table entries. Doing so makes it almost certain that this will continue to function in any future version, or at least result in a graceful (and easy debuggable) rejection. — Raffzahn, Feb 26 '20 at 15:55
@StephenKitt You could even have multiple drivers/TSRs patch the same vector, each passing along to the next in the chain. It was a beautifully simple system. — Monty Harder, Feb 26 '20 at 19:00
@MontyHarder But unpatching vectors is problematic. If a vector originally points to A, then program1 hooks it to B (saving off A as the original), then program2 hooks it to C (saving off B as the original), then when program1 unhooks it, A is restored to the vector, and program2's hook C is lost. I'm glad we've moved past this. — Jonathon Reinhart, Feb 26 '20 at 23:18
Agreed. I wouldn't get too romantic about patching DOS interrupt vectors, it was a constant source of bugs and tech support nightmares. For a sample of how bad it was, check out Int 2Fh, the so-called "muxing interrupt," for how overloaded things got: http://www.ctyme.com/intr/int-2f.htm It was so overridden, a second mux interrupt was proposed. — Jim Nelson, Feb 27 '20 at 01:06
@JonathonReinhart You can't fully unhook. Program1 has to be written such that B is the address of a 16-byte (because that's the smallest block of memory you can allocate) TSR that does a long call to Program1's service routine, followed by a long call to A. When Program1 wants to exit, it overwrites that first long call with NOPs, leaving behind the long call to A. — Monty Harder, Feb 27 '20 at 16:47
Was the interrupt pointer table (the protected mode descendant of this is the Interrupt Descriptor Table) usually at a fixed memory location? Rephrased, LIDT was introduced with the 80286 and presumably the IDT register likewise, so how was the interrupt pointer table found in the 8086? — Single Malt, Aug 27 '21 at 09:03
@SingleMalt At address 0:0 - like it has been since 8080 times :)) — Raffzahn, Aug 27 '21 at 12:30
@MontyHarder: If there were a convention that any code which sets an interrupt vector to ISEG:IOFS must reserve four bytes at ISEG:IOFS-4 to hold the old address, and perform interrupt-table patching using interrupts explicitly designated for that purpose, I think it would have been possible to design a system that would allow robust manipulation of the interrupt chain. Rather hard to patch that into a system after-the-fact, however. — supercat, May 14 '22 at 17:07

Jim Nelson · Answer 2 · 2020-02-25T21:58:16.467

As I mentioned in my comment, IBM's BIOS also used interrupts for service dispatch when it could have easily chosen a well-known address as a ROM entry point (such as how a soft reset was programmatically available by jumping to FFFF:0000). I'm speculating, but Microsoft might have been following IBM's lead.

Also, triggering an interrupt only takes two bytes in the calling code, and in a time when counting bytes was a must, that may have been part of the IBM/Microsoft design calculus.

But DOS did offer a jump table as you're asking about: When a program started, DOS would provide a PSP for program information (such as the command-line arguments). PSP offset 5h could be near-called to invoke the Int 21h dispatcher. Most programs written for DOS didn't bother using it—they could call Int 21h directly—but the intention was to aid machine-translating code from CP/M to DOS, as it operated similarly as CP/M's Zero Page offset 5h (which was in turn a call to its BDOS).

This mechanism, a legacy supported by Microsoft since DOS 1.0, caused problems with the A20 line and address wrap-around. A hair-raising summary and history of PSP offset 5h and the issues it led to can be found at the OS/2 Museum site.

Note that Intel reserved interrupts 00h - 1Fh in the 8086 and later chips. Since the ROM BIOS used many interrupts in that range, that did cause grief later.

As far as how using interrupts for service dispatch caused problems with protected mode, I can only recommend reading up about DPMI and the internal history of Windows 3.x. (Raymond Chen's blog is a good source for the latter.) Microsoft wound up paying off technical debt for its early decisions, some of which were admittedly unavoidable.

There's also an INT21 wrapper at PSP+50h which isn't restricted to the CP/M-compatible range of functions. — john_e, Feb 26 '20 at 12:49

dave · Answer 3 · 2020-02-26T12:13:49.313

13

Where would this 'jump table' live?

As I understand it, the use of a pure trap-based mechanism for calling system services removes the need for user programs to have knowledge of supervisor program layout. For the latter case, either (a) the table is at a well-known address that will never change, or (b) you need linker technology to include a system symbol table when linking a user program.

MS-DOS uses a table at a fixed address, except that it is called an 'interrupt table'. And by using INT 21 as the 'call' instruction, the user program does not even have to know the address of the table; the only 'address' (figuratively speaking) it has to know is the value 21 in the INT instruction.

Thus using a trap instruction effectively insulates user programs from system layout details.

You'll find similar arrangements in many earlier operating systems. An example is the S/360 SVC instruction, which is a trap to a service routine vectored through a fixed low-core address. In other words, MS-DOS used a pretty normal approach.

(It's a trap, not an interrupt -- sigh)

edited Feb 26 '20 at 12:13

answered Feb 25 '20 at 17:45

dave

35,301
3
80
160

2

Isn't it rather (c) - the table resides at an OS controlled location and access is done in a strict self contained (abstract) fashion? A DOS program does not need to know were the table is located at all - not even in case it wants to add/change an entry, as this is handled via a DOS call. – Raffzahn Feb 26 '20 at 02:35
2

Yeah, I guess so. I think we can consider this an example of David Wheeler's dictum that "any problem in computer science can be solved with another level of indirection". The programmer uses logical address 21 (as in INT 21) and we indirect through the interrupt table. – dave Feb 26 '20 at 02:41
I'm not so much a fan of 'anotehr level' of indirection, but I'm an evangelist of separation and symbolic handling. – Raffzahn Feb 26 '20 at 02:45
1

The obvious place for the jump able would be in the PSP segment where the CP/M compatible CALL 0 and CALL 5 "jump table" lived. – Feb 26 '20 at 03:16
2

@another-dave What about the much less often quoted, but much more consistently acted-upon, dictum that "any problem in computer science can be solved by causing three more problems for the next guy to deal with"? – Matthew Najmon Feb 26 '20 at 19:14
2

@MatthewNajmon I've most often seen that in the specialized case 'Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.' – JAB Feb 26 '20 at 23:14

score 11 · Answer 4 · answered Feb 25 '20 at 17:06

11

Nobody thought MS-DOS would be a long-lived system when it was created. Microsoft were quite clear that Xenix was going to be the future. It just didn't turn out that way, because the vast sales of the IBM PC and compatibles caused the development of a lot of popular software.

Using software interrupts with an operation code is very much like using a jump table, it's just that the jump table is in the OS' memory area rather than visible to the application programmer. That gives more flexibility to extend the OS, by adding more operation codes. If the table is in application-visible memory, it probably has to have a fixed size, which will limit extensions to the OS. Limiting MS-DOS to the set of APIs it had at version 1.x would have doomed it, because that API could not handle hierarchical directory structures.

Software interrupts also made it more practical to change software at source level to use protected mode. Use of a software interrupt allows a clean transition between user and supervisor modes, as opposed to having to write code to do that at each entry point in an OS jump table. This didn't work out, because a far greater range of development tools was produced for MS-DOS than Microsoft ever anticipated, and many people became fond of, or dependent on, bits of software that were no longer maintained, or were built with tools that were no longer available.

Software interrupts had no impact on the 640Kb barrier, which was created by the hardware designers of the IBM PC, who had put the video memory at fixed addresses. The real problems with making protected-mode software were:

The absence of an OS, until OS/2 appeared, and its unsatisfactory nature at first.
The completely different meaning of segment registers between real and protected mode, which meant that many of the established practices for dealing with data larger than 64Kb would not work. There wasn't a really satisfactory solution to this until 386 flat mode became available.

answered Feb 25 '20 at 17:06

John Dallman

13,177
3
46
58

Re, "If the table is in application-visible memory, it probably has to have a fixed size." Why? If you give me a pointer to a table today and tell me what I can do with the first N entries in it, Then tomorrow, give me a pointer to a table that's got N+M entries, where the first N still have the same meaning... Where's the problem? – Solomon Slow Feb 25 '20 at 17:45
4

@SolomonSlow: Commonplace poor programming practices of the late 1970s and early 1980s would have meant that some application developers would have failed to allow for backwards compatibility, by using higher entries without checking if they were available. Yes, it's their fault, but that's hard to prove to customers. If you share memory with an application in an unprotected OS, you have to design defensively. – John Dallman Feb 25 '20 at 18:05
2

Aside from "poor programming", you can't deal with forwards compatibility, when an old application overwrites the future extensions to the fixed size table which it never even knew about. – alephzero Feb 25 '20 at 18:20
6

A far call instruction is five bytes. An INT instruction is two. Given that DOS was intended to be usable on machines with as little as 48K of RAM, that's a pretty big difference. – supercat Feb 25 '20 at 19:50
@supercat - the original IBM PC started at 16K of memory. I recall the staff I worked with at Yorktown Heights had a 16K memory expansion card that made life much nicer for data acquisition and analysis. – Jon Custer Feb 25 '20 at 20:33
@JonCuster: I don't recall that one could use anything other than a cassette for I/O until one had at least 48K, though I suppose it might be possible to do BIOS-level sector reads and writes from a floppy by poking a little code into memory and calling it. – supercat Feb 25 '20 at 20:53
@supercat - nope, I clearly recall running DOS on original PCs with the 32K of memory - no cassette at all. – Jon Custer Feb 25 '20 at 20:59
2

@JonCuster Yes, DOS can boot with 32 KiB. See number in this answer. Now, while it can boot, it wouldn't leave much RAM to do anything. And already 2.0 would need 48 KiB to come up. – Raffzahn Feb 25 '20 at 22:27
MS-DOS started life as a Quick and Dirty clone of CP/M-80, where the "call operating system" was a "CALL 5" instruction. So my guess is the programmer picked a similar mechanism out of a hat. – Thorbjørn Ravn Andersen Feb 25 '20 at 22:38
4

Other services, including builtins like BIOS, and software packages, like NETBIOS, also used interrupts to publish their functionality (see Ralf Brown's Interrupt List for full info). So an argument that this was a temporary solution rather than the standard solution in real mode, doesn't hold water. – ivan_pozdeev Feb 26 '20 at 02:29

score 10 · Answer 5 · answered Feb 25 '20 at 16:59

The interrupt system really is just a fancy jump table... Already has allocated space that can't be used for anything else. Adding a regular jump table would have been a waste of memory at that time.

Back then there was no Protected mode to worry about, never mind what its requirements would be...

score 5 · Answer 6 · answered Feb 26 '20 at 14:56

5

One advantage of using interrupts is that in a system that uses segmented memory you don't need to change the current page to the one with the jump table, or to allocate part of the 64k available to standard DOS programs to said table.

answered Feb 26 '20 at 14:56

user

15,213
3
35
69

What was the problem with just using a long JMP directly to an address from the JMP table? (I think I may know, but it would be good if your answer addressed this.) – cjs Mar 27 '20 at 02:05

Polluks · Answer 7 · 2020-03-26T10:34:03.723

3

8-bit POV: A bad design is GEOS which uses a fixed jump table. Therefore C64 and Apple II cannot share the same executable. A software interrupt is more flexible.

For a better solution see https://retrocomputing.stackexchange.com/a/12296

edited Mar 26 '20 at 10:34

answered Feb 27 '20 at 14:43

Polluks

465
3
7

Why did the MS-DOS API choose software interrupts for its interface?

7 Answers7

TL;DR;

The Long Read