An MS-DOS .com file is just raw code/data without header, thus no linking information, and was limited to be loaded into just one segment (64kB). That's the reason corrupted binaries would print "Program too big to fit in memory". But I remember there are .com-files larger than 64kB (some demos?). How did these work? How come they do not produce an error message on loading?
1 Answers
Most large files (over 64KiB) with a .COM extension are really MZ executables; the DOS loader doesn’t care whether the extension is .EXE or .COM, it uses the MZ signature to identify the format. This is the only documented way for a .COM file larger than 64KiB to work, so it’s the only approach which can be relied upon.
However it is possible to build a .COM file larger than 64KiB, even though the documented maximum size is 65278 bytes: 65536 bytes addressable without changing segments, minus 256 bytes for the PSP, minus two bytes for the 0-word pushed to the stack (the return address used when exiting a program in DOS 1 style, with a near RET). When loading with a .COM file larger than 64KiB, the behaviour depends on the specific variant and version of DOS in use:
early versions of MS-DOS only read the first 64KiB;
by MS-DOS 5 if not earlier (and presumably corresponding versions of PC DOS), the implementation sticks to the specification, and complains that the program is “too big to fit in memory” (version 6) or simply that it “cannot execute ...” (version 5);
DR DOS and OpenDOS load the program in its entirety, if it fits into available memory, and start execution as normal.
Care must be taken because DOS will initialise the stack to start at the end of the 64KiB block the program is assumed to occupy, which means part of the binary’s image in memory will be overwritten. Typically one of the first things such a program would do, would be to move its stack... All the segment operations required to access anything outside the first 64KiB also have to be calculated manually.
This was supposed to be documented in release 62 of Ralf Brown’s Interrupt List, and there’s a thread on the topic on alt.msdos.programmer.
- 121,835
- 17
- 505
- 462
CLIthere is a tiny, tiny chance that an interrupt will happen just before and a heap of your stack gets corrupted. – Artelius Apr 27 '20 at 00:24COMMAND.COM, and it only looked at file extensions to determine file types. This was moved into the kernel in MS-DOS 2.0, and detection was changed to signature-based. All the.COMprograms in MS-DOS 2 are real.COMprograms, not MZ files. The only executable which is too large for a.COMfile isGWBASIC.EXE, which was new in MS-DOS 2. AllowingCOMfiles to be MZ files certainly came in useful later (COMMAND.COMitself is an MZ in later versions). – Stephen Kitt Apr 27 '20 at 10:09FORMAT.COMnever got anywhere near 64K, andEDIT.COMappeared late in the game (so backwards compatibility wasn’t an issue) and was also nowhere near 64K. Having a flexible loader did help withCOMMAND.COMbut that wasn’t an issue when the loader was written. See the source code. – Stephen Kitt Apr 27 '20 at 15:13.COMprograms in MS-DOS 2 are real.COMprograms, not MZ files.” I don’t see any evidence to support the idea that the developers were forced to implement the loader in this way. But yes, it’s possible that they were aware of other programs which would need this. – Stephen Kitt Apr 27 '20 at 15:21.COMprograms in MS-DOS 6.22 are MZ binaries either; some of them are however compressed. In Toshiba’s version of MS-DOS 5,FORMAT.COMis somewhat weird, it’s a raw.COMbinary which has been converted from an MZ binary, which is still contained in the.COMfile (the MZ signature is 16 bytes into the binary). – Stephen Kitt Apr 27 '20 at 15:34