Which "very esoteric processor instructions" are used by OS/2?

Question

According to the Oracle VirtualBox 6.0 manual, "Certain rare guest operating systems like OS/2 make use of very esoteric processor instructions that are not supported with our software virtualization. For virtual machines that are configured to contain such an operating system, hardware virtualization is enabled automatically." Any idea which "very esoteric processor instructions" would be meant by this?

I don't know the answers to your questions but I have OS/2 2.1 and OS/2 Warp both running under VirtualBox without issue. I didn't use any special settings. — Tim Locke, Sep 06 '21 at 14:44
@TimLocke that’s thanks to the hardware virtualisation in your system. — Stephen Kitt, Sep 06 '21 at 15:32
OS/2 1.3, like WIndows 95 (ActiveX after 8b) used some instructions that were not intended to be used in production as Intel already had multi CPU and multi-core in mind. I know this because Cyrix did not even include those instructions and installing DirectX 8C required falling back to an Intel CPU. This was in the days when Windows hung after 49.7 days due to millisecond rollover. With more than one core where did the time come from, etc. OS/2 1.3 was a Microsoft product. They left a subsystem in Windows NT for a while. — mckenzm, Sep 07 '21 at 01:45
@mckenzm are you referring to LOADALL by any chance? (See LOADALL Strikes Again and How does the LOADALL instruction on the 80286 work?) That would have affected OS/2, at least 1.0 and 1.1, but not Windows 95 or NT. Do you have any references to Intel already having multi-CPU and multi-core in mind when they designed the 286? — Stephen Kitt, Sep 07 '21 at 13:15
I'm thinking more of the time stamp counter, and quirks with the Cyrix MII as well as other Pentium equivalents at the time. My point being some of the opcodes were not intended to be used other than in bespoke system programmer type apps, and not compiled in to retail releases. — mckenzm, Sep 07 '21 at 14:19
@mckenzm ah right; the TSC wasn’t a problem for OS/2, it didn’t use it AFAIK, or at least didn’t need it. I ran OS/2 Warp 3 and 4 for a while on a TSC-less Cyrix 6x86 without trouble. As far as opcodes go, if they’re documented, they’re usable, as long as the documentation is followed (which can include privilege restrictions); TSC availability is indicated by a specific bit in CPUID’s output. As you say it causes problems on multi-core and multi-CPU systems, but that was documented too. — Stephen Kitt, Sep 07 '21 at 15:43

Stephen Kitt · Answer 1 · 2021-09-07T16:17:23.693

69

As far as I’m aware the difficulty in virtualising OS/2 isn’t due to esoteric processor instructions, but rather esoteric processor features. Specifically, OS/2 uses all the protected mode features available on 286 (OS/2 1.x) and 386 (OS/2 2.0 and later) PCs: segment limits, paging, protection rings... The latter has commonly been given as the main difficulty in supporting OS/2: while most protected-mode PC operating systems use only rings 0 (kernel) and 3 (user-mode), OS/2 also uses ring 2 for privileged code which couldn’t touch the kernel but could access some pieces of hardware (for printers and displays).

Because support for x86-style rings other than 0 and 3 isn’t needed for many operating systems, it probably isn’t a priority for developers of virtualisation tools; bear in mind that most virtualisation tools are developed mostly by companies, and their goal is typically to support specific workloads rather than provide a complete emulation of the original hardware platform. By skipping some support for unused rings, they only lost support for operating systems such as OS/2, and some features of DR DOS (DPMS specifically). On top of that, at least some virtualisation tools rely on protection rings themselves to virtualise the operating system they’re running: if they can’t use hardware-assisted virtualisation, they’ll run their virtual machine manager in ring 0, and run the guest operating system in ring 1 (instead of ring 0). Doing that doesn’t leave much room for the guest operating system to do anything clever with protection rings itself.

This probably isn’t the only feature which causes issues. OS/2 runs in both 16-bit and 32-bit protected mode simultaneously, with frequent switches from one to the other, even when running 32-bit applications; this requires specific support in virtualisation environments and assistance from the host operating system (at least, on 64-bit PC operating systems). OS/2 1.x relies on triple-faulting to leave protected mode, and OS/2 in general uses call gates for system calls, both of which require specific handling in the virtual machine manager and are much easier to implement with hardware assistance. (Other operating systems also use call gates, so that’s not an OS/2-specific requirement.)

All this is specifically a problem when virtualising on x86 without hardware virtualisation (AMD-V and VT-x). The goal when virtualising is to set the host environment up such that the guest can run directly on the CPU, with no translation, but with all privileged operations intercepted by the “hypervisor”. “Software virtualisation” on x86 is deficient for these purposes, and requires a lot of work in the hypervisor to make up for the architecture’s deficiencies. Hardware-assisted hypervisors can run OS/2 without difficulty, relying on the hardware to provide a complete protected-mode virtualisation; as can emulators (such as Bochs and QEMU in non-KVM mode), relying on their CPU emulation to simulate the protected-mode environment. Keith Adams’ and Ole Agesen’s 2006 paper, A Comparison of Software and Hardware Techniques for x86 Virtualization, explains the complexity of x86 virtualisation in detail.

edited Sep 07 '21 at 16:17

answered Sep 06 '21 at 13:22

Stephen Kitt

121,835
17
505
462

3

Dev: "So, which arcane features we'll use on our new SO?" Tech Lead: "Yes." – T. Sar Sep 06 '21 at 16:04
28

@T.Sar they weren’t arcane features when OS/2 was in development ;-). – Stephen Kitt Sep 06 '21 at 16:07
1

An interesting question is why IBM chose to use the full extent of features whereas others didn't. Is it because e.g. NT was designed to be highly portable across a variety of wildly different platforms? Reportedly, Dave Cutler chose the i860 platform for the development workstations because it was as different, one might even say "weird", from not only 386 but most other architectures as you can be. – Jörg W Mittag Sep 06 '21 at 16:27
13

In fact they were remarkable features! The protection system of Multics - more than the protection system of Multics - on a single chip microprocessor! (The GE645/H6180 systems did not have gates - call, task, or interrupt - and did all that through the segment hardware trapping on ring changes. I think gates were an improvement/optimization over that.) – davidbak Sep 06 '21 at 16:29
@JörgWMittag - well NT ran on MIPS as well as x86 from the beginning, and MIPS didn't have rings much less gates, so they couldn't use it architecturally in NT (e.g., >2 rings, tasks in the ISA). There may have been ways to use them strategically in implementation details, e.g., it probably wasn't necessary that the user->kernel transitions be done the same way on both architectures, but then it turned out that these features were slower on real Intel chips than alternative (more "normal") mechanisms already in the ISA. – davidbak Sep 06 '21 at 16:34
15

@JörgWMittag this Super User Q&A gives some reasons why later OSs only used two rings. OS/2 was developed for 286s, where all four rings were equally useful (albeit still hard to use because of the limited number of segment descriptors); most later OSs were developed for 386s, where using more than rings 0 and 3 made less sense (if protection was provided by paging instead of segments). – Stephen Kitt Sep 06 '21 at 16:37
1

@StephenKitt - "if protection was provided by paging instead of segments" - that's the crux of it. Pages and page table descriptors had a very good (though different) protection model from segments and segment descriptors; it turned out that paging was far superior to segments both from the OS implementation standpoint and from the programmer's standpoint. – davidbak Sep 06 '21 at 16:43
8

That makes sense. It looks like it's a combination of the facts that the earlier OSs were either too simple to need it, or too portable to make use of it, and the later ones were designed for 386+ where it doesn't make sense. OS/2 seems to have been designed in a sweet spot where neither the 386 nor PowerPC existed, IBM didn't intend to port it, and they were also used to lots of hardware assistance through very powerful instruction sets from RS/6000 and their mainframes. – Jörg W Mittag Sep 06 '21 at 16:44
QEMU's emulation is not complete: it can't e.g. emulate AVX-512 on a CPU that doesn't support it, even in non-KVM mode. I found this when I tried to use QEMU's high performance emulation to develop EDB support for AVX-512 while using a Haswell CPU. Had to resort to Bochs, which worked great, albeit very slow. – Ruslan Sep 07 '21 at 08:38
@Ruslan I wasn’t referring to feature coverage, I’ve removed “complete” to avoid confusion. – Stephen Kitt Sep 07 '21 at 08:54
1

@davidbak: It's open to a lot of question whether paging is superior from a programmer's viewpoint. Paging is simpler, but segments have a number of extremely useful features that paging simply can't provide. Just for one trivial example, segment-based protection on the 286 already had the equivalent of the no-execute bit that wasn't added to paging until many years later (but that's only the tip of the iceberg). – Jerry Coffin Sep 07 '21 at 15:54
@StephenKitt - Thanks for the update. I think that the body of the answer is better without the reference. But I think I can - sort of - confirm your memory, too. Since NetWare 4.11, there were still only two rings used at the same time, but you could tell NetWare (with the RING switch to the DOMAIN command to move the entire other-than-OS-domain ("OS_PROTECTED") to run in any one of rings 1, 2, or 3 if you cared. I was wrong thinking it was hardcoded to ring 3 (like it was in version 4.0 and 4.1). There could be something else I don't know. I stopped coding NLMs by version 5. – Jirka Hanika Sep 07 '21 at 20:02
1

@Jerry Coffin - I'd like to know more about the "tip of the iceberg" ways in which segmenting is superior for programmers these days - do you have a link maybe to some discussion of it somewhere? (I'd ask a question specifically about it but it doesn't seem quite on topic here and I can't think of which other board it would be on topic for.) – davidbak Sep 07 '21 at 20:20
@davidbak The Design of OS/2 gives a great overview of segmentation and paging on x86, and explains why various features were dropped. I’m not aware of anything which explains how great segmentation was, at least on paper. In reality it turned out to be slow and buggy on the 286 (see Digital Research’s woes with Concurrent DOS, and the wait for E2-stepping 286s). Segmentation on the 286 gives you CPU-assisted bounds checking (no need for guard pages), task switching, even the ability to virtualise DOS in protected mode (see Concurrent DOS). But it was too slow... – Stephen Kitt Sep 08 '21 at 05:35
Minor quibble with rings in the first section... to my recollection ring 1 was not used at all, and while ring 2 in theory was used for port IO it was rarely (?never?) used. The default config.sys (IOPL=YES) allowed port IO to all processes, however port IO was typically done with drivers in ring 0. IO was also restricted to 16bit code (driver or process). There was an interesting "experiment" (link escapes me at the moment) using a driver to grant a 32bit process IO privilege, and it worked to evade the kernel. In general great answer, as OS/2 uses "everything" the processor offered. – Doug Sep 08 '21 at 15:12
@Doug I didn’t mention ring 1 in the first section ;-). Even if ring 2 wasn’t used in practice for “privileged” processes, presumably the kernel would still need to support it, wouldn’t it? – Stephen Kitt Sep 08 '21 at 18:49
@JörgWMittag: I don't think "sweet spot" is really the right term. IBM had undertaken that OS/2 would run on a PS/2 Model 50 with 2MB RAM (that sold them a whole lot of machines shortly after the launch of PS/2) and were forced to try to make it happen. So they used everything available on the 286 and had to ignore the 386 for the initial implementation. – John Dallman Sep 12 '21 at 15:09

Which "very esoteric processor instructions" are used by OS/2?

1 Answers1