21

It's typically better to use CPU registers to their full capacity. For a portable piece of code, it means using 64-bits arithmetic and storage on 64-bits CPU, and only 32-bits on 32-bits CPU (otherwise, 64-bits instructions will be emulated in 32-bits mode, resulting in devastating performances).

That means it's necessary to detect the size of CPU registers, typically at compile-time (since runtime tests are expensive).

For years now, I've used the simple heuristic sizeof(nativeRegisters) == sizeof(size_t).

It has worked fine for a lot of platforms, but it appears to be a wrong heuristic for linux x32 : in this case, size_t is only 32-bits, while registers could still handle 64-bits. It results in some lost performance opportunity (significant for my use case).

I would like to correctly detect the usable size of CPU registers even in such a situation.

I suspect I could try to find some compiler-specific macro to special-case x32 mode. But I was wondering if something more generic would exist, to cover more situations. For example another target would be OpenVMS 64-bits : there, native register size is 64-bits, but size_t is only 32-bits.

Cyan
  • 13,248
  • 8
  • 43
  • 78
  • 1
    Our of interest, how can you "use the answer" in C - assembly I get, but C? – John3136 Apr 30 '16 at 08:12
  • It's in the algorithm. Use more accurate 64-bits arithmetic instead of cheaper 32-bits heuristics. Use registers as "fast local storage" for longer time when 64-bits vs 32-bits. Selection is compile-time, using test of `sizeof(size_t)` (up to now). – Cyan Apr 30 '16 at 08:14
  • Another popular one is `sizeof(void*)`. – Sergey Kalinichenko Apr 30 '16 at 08:22
  • I believe you can direct the compiler to compile for 32 or 64 bits (VC). It means linking with different libraries too. That would mean the source code never knows the available register size, but the compiler does. Then it should be in the makefile as a symbol to be defined (commandline `#define`) and a switch for the compiler. In your souce you can set guards that either a 32 bit symbol or 64 bit symbol is defined. – Paul Ogilvie Apr 30 '16 at 08:29
  • 1
    There is no reliable way of knowing register size from plain C level. But from the definitions of the language the closest guess shuld be _sizeof(int)_ as _int_ is supposed to be the natural type for the environment. (Other types are already bound to a different purpose.) Even then some compiler settings cold be folling you. – rpy Apr 30 '16 at 08:44
  • 1
    @rpy: In Linux, Windows and MacOS X for 64 bit targets sizeof(int)==4 as defined in their ABIs. So no, int is not a good indicator. Then on Windows sizeof(long)==4 and on Linux sizeof(long)==8. – datenwolf Apr 30 '16 at 10:55
  • Would `sizeof(intmax_t)` provide that information? (It is not clear if "Maximum-width integer types which are guaranteed to be the largest integer type in the implementation." — from [Wikipedia](https://en.wikipedia.org/wiki/C_data_types#Fixed-width_integer_types) — includes software implementations automatically provided by the compiler.) –  Apr 30 '16 at 12:40
  • `sizeof(void*)` was tempting, but it also fails to detect the x32 situation (where both `size_t` and `void*` are 4 bytes, while registers are 8 bytes). – Cyan Apr 30 '16 at 16:14
  • `uintmax_t` looks like the best bet, although it requires `stdint.h`, which implies C99, which can be a portability issue on some systems. – Cyan Apr 30 '16 at 16:18
  • Actually, after further testing, it seems that `uintmax_t` is always 64-bits, even in 32-bits mode. "largest supported integer type" seems to be related to compiler capability, and does not reflect hardware register size (indeed, 64-bits arithmetic is available to 32-bits platform, just at a greater software emulation cost). – Cyan Apr 30 '16 at 16:31
  • https://sourceware.org/glibc/wiki/x32#line-37 gives you some insight in what is different for x32 in the compiler (#defines, assembly level) – Honza Remeš Apr 30 '16 at 22:29
  • @datenwolf: as I tried to indicate, it depends on the compiler. Without knowing the target platform specifics of the compiler there will be no reliable assumption. Besides assumptions on _smallest_ types. From this _int_ will fit with integer arithmetic registers, even if it is not the longest possible size that would fit. – rpy May 01 '16 at 13:09
  • @rpy: The sizes of the arithmetic types are not determined by the compiler but by the host OS ABI rules. It has to be written into the ABI, because many syscalls (in practically every OS) pass around pointers to structs and the size and alignment of the structs must be identical between compilers. Only if there's no target environment with an ABI around a compiler may do as it likes to. – datenwolf May 01 '16 at 13:54
  • @datenwolf: taken for that level of discussion. But if you ever had started on bare hw buildiing your own assenbler and building your own compiler before porting any os , then you would know the decision of sizes is closer to the HW than any os ABI. ABI just documents reasonable definitions for use with a system environment (and a compiler for that env will adhere, granted). But on same HW one os could decide for int=4 while another would use int=8 so you would still need to now your target environment for any clues on optimal sizes. (x64 vs. x32 os versions is a common case of that) – rpy May 01 '16 at 14:04
  • @rpy: You assume that I don't know low-level stuff. The opposite is true: I'm the low level and RTOS developer at our company. Here's the thing: ABIs are strict system level contracts and compilers **must** adhere to them for their outputs to properly interface with the rest of a system. That means that the sizes and alignments of struct members and function arguments are pretty much pinned down. Where compilers have leeway (lots of it, actually) is the inside of the black boxes we call "functions" or methods. But the size of certain types is a strict requirement pinned down by ABIs. – datenwolf May 01 '16 at 15:23
  • @rpy: You wrote `But on same HW one os could decide for int=4 while another would use int=8` – that's exactly what ABIs are. They exactly define for a combination of OS and architecture the binary representation of each type used in system level interfaces. The choices in ABIs of course try to follow and match the peculiarities of the target architectures. But not all choices are just about performance. Some are also about memory footprint and register allocation. – datenwolf May 01 '16 at 15:26
  • @datawolf: no assumptions from my side. I think we are on the same page. ABI, however, may only be only "master" in case you are starting from an existing OS and bring it to new HW. Then the compiler might follow the ABI definitions for being implemented. Otherwise the ABI will incorporate what compiler builders suggest. So, this becomes more a hen or egg problem(;-). With respect to the OPs question this is irrelevant as the only item specified was it has to be "c" language. And this does not allow any ABI assumptions by itself. – rpy May 01 '16 at 15:41
  • @rpy: Yes, this discussion is getting off topic, but I sense that you really have no idea how ABIs are defined. You have to define an ABI before you can even start writing a compiler. And in that regard it's only x86_32 where everybody did what they liked. For practically every other platform out there it's the CPU architects who design the ABI conventions to be used with their Metal. It's ABI first, OS system level interfaces next and compilers last, at least for every platform with a sensible design underneath. – datenwolf May 03 '16 at 07:52
  • @datenwolf: Exactly what I try to say: it's HW first. And: not all OSes in history formally followed that strict sequence of specification levels you stated. Informally, it is not a bad thing to follow that steps, however. – rpy May 03 '16 at 09:06

1 Answers1

9

There is no reliable and portable way to determine register size from C. C doesn't even have a concept of "registers" (the description of the register keyword doesn't mention CPU registers).

But it does define a set of integer types that are the fastest type of at least a specified size. <stdint.h> defines uint_fastN_t, for N = 8, 16, 32, 64.

If you're assuming that registers are at least 32 bits, then uint_fast32_t is likely to be the same size as a register, either 32 or 64 bits. This isn't guaranteed. Here's what the standard says:

Each of the following types designates an integer type that is usually fastest to operate with among all integer types that have at least the specified width.

with a footnote:

The designated type is not guaranteed to be fastest for all purposes; if the implementation has no clear grounds for choosing one type over another, it will simply pick some integer type satisfying the signedness and width requirements.

In fact, I suggest that using the [u]int_fastN_t types expresses your intent more clearly than trying to match the CPU register size.

If that doesn't work for some target, you'll need to add some special-case #if or #ifdef directives to choose a suitable type. But uint_fast32_t (or uint_fast16_t if you want to support 16-bit systems) is probably a better starting point than size_t or int.

A quick experiment shows that if I compile with gcc -mx32, both uint_fast16_t and uint_fast32_t are 32 bits. They're both 64 bits when compiled without -mx32 (on my x86_64 system). Which means that, at least for gcc, the uint_fastN_t types don't do what you want. You'll need special-case code for x32. (Arguably gcc should be using 64-bit types for uint_fastN_t in x32 mode. I've just posted this question asking about that.)

This question asks how to detect an x32 environment in the preprocessor. gcc provides no direct way to determine this, but I've just posted an answer suggesting the use of the __x86_64__ and SIZE_MAX macros.

Community
  • 1
  • 1
Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
  • That's a great complete answer. I'll go for the proposed macro solution. – Cyan May 01 '16 at 05:06
  • "If you're assuming that registers are at least 32 bits" — Looking for a platform-agnostic solution should mean something that works for 8- and 16-bit processors too, for small embedded systems. – Craig McQueen Aug 06 '20 at 05:26
  • I've noticed that Linux kernel structs, such as `struct vm_area_struct`, use `unsigned long` to hold virtual memory addresses. Is there any guarantee that `sizeof(long)` is at least the size of a register? – Daniel Walker Aug 26 '21 at 17:38
  • 1
    @DanielWalker No, there's no such guarantee (in fact C has no concept of a "register", in spite of the presence of the `register` keyword). Linux kernel code isn't portable (because it doesn't have to be). It's only compiled with gcc, or perhaps something compatible. – Keith Thompson Aug 26 '21 at 17:43