9

Most assembly programs make use of the 4 general purpose registers eax, ebx, ecx, and edx but I find that quite often I need to use more than 4 registers to accomplish my task easily without having to push and pop from the stack too much. Since my program has no intentions of using the FPU or MMX registers for floating point calculations or their "intended use", is it considered acceptable to use these extra registers in your program?

Eg. using mm0 for a loop increment counter freeing up the ecx register to do other things.

Sep Roland
  • 33,889
  • 7
  • 43
  • 76
user99545
  • 1,173
  • 3
  • 16
  • 32
  • You can pretty much do whatever you want. If you find that using the xmm registers is faster than spilling to stack, go for it. – Mysticial Feb 23 '13 at 05:42
  • 2
    @Mysticial Rarely does anybody tell me "I can do whatever I want" pertaining to programming. I dig it :) – user99545 Feb 23 '13 at 05:50
  • 3
    x86_64 has 8 extra registers for general use. – Dietrich Epp Feb 23 '13 at 05:58
  • Aversion to using memory will become quite a hindrance once you graduate to algorithms that are more complicated that "Hello world". Better learn to use memory (hint: PUSH/POP is **not** how you do it). And yes, ESI, EDI, EBP. ESP if you're crazy. – Seva Alekseyev Feb 25 '13 at 02:28
  • 1
    Louder hint: set up a stack frame when you enter a function, and set EBP to point to the frame. Then you can have essentially as many private memory locations as you want by allocating them in the stack frame. My personal experience is that if I have a few data structures accessed by pointers in registers (as is usual), I pretty much can make do with the 6 registers (I burn EBP for a stack frame pointer) with only an occasion push and pop. Yes, I rewrite my code lot when I change it, to take advantage of/live with new constraints of the revised code. – Ira Baxter Feb 25 '13 at 04:14
  • 1
    Why don't you write a simple C version first to see how the compiler spill registers. But if a lot of registers must be used, changing to x86_64 is recommended – phuclv Aug 10 '13 at 15:03

3 Answers3

4

Why four? You can use all of these: eax, ebx, ecx, edx, esi, edi and ebp. That's seven. Or is that not enough either?

FPU and MMX registers are somewhat awkward to work with since they can only be loaded from themselves and memory and stored only to themselves and memory. You cannot freely move data between them and general purpose registers, nor there are instructions capable of operating on both kinds of registers at the same time.

If seven general purpose registers aren't enough, use local/on-stack variables. For example, you can decrement a counter variable in memory directly and you can also directly compare it with a constant or another register. Chances are, this is going to be no slower (likely, faster) than using FPU or MMX registers in strange ways.

Alexey Frunze
  • 61,140
  • 12
  • 83
  • 180
  • No, seven registers are not as fast as using all of them. MMX registers can be quite useful. – Ben Voigt Feb 23 '13 at 06:44
  • @BenVoigt: To be fair, they're not useful *as a loop counter instead of `ecx`*. Branching on an MMX register becoming zero is way less efficient than `dec ecx / jnz`, and requires a spare integer register at that point anyway. But sure, if you want to fill an array with an increasing sequence, use an XMM register and `paddd` / `movd` store. Or better, actually vectorize and fill 4 elements at once with `paddd` / `movdqu` store. But then you're just vectorizing the loop the normal way with vector regs. – Peter Cordes Oct 03 '20 at 02:16
  • 1
    But yes, especially in 32-bit code (where register pressure is worse) there can be cases where using the low element of a vector reg as a scalar makes sense, if you unconditionally do some integer stuff that's independent of values in integer registers. Especially just copying memory. Keep in mind that MMX specifically will need an `emms` at some point, costing cycles vs. SSE2 for XMM registers. – Peter Cordes Oct 03 '20 at 02:18
1

How often do you need full 32 bits of a register? For things like small counters, feel free to use byte-sized quarters of general purpose registers: AH/AL, BH/BL, CH/CL, DH/DL. With some bitwise trickery, you can also use upper 16 bits of general purpose registers as an intermediate storage for word-sized variables.

In real mode (read: under DOS), you can also use segment registers ES, FS, and GS for intermediate value storage. Under a protected-mode OS (Windows, Linux, *nix) the code will crash, though.

Seva Alekseyev
  • 59,826
  • 25
  • 160
  • 281
  • 2
    That's only a good idea on Intel CPUs (which rename AH separately from the rest of the register). On AMD, `inc ah` and `inc al` have false dependencies on each other. See [Why doesn't GCC use partial registers?](https://stackoverflow.com/q/41573502). Even on Intel ([since Haswell or so](https://stackoverflow.com/questions/45660139/)), writing AL is actually a merge into RAX, meaning that `mov al, 6` is *not* dependency-breaking while `mov eax, 6` is.(It doesn't force merging of a separately-renamed AH, so at least doesn't couple those dep chains) – Peter Cordes Oct 03 '20 at 02:09
-2

Well there are SI and DI as well of course, and on x64 you have additional registers, but you can use the FP registers for whatever you want.

phuclv
  • 37,963
  • 15
  • 156
  • 475