0

How can I move like: MOV EAX, EBX without using the MOV instruction?

Yuri Aps
  • 929
  • 8
  • 13
  • 1
    `lea eax, [ebx]`? – Nate Eldredge Oct 16 '21 at 21:44
  • 2
    `mov` isn't an "operator", it's an "instruction". Most assemblers allow assemble-time constant expressions that include math operators, like `add eax, 8 + 4*3` as a way to write numbers in a meaningful way. Operators in assembly are things that are part of assemble-time expressions, not instruction mnemonics. – Peter Cordes Oct 16 '21 at 22:15
  • 5
    Am I the only one wondering why you would want to? – Rob Oct 16 '21 at 22:23
  • If you don't want to use `mov`, just use a different instruction, x86 has tons of others. Any if you the isa (or here external rules) do not allow doing something you want in one instruction, then use two or more! – Erik Eidt Oct 17 '21 at 00:50
  • 3
    `MOV EAX, EBX` actually doesn't **move** data from one place to another, it does **copy**, because `EBX` keeps its original contents. If you don't care what will be left in `EBX` after the operation, you can also use `XCHG EAX,EBX` (which often has a shorter encoding on `x86`). – vitsoft Oct 17 '21 at 12:57
  • 1
    For 16-bit registers, you can use `shld ax, bx, 16` or `shrd ax, bx, 16`. This does not work however for 32-bit registers since the necessary shift count of 32 falls ouside of the allowed range [0,31]. – Sep Roland Oct 17 '21 at 17:49
  • @vitsoft: Good suggestion, that's worth mentioning, same for Sep's. I think I'd considered those but decided against SHLD because it only worked on 16-bit regs, but probably interesting to mention even if only to rule it out for 32-bit regs. – Peter Cordes Oct 17 '21 at 20:01

2 Answers2

6

You can use lea with a simple register addressing mode as a slower mov (no mov-elimination and runs on fewer ports on Intel before Ice Lake), although it's still one instruction.

# nasm -f elf64 foo.asm && objdump -drwC -Mintel foo.o
0000000000000000 <.text>:
   0:   89 c8                   mov    eax,ecx
   2:   8d 01                   lea    eax,[rcx]  # in 64-bit code,
   4:   67 8d 01                lea    eax,[ecx]  # don't use 32-bit address size

As well as being slower, it takes extra code size for some registers (RSP/R12, and RBP/R13). (I used 64-bit operand-size so they'd all need REX prefixes, so for example RSP and R12 machine code lines up because they both need to use a SIB byte.

  10:   48 89 fe                mov    rsi,rdi
  13:   48 8d 37                lea    rsi,[rdi]
  16:   48 8d 34 24             lea    rsi,[rsp]
  1a:   49 8d 34 24             lea    rsi,[r12]
  1e:   48 8d 75 00             lea    rsi,[rbp+0x0]  # source had just [rbp]
  22:   49 8d 75 00             lea    rsi,[r13+0x0]

The same thing is of course possible in other modes, using lea eax, [ecx] as a 2-byte instruction. (Or 4 bytes in 16-bit mode for 32-bit registers. Even if you only want 16-bit registers, 16-bit mode needs 32-bit address size for source registers other than [bx|bp] + [si|di] because of 16-bit addressing mode encoding limitations.)


push/pop is also an option, making it possible to copy a 64-bit register in only 2 bytes of machine code, instead of the usual 3.

Or if you don't care about the source register (actually move rather than copy), you can xchg, which has a shorter encoding when EAX is one of the two registers

  30:   52                      push   rdx
  31:   58                      pop    rax
  32:   48 89 d0                mov    rax,rdx
  35:   48 8d 02                lea    rax,[rdx]
  38:   87 f1                   xchg   ecx,esi    # opcode + ModRM form
  3a:   91                      xchg   ecx,eax    # EAX special case

This is useful for code-golf.


Other silly computer tricks include imul and a double-shift suggestion from comments

  • immediate imul by 1.

  • for 16-bit registers, SHLD or SHRD with count = 16 is possible. (x86 scalar-integer shifts mask the count with & 31 for any operand-size other than 64-bit, so this can't shift all the bits of a 32 or 64-bit full register, only a 16-bit partial register. And shld/shrd were new in 386, so 16-bit registers are always "partial".)

  40:   6b f1 01                imul   esi,ecx,0x1
  43:   66 0f a4 ce 10          shld   si,cx,0x10    # SI = CX.  CX unchanged.
  48:   66 0f ac ce 10          shrd   si,cx,0x10
  4d:   0f a4 ce 20             shld   esi,ecx,0x20  # nope, equivalent to a shift by 0

Or if you want to consider multiple instructions, you could xor-zero the destination and add, or, or xor into it like Yuri suggests. Zero is the identity element for +, | and ^.

Even less efficient would be to AND into all-ones, the identity element for &. (False dependency on the old value of EAX, and it's not xor-zeroing so it can't be eliminated (no execution unit) the way Sandybridge-family CPU do. Also larger code-size)

  or   eax, -1
  and  eax, ecx

(Godbolt compiler explorer NASM source and disassembly, for the disassembly in this answer)

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
1

First, clean the data on one register XOR EAX, EAX, then perform OR EAX, EBX

Yuri Aps
  • 929
  • 8
  • 13