4

I want to copy the value at a certain address in memory to a register using AT&T style assembly. I know this shouldn't be hard, and I think in Intel style it's something like:

mov rdi, [0xdeadbeef]

But I don't know much about the AT&T style (or assembly in general). I searched about it but all the examples about mov that I got didn't include this one.

So can anyone tell me how that instruction looks like?

Also, where can I find a complete list of x86_64 assembly instructions in AT&T style?

phuclv
  • 37,963
  • 15
  • 156
  • 475
Gnijuohz
  • 3,294
  • 6
  • 32
  • 47
  • Hexadecimals start with a zero `0`, not `o`, and there should be a comma between parameters – phuclv Oct 21 '13 at 04:53
  • @LưuVĩnhPhúc thanks, I corrected the mistake. – Gnijuohz Oct 21 '13 at 04:57
  • Asking for a list is off-topic here, but here it is: [Is there a complete x86 assembly language reference that uses AT&T syntax?](https://stackoverflow.com/q/1776570/995714) – phuclv Jul 04 '18 at 03:43
  • AT&T just means put the operands in backward order. There is more to the assembly languages than that the assembler, the tool that reads it dictates the language syntax. and the x86 assemblers have a lot of incompatible syntaxes with x86 vs AT&T being a tiny subset of that. – old_timer Apr 24 '19 at 17:05

2 Answers2

5

To copy the value at a certain address in memory to a register in 32-bit mode we use

mov edi, [0xdeadbeef] ; Intel
movl 0xdeadbeef, %edi ; AT&T

In AT&T any literal that is not prefixed by $ is an address

But in x86_64 64-bit absolute addressing is not allowed, so you can't use movq 0xdeadbeef, %rdi like above. The only instruction that has 64-bit immediate is mov (movabs in gas), which can assign a 64-bit constant to any registers, or move value at a 64-bit absolute address to Areg

mov rax, [0xdeadbeef]   ; Intel
movabs 0xdeadbeef, %rax ; AT&T

If you really need to move the value from a 64-bit absolute address to a register different from Areg you must use indirect addressing instead

mov rdi, 0xdeadbeef     ; Intel
mov rdi, [rdi]

movq $0xdeadbeef, %rdi  ; AT&T
movq (%rdi), %rdi

or if you want the value to be copied to both rax and rdi then

mov rax, [0xdeadbeef]   ; Intel
mov rdi, rax

movabs 0xdeadbeef, %rax ; AT&T
movq %rax, %rdi

Here the q suffix means quadword (64-bit) registers

In AT&T syntax the size of memory operands is determined from the last character of the instruction mnemonic. Mnemonic suffixes of b, w, l and q specify byte (8-bit), word (16-bit), long (32-bit) and quadruple word (64-bit) memory references. Intel syntax accomplishes this by prefixing memory operands (not the instruction mnemonics) with byte ptr, word ptr, dword ptr and qword ptr. Thus, Intel mov al, byte ptr foo is movb foo, %al in AT&T syntax.

In 64-bit code, movabs can be used to encode the mov instruction with the 64-bit displacement or immediate operand.

https://sourceware.org/binutils/docs/as/i386_002dVariations.html

More information about 64-bit mov instruction here: Difference between movq and movabsq in x86-64. As you can see there's no version for moving from a 32-bit absolute address to a 64-bit register, so even in rare cases when the address fits in 32 bits like 0xdeadbeef, you still have to use movabs Areg, moffs64

phuclv
  • 37,963
  • 15
  • 156
  • 475
  • 1
    Should it be mov 0xdeadbeef, %rdi? Because in AT&T the destination should be the second one. So $0xdeadbeef copies the value and 0xdeadbeef copies the address? – Gnijuohz Oct 21 '13 at 04:44
  • no, Intel style `mov [0xdeadbeef], rdi` would be reversed in AT&T syntax, which is like the one I wrote above – phuclv Oct 21 '13 at 04:46
  • 1
    ok, my intel style instruction was wrong... Based on what I asked for, which is 'move value from memory to register' it should be the way I commented. – Gnijuohz Oct 21 '13 at 04:56
  • Ahhh, your Intel style example is wrong. To copy the value from an address to a register the instruction should be `mov rdi, [0xdeadbeef]`, which is `movq 0xdeadbeef, %rdi` in AT&T – phuclv Oct 21 '13 at 04:57
  • Now I'm confused... Could you explain the following three instructions? 1. `mov $5, %rdi` 2. `mov $0xdeadbeef, %rdi` 3. `mov 0xdeadbeef, %rdi`.The first one is moving the value 5 to rdi, the second one and the third one which is moving the value at the address, which one is moving the address 0xdeadbeef??? I thought the second one was moving the value because of the `$` sign... – Gnijuohz Oct 21 '13 at 05:04
  • I've said that any number that is not prefixed by $ is and address. So the first 2 instructions will move the value (5 and 0xdeadbeef) to rdi. Only the last one move data from the address 0xdeadbeef to rdi. Look at the answer [here](http://stackoverflow.com/a/18998302/995714) – phuclv Oct 21 '13 at 05:06
  • Sorry but actually your answer gives an error `suffix or operands invalid for 'mov'` so I have to unchoose it as the answer... – Gnijuohz Oct 21 '13 at 22:26
  • That's because you're compiling in the wrong mode. You must specify "-m64" or "-mx32" to use 64-bit registers and "q" suffix. Look at [Differences between NASM, MASM, and GAS](http://cs.lmu.edu/~ray/notes/x86assembly/) here, it also states that to move contents of address 10 into register ecx, use `mov ecx, [10]` or `movl 10, %ecx` – phuclv Oct 22 '13 at 01:44
  • I understand why. [32-bit absolute address is not supported on x86_64](http://stackoverflow.com/questions/6577482/assembler-error-mach-o-64-bit-does-not-support-absolute-32-bit-addresses). Edited my answer – phuclv Oct 22 '13 at 04:27
  • I think you read the question title backwards. It *is* asking for mem -> reg, like its Intel-syntax instruction is doing. – Peter Cordes Jul 04 '18 at 05:57
  • @PeterCordes originally the OP had written `mov [0xdeadbeef], rdi` which was reg -> mem – phuclv Jul 04 '18 at 05:58
  • Ok, but your AT&T syntax `mov %rdi, 0xdeadbeef` is a store. – Peter Cordes Jul 04 '18 at 06:15
  • @PeterCordes that was the answer to the old question. I've updated to make it clearer – phuclv Jul 04 '18 at 06:19
  • The title was always mem to reg. Preserving that historical detail as the very first thing in your current answer to the fixed question really brings it down, IMO. At least move it to a PS section or something. – Peter Cordes Jul 04 '18 at 06:53
  • BTW, 32-bit absolute addressing is allowed for any instruction: NASM `a32 mov rdi, [abs 0xdeadbeef]` is encodeable because the address-size prefix truncates to 32-bit with *zero*-extension back to 64, unlike normal ModRM `[abs disp32]` encoding which *sign*-extends. That's the problem here with `0xdeadbeef`, which is outside the low 2G but still in the low 4G. Objdump disassembly is `67 48 8b 3c 25 ef be ad de mov 0xdeadbeef(,%eiz,1),%rdi` (because 32-bit absolute uses a SIB with no index I guess) – Peter Cordes Apr 04 '22 at 07:25
  • In AT&T syntax, `addr32 mov 0xdeadbeef, %rdi`. – Peter Cordes Apr 04 '22 at 07:27
  • `mov rax, [0xdeadbeef]` isn't correct NASM syntax for movabs. Unfortunately you need `mov rax, [qword 0xdeadbeef]`, otherwise it just warns about `dword data exceeds bounds` and encodes a disp32 load from `0xffffffffdeadbeef` – Peter Cordes Apr 04 '22 at 07:50
0

Normally mov rdi, [0x123456] is fine, AT&T mov 0x123456, %rdi.

In this special case, your address 0xdeadbeef is outside the low 2GiB so you can't use a normal 32-bit absolute address. But it's within the low 4GiB, so you can use a 32-bit address-size override to get a 32-bit zero-extended address instead of needing movabs with a full 64-bit absolute address (moffs), or moving an imm64 to a register to set up for mov (%rdi), %rdi

NASM syntax:

a32 mov rdi, [a32 abs 0xdeadbeef]

GAS AT&T syntax:

addr32 mov 0xdeadbeef, %rdi

Both assemble to the same machine code, which objdump disassembles as:

67 48 8b 3c 25 ef be ad de      mov    0xdeadbeef(,%eiz,1),%rdi

32-bit absolute [disp32] uses a SIB with no index (the longer of the two redundant encodings in 32-bit machine code for a [disp32] absolute addressing mode), so that's probably why it disassembles that way. The shorter of the two encodings was repurposed for x86-64 to be [RIP+rel32].

An address-size prefix costs 1 extra byte, but does execute efficiently on existing CPUs. It does not cause an LCP stall on Intel CPUs unless you use it on movabs, because the length of the rest of the instruction is the same with or without it. (Unlike in 32-bit mode where it overrides the interpretation of disp32 to be disp16, and ModRM to be 16-bit style with no optional SIB).


The other option is mov $imm32, %r32 (5 bytes) to get the address zero-extended that way. This is 2 separate instructions but actually smaller machine code size: 8 total bytes vs. 9 for mov with an absolute 32-bit address. It will still decode to 2 uops, so it's less efficient than the single-instruction load.

  401009:       bf ef be ad de          mov    $0xdeadbeef,%edi
  40100e:       48 8b 3f                mov    (%rdi),%rdi

Alternatives in NASM syntax for full 64-bit addresses, as in
Load from a 64-bit address into other register than rax

  mov rsi, 0x000000efdeadbeef          ; address into register
  mov rsi, [rsi]

  mov rax, [qword 0x00000000deadbeef]  ; moffs64 load into RAX, then copy
  mov rdi, rax

AT&T Disassembly:

  401011:       48 be ef be ad de ef 00 00 00   movabs $0xefdeadbeef,%rsi
  40101b:       48 8b 36                mov    (%rsi),%rsi

  40101e:       48 a1 ef be ad de 00 00 00 00   movabs 0xdeadbeef,%rax
  401028:       48 89 c7                mov    %rax,%rdi

If you omit the qword in [qword 0xdeadbeef], NASM will warn warning: dword data exceeds bounds and emits:

# without forcing qword address encoding for NASM, it truncates to a disp32
48 8b 04 25 ef be ad de         mov    rax,QWORD PTR ds:0xffffffffdeadbeef
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847