3

I was reading this article " assembly-challenge-jump-to-a-non-relative-address-without-using-registers ".

I need to do exactly what he suggests here (Jump to a non-relative address without using registers), only I need to do it in intel syntax instead of att.

The solution he found for att syntax was:

jmp *0f(%eip)
0: .int 0x12345678

What would this look like in intel syntax?

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
ChrisMan
  • 119
  • 1
  • 2
  • 10
  • That's using a register: `%eip`. – Erik Eidt May 06 '22 at 03:07
  • Yea im just repeating how he labeled it. Im mainly just concerned how to convert his code into intel syntax – ChrisMan May 06 '22 at 03:11
  • 1
    @ErikEidt: x86-64 RIP-relative addressing (with a strange address-size override to 32-bit) isn't really using a register, just a relative displacement from the instruction. The problem here is that the syntax for RIP-relative depends on which Intel-syntax assembler you use. e.g. NASM `jmp dword [rel foo]` I think. (PC-relative addressing isn't available in 32-bit mode) – Peter Cordes May 06 '22 at 04:00
  • Am I the only one who broke down and wrote `db 0x.., 0x..` to express some instructions because I could not get the assembler to do it? – Joshua May 06 '22 at 04:09
  • I guess my main concern is that I want to allocate some space in memory, overwrite an address in memory with instruction to jump to my allocated memory. Perform some instructions there, and then jump back to right after the overwritten memory location to continue on where the program left off. – ChrisMan May 06 '22 at 14:34
  • 1
    So I checked the code at the link. The author is proposing your `jmp *0f(%eip)` for use in **32-bit mode**. But it won't work as there is no EIP-relative addressing available in 32-bit mode. The disassembly shows that what the assembler actually gave them is just `jmp *0x0`, i.e. with the jump target to be loaded from absolute address 0, which will of course crash as a null dereference. (There might be a relocation asking the linker to fill in the absolute address of the `0f` label, but that won't help if you need position-independent code.) – Nate Eldredge May 07 '22 at 12:58
  • 1
    So as far as 32-bit mode is concerned, the author is just confused. For 64-bit mode, their proposed `jmp *0f(%rip) ; 0: .quad 0x1234567890` is fine, but note that it references `rip` instead of `eip` (as it should, a 32-bit address size is possible but almost certainly not what you want) and uses `.quad` to assemble a full 64-bit address. – Nate Eldredge May 07 '22 at 13:00
  • @NateEldredge: Weird, they show some version of clang actually assembling `jmp *0f(%eip)`. Must be an old buggy version. I wondered if it was just rewriting the addressing mode into absolute, and using a placeholder (since disassembly didn't use `objdump -d -r` to show a comment), but probably it's fully buggy and just used 32-bit absolute with an address of zero. Changing it to `jmp *0f` so it assembles gives up `ff 25 06 00 00 00 jmp *0x6 2: R_386_32` - a placeholder absolute address of `6` not `0` even in the un-linked `.o`, so unless older clang also differed on that... – Peter Cordes May 07 '22 at 13:09
  • 1
    @PeterCordes: Yes, looks like it was a bug in clang 6 and earlier: https://godbolt.org/z/xW13hEsT8. And the readelf output doesn't show any reloc for that instruction, so I guess it really did just try to load from absolute address 0. Actually, if you put some other junk between the jump and the label, it looks like it took the displacement of the label relative to the start of the next instruction, and used it as an absolute address. So yeah, I guess the author of the blog just saw that it assembled and called it good, without actually trying to run it. – Nate Eldredge May 07 '22 at 17:58
  • @NateEldredge: Ok, so exactly what we'd expect from an assembler that thought 32-bit machine code worked the same as 64-bit, using the no-SIB no-register form as if it was E/RIP-relative (to the end of this instruction), but actually decodes as 32-bit absolute. – Peter Cordes May 07 '22 at 18:41

2 Answers2

3

The blog actually suggests jmp *0f(%eip) for use in 32-bit mode. But that is wrong; there is no EIP-relative addressing available in 32-bit mode, so this is not valid 32-bit assembly. It looks like clang 6.0 and earlier had a bug where it would accept jmp *0f(%eip) anyway, but the output shows that the instruction it actually assembled was jmp *0, i.e. trying to load the jump target from absolute address 0 (not *0f, the address of the local label where you put some data). This won't work and will simply crash, assuming page 0 is unmapped as would be the case under a normal OS.

(More generally, it appears the bug would cause jmp label(%eip) to take the displacement of label from the next instruction, and use it as an absolute address, which would never be useful. i.e. encode as if EIP-relative addressing worked in 32-bit mode; in 64-bit mode the same machine-code would use those 4 byte of machine code as a rel32 relative displacement instead of a disp32 absolute address. But x86-64 couldn't change how 32-bit machine code worked while maintaining backwards compatibility.) So the author is mistaken about this, and must not have actually tested their proposed code.


You tagged this so I assume you are actually interested in 64-bit mode. In that case the blog's suggestion of

    jmp *0f(%rip)
0:
    .quad 0x1234567890

is valid. Note the use of the 64-bit program counter rip and the use of .quad to get a 64-bit address. (Using eip here actually is a valid instruction, corresponding to a 0x67 address size override, but it would cause the load address to be truncated to 32 bits, which is unlikely to be desired.)

The Intel syntax for RIP-relative addressing varies between assemblers. In NASM you would write:

    jmp [rel label]
label:
    dq 0x1234567890

Other assemblers might want some variation, or might assemble jmp [label] as RIP-relative by default. Check the manual of the assembler you want to use.


If you really did want to accomplish this in 32-bit mode, it would be harder. If you know the correct selector for the code segment, which on many operating systems would be a fixed value, you could do a direct far jump and encode the 6-byte segment/offset directly into the instruction. Otherwise, I don't immediately see a way to do this without using either registers or stack.

Of course, it is easy using the stack, temporarily modifying ESP:

push $0xdeadbeef   # push the constant
ret                # pop it into EIP

The blog you linked got that one wrong, too, writing push 0xdeadbeef which is a memory source operand, loading 4 bytes from that absolute address.

The next example is also broken, using mov %eax,0xdeadbeef (store EAX to an absolute address), then jmp %eax (which GAS assembles as jmp *%eax, warning you about the missing * for an indirect jump).

Seems they're used to Intel syntax; .intel_syntax noprefix would avoid having to translate to AT&T. The blog cites an SO question they asked where the same examples appear. @fuz's answer there does correct the AT&T syntax.

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Nate Eldredge
  • 48,811
  • 6
  • 54
  • 82
  • The GAS `.intel_syntax noprefix` version would be `jmp [RIP + 0f]` if you still wanted to use `0:` as your label name. [How do RIP-relative variable references like "\[RIP + \_a\]" in x86-64 GAS Intel-syntax work?](https://stackoverflow.com/q/54745872) – Peter Cordes May 07 '22 at 18:47
1

OK I'll follow the main approach to answer such question by somebody's own.

Created a file with the following contents:

.text
        jmp *0f(%eip)
0:      .int 0x12345678

Compiled it and checked report of the same contents (well, your .int is decoded as a command):

$ gcc -c so_72135694.S 
$ objdump -d so_72135694.o

so_72135694.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <.text>:
   0:   67 ff 25 00 00 00 00    jmpq   *0x0(%eip)        # 0x7
   7:   78 56                   js     0x5f
   9:   34 12                   xor    $0x12,%al

(why gcc and not directly as - well, I was too lazy to recall as options.)

And then, called Intel style decoding:

$ objdump -d -Mintel-syntax so_72135694.o

so_72135694.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <.text>:
   0:   67 ff 25 00 00 00 00    jmp    QWORD PTR [eip+0x0]        # 0x7
   7:   78 56                   js     0x5f
   9:   34 12                   xor    al,0x12

Let's compare it back:

$ cat so_72135694.intel.S
.intel_syntax noprefix
.text
        jmp QWORD PTR [eip+0x0]
0:      .int 0x12345678
$ gcc -c so_72135694.intel.S 
$ objdump -d so_72135694.intel.o

so_72135694.intel.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <.text>:
   0:   67 ff 25 00 00 00 00    jmpq   *0x0(%eip)        # 0x7
   7:   78 56                   js     0x5f
   9:   34 12                   xor    $0x12,%al
$ objdump -d -Mintel-syntax so_72135694.intel.o

so_72135694.intel.o:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <.text>:
   0:   67 ff 25 00 00 00 00    jmp    QWORD PTR [eip+0x0]        # 0x7
   7:   78 56                   js     0x5f
   9:   34 12                   xor    al,0x12

One can easily see they are identical, and you can follow this method for all similar questions.

NB1: It is crucial to note that Unix binutils interpretation of what is "Intel syntax" will differ in subtle details with what Intel itself thinks (and even in syntax basics, like 0x1234 vs. 1234h), and with wide popular tools like NASM or FASM. Here I assume if you say for AT&T syntax, the most typical Binutils pack (GNU one) is utilized (and my system here is Ubuntu 20.04/x86-64, the nearly most popular one). If Iʼm wrong here, feel free to explore other tools specifics.

NB2: The really confusing thing in your code was using relative addressing over EIP. This addressing can be used only in 64-bit mode, but in that case using EIP is weird. An attempt to compile this in 32-bit mode (using e.g. .code32) naturally fails.

Netch
  • 4,171
  • 1
  • 19
  • 31
  • Note that the address `0x12345678` ought to be 64 bits. – Nate Eldredge May 07 '22 at 12:46
  • An attempt to actually do a jmp with a 32-bit memory-indirect operand failed: `jmp dword ptr [RIP + 0f]` assembles to `66 ff 2d 00 00 00 00 ljmpw *0x0(%rip)` - a 16-bit operand-size `far` jump, loading a new CS:IP! NASM refuses to assemble `jmp dword [rel foo]`. Which makes sense, as https://www.felixcloutier.com/x86/jmp shows, `jmp r/m32` is not supported in 64-bit mode. – Peter Cordes May 07 '22 at 12:47
  • Also, the use of `%eip` leading to an address-size override is probably a mistake by the OP and should have just been `%rip`. – Nate Eldredge May 07 '22 at 12:49
  • @NateEldredge: Yes, it ought to be a 64-bit address, but the code in the question seems to be an attempt to keep it 32-bit (which isn't possible); I wonder if that's why they used EIP, thinking that address-size would imply memory operand-size? Obviously `pushq imm32` / `ret` would be more compact, but even lower performance by throwing off return address prediction. – Peter Cordes May 07 '22 at 12:49
  • 1
    @PeterCordes: I looked again at the link and added some comments on the question. The author suggested `jmp 0f(%eip)` for use in **32-bit mode** which is simply wrong. For 64-bit mode they use `jmp 0f(%rip)` with a 64-bit address and everything makes sense. – Nate Eldredge May 07 '22 at 13:02