3

I'm debugging a crash in linux, and am going through the assembly of the function cxa_finalize

The crash happens on a source line that appears unreachable:

cmp    %edx,%esi             // f >= &funcs->fns[0]
jae    0xb6e17b88            // enter for loop
jmp    0xb6e17c08            // exit for loop
lea    0x0(%esi,%eiz,1),%esi // crashes here - how do we even get here since there is a 
                             // jmp above us, and nothing jumps to here
cmp    %edi,0xc(%esi)        // d == f->func.cxa.dso_handle (jumped to from below)

later on in this method, there is another lea 0x0(%esi,%eiz,1),%esi instruction that appears after a jmp and nop that also appears unreachable. In that case, the jmp is also exiting a for loop.

Is there some paradigm going on here that puts in unreachable instructions?

Edit: Turns out it wasn't crashing on the lea instruction, but on the cmp instruction after it, when accessing the memory at esi.

default
  • 2,637
  • 21
  • 44
  • 3
    It may just be garbage your debugger is disassembling to readable instructions. This can happen for a wide variety of reasons. – Captain Obvlious May 25 '16 at 22:47
  • 1
    It would help to know the addresses of this code. Do you know, for example, that the instruction `jae 0xb6e17b88` isn't just a jump over the `jmp` instruction to the `lea` instruction? – davidbak May 25 '16 at 22:47
  • 2
    It may be the target of some other jump too. The `lea` is just padding for alignment, by the way, so that may indeed not be reached. – Jester May 25 '16 at 22:48
  • My personal favourite is a smashed stack results in a return to the wrong place in the code. Hilarity results. It's not quite Bill Murray funny, but it's way up there. – user4581301 May 25 '16 at 22:52
  • @davidbak, yes i know that. I'm showing a simplified output in my question, but in my full output, each line has a absolute and relative address next to it, and all the jumps also show the relative address as well so its really easy to see where all the jumps of the function are going. – default May 25 '16 at 22:55

1 Answers1

5

I found the answer here

Sometimes GCC inserts NOP instructions into the code stream to ensure proper alignment and stuff like that. The NOP instruction takes one byte, so you would think that you could just add as many as needed. But according to Ian Lance Taylor, it’s faster for the chip to execute one long instruction than many short instructions. So rather than inserting seven NOP instructions, they instead use one bizarro LEA, which uses up seven bytes and is semantically equivalent to a NOP

Community
  • 1
  • 1
default
  • 2,637
  • 21
  • 44
  • 4
    Geez, if the word "bizarro" wasn't in the quote, I would have made it this comment. – davidbak May 25 '16 at 22:57
  • 2
    If the alignment filler code is supposed to be unreachable anyway, is there any advantage in filling with one bizarro LEA instruction rather than a series of NOP instructions? – Ian Abbott May 25 '16 at 23:42
  • @IanAbbott: It *probably* doesn't matter. It might still get decoded in the same cycle as the jmp that makes it unreachable. IDK if Intel SnB-family CPUs would allocate a new uop cache-line to hold those decoded uops, in case they're needed later. (It doesn't know until after it decodes them that they're `nop`s, and in the non-`nop` case it's probable that something will jump there soon.) So it might possibly save a cycle decoding, or some space in the uop cache. – Peter Cordes May 26 '16 at 02:57
  • @IanAbbott: Fun fact: the static prediction for an unconditional indirect jump is the next address. If that's not one of the possible branch targets, it does actually help to follow it with a `ud2` to stop speculative execution when a better branch-target prediction isn't available. Or, if you're making a jump table, try to put the common case as the address right after the `jmp rax`. – Peter Cordes May 26 '16 at 02:59