0

I would like to write a small thunk which will call an underlying function and then compare the value of rsp before and after the call. Critically, this code shouldn't clobber any registers.

The obvious implementation is to simply push rsp before the call, and then compare after. The problem is that push rsp itself subtracts 8 from the stack, so you should actually compare the saved value of rsp with the value after the call plus 8.

Here's one way to do it:

foo_thunk:
push rsp              ; save the value of rsp
call foo              ; call the underlying function
add rsp, 8            ; adjust rsp by 8, popping the saved value
cmp rsp, [rsp - 8]    
jne bad_rsp           ; bad foo!
ret

The problem is this accesses a value [rsp - 8] which is above rsp - i.e., not on the stack but the nebulous region above the stack. This should be where you have a red-zone, but it isn't when you don't.

What are the alternatives? Performance and code-size is important.

BeeOnRope
  • 60,350
  • 16
  • 207
  • 386
  • Does it have to be re-entrant? – harold Oct 24 '17 at 22:25
  • 2
    Also that breaks alignment requirement. You are of course not required to adjust `rsp` before the comparison so you don't need to access under the stack pointer. Worse problem is, there may be arguments on the stack that you need to handle. – Jester Oct 24 '17 at 22:26
  • @harold - yes it does. – BeeOnRope Oct 24 '17 at 22:28
  • 4
    PS: if `rsp` is destroyed chances are the `foo` will not return to you so I don't see much point in checking. XY problem? – Jester Oct 24 '17 at 22:28
  • @Jester - good point, fixed (I think). – BeeOnRope Oct 24 '17 at 22:28
  • @Jester - yes, most of the time you are right, and this check won't fire. Still I would like to know how to write it. – BeeOnRope Oct 24 '17 at 22:29
  • 1
    Jester is also correct about this throwing functions with enough parameters out of whack if they are spilled to the stack. – Michael Petch Oct 24 '17 at 22:31
  • 2
    You can allocate a separate stack for the values. – Jester Oct 24 '17 at 22:32
  • 1
    @Jester Separate space for the values would be my method. – Michael Petch Oct 24 '17 at 22:34
  • 1
    Another alternative: `sub rsp,8` `push verify_rsp_is_valid` `call foo_with_no_args_in_stack` `ret` `verify_rsp_is_valid: add rsp,8` `ret`. Will continue correctly only when `rsp` was valid, or the `foo` understands your trickery and jumps directly or leaves `verify_rsp_is_valid` address in stack on proper place. In case of bad `rsp` you will end randomly elsewhere. But that's the same thing as ordinary `foo` will do either way, so the practical usability looks to be close to zero, I would rather understand some canary system to check for stack overwriting, then checking the `rsp` itself. – Ped7g Oct 24 '17 at 22:34
  • 1
    Right @Ped7g - but this is separate from stack overwriting: it's checking that `rsp` is properly handled (but it's not going to work). Canary is checking for overwriting (and like this check also is best in the `callee` since it's not going to work in a thunk). – BeeOnRope Oct 24 '17 at 22:37
  • @Ped7g : That still would screw up any function call where there are enough parameters where they get spilled to the stack. – Michael Petch Oct 24 '17 at 22:37
  • @MichaelPetch sure.. that's why I called it `foo_with_no_args_in_stack` ... but overall it's simply very stupid alternative, thinking about it. Also even if it is transparent no-op for ordinary code, it's huge "welcome here" attack vector for any malicious code, so I would definitely not suggest to use something like that. As long as I am concerned, if the `call foo` will later reach the next instruction, it means the `rsp` is by 99.99% correct one, or you are under heavy attack. Another alternative is probably self-mod to store `rsp` as immediate for `cmp`, but `cmp rsp,imm64` is not valid?? – Ped7g Oct 24 '17 at 22:41
  • And self-mod in such near vicinity may have performance implications, so from performance perspective it's probably better to store/restore other registers, and spill the test code mechanism over those with separate stack for the rsp check values. (I'm more like thinking loudly, because I believe BeeOnRope is experienced enough to pick up only signal and ignore the noise) .. just as catch prog. errors even `cmp esp,imm32` may be enough! – Ped7g Oct 24 '17 at 22:43
  • 1
    Just to provide some motivation: this was a sub-question of [this one](https://stackoverflow.com/questions/46905229/writing-a-thunk-to-verify-sysv-abi-compliance) which is intended to be a fail-fast check for the C-asm interface (or asm-asm interface) in "debug" mode (i.e., it can be disabled with a compile-time variable). I've debugged too many heisenbugs where some callee-preserved register was clobbered (or worse: _sometimes_ clobbered) which broke (or not) the calling code in wonderful ways depending on compiler flags, etc. This idea of checking `rsp` in the thunk seems DOA though. – BeeOnRope Oct 24 '17 at 22:47
  • 2
    @Jester - actually I think the alignment of the original code was correct. As I understand it, the caller must have a 16B aligned stack at the moment of the call, which means it is misaligned again by 8B in the callee since the `call` pushes the return address. So the `sub rsp, 8` aligns it again. – BeeOnRope Oct 24 '17 at 23:07
  • @BeeOnRope yeah my bad, sorry! – Jester Oct 24 '17 at 23:08
  • 1
    @Jester - well your genius separate stacks idea is the special sauce I needed all along so in the end this apparently pointless question was well-worth it. – BeeOnRope Oct 24 '17 at 23:12

3 Answers3

0

I guess a simple solution is just to use sub and mov to adjust the stack, rather than push:

foo_thunk:
sub rsp, 8            ; 16 to maintain alignment
mov [rsp], rsp        ; save the value of rsp
call foo              ; call the underlying function
cmp rsp, [rsp]    
jne bad_rsp           ; bad foo!
add rsp, 8            ; pop the stack
ret 

Of course, if the callee foo messed up the stack pointer, it probably won't return to the calling thunk at all, so this check has limited usefulness. It would be better off in the callee as a check prior to ret.

The the original caller has put any arguments on the stack you are really screwed. Putting the check in the callee solves that too.

BeeOnRope
  • 60,350
  • 16
  • 207
  • 386
  • 1
    It *could* change `rsp`, still return, and also pass that test though – harold Oct 24 '17 at 22:34
  • @harold - you mean by also overwriting the saved value of `rsp` on the stack? Sure. To be clear my motivation here is to catch programming errors quickly, not as a mechanism to defend against malicious code. Of course, it still fails in that approach because a function that fails to maintain `rsp` properly is almost never going to return to the thunk as pointed out in the comments. – BeeOnRope Oct 24 '17 at 22:38
  • You'll want to change the sub and add to use 8 instead of 16 to match the conclusion reached in the comments to the question. – prl Oct 24 '17 at 23:42
  • 1
    @BeeOnRope: It's possible for `foo` to pop the return address, make a stack frame somewhere else, and `ret` there. But passing this test doesn't involve *overwriting* anything, it involves making a value at the new location point to itself. Your check is using the new `rsp`. This kind of mistake is very unlikely to happen by accident; I wouldn't worry about it. (Fun fact: gcc will sometimes copy a stack frame including return address, although it doesn't use it to return. Just to have a stack frame for backtraces or something I think. I forget why, maybe GNU C nested functions.) – Peter Cordes Oct 25 '17 at 02:28
0

I assumed from the original question that you wanted to adjust rsp by 8 before branching to bad_rsp when there is an error, which you can do with

foo_thunk:
    sub rsp, 8
    mov [rsp], rsp
    call foo
    cmp rsp, [rsp]
    lea rsp, [rsp+8]
    jne bad_rsp
    ret

(Of course, once you've determined rsp has not been preserved, there's probably no need to adjust it, but I think this approach is useful enough to be worth mentioning.)

prl
  • 11,716
  • 2
  • 13
  • 31
  • Also possible: `pop rsp` instead of `lea`, except that if RSP isn't pointing to your saved RSP anymore, you're screwed a different way. (And `pop rsp` is 3 uops on IvyBridge and later, slower than `lea`). – Peter Cordes Oct 25 '17 at 02:21
-2

Self mod alternative, checking only low 32b part of rsp:

foo_thunk:
mov [cs:check_esp_part+2],esp
call foo              ; call the underlying function
check_esp_part:
cmp esp,0x12345678
jne bad_rsp           ; bad foo!
ret 
Ped7g
  • 16,236
  • 3
  • 26
  • 63
  • 1
    In 64 bit mode you can just write the value into nearby memory and use `cmp rsp, [rip+x]`. Logically this is just a global variable (and due to non-writable code page probably a non working one ;)) and definitely not reentrant. – Jester Oct 24 '17 at 23:00
  • @Jester forgot about re-entrant constraint. The addressing + access are more like implementation detail ;), but reentrant makes this plain wrong and useless. – Ped7g Oct 24 '17 at 23:07
  • If you're doing this for hardening purposes, this requires a memory page with write+exec permissions (for self-modifying code). Normally you wouldn't need that. – Peter Cordes Oct 25 '17 at 02:11