6

In terms of x86 assembly code anyways. I've been reading about function calls, but still can't fully grasp the need for a base / frame pointer (EBP) along with a stack pointer (ESP).

When we call a function, the current value of EBP will be placed on the stack and then EBP gets the current ESP value.

Place holders for the return value, function arguments and local variables of the function will then be placed on the stack, and the stack pointer ESP value will decrease (or increase) to point to after the last placeholder placed on the stack.

Now we have the EBP pointing to the beginning of the current stack frame, and ESP pointing to the end of the stack frame.

The EBP will be used to access the arguments and local variables of the function due to constant offsets from the EBP. That is fine. What I don't understand is, why can't the ESP just be used to access these variables also by using its offsets. The EBP points to the beginning of the stack frame , and the ESP points to the end of the stack frame. What's the difference?

The ESP shouldn't change from once there has been a placeholder for all the local variables etc. or should it?

Engineer999
  • 3,683
  • 6
  • 33
  • 71
  • 3
    generally you dont NEED it. For many architectures you can just live with a stack pointer and no frame pointer. There are various debug or other reasons that some folks might want one even though it burns a register (some architectures the frame pointer is not a gpr) and costs extra instructions. – old_timer Oct 03 '19 at 20:16
  • @old_timer Is there an example where the ESP can change in an unpredictable way as the function is executing further? I'm guessing it should be pre-defined which registers the callee function should be pushing on the stack etc. , so it should be easy to use the ESP offsets to access local variables and arguments right? – Engineer999 Oct 03 '19 at 20:24
  • 2
    "The ESP shouldn't change from once there has been a placeholder for all the local variables etc." - why not? The compiler is free to use PUSH/POP instructions if there is a need for this, so I'm not quite getting this statement of yours... But as @old_timer says - it is possible to do without the frame pointer, just less convenient for compiler writers, debuggers etc. – tum_ Oct 03 '19 at 20:28
  • @tum_ Well this is what i'm trying to understand. What else could the compiler put there apart from the local variables, args etc? I guess i'm missing this part – Engineer999 Oct 03 '19 at 20:31
  • x86 is not _my architecture_, so I can't give real life examples and can even say something wrong (but I worked a lot on Z80 which shared a similar philosophy). As an ad-hoc example, PUSH/POP can be used simply to move a temporary value from one register to another. Basically, on assembly level there are no *local variables* - this is a high level language concept but that would be too long to elaborate on this in a comment... – tum_ Oct 03 '19 at 20:59
  • 4
    Note that many modern compilers allow using only the stack pointer and freeing EBP for other uses, e.g. gcc's `-fomit-frame-pointer`. – ninjalj Oct 03 '19 at 21:47
  • changing the stack pointer during the function is a design decision, not a hard and fast rule. like burning a register and instructions popular compilers also burn stack space. again there are reasons why, memory is cheap, etc.. – old_timer Oct 03 '19 at 23:13
  • 1
    @tum_: usually you'll only see `push` in compiler-generated code to pass function args. But yes, clang with `-Oz` (optimize for code-size even at the expense of speed) will use stuff like `push 2` / `pop rcx` (3 bytes total) instead of 5-byte `mov ecx, 2`. push/pop instead of `mov reg,reg` is insane, though. It only saves 1 byte for 64-bit regs that aren't r8..r15, otherwise it's break even or a loss. (3-byte `mov r8, r9` vs. 2-byte `push r9` + 2-byte `pop r8`.) In 32-bit code, 2-byte `mov` is always better than `push`/`pop` for speed and break-even for size. – Peter Cordes Oct 04 '19 at 10:39
  • @PeterCordes Got it. Yeah, on Z80 push/pop was frequent in compiler's output as you couldn't 'mov' 16-bit registers... – tum_ Oct 04 '19 at 10:45

2 Answers2

4

Technically, it is possible (but sometimes hard) to track how many local and temporary variables are stored on the stack, so that accessing function input, and local variables can be done without EBP.

Consider the following "C" code ;

int func(int arg) {
   int result ;
   double x[arg+5] ;
   // Do something with x, calculate result
   return result ;
} ;

The numbers of items that are stored on the stack is now variables (arg+5 items of double). Calculating the location of 'arg' from the stack require run time calculation, which can have significant negative impact on performance.

With extra register (EBP), the location of arg is always at fixed location (EBP-2). Executing a 'return' is always simple - move BP to SP, and return, etc.

Bottom line, the decision to commit the EBP register to a single function (instead of using it as a general register) is a trade off between performance, simplicity, code size and other factors. Practical experience has shown the benefit outweigh the cost.

dash-o
  • 13,723
  • 1
  • 10
  • 37
  • 1
    Yes, functions with VLAs will set up a frame pointer even with optimization enabled. Otherwise it won't. `-fomit-frame-pointer` is on by default at `-O1` and higher in GCC/clang, even for 32-bit code. Are you claiming that's a mistake, and that `-fno-omit-frame-pointer` should be the default even at `-O2` and `-O3`? That seems highly unlikely in 32-bit code, where going from 6 to 7 registers is significant. Maybe in 64-bit code the code-size advantage vs. the performance disadvantage would be a better tradeoff, but OTOH many x86-64 functions can keep more things in regs, less stack access. – Peter Cordes Oct 04 '19 at 10:35
  • The description for '-fomit-frame-pointer' indicates 'Don't keep the frame pointer in a register for functions that don't need one', and that '-O also turns on -fomit-frame-pointer on machines where doing so does not interfere with debugging'. Trying to step thru programs with '-O' without frame-pointer is very unpleasant experience - so at my company we used to disable this optimization in 32 bit. Luckily, for Intel 64 bit (-O -g) the compiler will use EBP even. – dash-o Oct 04 '19 at 11:17
  • 1
    Yes, functions with VLAs or alloca are "functions that *do* need one". Modern debug formats don't depend on EBP / RBP as a frame pointer; they use `.eh_frame` metadata for stack unwinding so you can still backtrace in optimized builds, and single-stepping is no worse than you'd expect for optimized code. One use-case for `-fno-omit-frame-pointer` is with `perf`, when taking stack snapshots on events. EBP backtracing is faster and apparently works better. – Peter Cordes Oct 04 '19 at 11:54
  • @PeterCordes I think there is a strong parallel w/ modern EH code based on tables and instr emitted by the compiler for each region of code. – curiousguy Nov 13 '19 at 05:59
  • @curiousguy: yes, unwind metadata is what enabled `-fomit-frame-pointer` to be the default in C++, even with exceptions enabled. The Linux ABI even requires unwind metadata for C, which is handy for debugging backtraces at least. – Peter Cordes Nov 13 '19 at 06:05
  • I'm confused why it would even be needed for VLAs. The caller knows which argument will be used for the VLA, and the caller is adding/subtracting from the stack-pointer so... couldn't the compiler still avoid burning a register by just using a non-changing variable for that operation? – vitiral Jun 19 '21 at 05:34
2

Side note about debugger/runtime tools:

Using of EBP make it easier for debugger (and other runtime tools) to 'walk the stack'. Tools can examine the stack at run-time, and without knowing anything about the current program stack (e.g., how many items have been pushed into eac frame), they can travel the stack all the way to the "main".

Without EBP pointing to the 'next' frame, run-time tools (including debugger) will face the very hard (impossible ?) task of knowing how to move from the ESP to specific local variables.

dash-o
  • 13,723
  • 1
  • 10
  • 37
  • 2
    If the binary has debug info, that debug info can include information about stack frames to help debuggers. Otherwise, with no debug info and also _no base pointer_, you must resort to heuristics, which often give wrong results: e.g. suppose everything on the stack which looks like an address in the code segment is a return address. – ninjalj Oct 03 '19 at 21:52
  • If you have no debug info how much of the stack frame can you really inspect though? It will be very difficult to know which variables are what/etc right? – vitiral Jun 19 '21 at 05:36