3

im learning x86 assembly, using the code below for testing, i see in gdb console that the rsp register which points at the top of the stack starts at 0x7FFFFFFFDFD0, if i understand correctly, in the code i haven't used push or pop which modifies rsp, so 0x7FFFFFFFDFD0 its a default value, this implicate that we have the same number of bytes in stack, but im using linux where stack size is 8mb

section .text
global _start
_start:

mov rcx, 2
add rcx, 8

mov rax, 0x1
mov rbx, 0xff
int 0x80
Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
Isaí
  • 49
  • 3
  • 2
    There's a bunch of stuff on the stack when your program starts (ELF aux vectors, command line arguments, environment variables). That takes space. The actual top of stack is likely `0x7FFFFFFFFFFFFFFF`. – fuz Nov 06 '22 at 22:22
  • See [ASLR](https://en.wikipedia.org/wiki/Address_space_layout_randomization) – dimich Nov 06 '22 at 22:24
  • 3
    @dimich: GDB disables ASLR by default; that's why the OP sees the same value every time, and why it's so near the top page of the low canonical half of virtual address-space. ASLR would change more bits than that for where the top of the 8M mapping is, as well as offsetting a random amount within a page. (Actually doesn't start as an 8M mapping, but is growable up to that.) – Peter Cordes Nov 06 '22 at 22:41
  • Related: [Each program allocates a fixed stack size? Who defines the amount of stack memory for each application running?](https://stackoverflow.com/q/69623703) mentions some about why it's at the top of canonical address-space. Also [Why does Linux favor 0x7f mappings?](https://stackoverflow.com/q/61561331) – Peter Cordes Nov 06 '22 at 23:07
  • Please capitalize normally. – thb Nov 06 '22 at 23:20

1 Answers1

8

For 64-bit 80x86; typically (see note 1) only 48 bits of a virtual address can be used. To make it easier to increase the number of bits that can be used in future processors without breaking older software; AMD decided that the unused highest 16 bits of a 64-bit virtual address should match. Addresses that comply with this are called "canonical addresses", and addresses that don't are called "non-canonical addresses". Normally (see note 2) any attempt to access anything at a non-canonical address causes an exception (a general protection fault).

This gives a virtual address space like:

0x0000000000000000 to 0x00007FFFFFFFFFFF = canonical (often "user space")
0x0000800000000000 to 0xFFFF7FFFFFFFFFFF = non-canonical hole
0xFFFF800000000000 to 0xFFFFFFFFFFFFFFFF = canonical (often "kernel space")

This makes it reasonably obvious that, without Address Space Layout Randomization, a process' initial thread's stack (see note 3) begins at an address slightly lower than the highest address that a process can use.

The difference between the highest address a process can use and the address you're seeing (0x7FFFFFFFDFD0) is only 2030 bytes; which (as mentioned by Fuz's comment) is used by things like ELF aux vectors, command line arguments and environment variables, that consume part of the stack before your code is started.

Note 1: Intel recently (about 2 years ago?) created an extension that (if supported by CPU and OS) makes 57 bits of a virtual address usable. In this case the "non-canonical hole" shrinks, and the highest virtual address a process can use would be increased to 0x00FFFFFFFFFFFFFF.

Note 2: More recently (about 6 months ago?) Intel created an extension that (if supported by CPU and OS and enabled for a process) can make the unused higher bits of an address ignored by the CPU; so that software can pack other information into those bits (e.g. maybe a "data type") without doing explicit masking before use.

Note 3: Because operating systems typically provide no isolation between threads (e.g. any thread can corrupt any other thread's stack or any other thread's "thread local data"); if you create more threads they can't use the same "top of stack" address.

Brendan
  • 35,656
  • 2
  • 39
  • 66