2

I'm wondering how the Linux kernel knows in which registers to look for function arguments when performing a system call. For example: I call the write system call from assembly. The arguments are stores in rdi, rsi and rdx. The kernel then calls the write function which looks something like write(fd, buf, len).

But how does the kernel know, that fd is stored in rdi, buf in rsi and len in rdx? How is the implementation? Is there some kind of a mapping in the kernel that initializes the arguments from these registers?

I guess im missing something. Maybe it doesn't even has something to do with these registers? I would appreciate some help :)

Peter Cordes
  • 328,167
  • 45
  • 605
  • 847
K. Sean
  • 57
  • 5
  • 3
    It's part of the architectures calling convention. https://en.wikipedia.org/wiki/X86_calling_conventions – perivesta Oct 05 '20 at 11:28
  • 1
    The arguments are always in the same registers. The kernel knows, because it has been programmed to take the file descriptor from `edi`, the buffer from `rsi`, and so on. – fuz Oct 05 '20 at 11:38
  • @fuz I could guess that :) But since the kernel is written in C, is there actually a possibility to write in C, that argument xy is stored in register Z? Or is there something like inline assembly used? – K. Sean Oct 05 '20 at 11:41
  • I think there's something that I'm completely missing ... – K. Sean Oct 05 '20 at 11:42
  • 1
    @K.Sean The kernel entry point, i.e. the code that is executed when a system call occurs, is generally written in assembly. It copies the contents of these registers into a structure for use by the kernel. Inline assembly could be used, but it's easier to just write these parts in plain old assembly. – fuz Oct 05 '20 at 11:42
  • 1
    Specifically, the Linux kernel entry code is [here](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/entry/entry_64.S#L136). It pushes all relevant registers on the stack and then calls a system call from the system call table, passing a pointer to this register structure. – fuz Oct 05 '20 at 11:45
  • @fuz alright, I think I kind of get it. But the structure it copies those values into must then be accessible by the C Part of the implementation right? How do you get access to this structure when you carry on using C after that? – K. Sean Oct 05 '20 at 11:47
  • @fuz thanks for this specific resource. This really helps. But I kind of miss the link between the assembly part and the C part – K. Sean Oct 05 '20 at 11:49
  • 3
    @K.Sean Reading the code again, I must take that back. The registers are saved on the stack, yes. But that's only so we can restore them when returning to user space. The system call itself is just a C function called in [line 203](https://github.com/torvalds/linux/blob/16f73eb02d7e1765ccab3d2018e0bd98eb93d973/arch/x86/entry/entry_64.S#L203). Since the system call arguments are placed in the same registers as arguments to a normal C function (except using r10 for rcx, but that is addresses in line 196), no special processing is needed and the system call code is written in plain C. – fuz Oct 05 '20 at 11:54
  • @fuz Okay I see. So the Reason this is working, is because the C Compiler follows the same calling conventions and therefor "automatically" looks for the arguments in the respective registers, so we don't need to explicitly tell the C Compiler where the arguments are stored? – K. Sean Oct 05 '20 at 11:59
  • 1
    @K.Sean Yeah. It was purposefully designed to be like this. – fuz Oct 05 '20 at 12:00
  • 1
    Also read [this article](https://0xax.gitbooks.io/linux-insides/content/SysCall/linux-syscall-2.html) perhaps. Now lastly, it would be great if you could summarise what you learned and add it as an answer to your question. This way, future people with the same question may benefit from the wisdom you received. – fuz Oct 05 '20 at 12:01
  • I actually came across this article 2 minutes ago. Sure i'll do that – K. Sean Oct 05 '20 at 12:03
  • “See comments on original question.” is not what I meant when I said “write up an answer.” – fuz Oct 05 '20 at 12:05
  • 1
    [What happens if you use the 32-bit int 0x80 Linux ABI in 64-bit code?](https://stackoverflow.com/q/46087730) looks at the kernel side of a system call. But it sounds like you need to understand the concept of a **calling convention**, because that's the same for calling user-space functions. The caller and callee agree on where the caller should put args, and the callee looks there. – Peter Cordes Oct 05 '20 at 12:05
  • @fuz Yeah I know you meant a more specific answer. Thats why I have deleted this post seconds after reading your comment. I have posted an updates answer. Hope it's better now – K. Sean Oct 05 '20 at 18:10
  • @K.Sean For the future, Stack Overflow allows you to edit your answers as often as you like. There's no need to make a new answer. – fuz Oct 05 '20 at 19:20

1 Answers1

-1

When a system call is executed, the arguments are stored in registers as defined in the x86_64 calling conventions. On *nix Systems, these are most likely SystemV. The arguments to a simple C function are stored there as well, as the compiler also follows the calling conventions. Because of this, no further processing has to be done in order to map argument locations to the C calling routines.

K. Sean
  • 57
  • 5
  • Not quite accurate; x86-64 System V's function-calling convention uses RCX for the 4th arg, but [the kernel syscall convention uses R10](https://stackoverflow.com/q/21322100) (because the `syscall` instruction destroys RCX). But yes, mostly true on x86-64 System V (but not i386), at least until recent kernels always used a C wrapper for dispatch that reloads saved user-space registers from the struct saved on entry to the kernel. This has the minor benefit of sanitizing registers to make it harder for user-space to inject anything usable for Spectre gadgets, I guess. – Peter Cordes Oct 06 '20 at 03:08