Why is it recommended to only use registers a0 and a1 to pass return values in RISC-V?

Question

The RISC-V Calling Convention states that registers a0 and a1 can be used for return values, rather than all eight registers a0~a7. When there are more than two values needed "returning", we can use the stack otherwise. Why? Are there any advantages to do that?

I'm learning RISC-V language as part of my study of computer architecture. I've noticed that we can pass arguments to a function using all of the eight a0~a7 registers, but only two of them, a0 and a1, can be used to return the return values, according to some RISC-V Calling Conventions, like Understanding RISC-V Calling Convention and RISC-V Calling Conventions. I'm confused about why the convention includes the rule that only a0 and a1 should be used to return. I skimmed over the two articles mentioned above, but I failed to find anything that explains this. In my opinion, the fact that these a0~a7 registers are not preserved across function calls indicates that we can freely use them in a function. Therefore, we can and we should use any of them to pass return values if needed, for easiness and efficiency. In a word, are there any reasons requiring us to limit return values to a0 and a1 registers?

P.S. I just noticed this question Why two return registers (in many procedure calling conventions/ABIs), which tells me consecutive reigisters can be used for big numbers. However, my point is, why we limit ourselves from putting more return values into a2~a7, even though there seems no apparent disadvantages? Or, is it awful if I use a2~a7 for return values, violating the convention?

Yes, you do quickly end up in awful territory. The point of a calling convention is to let a call succeed, even though the caller and callee are written in completely different languages and stored in different modules. Strict rules are needed to guarantee this, balancing competing interests, they are based on what is commonly used in practical programs. It does happen, the .NET just-in-time compiler can do this, optimizing the return of small structs. Which it can do because it controls both the caller and callee. — Hans Passant, Mar 28 '23 at 13:55

score 4 · Answer 1 · answered Mar 28 '23 at 13:02

Disclaimer: Following is complete speculation.

Functions rarely return more than one value and calling conventions tend to be tailored for C anyway where you can't even. The two return registers on risc-v and other architectures are used for types that don't fit into a single register, usually double width integers or small structs. Larger structs are normally dealt with in memory, you don't typically need all the members in registers at the same time. That applies to the callee as well, it may very well be creating the return value in memory so it would take extra instructions to load into registers only for the caller to immediately copy it back. Passing a pointer to the expected return location allows the result to be created in the right place directly.

That said, for your own functions you can use whatever you want. Even compilers use ad-hoc conventions for private functions.

score 2 · Answer 2 · answered Mar 28 '23 at 13:21

2

As with the other answers that have been provided so far, this is speculation on the rationale, not an authoritative source.

There are diminishing returns when using more registers. Consider the callee (called procedure): As you put more and more return values into their return registers, the number of registers to use for computing the remaining return values is reduced more and more. What do you do? You spill more values on the stack. Worst case, you push return values on the stack only to pop them again to satisfy your calling convention.

The same goes for the caller. What do you do when all non-preserved registers are filled with return values? You can't really do much except push some of them onto the stack and then reload them later. At this point in the worst case you push and pop two times each across caller and callee instead of once in the current calling convention.

You might argue that four registers would be a good compromise and I agree that this is a thought worth entertaining. However, there are some non-technical aspects to consider:

Using two registers is a long-standing convention on other platforms. ARM does it, x86 does it. This means adapting existing codes and compilers is easier. Plus, the expectation of programmers who come from these platforms is to have roughly two "fast" return values which means compiling existing code for RISC-V is unlikely to benefit from an improved number of return values since most people don't do that in the first place; and optimizing your codes for RISC-V with more return values will make them slower on other platforms.

answered Mar 28 '23 at 13:21

Homer512

9,144
2
8
25

As a learner I haven't experienced much the scarceness of registers. Your answer enlightens me a lot. Your mention of potential inefficiency also shows valuable information to me. – adong660 Mar 28 '23 at 15:26
1

@adong660: RISC-V has 32 registers, and in the standard ABI there are plenty of [call-clobbered](https://stackoverflow.com/a/56178078/224132) registers (t0-t6) separate from a0-a7 which you can use without having to save/restore them on the stack. And for the caller, it could have pointers in s0-s11 that the callee preserves. So this argument doesn't hold water for RISC-V, especially not the part from the caller's perspective. It's normal for a non-leaf function, it's normal to save some `s` regs for its own use. – Peter Cordes Mar 28 '23 at 15:33
@adong660: But yes, on platforms with 16 registers like x86-64, they can start to get scarce in complex functions. 32-bit x86 is especially bad, with only 8 general-purpose registers (one of them a stack pointer), a dinosaur from an earlier age of computing. That's where the tradition of at most 2 return-value registers dates back to, but it's hard to say whether it was cause or effect (of what Jester pointed out, that C only allows one return value, which might be a struct, not e.g. returning separate scalars from `strcmp` of pointer and - / 0 / + compare result like you'd do in asm.) – Peter Cordes Mar 28 '23 at 15:36
@adong660: Anyway, if you were porting hand-written asm to an ISA with fewer registers, you'd probably find it was more efficient not to return as many values in registers, if you'd been using a lot. That's a weak argument for not taking advantage of what's possible in RISC-V. It's pretty unlikely you'd want anywhere near 8 return values in registers, or one large struct that the caller would probably just store, but maybe there's a case where different caller would only want to use part of a struct, so having its members in separate registers lets different callers work efficiently. – Peter Cordes Mar 28 '23 at 15:40
2

@adong660: I think the major reasons that were relevant for RISC-V choosing 2 return-value registers were what the last paragraph in this answer points out: historical inertia and following typical conventions from ABIs for other ISAs. And what Jester points out, that forcing larger structs to always be returned by register could be worse for a C ABI, unlike hand-written asm where it's your choice for each function how you return stuff, if the caller is also hand-written. – Peter Cordes Mar 28 '23 at 16:02

Why is it recommended to only use registers a0 and a1 to pass return values in RISC-V?

2 Answers2