I have a pair of 32-bit floats stored in eax and ecx. Can I directly load these into the FPU to operate on them, without first storing to memory? This would simplify some compiler code significantly, but fld seems to only be able to operate on memory.
Asked
Active
Viewed 629 times
1
vgel
- 3,225
- 1
- 21
- 35
-
1No, you can't. As far as compiler code goes, I doubt it makes a significant difference if you have to go through memory. You can simulate `fld r32` easily enough (`push r32; fld [esp]; pop r32`). Anyway, consider using SSE if available. – Jester Nov 24 '14 at 03:13
-
@Jester I'm open to using SSE, what would the equivalent SSE code be for that push-fld-pop example? – vgel Nov 24 '14 at 03:14
-
You can move to an SSE register directly using `MOVD` instruction. You could do `movd xmm0, eax; movd xmm1, ecx; addss xmm0, xmm1;` then move back as necessary. Of course you could use xmm registers for your floats in general :) – Jester Nov 24 '14 at 03:16
-
Awesome, thank you :) If you post this as an answer I'll accept it. And yeah I guess I could use the xmm registers for everything, but I'm trying to keep the code simple even if it does generate inefficient assembly. – vgel Nov 24 '14 at 03:20
-
@Jester … but `movd` is an integer instruction, and the subsequent floating-point instruction may be penalized, according to http://stackoverflow.com/questions/4996384/do-i-get-a-performance-penalty-when-mixing-sse-integer-float-simd-instructions – Pascal Cuoq Nov 24 '14 at 03:20
-
1@PascalCuoq that is true. But `eax` and `ecx` are integer registers, so this "reinterpretation" must be done eventually. – Jester Nov 24 '14 at 03:23
1 Answers
5
No, you can't do that. As far as generating code goes, you can simulate fld r32 easily enough through the following sequence for example (optimized for size ;)) :
push r32
fld [esp]
pop r32
Consider using SSE if available, which does offer direct GPR-to-XMM moves using the movd instruction. Adding the two registers could then look something like:
movd xmm0, eax
movd xmm1, ecx
addss xmm0, xmm1
If you need the result in a GPR, you can move it back using another movd.
Jester
- 56,577
- 4
- 81
- 125