How can I convert an XMM register of single-precision floats to integers?

Question

I have a bunch of packed floats inside an XMM register (using SSE intrinsics):

__m128 xmm = _mm_set_ps(4.0f, 3.0f, 2.0f, 1.0f);

I'd like to convert all of these to integers in one go. I found an intrinsic, that does what I want (_mm_cvtps_pi16()), but it yields 4x16-bit short instead of full-blown int. An intrinsic called _mm_cvtps_pi32() yields int, but only for the two lower values in xmm. I can use it, extract the values, move things around and use it again, but is there a simpler way? Why wouldn't there be a straightforward 32bit packed float -> 32bit integer instruction? Surely both fit in the same space of an XMM register?

EDIT: Okay, I see now that _mm_cvtps_pi32() returns __m64 instead of __m128, which means it operates on a MMX-style MM... register. That would explain why it returns just two ints, but now I'm wondering:

Will I have trouble when compiling for x64? Reportedly, __m64 isn't supported there...
Why didn't they extend this instruction when SSE rolled out?

Thanks!

score 4 · Accepted Answer · answered Sep 17 '13 at 23:36

4

According to this documentation: __m128d _mm_cvtps_epi32(__m128d a) generates a cvtps2dq instruction, which does what you want.

answered Sep 17 '13 at 23:36

Mats Petersson

126,704
14
140
227

1

It's worth taking a moment to understand the suffixes. In this case the question's `pi32` leads directly to this answer's `epi32` -- the `e` for extended. Extended, parallel, integer of 32 bits. – Ben Jackson Sep 17 '13 at 23:43
I used to think `__m128d` was used for storing two 64-bit floats, that's why I didn't look at this intrinsic more carefully. Any idea why this return type? – neuviemeporte Sep 17 '13 at 23:44
Okay, it looks as we were both wrong, the return type is actually `__m128i` and all is right now. The intrinsic is documented in the `__m128d` section of the SSE2 docs on MSDN, though, for a reason I don't understand. – neuviemeporte Sep 17 '13 at 23:47
I can't vouch for the documentation (I didn't write it, I just searched for the instruction I wanted), but it seems like the other answer also suggests `_mm_cvtps_epi32`, so it may be worth trying that one. – Mats Petersson Sep 17 '13 at 23:48
Thank you very much. It's just that I find these docs very confusing. Accepting now. – neuviemeporte Sep 17 '13 at 23:50
Yeah, I look up the instruction in my books first, then search for it with google... – Mats Petersson Sep 17 '13 at 23:54
It looks like there are mistakes on the page you link to - the intrinsic should be: `__m128i _mm_cvtps_epi32 (__m128 a)` - I recommend using the [Intel Intrinsics Guide](http://software.intel.com/en-us/articles/intel-intrinsics-guide) for looking this stuff up, rather than Google or Microsoft. – Paul R Sep 18 '13 at 07:26

score 1 · Answer 2 · answered Sep 17 '13 at 23:41

1

Use documentation (_mm_cvtps_epi32):

Magic documentation.

answered Sep 17 '13 at 23:41

Jakub Świerk

81
2

I guess it's my bad for sticking with the MSDN docs. I figured it was the way to go, since I'm writing in Visual C++ on Windows. – neuviemeporte Sep 18 '13 at 00:03
Some times you need to search deeper: [MSDN documentation](http://msdn.microsoft.com/en-us/library/xdc42k5e(v=vs.90).aspx) – Jakub Świerk Sep 18 '13 at 00:24
A more useful reference is the [Intel Intrinsics Guide](http://software.intel.com/en-us/articles/intel-intrinsics-guide) - it's a documentation tool for Linux/Windows/OS X and it's much more comprehensive/accurate and quicker/easier to use than MSDN. – Paul R Sep 18 '13 at 07:31

How can I convert an XMM register of single-precision floats to integers?

2 Answers2

Linked