Convert float to bigint (aka portable way to get binary exponent & mantissa)

Question

In C++, I have a bigint class that can hold an integer of arbitrary size.

I'd like to convert large float or double numbers to bigint. I have a working method, but it's a bit of a hack. I used IEEE 754 number specification to get the binary sign, mantissa and exponent of the input number.

Here is the code (Sign is ignored here, that's not important):

 float input = 77e12;
 bigint result;

 // extract sign, exponent and mantissa, 
 // according to IEEE 754 single precision number format
 unsigned int *raw = reinterpret_cast<unsigned int *>(&input); 
 unsigned int sign = *raw >> 31;
 unsigned int exponent = (*raw >> 23) & 0xFF;
 unsigned int mantissa = *raw & 0x7FFFFF;

 // the 24th bit is always 1.
 result = mantissa + 0x800000;

 // use the binary exponent to shift the result left or right
 int shift = (23 - exponent + 127);
 if (shift > 0) result >>= shift; else result <<= -shift;

 cout << input << " " << result << endl;

It works, but it's rather ugly, and I don't know how portable it is. Is there a better way to do this? Is there a less ugly, portable way to extract the binary mantissa and exponent from a float or double?

Thanks for the answers. For posterity, here is a solution using frexp. It's less efficient because of the loop, but it works for float and double alike, doesn't use reinterpret_cast or depend on any knowledge of floating point number representations.

float input = 77e12;
bigint result;

int exponent;
double fraction = frexp (input, &exponent);
result = 0;
exponent--;
for (; exponent > 0; --exponent)
{
    fraction *= 2;
    if (fraction >= 1)
    {
        result += 1;
        fraction -= 1;
    }
    result <<= 1;
}

By the way, if you make it `unsigned int raw& = *reinterpret_cast(&input); `, you get rid of all the other dereferences. — GManNickG, Jan 25 '10 at 16:58
The result of this program is 7.699999752192e13, not 7.7e13. As I said in my answer below, the single line of code -- unsigned long long float_to_int = (unsigned long long) input; -- gives the same answer as your program does. — Rick Regan, Jan 26 '10 at 01:54

score 8 · Accepted Answer · edited Nov 09 '20 at 12:32

8

Can't you normally extract the values using frexp(), frexpf(), frexpl()?

edited Nov 09 '20 at 12:32

DevSolar

67,862
21
134
209

answered Jan 25 '10 at 16:52

Kornel Kisielewicz

55,802
15
111
149

3

You certainly can, though in C++ it's better to use `std::frexp()` which is overloaded for `float`, `double` and `long double` arguments. – Mike Seymour Jan 25 '10 at 17:01
1

frexp() returns the significand (mantissa) as a float, but the OP uses it as an integer. – Rick Regan Jan 25 '10 at 19:00
Thanks, I didn't know about frexp. I added a solution using frexp to the end of the question, in case you're interested. – amarillion Jan 25 '10 at 20:32

Chris · Answer 2 · 2011-03-21T06:48:14.657

I like your solution! It got me on the right track.

I'd recommend one thing though - why not get a bunch of bits all at once and almost always eliminate any looping? I implemented a float-to-bigint function like this:

template<typename F>
explicit inline bigint(F f, typename std::enable_if<(std::is_floating_point<F>::value)>::type* enable = nullptr) {
    int exp;
    F fraction = frexp(fabs(f),&exp);
    F chunk = floor(fraction *= float_pow_2<F,ulong_bit_count>::value);
    *this = ulong(chunk); // will never overflow; frexp() is guaranteed < 1
    exp -= ulong_bit_count;
    while (sizeof(F) > sizeof(ulong) && (fraction -= chunk)) // this is very unlikely
    {
        chunk = floor(fraction *= float_pow_2<F,ulong_bit_count>::value);
        *this <<= ulong_bit_count;
        (*this).data[0] = ulong(chunk);
        exp -= ulong_bit_count;
    }
    *this <<= exp;
    sign = f < 0;
}

(By the way, I don't know of an easy way to put in floating point power-of-two constants, so I defined float_pow_2 as follows):

template<typename F, unsigned Exp, bool Overflow = (Exp >= sizeof(unsigned))>
struct float_pow_2 {
    static constexpr F value = 1u << Exp;
};
template<typename F, unsigned Exp>
struct float_pow_2<F,Exp,true> {
    static constexpr F half = float_pow_2<F,Exp/2>::value;
    static constexpr F value = half * half * (Exp & 1 ? 2 : 1);
};

score -1 · Answer 3 · answered Jan 25 '10 at 19:31

-1

If the float always contains an integral value, just cast it to int: float_to_int = (unsigned long) input.

BTW, 77e12 overflows a float. A double will hold it, but then you'll need this cast: (unsigned long long) input.

answered Jan 25 '10 at 19:31

Rick Regan

3,407
22
28

Ehm no... 77e12 does not overflow a float. The exponent can go from -126 to 127. Casting to int is exactly what I want to avoid, that's why I'm using a bigint class. – amarillion Jan 25 '10 at 20:10
"Overflow" was the wrong word -- sorry (did that deserve a down vote?). 77e12 needs 47 bits to represent. That can't fit in a float -- not unless you want it truncated. Doesn't your compiler give you a warning? Mine does. – Rick Regan Jan 25 '10 at 22:03
Just to be clear: you're assigning 77000000000000 to a float, and that float is taking on the value 76999997521920. Is that what you want? – Rick Regan Jan 25 '10 at 22:15
Ok, where is this function float_to_int defined? I can't find it anywhere. And yes, I see your point about 77e12 being truncated. But 76999997521920 is what I want here, it's the best possible conversion. – amarillion Jan 26 '10 at 10:02
float_to_int is just a variable name I picked. Your example, 77e12, is small enough to work with casting -- that's why I suggested it. (Just out of curiosity -- how are you using the "converted to bigint" values? They are as inaccurate as the floats they came from.) – Rick Regan Jan 26 '10 at 14:33

Convert float to bigint (aka portable way to get binary exponent & mantissa)

3 Answers3

Linked