4

So there's something I just can't understand about ieee-754.

The specific questions are:

Which range of numbers can be represented by IEEE-754 standard using base 2 in single (double) precision?

Which range of numbers can be represented by IEEE-754 standard using base 10 in single (double) precision?

Which range of numbers can be represented by IEEE-754 standard using base 16 in single (double) precision?

(the textbook is not in English so I might not have translated this well but I hope you get the point).

The only information given in the textbook are the ranges themselves without the actual explanation of how they were calculated. For example:

binary32:

The largest normalized number: $(1-2^{-24})\times 2^{128}$

The smallest normalized number: $1.0\times 2^{-126}$

The smallest subnormal number: $1.0\times 2^{-149}$

I have a test coming up where these kind of question will appear and I really don't feel like learning all of this by heart. On the other hand, there must be a method to calculate these values, but they seem so random and that's what confuses me.

Koy
  • 877
  • 1
  • 6
  • 13
  • You would not learn those by heart. You would need to understand how floating point numbers are encoded in 32 bits: https://en.wikipedia.org/wiki/Single-precision_floating-point_format can serve as a good guideline. Once you understand that, you can easily see why those particular numbers are the limits. For example, if you know that the exponent goes up to 127, and that mantissa can go from $1_2$ to $1.11111111111111111111111_2=2-2^{-23}$ (23 'ones' after decimal point) - the maximum is obvious ($(2-2^{-23})2^{127}=(1-2^{-24})2^{128}$). –  Jan 16 '18 at 12:58
  • (Cont'd) Depending on the type of the test you will be taking, it may be worth trying to understand how IEEE-754 standard works, or it may be worth remembering those exact figures. If you are curious and mathematically minded, I guess you would prefer the first approach - but I cannot promise it is a simple standard to deal with, if you have no previous experience with floating point encodings. As I said, try Wikipedia first. –  Jan 16 '18 at 13:01

2 Answers2

3

The exponent for the IEEE-754 standard for single precision is in the range $-126$ ... $127$. The mantissa is of the form $1.xxxxxxxxxxxxxxxxxxxxxxx_2$ (23 binary digits ($x$'s), every $x$ is $0$ or $1$) for normalised numbers, and of the form $0.xxxxxxxxxxxxxxxxxxxxxxx_2$ for the subnormal numbers (which always assumes the exponent to be $-126$). Thus:

  • The biggest number takes the biggest mantissa and the biggest exponent: $1.11111111111111111111111_2\times 2^{127}=(2-2^{-23})\times 2^{127}=(1-2^{-24})\times 2^{128}$
  • The smallest normalised number takes the smallest normalised mantissa and the smallest exponent: $1.00000000000000000000000_2\times 2^{-126}=1.0\times 2^{-126}$
  • The smallest subnormal number takes the smallest subnormal mantissa and the (smallest) exponent $-126$: $0.00000000000000000000001_2\times 2^{-126}=2^{-23}\times2^{-126}=1.0\times 2^{-149}$

I've used the index $_2$ to denote a number written in binary (base $2$); all the other numbers are written in base $10$.

0

Existing answer is true for base 2 single precision (i.e. binary32), and is easily extended to binary64 (just add more bits).

However, there's IEEE754 format for decimal floating point, which encodes numbers somewhat differently, and uses either Binary Integer Decimal (BID) or Densely Packed Decimal (DPD) for binary encoding of decimal numbers.

Regardless of the encoding, decimal can store 7 decimal digits in coefficient and values [-95, 96] in the exponent, if the coefficient is interpreted as $d_1.d_2d_3d_4d_5d_6d_7$. This means smallest representable number is $0.000001^{-95} = 1^{-101}$, and the largest $9.999999^{96} = 9999999^{90}$. This is including subnormal numbers $-$ because numbers aren't normalized in their decimal representation, exponent 0 is treated the same as any other.

decimal64 expands the space to 16 digits for coefficient and exponent in range [-383, 384], and decimal128 34 digits in coefficient and exponent in range [-6143, 6144].

IEEE754-2008 doesn't define any format for hexadecimal storage of floating point numbers.

Luke
  • 153