3.4.2 Single precision data type for IEEE 754 arithmetic

A float value is 32 bits wide.

The structure is:

Figure 3-1 IEEE 754 single-precision floating-point format
To view this graphic, your browser must support the SVG format. Either install a browser with native support, or install an appropriate plugin such as Adobe SVG Viewer.

The S field gives the sign of the number. It is 0 for positive, or 1 for negative.

The Exp field gives the exponent of the number, as a power of two. It is biased by 0x7F (127), so that very small numbers have exponents near zero and very large numbers have exponents near 0xFF (255).

For example:

  • If Exp = 0x7D (125), the number is between 0.25 and 0.5 (not including 0.5).
  • If Exp = 0x7E (126), the number is between 0.5 and 1.0 (not including 1.0).
  • If Exp = 0x7F (127), the number is between 1.0 and 2.0 (not including 2.0).
  • If Exp = 0x80 (128), the number is between 2.0 and 4.0 (not including 4.0).
  • If Exp = 0x81 (129), the number is between 4.0 and 8.0 (not including 8.0).

The Frac field gives the fractional part of the number. It usually has an implicit 1 bit on the front that is not stored to save space.

For example, if Exp is 0x7F:

  • If Frac = 00000000000000000000000 (binary), the number is 1.0.
  • If Frac = 10000000000000000000000 (binary), the number is 1.5.
  • If Frac = 01000000000000000000000 (binary), the number is 1.25.
  • If Frac = 11000000000000000000000 (binary), the number is 1.75.

In general, the numeric value of a bit pattern in this format is given by the formula:

(–1)S * 2(Exp–0x7F) * (1 + Frac * 2–23)

Numbers stored in this form are called normalized numbers.

The maximum and minimum exponent values, 0 and 255, are special cases. Exponent 255 can represent infinity and store Not a Number (NaN) values. Infinity can occur as a result of dividing by zero, or as a result of computing a value that is too large to store in this format. NaN values are used for special purposes. Infinity is stored by setting Exp to 255 and Frac to all zeros. If Exp is 255 and Frac is nonzero, the bit pattern represents a NaN.

Exponent 0 can represent very small numbers in a special way. If Exp is zero, then the Frac field has no implicit 1 on the front. This means that the format can store 0.0, by setting both Exp and Frac to all 0 bits. It also means that numbers that are too small to store using Exp >= 1 are stored with less precision than the ordinary 23 bits. These are called denormals.

Non-ConfidentialPDF file icon PDF versionARM 100073_0608_00_en
Copyright © 2014–2017 ARM Limited or its affiliates. All rights reserved.