5.3 Half-precision floating-point number format

ARM® Compiler supports the half-precision floating-point __fp16 type.

Half-precision is a floating-point format that occupies 16 bits. Architectures that support half-precision floating-point numbers include:
  • The ARMv8 architecture.
  • The ARMv7 FPv5 architecture.
  • The ARMv7 VFPv4 architecture.
  • The ARMv7 VFPv3 architecture (as an optional extension).
If the target hardware does not support half-precision floating-point numbers, the compiler uses the floating-point library fplib to provide software support for half-precision.

Note

The __fp16 type is a storage format only. For purposes of arithmetic and other operations, __fp16 values in C or C++ expressions are automatically promoted to float.

Half-precision floating-point format

ARM Compiler uses the half-precision binary floating-point format defined by IEEE 754r, a revision to the IEEE 754 standard:
Figure 5-1 IEEE half-precision floating-point format
To view this graphic, your browser must support the SVG format. Either install a browser with native support, or install an appropriate plugin such as Adobe SVG Viewer.

Where:
   S (bit[15]):      Sign bit
   E (bits[14:10]):  Biased exponent
   T (bits[9:0]):    Mantissa.
The meanings of these fields are as follows:
IF E==31:
   IF T==0: Value = Signed infinity
   IF T!=0: Value = Nan
             T[9] determines Quiet or Signalling:
                  0: Quiet NaN
                  1: Signalling NaN
IF 0<E<31:
   Value = (-1)^S x 2^(E-15) x (1 + (2^(-10) x T))
IF E==0:
   IF T==0: Value = Signed zero
   IF T!=0: Value = (-1)^S x 2^(-14) x (0 + (2^(-10) x T))

Note

See the ARM C Language Extensions for more information.
Related information
ARM C Language Extensions
Non-ConfidentialPDF file icon PDF versionARM DUI0774E
Copyright © 2014-2016 ARM. All rights reserved.