| |||
| Home > ARM Compiler Reference > Limits for floating-point numbers | |||
The following tables give the characteristics, ranges, and limits for floating-point numbers as implemented in ARM C and C++. Note also:
when a floating-point number is converted to a shorter floating-point number, it is rounded to the nearest representable number
the properties of floating-point arithmetic accord with IEEE 754.
Table 3.13. Floating-point limits
| Constant | Meaning | Value |
|---|---|---|
FLT_MAX | Maximum value of float | 3.40282347e+38F |
FLT_MIN | Minimum value of float | 1.17549435e–38F |
DBL_MAX | Maximum value of double | 1.79769313486231571e+308 |
DBL_MIN | Minimum value of double | 2.22507385850720138e–308 |
LDBL_MAX | Maximum value of long double | 1.79769313486231571e+308 |
LDBL_MIN | Minimum value of long double | 2.22507385850720138e–308 |
FLT_MAX_EXP | Maximum value of base 2 exponent for type float | 128 |
FLT_MIN_EXP | Minimum value of base 2 exponent for type float | –125 |
DBL_MAX_EXP | Maximum value of base 2 exponent for type double | 1024 |
DBL_MIN_EXP | Minimum value of base 2 exponent for type double | –1021 |
LDBL_MAX_EXP | Maximum value of base 2 exponent for type long double | 1024 |
LDBL_MIN_EXP | Minimum value of base 2 exponent for type long double | –1021 |
FLT_MAX_10_EXP | Maximum value of base 10 exponent for type float | 38 |
FLT_MIN_10_EXP | Minimum value of base 10 exponent for type float | –37 |
DBL_MAX_10_EXP | Maximum value of base 10 exponent for type double | 308 |
DBL_MIN_10_EXP | Minimum value of base 10 exponent for type double | –307 |
LDBL_MAX_10_EXP | Maximum value of base 10 exponent for type long double | 308 |
LDBL_MIN_10_EXP | Minimum value of base 10 exponent for type long double | –307 |
Table 3.14. Other floating-point characteristics
| Constant | Meaning | Value |
|---|---|---|
FLT_RADIX | Base (radix) of the ARM floating-point number representation | 2 |
FLT_ROUNDS | Rounding mode for floating-point numbers | 1 (nearest) |
FLT_DIG | Decimal digits of precision for float | 6 |
DBL_DIG | Decimal digits of precision for double | 15 |
LDBL_DIG | Decimal digits of precision for long double | 15 |
FLT_MANT_DIG | Binary digits of precision for type float | 24 |
DBL_MANT_DIG | Binary digits of precision for type double | 53 |
LDBL_MANT_DIG | Binary digits of precision for type long double | 53 |
FLT_EPSILON | Smallest positive value of x such that 1.0 + x != 1.0 for type float | 1.19209290e–7F |
DBL_EPSILON | Smallest positive value of x such that 1.0 + x != 1.0 for type double | 2.2204460492503131e–16 |
LDBL_EPSILON | Smallest positive value of x such that 1.0 + x != 1.0 for type long double | 2.2204460492503131e–16L |