| |||

Home > AArch64 Floating-point and NEON > NEON and Floating-Point architecture |

The contents of the NEON registers are *vectors* of *elements* of
the same data type. A vector is divided into *lanes* and
each lane contains a data value called an *element*.

The number of lanes in a NEON vector depends on the size of the vector and the data elements in the vector.

Usually, each NEON instruction results in

operations
occurring in parallel, where `n`

is
the number of lanes that the input vectors are divided into. There
cannot be a carry or overflow from one lane to another. Ordering
of elements in the vector is from the least significant bit. This means
that element 0 uses the least significant bits of the register.`n`

NEON and floating-point instructions operate on elements of the following types:

32-bit single precision and 64-bit double precision floating-point.

### Note

16-bit floating-point is supported, but only as a format to be converted from or to. It is not supported for data processing operations.

8-bit, 16-bit, 32-bit, or 64-bit unsigned and signed integers.

8-bit and 16-bit polynomials.

The polynomial type is for code, such as error correction, that uses power-of-two finite fields or simple polynomials over {0,1}. Normal ARM integer code typically uses a lookup table for finite field arithmetic. AArch64 NEON provides instructions to use large lookup tables.

Polynomial operations are hard to synthesize out of other operations, so it is useful having a basic multiply operation from which other, larger operations can be synthesized.

The NEON unit views the register file as:

32 × 128-bit quadword registers, `V0-V31,`

each
of which can be viewed as in Figure 7.1:

Thirty-two 64-bit D, or doubleword, registers, `D0-D31`

,
each of which can be viewed as in Figure 7.2:

All of these registers are accessible at any time. Software does not have to explicitly switch between them because the instruction used determines the appropriate view.