| |||
| Home > Using NEON Support > Intrinsics | |||
The intrinsics described in this section map closely to NEON instructions. Each section begins with a list of function prototypes, with a comment specifying an equivalent assembler instruction. The compiler selects an instruction that has the required semantics, but there is no guarantee that the compiler produces the listed instruction.
The intrinsics use a naming scheme that is similar to the NEON unified assembler syntax. That is, each intrinsic has the form:
<opname><flags>_<type>
An additional q flag is provided to specify
that the intrinsic operates on 128-bit vectors.
For example:
vmul_s16,
multiplies two vectors of signed 16-bit values.
This compiles to VMUL.I16 .d2, d0, d1
vaddl_u8, is a long add of two 64-bit
vectors containing unsigned 8-bit values, resulting in a 128-bit
vector of unsigned 16-bit values.
This compiles to VADDL.U8 .q1, d0, d1
Registers other than those specified in these examples might be used. In addition, the compiler might perform optimization that in some way changes the instruction that the source code compiles to.
The intrinsic function prototypes in this section use the following type annotations:
__const(n)the argument n must be
a compile-time constant
__constrange(min, max)the argument must be a compile-time constant in
the range to minmax
__transfersize(n)the intrinsic loads lanes
from this pointer.n
The NEON intrinsic function prototypes that use __fp16 are
only available for targets that have the NEON half-precision VFP
extension. To enable use of __fp16, use the --fp16_format command-line
option. See ‑‑fp16_format=format.
The intrinsics are grouped into: