12.23 __smlald intrinsic
This intrinsic inserts an
SMLALD instruction into the instruction stream generated by the compiler.
you to perform two signed 16-bit multiplications, adding both results
to a 64-bit accumulate operand. Overflow is only possible as a result
of the 64-bit addition. This overflow is not detected if it occurs.
Instead, the result wraps around modulo 264.
unsigned long long __smlald(unsigned int
, unsigned long long
holds the first halfword operands for each multiplication
holds the second halfword operands for each multiplication
holds the accumulate value.
__smlald intrinsic returns the product
of each multiplication added to the accumulate value.
unsigned int dual_multiply_accumulate(unsigned int val1, unsigned int val2, unsigned int val3)
unsigned int res;
res = __smlald(val1,val2,val3); /* p1 = val1[15:0] × val2[15:0]
p2 = val1[31:16] × val2[31:16]
sum = p1 + p2 + val3[63:32][31:0]
res[63:32] = sum[63:32]
res[31:0] = sum[31:0]