12.24 __smlaldx intrinsic
This intrinsic inserts an
SMLALDX instruction into the instruction stream generated by the compiler.
you to exchange the halfwords of the second operand, and perform
two signed 16-bit multiplications, adding both results to a 64-bit
accumulate operand. Overflow is only possible as a result of the
64-bit addition. This overflow is not detected if it occurs. Instead, the
result wraps around modulo 264.
unsigned long long __smlaldx(unsigned int
, unsigned long long
holds the first halfword operands for each multiplication
holds the second halfword operands for each multiplication
holds the accumulate value.
__smlald intrinsic returns the product
of each multiplication added to the accumulate value.
unsigned int dual_multiply_accumulate(unsigned int val1, unsigned int val2, unsigned int val3)
unsigned int res;
res = __smlald(val1,val2,val3); /* p1 = val1[15:0] × val2[31:16]
p2 = val1[31:16] × val2[15:0]
sum = p1 + p2 + val3[63:32][31:0]
res[63:32] = sum[63:32]
res[31:0] = sum[31:0]