2.3.2. Instruction throughput and latency

Table 2.1 shows:

Table 2.1. FPU instruction throughput and latency cycles

Old ARM assembler mnemonicUALSingle Precision Double Precision
ThroughputLatencyThroughputLatency
 FwdWbck FwdWbck

FADD

FSUB

FCVT

FSHTOD, FSHTOS

FSITOD, FSITOS

FTOSHD, FTOSHS

FTOSID, FTOSIS

FTOSL, FTOUH

FTOUI{Z}D, FTOUI{Z}S

FTOULD, FTOULS, FUHTOD, FUHTOS

FUITOD, FUITOS

FULTOD, FULTOS

VADD

VSUB

VCVT

1414

FMUL

FNMUL

VMUL

VNMUL

1526

FMAC

FNMAC

FMSC

FNMSC

VMLA

VMLS

VNMLS

VNMLA

1829

FCPY

FABS

FNEG

FCONST

VMOV

VABS

VNEG

VMOV

112112

FMRS

FMRR(S/D)

FMRD(L/H)

VMOV[a]1-3[b]1-3[b]

FMSR

FM(S/D)RR

FMD(L/H)R

VMOV[c]

112112
FMSTATVMRS1-3[b]1-3[b]
FDIV

VDIV

10152025
FSQRT

VSQRT

13172832

FCMP

FCMPE

FCMPZ

FCMPEZ

VCMP

VCMP{E}

VCMP{E}

VCMP{E}

114114
-

FCVT(T/B)

.F16.F32

122---
-

FCVT(T/B)

.F32.F16

1-4---

[a] FPU to ARM.

[b] The writeback number for these instructions is given from an ARM core writeback point of view. It reflects the penalty of moving data from the FPU into the ARM core register file before the following ARM instruction can use the moved data.

[c] ARM to FPU.


Copyright © 2008-2012 ARM. All rights reserved.ARM DDI 0408I
Non-ConfidentialID091612