| |||
| Home > Instruction Cycle Timings > Instruction speed summary | |||
Due to the pipelined architecture of the CPU, instructions overlap considerably. In a typical cycle, one instruction can be using the data path while the next is being decoded and the one after that is being fetched. For this reason Table 6.23 presents the incremental number of cycles required by an instruction, rather than the total number of cycles for which the instruction uses part of the processor. Elapsed time, in cycles, for a routine can be calculated from these figures listed in Table 6.23. These figures assume that the instruction is actually executed. Unexecuted instructions take one cycle.
If the condition is not met then all instructions take one S-cycle. The cycle types N, S, I, and C are described in Bus cycle types .
In Table 6.23:
b is the number of cycles spent in the coprocessor busy-wait loop
m is:
1 if bits [32:8] of the multiplier operand are all zero or one
2 if bits [32:16] of the multiplier operand are all zero or one
3 if bits [31:24] of the multiplier operand are all zero or all one
n is the number of words transferred.
Table 6.23. ARM instruction speed summary
| Instruction | Cycle count | Additional |
|---|---|---|
Data Processing | S | +I for SHIFT(Rs) +S + N if R15 written |
MSR, MRS | S | - |
LDR | S+N+I | +S +N if R15 loaded |
STR | 2N | - |
LDM | nS+N+I | +S +N if R15 loaded |
STM | (n-1)S+2N | - |
SWP | S+2N+I | - |
B,BL | 2S+N | - |
SWI, trap | 2S+N | - |
MUL | S+mI | - |
MLA | S+(m+1)I | - |
MULL | S+(m+1)I | - |
MLAL | S+(m+2)I | - |
CDP | S+bI | - |
LDC, STC | (n-1)S+2N+bI | - |
MCR | N+bI+C | - |
MRC | S+(b+1)I+C | - |