B.3. Load and store instructions

Load and store instructions are classed as:

For load multiple and store multiple instructions, the number of registers in the register list usually determines the number of cycles required to execute a load or store instruction.

The Cortex-A9 processor has special paths that immediately forward data from a load instruction to a subsequent data processing instruction in the execution units.

This path is used when the following conditions are met:

Table B.2 shows cycle timing for single load and store operations. The result latency is the latency of the first loaded register.

Table B.2. Single load and store operation cycle timings

Instruction cyclesAGU cyclesResult latency
Fast forward casesother cases

LDR ,[reg]

LDR ,[reg imm]

LDR ,[reg reg]

LDR ,[reg reg LSL #2]

123

LDR ,[reg reg LSL reg]

LDR ,[reg reg LSR reg]

LDR ,[reg reg ASR reg]

LDR ,[reg reg ROR reg]

LDR ,[reg reg, RRX]

134

LDRB ,[reg]

LDRB ,[reg imm]

LDRB ,[reg reg]

LDRB ,[reg reg LSL #2]

LDRH ,[reg]

LDRH ,[reg imm]

LDRH ,[reg reg]

LDRH ,[reg reg LSL #2]

234

LDRB ,[reg reg LSL reg]

LDRB ,[reg reg ASR reg]

LDRB ,[reg reg LSL reg]

LDRB ,[reg reg ASR reg]

LDRH ,[reg reg LSL reg]

LDRH ,[reg reg ASR reg]

LDRH ,[reg reg LSL reg]

LDRH ,[reg reg ASR reg]

245

The Cortex-A9 processor can load or store two 32-bit registers in each cycle. However, to access 64 bits, the address must be 64-bit aligned.

This scheduling is done in the Address Generation Unit (AGU). The number of cycles required by the AGU to process the load multiple or store multiple operations depends on the length of the register list andthe 64-bit alignment of the address. The resulting latency is the latency of the first loaded register. Table B.3 shows the cycle timings for load multiple operations.

Table B.3. Load multiple operations cycle timings

InstructionAGU cycles to process the instruction Resulting latency
Address aligned on a 64-bit boundaryFast forward caseOther cases
YesNo
LDM ,{1 register}1123

LDM ,{2 registers}

LDRD

RFE

1223
LDM ,{3 registers}2223
LDM ,{4 registers}2323
LDM ,{5 registers}3323
LDM ,{6 registers}3423
LDM ,{7 registers}4423
LDM ,{8 registers}4523
LDM ,{9 registers}5523
LDM ,{10 registers}5623
LDM ,{11 registers}6623
LDM ,{12 registers}6723
LDM ,{13 registers}7723
LDM ,{14 registers}7823
LDM ,{15 registers}8823
LDM ,{16 registers}8923

Table B.4 shows the cycle timings of store multiple operations.

Table B.4. Store multiple operations cycle timings

InstructionAGU cycles
Aligned on a 64-bit boundary
YesNo
STM ,{1 register}11

STM ,{2 registers}

STRD

SRS

12
STM ,{3 registers}22
STM ,{4 registers}23
STM ,{5 registers}33
STM ,{6 registers}34
STM ,{7 registers}44
STM ,{8 registers}45
STM ,{9 registers}55
STM ,{10 registers}56
STM ,{11 registers}66
STM ,{12 registers}67
STM ,{13 registers}77
STM ,{14 registers}78
STM ,{15 registers}88
STM ,{16 registers}89

Copyright © 2008-2009 ARM. All rights reserved.ARM DDI 0388E
Non-Confidential