| |||
| Home > Cycle Timings and Interlock Behavior > Single load and store instructions | |||
This section describes the cycle timing behavior for LDR, LDRT,LDRB, LDRBT, LDRSB, LDRH, LDRSH, STR, STRT, STRB, STRBT, STRH, and PLD instructions.
Table 16.13 shows the cycle timing behavior for stores and loads, other than loads to the PC.
You can replace LDR with any of the above single load or store instructions. The following rules apply:
They are single-cycle issue if a constant offset is used or if a register offset with no shift, or shift by 2 is used. Both the base and any offset register are Early Regs.
They are two-cycle issue if either a negative register offset or a shift other than LSL #2 is used. Only the offset register is an Early Reg.
If ARMv6 unaligned support is enabled then accesses to addresses not aligned to the access size generates two accesses to memory, and so consume the load/store unit for an additional cycle. This extra cycle is required if the base or the offset is not aligned to the access size, consequently the final address is potentially unaligned, even if the final address turns out to be aligned.
If ARMv6 unaligned support is enabled and the final access address is unaligned there is an extra cycle of result latency.
PLD (data preload hint instructions) have cycle timing behavior as for load instructions. Because they have no destination register, the result latency is not-applicable for such instructions. Since a PLD instruction is treated as any other load instruction by all levels of cache, standard data-dependency rules and eviction procedures are followed. The PLD instruction is ignored in case of an address translation fault, a cache hit or an abort during any stage of PLD execution. Only use the PLD instruction to preload from cacheable Normal memory.
The updated base register has a result latency of one. For back-to-back load/store instructions with base write back, the updated base is available to the following load/store instruction with a result latency of 0.
Table 16.13. Cycle timing behavior for stores and loads, other than loads to the PC
| Example instruction | Cycles | Memory cycles | Result latency | Comments |
|---|---|---|---|---|
LDR <Rd>, <addr_md_1cycle>[1] | 1 | 1 | 3 | Legacy access / ARMv6 aligned access |
LDR <Rd>, <addr_md_2cycle>[1] | 2 | 2 | 4 | Legacy access / ARMv6 aligned access |
LDR <Rd>, <addr_md_1cycle>[1] | 1 | 2 | 3 | Potentially ARMv6 unaligned access |
LDR <Rd>, <addr_md_2cycle>[1] | 2 | 3 | 4 | Potentially ARMv6 unaligned access |
LDR <Rd>, <addr_md_1cycle>[1] | 1 | 2 | 4 | ARMv6 unaligned access |
LDR <Rd>, <addr_md_2cycle>[1] | 1 | 2 | 4 | ARMv6 unaligned access |
[1] See Table 16.15 for an explanation of | ||||
Table 16.14 shows the cycle timing behavior for loads to the PC.
Table 16.14. Cycle timing behavior for loads to the PC
| Example instruction | Cycles | Memory cycles | Result latency | Comments |
|---|---|---|---|---|
LDR pc, [sp, #cns] (!) | 4 | 1 | - | Correctly return stack predicted |
LDR pc, [sp], #cns | 4 | 1 | - | Correctly return stack predicted |
LDR pc, [sp, #cns] (!) | 9 | 1 | - | Return stack mispredicted |
LDR pc, [sp], #cns | 9 | 1 | - | Return stack mispredicted |
LDR <cond> pc, [sp, #cns] (!) | 8 | 1 | - | Conditional return, or empty return stack |
LDR <cond> pc, [sp], #cns | 8 | 1 | - | Conditional return, or empty return stack |
LDR pc, <addr_md_1cycle>[1] | 8 | 1 | - | - |
LDR pc, <addr_md_2cycle>[1] | 9 | 2 | - | - |
[1] See Table 16.15 for an explanation of | ||||
Only cycle times for aligned accesses are given because unaligned accesses to the PC are not supported.
ARM1136JF-S processor includes a three-entry return stack that can predict procedure returns. Any load to the PC with an immediate offset, and the stack pointer r13 as the base register is considered a procedure return.
For condition code failing cycle counts, you must use the cycles for the non-PC destination variants.
Table 16.15 shows
the explanation of <addr_md_1cycle> and <addr_md_2cycle> used
in Table 16.13 and Table 16.14.
Table 16.15. <addr_md_1cycle> and <addr_md_2cycle> LDR example instruction
| Example instruction | Early Reg | Comment | |
|---|---|---|---|
<addr_md_1cycle> | |||
LDR <Rd>, [<Rn>,
#cns] (!) | <Rn> | If an immediate offset, or a
positive register offset with no shift or shift LSL #2,
then one-issue cycle. | |
LDR <Rd>, [<Rn>,
<Rm>] (!) | <Rn>, <Rm> | ||
LDR <Rd>, [<Rn>,
<Rm>, LSL #2] (!) | <Rn>, <Rm> | ||
LDR <Rd>, [<Rn>],
#cns | <Rn> | ||
LDR <Rd>, [<Rn>],
<Rm> | <Rn>, <Rm> | ||
LDR <Rd>, [<Rn>], <Rm>, LSL
#2 | <Rn>, <Rm> | ||
<addr_md_2cycle> | |||
LDR <Rd>, [<Rn>,
-<Rm>] (!) | <Rm> | If negative register offset,
or shift other than LSL #2 then two-issue cycles. | |
LDR <Rd>, [Rm, -<Rm>
<shf> <cns>] (!) | <Rm> | ||
LDR <Rd>, [<Rn>],
-<Rm> | <Rm> | ||
LDR <Rd>, [<Rn>], -<Rm> <shf>
<cns> | <Rm> | ||