### 9.11.1. Interlocks

Unaligned word loads, load byte (`LDRB`), and load halfword (`LDRH`) instructions use the byte rotate unit in the Write stage of the pipeline. This introduces a single-cycle load-use interlock, that can affect the two instructions immediately following the load instruction.

The following example incurs a single-cycle interlock:

```LDRB r0, [r1, #1]ADD r2, r0, r3ORR r4, r4, r5
```

The following example incurs a single-cycle interlock:

```LDRB r0, [r1, #1]ORR r4, r4, r5ADD r2, r0, r3
```

When an interlock has been incurred for one instruction it does not have to be incurred for a later instruction. For example, the following sequence incurs a single-cycle interlock on the first `ADD` instruction, but the second `ADD` does not incur any interlocks:

```LDRB r0, [r1, #1]ADD r2, r0, r3ADD r4, r0, r5
```

A single-cycle interlock refers to the number of unwaited clock cycles to which the interlock applies. If a multi-cycle instruction separates a load instruction and the instruction using the result of the load, then no interlock can apply. The following example does not incur an interlock:

```LDRB r0, [r1]MUL r6, r7, r8ADD r4, r0, r5
```

Table 9.17 shows the cycle timing for basic load register operations.

Table 9.17. Cycle timings for basic load register operations

Normal case1da(pc+2i)N cycle
2pc+3i(da)N cycle
(pc+3i)
Scaled offset1pc+3i(pc+2i)I cycle
2da-N cycle
3pc+3i(da)N cycle
(pc+3i)
dest=pc1da(pc+2i)N cycle
2pc+3i(da)I cycle
3pc'-N cycle
4pc'+i(pc')S cycle
5pc'+2i(pc'+i)S cycle
(pc'+2i)
Scaled offset dest=pc1pc+3i(pc+2i)I cycle
2da-N cycle
3pc+3i(da)I cycle
4pc'-N cycle
5pc'+i(pc')S cycle
6pc'+2i(pc'+i)S cycle
(pc'+2i)

Table 9.18 shows the cycle timing for load operations resulting in simple interlocks.

Table 9.18. Cycle timings for load operations resulting in simple interlocks

Single-cycle interlock1da(pc+2i)N cycle
2pc+3i(da)I cycle
3pc+3i-N cycle
(pc+3i)

With more complicated interlock cases you cannot consider the load instruction in isolation. This is because in these cases the load instruction has vacated the Execute stage of the pipeline and a later instruction has occupied it.

Table 9.19 shows the one-cycle interlock incurred for the following sequence of instructions:

```LDRB r0, [r1]ADD r6, r6, r3 ADD r2, r0, r1
```
```
```

Table 9.19. Cycle timings for an example LDRB, ADD and ADD sequence

`LDRB r0, [r1]`1da(pc+2i)N cycle
2pc+3i(da)N cycle
`ADD r6, r6, r3`3pc+4i (pc+3i)I cycle
4pc+4i-S cycle
`ADD r2, r0, r1`5pc+5i (pc+4i)S cycle
(pc+5i)

Table 9.20 shows thecycle timing for the following code sequence:

```LDRB r0, [r2]STMIA r3, {r0-r1}
```
```
```

Table 9.20. Cycle timings for an example LDRB and STMIA sequence

`LDRB r0, [r2]`1da(pc+2i)N cycle
`STMIA r3, {r0-r1}`3pc+4i(pc+3i)I cycle