| |||

Home > Instruction Cycle Times > Multiply and multiply accumulate > Interlocks |

The multiply unit in the ARM7EJ-S processor operates in both the Execute and Memory stages of the pipeline. For this reason, the multiplier result is not available until the end of the Memory stage of the pipeline. If the following instruction requires the use of the multiplier result, then it must be interlocked so that the correct value is available. This applies to all instructions that require the multiply result for the first Execute cycle or first Memory cycle of the instruction, except for multiply accumulate instructions using the previous multiply result as the accumulator operand.

For example, the following sequence incurs a single-cycle interlock:

MULr0, r1, r2SUBr4, r0, r3

The following cycle also incurs a single-cycle interlock:

MLAr0, r1, r2, r3STRr0, [r8]

The following example does not incur an interlock:

MLAr0, r1, r2, r0MLAr0, r3, r4, r0

Table 9.10 shows
the cycle timing for `MUL`

and `MLA`

instructions
with and without interlocks.

**Table 9.10. Cycle timing for MUL and MLA **

Cycle | ADDR | RDATA | TRANS | |
---|---|---|---|---|

Normal | 1 | pc+3i | (pc+2i) | I cycle |

2 | pc+3i | - | S cycle | |

(pc+3i) | ||||

Interlock | 1 | pc+3i | (pc+2i) | I cycle |

2 | pc+3i | - | I cycle | |

3 | pc+3i | - | S cycle | |

(pc+3i) |

The `MULS`

and `MLAS`

instructions always
take four cycles to execute, and cannot generate interlocks in following
instructions.

Table 9.11 shows
the cycle timing for `MULS`

and `MLAS`

instructions.

**Table 9.11. Cycle timings for MULS and MLAS **

Cycle | ADDR | RDATA | TRANS |
---|---|---|---|

1 | pc+3i | (pc+2i) | I cycle |

2 | pc+3i | - | I cycle |

3 | pc+3i | - | I cycle |

4 | pc+3i | - | S cycle |

(pc+3i) |

Table 9.12 shows
the cycle timing for `SMULL`

, `UMULL`

, `SMLAL`

,
and `UMLAL`

instructions with and without interlocks.

**Table 9.12. Cycle timing for SMULL, UMULL, SMLAL, and UMLAL **

Cycle | ADDR | RDATA | TRANS | |
---|---|---|---|---|

Normal | 1 | pc+3i | (pc+2i) | I cycle |

2 | pc+3i | - | I cycle | |

3 | pc+3i | - | S cycle | |

(pc+3i) | ||||

Interlock | 1 | pc+3i | (pc+2i) | I cycle |

2 | pc+3i | - | I cycle | |

3 | pc+3i | - | I cycle | |

4 | pc+3i | - | S cycle | |

(pc+3i) |

The `SMULLS`

, `UMULLS`

, `SMLALS`

,
and `UMLALS`

instructions always take five cycles to
execute, and cannot generate interlocks in following instructions.

Table 9.13 shows
the cycle timing for the `SMULLS`

, `UMULLS`

, `SMLALS`

,
and `UMLALS`

instructions.

**Table 9.13. Cycle timings for SMULLS, UMULLS, SMLALS, and UMLALS **

Cycle | ADDR | RDATA | TRANS |
---|---|---|---|

1 | pc+3i | (pc+2i) | I cycle |

2 | pc+3i | - | I cycle |

3 | pc+3i | - | I cycle |

4 | pc+3i | - | I cycle |

5 | pc+3i | - | S cycle |

(pc+3i) |

Table 9.14 shows
the cycle timings for `SMULxy`

, `SMLAxy`

, `SMULWy`

,
and `SMLAWy`

instructions with and without interlocks.

**Table 9.14. Cycle timings for SMULxy, SMLAxy, SMULWy, and SMLAWy**

Cycle | ADDR | RDATA | TRANS | |
---|---|---|---|---|

Normal | 1 | pc+3i | (pc+2i) | S cycle |

b | (pc+3i) | b | ||

Interlock | 1 | pc+3i | (pc+2i) | I cycle |

2 | pc+3i | - | S cycle | |

(pc+3i) |

Table 9.15 shows
the cycle timing for `SMLALxy`

instructions with and
without interlocks.