16.4.4. ThumbEE instructions

The majority of the ThumbEE instruction set is identical in both encodings and behavior to the Thumb-2 instruction set and therefore the cycle timings are also identical to the Thumb-2 instruction timings. The behavior of some instructions are different when executed in ThumbEE state instead of in Thumb state. However, the behavior changes for these instructions do not result in any changes to their cycle timing. The only additional cycle timing information for ThumbEE is for the new instructions.

Table 16.17 shows the timing operation of the new ThumbEE instructions.

Table 16.17. ThumbEE instructions

Instruction typeCyclesSource1Source2Source3Source4Result1Result2
ENTERX/LEAVEX[1]16------
CHKA[2]1E2E2----
HB[3]1------
HBL[4]1----R14:E3-
HBPc2----R8:E2-
HBLPd2----R8:E2R14:E3
LDR [R9][5]1R9:E1---Rd:E3-
LDR [R10]e1R10:E1---Rd:E3-
LDR [negative offset]e1Rn:E1---Rd:E3-
STR [R9][6]1Rn:E1Rd:E3----

[1] This instruction waits for all outstanding instructions to complete and then issues.

[2] If CHKA fails the array bounds check, then an exception is taken. Otherwise, this is a single cycle instruction.

[3] This instruction is predicted and behaves as a direct branch, B instruction.

[4] This instruction is predicted and behaves as a direct branch and link, BL instruction.

[5] Timing is identical to similar load instructions.

[6] Timing is identical to similar store instructions.


ThumbEE memory check exceptions

All loads and stores in ThumbEE state have the additional functionality of checking the base register for a zero value. If the base register is zero, then the processor performs a branch to the address [HandlerBase – 4]. See the ARM Architecture Reference Manual for more information.

The processor handles this scenario in the same way as to an exception such as a data abort because it does not occur in the common case. If the base register is zero, the processor flushes the pipeline and branches to the correct address. The additional cycle time penalty for this is variable in length, but is at least 13 cycles. The CHKA instruction uses the same mechanism when the array bounds check fails. This is also a rare occurrence and therefore is not optimized for performance.

Predicting ThumbEE branch type instructions

All ThumbEE branch type instructions are predicted in ThumbEE state in the same manner that they are predicted in ARM or Thumb state. In addition, the handler base branch instructions, HB[L][P], are also predicted using the same branch prediction hardware used for direct branch and branch link, B and BL instructions, respectively. Because the ThumbEE instruction set uses R9 as the base register rather than R13 as a stack pointer, LDR and STR instructions that read or write to the PC are written onto the return stack to aid in the prediction of these indirect branches. The usage model of the return stack in ThumbEE state, using R9 as the stack pointer, is identical to the usage model in ARM and Thumb state, using R13 as the stack pointer.

Copyright © 2006-2009 ARM Limited. All rights reserved.ARM DDI 0344I
Non-Confidential