4.6.7. Data hazards in Full-compliance mode

Source registers must be protected in the event of an exceptional condition in an instruction or in an iteration of a short vector instruction.

Source registers are cleared in the first Execute 1 cycle of an operation. To enable forwarding to a subsequent instruction, destination registers are cleared in the next-to-last cycle.

The sections that follow give examples of data hazards in RunFast mode:

Status register RAW hazard example

In Example 4.5, the FMSTAT is stalled for three cycles in the Fetch stage until the FCMPS updates the condition codes in the FPSCR register. Two cycles later, FMSTAT updates the ARM CPSR register with the condition codes.

Example 4.5. Status register RAW hazard

FCMPS S1, S2FMSTAT

Table 4.6 shows the VFP9-S pipeline stages for Example 4.5.

Table 4.6. Pipeline stages for Example 4.5

 Instruction cycle number
Instruction12345678910
FCMPSFDE1E2E3E4----
FMSTAT-FFFFFDEMW

Load multiple/CDP RAW hazard example

In Example 4.6, the FADDS is stalled in the Fetch stage for nine cycles until the FLDM makes its last transfer to the VFP9-S coprocessor.

Example 4.6. Load multiple/CDP RAW hazard

FLDM [Rx], {S8-S15}FADDS S1, S2, S15

Table 4.7 shows the VFP9-S pipeline stages for Example 4.6.

Table 4.7. Pipeline stages for Example 4.6

 Instruction cycle number
Instruction12345678910111213141516
FLDMFDEMWWWWWWWW----
FADDS-FFFFFFFFFFDE1E2E3E4

CDP/CDP RAW hazard example

In Example 4.7, the FADDS is stalled for three cycles in the Fetch stage until the FMULS data is written and forwarded in cycle 6 to the Decode stage of the FADDS.

Example 4.7. CDP/CDP RAW hazard

FMULS S4, S1, S0FADDS S5, S4, S3

Table 4.8 shows VFP9-S pipeline stages of Example 4.7.

Table 4.8. Pipeline stages for Example 4.7

 Instruction cycle number
Instruction12345678910
FMULSFDE1E2E3E4----
FADDS-FFFFDE1E2E3E4

Load multiple/short vector CDP RAW hazard example

In Example 4.8, the short vector FADDS is stalled in the Fetch stage until the FLDM loads all source registers required by the FADDS. In this case, the FADDS is stalled for two cycles. It does not have to wait for completion of the FLDM, because it depends on the FLDM only for one register, S7. The S7 data is forwarded in cycle 5. The vector length is four iterations (LEN = 3), and the stride is one (STRIDE = 0). Notice that the first source vector uses registers S7, S0, S1, and S2, and the only FADDS source register loaded by the FLDM is S7. This example is based on the assumption that the remaining source and destination registers are available to the FADDS in cycle 5.

Example 4.8. Load multiple/short vector CDP RAW hazard

FLDM [R2], {S7-S14}FADDS S16, S7, S25

Table 4.9 shows the VFP9-S pipeline stages for Example 4.8.

Table 4.9. Pipeline stages for first iteration of Example 4.8

 Instruction cycle number
Instruction123456789
FLDMFDEMWWWW-
FADDS-FFFDE1E2E3E4

Short vector CDP/store RAR hazard example

In Example 4.9, S25 is a source for the second iteration of the FMULS and a source for the FSTS. The FMULS locks S25, and the FSTS must wait until the FMULS releases it. After the FMULS releases S25, the FSTS can store S25 while the FMULS continues with its third and fourth iteration. The vector length is four iterations (LEN = 3), and the stride is one (STRIDE = 0).

Example 4.9. Short vector CDP/store RAR hazard

FMULS S8, S16, S24FSTS S25, [R2]

Table 4.10. Pipeline stages for Example 4.9

 Instruction cycle number
Instruction123456789
FMULSFDE1E1E1E1E2E3E4
FSTS-FFFDEMW-

Short vector CDP/load multiple WAR hazard example

In Example 4.10, the load multiple FLDMS creates a WAR hazard to the source registers of the FMULS. The vector length is four iterations (LEN = 3), and the stride is one (STRIDE = 0). The VFP9-S coprocessor stalls the FLDMS until the FMULS clears all the source registers, S16-S19 and S24-S27.

Example 4.10. Short vector CDP/load multiple WAR hazard

FMULS S8, S16, S24FLDMS [R2], {S16-S27}

Table 4.11. Pipeline stages for first iteration of Example 4.10

 Instruction cycle number
Instruction1234567891011121314151617
FMULSFDE1E1E1E1E2E3E4--------
FLDMS-FFFFFDEMWWWWWWWW

Copyright © 2002, 2003, 2008, 2010 ARM Limited. All rights reserved.ARM DDI 0238C
Non-Confidential