ARM Technical Support Knowledge Articles

How does the coprocessor interface of the ARM7TDMI work?

Applies to: ARM7TDMI


The following text briefly describes the coprocessor interface of the ARM7TDMI, and how a coprocessor should work.

Note: If no coprocessor is connected to the ARM7TDMI, both CPA and CPB have to be tied HIGH.

The coprocessor has to follow the pipeline of the ARM7TDMI. So it must have 3 stages (fetch, decode & execute), each holding one ARM instruction. The pipeline will advance each time the ARM does an instruction fetch, so the coprocessor pipeline stage will be controlled by (ECLK and NOT(nOPC)). At the decode stage of the pipeline, the coprocessor should examine the instruction opcode it has fetched. If it is a coprocessor instruction that it recognises, it must look to see if the nCPI ARM output goes low in the execution stage - if so, then the coprocessor instruction should be executed.

If the coprocessor just follows D[31:0], sees a relevant coprocessor instruction and then just waits for nCPI, there may be problems. For example, if the next instruction executed is an LDM of all 16 registers, it would be necessary to wait 20 clock cycles before nCPI goes low. One could just let the coprocessor wait for 20 clock cycles, but this could cause problems. If, for example, a branch occurs before this coprocessor instruction was executed and the program runs an instruction for a different coprocessor, both coprocessors may try to execute it simultaneously. It is also necessary to consider the effect of interrupts/aborts occurring just after the coprocessor instruction appears on the D[31:0] bus.

So, upon recognising a relevant instruction, one needs to count pulses of (ECLK and NOT (nOPC)) to count instruction pipeline advances. Only if nCPI goes low 2 pipeline advances after the coprocessor instruction was fetched should this instruction be executed.

Besides looking at nOPC, it may be useful to consider TBIT. Coprocessor instructions are not possible in Thumb state, so in order to save power the pipeline follower in the coprocessor could be switched off during fetching of Thumb instructions.

Once the coprocessor has recognised an instruction, it must drive CPA & CPB. When the ARM has a coprocessor instruction in its execution stage, it looks for CPA to go low. (If CPA is high, the undefined instruction trap is taken). If CPA is low, CPB is also checked. If CPB is high, the ARM will busy-wait until the coprocessor is ready to execute the instruction. During the busy-wait stage, the ARM will take an IRQ or FIQ if one occurs and the coprocessor instruction will be abandoned. This will be signaled to the coprocessor by nCPI going high. If CPB is low, the ARM will continue to fetch/execute subsequent instructions.

Other things to consider are:

  • Other coprocessors (e.g. internal coprocessors like CP0 for EmbeddedICE, CP14 for the Debug Comms Channel, and CP15 for the MMU/PU).
  • If more than one coprocessor is used, CPA and CPB from all coprocessors can be ANDed together. If required, open drain schemes can be used with pull up resistors. These may be useful if coprocessors from other ASICs are added on board level.
  • Reset. It has to be ensured that nRESET asserted takes CPA & CPB high.

Timing of the coprocessor signals:

The timing of nOPC is dependent upon how one controls APE/ALE (i.e. same timing as A[31:0]bus).IfAPE=ALE=1, then nOPC will change during the clock high phase of the cycle before the actual data transfer takes place. D[31:0] is valid on the falling edge of MCLK.

nCPI changes off the falling edge of MCLK - the old value stays valid for time Tcpih and the propagation delay for the new value is given by the timing parameter Tcpi. The MCLK input to ECLK output propagation delay (Tcdel) also has to be taken into account.

CPA & CPB are sampled on the MCLK rising edge. Of course, it may be possible to generate these signals during the previous MCLK high phase if the pipeline is being followed. They will be sampled on every MCLK rising edge - the setup and hold times (Tcps and Tcph) have to be met.


The above picture shows an ARM7TDMI executing a coprocessor MCR instruction:

Cycle 1:Fetch the instruction MOV R2,#2 (opcode E3A02002)
Cycle 2:Fetch the instruction MCR cp4,0,r2,c1,c0 (opcode EE012410). Decode the MOV.
Cycle 3:Execute the MOV, Decode the MCR and Fetch the next instruction.
Cycle 4:nCPI goes low off the falling edge of MCLK. Notice that the coprocessor here has driven CPA & CPB without waiting for nCPI low. This is not mandatory. The propagation delay between the MCLK falling edge and nCPI valid is given by Tcpi. Also note that nMREQ & SEQ are now indicating that a coprocessor data transfer cycle will follow. nOPC goes high during the clock high phase (but this timing can be changed if APE or ALE are used to modify the A[31:0] timing). The propagation delay from MCLK rising to nOPC valid is given by Topcd.
Cycle 5:The ARM now writes the value to be transferred to the coprocessor onto the D[31:0] bus. CPA & CPB have been driven high by the coprocessor to indicate the transfer has completed.

Attachments: img1957.gif

Article last edited on: 2008-09-09 15:47:36

Rate this article

Disagree? Move your mouse over the bar and click

Did you find this article helpful? Yes No

How can we improve this article?

Link to this article
Copyright © 2011 ARM Limited. All rights reserved. External (Open), Non-Confidential