4.10. Large blocks of instructions

When a trace decompressor processes a branch address packet or atom packet, it must decode instructions from the last waypoint target until it encounters another waypoint. In exceptional situations, such as if the program has branched into an uninitialized memory region or the trace data stream is corrupted, the process of decoding to the next instruction could take a very long time. To prevent this, the PFT architecture implements a mechanism that enables a decompressor to bound the amount of instruction data it processes after identifying a waypoint.

If a PTM reaches a waypoint that is more than 4096 bytes of instruction data from the previous waypoint target then it outputs a waypoint update packet. The address in this packet is (new waypoint address - n), where n is 2 or 4. After outputting this waypoint update packet the PTM generates the atom or branch address packet for the new waypoint.

If there has not been a waypoint since the most recent I-sync packet, the waypoint update packet is output if the new waypoint is more than 4096 byes from the address in the I-sync packet.

If an exception means that the PTM must output a waypoint update packet before it processes the new waypoint, then no additional waypoint update packet is generated.

For example, if a program branches into a region of uninitialized memory where all of the data is zero, the processor interprets this data as a series of NOP instructions, and might execute a very large number of these instructions before it encounters the next waypoint. If this happens, when the processor encounters the next waypoint the PTM outputs a waypoint update packet, giving the address of the instruction before the waypoint. The trace decompressor recognizes that this address is a long way in the future, and might determine that the address is in uninitialized memory, and therefore that the program has gone wrong. Alternatively, the decompressor can process the code between the two waypoints. Knowing that this is a large block of code it can break the processing down into smaller blocks.

In addition, if the trace decompressor does not receive a waypoint update packet immediately before an atom packet or branch address packet, it knows it does not have to process more that 4096 consecutive bytes of instruction data beyond the previous waypoint.

If program execution rolls over the top of memory, where the instruction at 0xFFFFFFFC or 0xFFFFFFFE is followed by the instruction at 0x00000000 or 0x00000002, this causes unpredictable processor operation. In this situation, it is implementation specific whether the waypoint update packet is output if more than 4096 consecutive instruction bytes have been executed.

Copyright © 1999-2002, 2004-2008, 2011 ARM. All rights reserved.ARM IHI 0035B