2.3.2.  Bus traffic

PVBus can simulate the behavior of individual bus transactions passing through a hierarchy of bus fabric, but it employs several techniques to optimize this process:

  1. PVBus generally decodes the path between a bus master and the bus slave the first time a transaction is issued. All subsequent transactions to the same address are automatically sent to the same slave, without passing through the intervening fabric.

  2. For accesses to normal memory, the master can cache a pointer to the (host) storage that holds the data contents of the memory. The master can read and write directly to this memory without generating bus-transactions.

  3. For instruction-fetch, and for operations such as repeated DMA from framebuffer memory, PVBus provides an optimization called “snooping”, that informs the master if anyone else could have modified the contents of memory. If no changes have occurred the master can avoid the need to re-read memory contents.

If a piece of bus fabric wants to intercept and log all bus transactions, it can defeat these optimizations by claiming to be a slave device. It can then log all transactions and can reissue identical transactions on its own master port. However, doing this slows all bus transactions and significantly impacts simulation performance.

Note

If direct accesses to memory by the CT engine are intercepted by the fabric, the processor is forced to single step. Execution is much slower than normal operation with translated code.

The bus traffic generated by a processor is not representative of real traffic for the following reasons:

Timing Differences

These can be caused by re-ordering and buffering of memory accesses, out-of-order execution, speculative prefetch and drain-buffers. They are not modeled, since they are not visible to the programmer except in situations where a multiprocessor program contains race-conditions that violate serial-consistency expectations.

Bus Contention

Fast Models do not model the time taken for a bus transaction, so they cannot model the effects of multiple transactions contending for bus availability.

Size of Access

Fast Models do not attempt to generate the same types of burst transaction from the processor for accesses to multiple consecutive locations.

Instruction Fetch

The behavior of the instruction prefetch unit of a processor is not modeled to match the hardware implementation. See Instruction prefetch.

Behavioral Differences

In some software, the trace of instruction execution is dependent on timing effects. For example, if a loop polls a device waiting for a 10ms time-out, the number of iterations of the polling loop depends on the rate of instruction execution.

Copyright © 2008-2013 ARM. All rights reserved.ARM DUI 0423O
Non-ConfidentialID060613