1.3.3 Bus traffic in Fast Models
PVBus can simulate the behavior of individual bus transactions passing through a hierarchy of bus fabric, but it employs several techniques to optimize this process.
generally decodes the path between a bus master and the bus slave
the first time a transaction is issued. All subsequent transactions
to the same address are automatically sent to the same slave, without
passing through the intervening fabric.
- For accesses to normal memory, the master can cache
a pointer to the (host) storage that holds the data contents of
the memory. The master can read and write directly to this memory
without generating bus-transactions.
- For instruction-fetch, and for operations such as
repeated DMA from framebuffer memory, PVBus provides an optimization
called “snooping”, that informs the master if anyone else could
have modified the contents of memory. If no changes have occurred
the master can avoid the need to re-read memory contents.
If a piece of bus fabric wants to intercept and log all bus
transactions, it can defeat these optimizations by claiming to be
a slave device. It can then log all transactions and can reissue identical
transactions on its own master port. However, doing this slows all
bus transactions and significantly impacts simulation performance.
If direct accesses to memory by the CT engine are intercepted
by the fabric, the processor is forced to single step. Execution
is much slower than normal operation with translated code.
The bus traffic generated by a processor is not representative of real traffic:
- Timing differences
- Re-ordering and buffering of memory accesses, out-of-order execution, speculative prefetch and
drain-buffers can cause timing differences. They are not modeled, since they are not
visible to the programmer except in situations where a cluster program contains race
conditions that violate serial-consistency expectations.
- Bus contention
- Fast Models do not model the time taken for a bus
transaction, so they cannot model the effects of multiple transactions
contending for bus availability.
- Size of access
- Fast Models do not attempt to generate the same
types of burst transaction from the processor for accesses to multiple
- Instruction fetch
- The behavior of the instruction prefetch unit of a processor is not modeled to match the
- Behavioral differences
- In some software, the trace of instruction execution
is dependent on timing effects. For example, if a loop polls a device
waiting for a 10ms time-out, the number of iterations of the polling
loop depends on the rate of instruction execution.