2.1.4. Data buffers

Each AHB memory interface uses a single Dword write buffer and read cache to merge data to and from external memory to improve memory bandwidth. When reading word (32-bit), halfword, or byte wide data, external memory accesses are reduced to a single 64-bit wide external memory access. If consecutive address hits occur to data cached in the 64-bit wide read buffer, no external memory accesses are necessary. Similarly, when writing word, halfword, or byte data, external memory accesses are reduced to a single 64-bit wide access if consecutive address hits occur to the 64-bit wide write buffer. The merge write or read buffers for each port can be disabled using the register interface by writing to the MPMCAHBControlx Register for the AHB port.

Buffered write and read transactions on a single AHB port to the same memory location are always coherent.

Note

For 32 and 64-bit masters which use separate read and write ports, the associated AHB port buffers must be disabled to ensure that data coherency is maintained.

The internal merge buffers of both 32 and 64-bit ports are fixed at 64 bit, and any external read accesses always perform enough transfers to fill the 64-bit buffer, assuming they are enabled and HPROT indicates it can be cached or buffered.

The AHB port data buffers are only used for dynamic memory transfers. The static memory controller uses a 32-bit buffer to optimize reads and writes.

The buffers are automatically disabled during ARM11 Exclusive Access transactions.

Read transaction buffer operation

If an 8-bit, 16-bit, or 32-bit wide AHB read transfer is performed with the buffers enabled, the read transaction submitted to the memory is at the maximum width of the memory chip-select, with enough transfers to fill the 64-bit buffer. Therefore a chip-select with a 64-bit data bus returns 64 bits of data. This data is then placed into the buffer. Data is provided from the buffer to the AHB bus as necessary. This reduces the number of read transactions to external memory for 8-bit, 16-bit, or 32-bit wide AHB transactions. The MPMC can then re-arbitrate to a different AHB port and perform memory transactions while the data is being transferred from the data buffer.

Note

  • Enabling the buffers has no impact on 64-bit wide read transactions.

  • The MPMC re-arbitrates to an open page transfer during a buffered transaction.

Enabling read buffer

For read transactions the respective AHB port buffer is only used if:

  • The AHB respective AHB port buffer is enabled.

  • One of the 8-bit, 16-bit, or 32-bit wide AHB read transactions is performed. 64-bit wide transactions do not use the buffer, because this does not improve performance.

  • A multi-word AHB burst is performed. AHB burst SINGLE transfers do not use the buffer.

  • AHB HPROT protection information indicates that the transfer is cacheable, indicating to the MPMC that the particular read transaction can be buffered.

  • The transaction is not a locked transfer (HMASTLOCK is LOW). This ensures that atomic AHB transactions are performed straight to memory and are not re-arbitrated during the transaction.

  • The transaction is not an ARM11 Exclusive Access read.

Note

If an AHB master does not provide AHB HPROT protection information, the relevant HPROT bits can be tied off as required.

Read data re-use

Only the AHB burst that fetched the data from the memory into the buffer can use the data in the buffer. Subsequent AHB bursts do not make use of the read data in the buffer even if the transaction is to the same memory area.

Advantages of read buffering

Enabling read buffering:

  • reduces power consumption because fewer commands are issued to memory.

  • increases memory bandwidth for 8-bit, 16-bit, and 32-bit wide memory transactions, because the MPMC can re-arbitrate to service a different AHB port while the data is being read from the buffer.

Note

Only AHB ports that have a memory request to open pages of SDRAM memory are serviced.

Read buffer example

The example described is for the case where the AHB port buffers are enabled and cacheable transfers are performed, with AHB port 0 and AHB port 1 performing INCR4, 16-bit wide read transactions. Accesses are to a 64-bit wide dynamic memory chip-select.

Read buffer enabled

Table 2.3 shows that with the read buffer enabled the memory is operating at maximum efficiency and providing 64 bits of data on each clock cycle.

Table 2.3. Read buffer enabled

Clock cycleAHB 0 AHB 1
164-bit read from memory and placed in buffer. Data 0 returned on AHB.AHB port waiting for data.
2Data 1 returned on AHB.64-bit read from memory and placed in buffer. Data 0 returned on AHB
364-bit read from memory and placed in buffer. Data 2 returned on AHB.Data 1 returned on AHB.
4Data 3 returned on AHB.64-bit read from memory and placed in buffer. Data 2 returned on AHB.
5AHB port available.Data 3 returned on AHB.
Read buffer disabled

Table 2.4 shows that with the read buffer disabled the memory provides 32 bits of data on each clock cycle. The MPMC therefore only provides half the amount of bandwidth compared to the case with the buffers enabled.

Table 2.4. Read buffer disabled

Clock cycleAHB 0 AHB 1
132-bit read from memory. Data 0 returned on AHB.AHB port waiting for data.
232-bit read from memory. Data 1 returned on AHB.AHB port waiting for data.
332-bit read from memory. Data 2 returned on AHB.AHB port waiting for data.
432-bit read from memory. Data 3 returned on AHB.AHB port waiting for data.
5AHB port available.32-bit read from memory. Data 0 returned on AHB.
6AHB port available.32-bit read from memory. Data 1 returned on AHB.
7AHB port available.32-bit read from memory. Data 2 returned on AHB.
8AHB port available.32-bit read from memory. Data 3 returned on AHB.

Disadvantages of read buffering

The disadvantage with enabling read buffering is that an 8-bit, 16-bit, or 32-bit wide read transfer can take longer to complete, because the AHB port is re-arbitrated to maximize bandwidth while the data is being read from the buffer. If buffering is not required, read buffering can be disabled by either:

  • setting the Buffer enable (E) field of the MPMCAHBControl Register inactive

  • setting the AHB HPROT protection information to indicate a non-cacheable access.

Worst case additional latency

The worst case additional latency is for the case where one AHB port, in this example AHB port 0, performs INCR16 32-bit wide read transactions. Other AHB ports, in this case AHB port 1 and 2, perform continuous, in page, INCR16 64-bit read transactions. Accesses are to a 32-bit wide dynamic memory chip-select.

Read buffer enabled

With the read buffer enabled the memory provides 64 bits of data on every clock cycle. The MPMC re-arbitrates to a different AHB port when data is being read from the buffer. The latency of a buffer transfer might therefore be longer than a nonbuffered transfer.

The worst case additional latency for a buffered transfer is when a buffered INCR16 32-bit wide burst is being performed and the other AHB ports are performing INCR16 64-bit wide transfers. Table 2.5 shows that the INCR16 32-bit wide read can take up to 128 cycles to complete.

Table 2.5. Read buffer enabled 

Clock cycleAHB 0AHB 1AHB 2
164-bit read from memory and placed in buffer. 16-bit data 0 returned on AHB.AHB port waiting for dataAHB port waiting for data
232-bit data 1 returned on AHB.AHB port waiting for dataAHB port waiting for data
3AHB port waiting for data.64-bit read zero from memoryAHB port waiting for data
4-17AHB port waiting for data.64-bit read 1-14 from memoryAHB port waiting for data
18AHB port waiting for data.64-bit read 15 from memoryAHB port waiting for data
1964-bit read from memory and placed in buffer. 32-bit data 2 returned on AHB.Next INCR16 transferAHB port waiting for data
2032-bit data 3 returned on AHB.AHB port waiting for dataAHB port waiting for data
21AHB port waiting for data.AHB port waiting for data64-bit read zero from memory
22-37AHB port waiting for data.AHB port waiting for data64-bit read 1-14 from memory
38AHB port waiting for data.AHB port waiting for data64-bit read 15 from memory
39AHB port waiting for data.AHB port waiting for dataNext INCR16 transfer
4064-bit read from memory and placed in buffer. 32-bit data 4 returned on AHB.AHB port waiting for dataAHB port waiting for data
4132-bit data 5 returned on AHB.AHB port waiting for dataAHB port waiting for data

Note

Table 2.5 shows the first 41 cycles of the transaction. Cycles 42 to 136 are a continuation of the sequence shown.

Read buffer disabled

Table 2.6 shows that if read buffers are not enabled, an INCR16 32-bit wide read burst takes 16 cycles to complete when the read data starts being returned from the memory.

Table 2.6. Read buffer disabled

Clock cycleAHB 0AHB 1AHB 2
132-bit read from memory. Data 0 returned on AHB.AHB port waiting for data.AHB port waiting for data
2-1532-bit read from memory. Data 1-14 returned on AHB.AHB port waiting for data.AHB port waiting for data
1632-bit read from memory. Data 15 returned on AHB.AHB port waiting for data.AHB port waiting for data
17AHB port available.64-bit read from memory. Data 0 returned on AHB.AHB port waiting for data
18-31AHB port available.64-bit read from memory. Data 1-14 returned on AHB.AHB port waiting for data
32AHB port available.64-bit read from memory. Data 15 returned on AHB.AHB port waiting for data3

Write transaction buffer operation

If an 8-bit, 16-bit, or 32-bit wide AHB write transfer is performed with the buffers enabled the write data is merged into the buffer, so that a write of the maximum width of the memory chip-select is performed. Therefore, for a chip-select with a 64-bit data bus, 64 bits of data are written at a time. This reduces the number of write transactions to external memory for 8-bit, 16-bit, or 32-bit wide AHB transactions. The MPMC can then re-arbitrate to a different AHB port and perform memory transactions while the data is being written into the data buffer.

Note

  • Enabling the buffers has no impact on 64-bit wide write transactions.

  • The MPMC re-arbitrates to an open page transfer during a buffered transaction.

Enabling write buffer

For write transactions the respective AHB port buffer is only used if:

  • The respective AHB port buffer is enabled.

  • An 8-bit, 16-bit, or 32-bit write transaction is performed. 64-bit wide transactions do not make use of the buffer because this does not improve performance.

  • A multi-word AHB burst is performed. AHB burst SINGLE transfers do not use the buffer.

  • AHB HPROT protection information indicates that the transfer is bufferable. This indicates to the MPMC that the particular write transaction can be buffered.

  • The transaction is not a locked transfer (HMASTLOCK is LOW). This ensures that atomic AHB transactions are performed straight to memory and are not re-arbitrated during the transaction.

  • The transaction is not an ARM11 Exclusive Access write.

Note

If an AHB master does not provide AHB HPROT protection information the relevant HPROT bits can be tied off as required.

Write data re-use

If an AHB port performs a burst write into a buffer subsequent AHB burst writes do not merge data into the same buffer, even if the subsequent burst is to the same area of memory. Instead the data in the buffer from the first write is submitted to memory and only then can the second burst start. This ensures that the memory is updated at the end of each burst write so that if another AHB port reads from the same memory location the memory is updated.

Advantages of write buffering

Enabling write buffering:

  • reduces power consumption as fewer commands are issued to memory

  • increases memory bandwidth for 8-bit, 16-bit, and 32-bit wide memory transactions, because the MPMC can re-arbitrate to service a different AHB port while the data is being written to the buffer.

Note

Only AHB ports that have a memory request to open pages of SDRAM memory are serviced.

Write buffer example

The write buffer example describes the case where the AHB port buffers are enabled and bufferable transfers are performed, with AHB port 0 and AHB port 1 performing INCR4, 32-bit wide write transactions. Accesses are to a 64-bit wide dynamic memory chip-select.

Write buffer enabled

Table 2.7 shows that with the write buffer enabled the memory is operating at maximum efficiency and providing 64 bits of data on each clock cycle.

Table 2.7. Write buffer enabled

Clock cycleAHB 0 AHB 1
132-bit write data 0 written into buffer.AHB port waiting for data.
232-bit write data 1 written into buffer. 64-bit write data written to memory.32-bit write data 0 written into buffer.
332-bit write data 2 written into buffer.32-bit write data 1 written into buffer. 64-bit write data written to memory.
432-bit write data 3 written into buffer.32-bit write data 2 written into buffer.
5AHB port available.32-bit write data 3 written into buffer. 64-bit write data written to memory.
Write buffer disabled

Table 2.8 shows that with the write buffer disabled 32 bits of data is written to memory on each clock cycle. The MPMC therefore only provides half the amount of bandwidth compared to the case with the buffers enabled.

Table 2.8. Write buffer disabled

Clock cycleAHB 0 AHB 1
132-bit write data 0 written into memoryAHB port waiting for data
232-bit write data 1 written into memoryAHB port waiting for data
332-bit write data 2 written into memoryAHB port waiting for data
432-bit write data 3 written into memoryAHB port waiting for data
5AHB port available32-bit write data 0 written into memory
6AHB port available32-bit write data 1 written into memory
7AHB port available32-bit write data 2 written into memory
8AHB port available32-bit write data 3 written into memory

Disadvantages of write buffering

The disadvantage with enabling write buffering is that an 8-bit, 16-bit, or 32-bit wide write transfer can take longer to complete, because the AHB port is re-arbitrated to maximize bandwidth while the data is being written into the buffer. If buffering is not required, write buffering can be disabled by:

  • setting the Buffer enable (E) field of the MPMCAHBControl Register inactive

  • setting the AHB HPROT protection information to indicate a nonbufferable access.

Worst case additional latency

The worst case additional latency is for the case where one AHB port, in this case AHB port 0, performs INCR16 32-bit wide write transactions. Other AHB ports, in this case AHB port 1 and 2, perform continuous, in page, INCR16 64-bit write transactions. Accesses are to a 64-bit wide dynamic memory chip-select.

Write buffer enabled

With the write buffer enabled the memory writes 64 bits of data on every clock cycle.

The MPMC re-arbitrates to a different AHB port when data is being written to the buffer. The latency of a buffer transfer might therefore be longer than a nonbuffered transfer.

The worst case additional latency for a buffered transfer is when a buffered INCR16 32-bit wide burst is being performed and the other AHB ports are performing INCR16 64-bit wide transfers. See Table 2.9 for an example of the worse case scenario.

Table 2.9. Write buffer enabled

Clock cycleAHB 0AHB 1AHB 2
132-bit write data 0 written into buffer.AHB port waiting for dataAHB port waiting for data
232-bit write data 1 written into buffer. 64-bit write to memory.AHB port waiting for dataAHB port waiting for data
332-bit write data 2 written into buffer.64-bit write 0 to memoryAHB port waiting for data
432-bit write data 3 written into buffer.64-bit write 1 to memoryAHB port waiting for data
5-17AHB port waiting for data.64-bit write 2-14 to memoryAHB port waiting for data
18AHB port waiting for data.64-bit write 15 to memoryAHB port waiting for data
1964-bit write to memory.Next INCR16 transferAHB port waiting for data
2032-bit write data 4 written into buffer.AHB port waiting for dataAHB port waiting for data
2132-bit write data 5 written into buffer.AHB port waiting for data64-bit write 0 to memory
23-37AHB port waiting for data.AHB port waiting for data64-bit write 1-14 to memory
38AHB port waiting for data.AHB port waiting for data64-bit write 15 to memory
39AHB port waiting for data.AHB port waiting for dataNext INCR16 transfer
4064-bit write to memory.AHB port waiting for dataAHB port waiting for data

Note

Table 2.9 shows the first 39 cycles of the transaction. Cycles 40 to 122 are a continuation of the sequence shown.

Write buffer disabled

Table 2.10 shows that if write buffers are not enabled an INCR16 32-bit wide write burst takes 16 cycles to complete when the write data starts being written to memory.

Table 2.10. Write buffer disabled

Clock cycleAHB 0AHB 1AHB 2
132-bit write 0 to memory AHB port waiting for dataAHB port waiting for data
2-1532-bit write 2-15 to memoryAHB port waiting for dataAHB port waiting for data
1632-bit write 16 to memoryAHB port waiting for dataAHB port waiting for data
17AHB port available64-bit write 0 to memory AHB port waiting for data
18-31AHB port available64-bit write 1 to memoryAHB port waiting for data
32AHB port available64-bit write 2-14 to memoryAHB port waiting for data
33AHB port available64-bit write 16 to memory64-bit write 0 to memory
34-47AHB port availableAHB port waiting for data64-bit write 1 to memory
48AHB port availableAHB port waiting for data64-bit write 2-14 to memory
49AHB port availableAHB port available64-bit write 16 to memory

Zero wait state write transfers

When the buffers are disabled, all write transfers receive HREADY LOW wait states until the transfer has started to be performed by the MPMC. HREADY then goes HIGH. This functionality is important for multi-port masters to ensure data coherency over multiple ports, or when data is being passed between masters. For example, the ARM11 read and write data ports require the write transfers to receive wait states until the write is being performed to ensure coherency between the two ports. This is the default operation of the MPMC after reset.

With the buffers enabled and empty, a write transfer receives no wait states and completes immediately, enabling reduced write latency if there are no data coherency issues. Subsequent writes are waited until the first write has completed.

To enable zero wait state writes to be performed without using the data merge buffers, the buffers must be enabled and the HPROT[3:2] lines driven correctly to indicate that reads are not cacheable and writes are not bufferable.

Note

Exclusive write transfers always have a minimum of two wait states inserted, even when the merge buffer is used.

Copyright ©  2003 ARM Limited. All rights reserved.ARM DDI 0269A
Non-Confidential