| |||
| Home > Functional Overview > PrimeCell MPMC functional description > Data buffers | |||
Each AHB memory interface uses a single Dword write buffer and read cache to merge data to and from external memory to improve memory bandwidth. When reading word (32-bit), halfword, or byte wide data, external memory accesses are reduced to a single 64-bit wide external memory access. If consecutive address hits occur to data cached in the 64-bit wide read buffer, no external memory accesses are necessary. Similarly, when writing word, halfword, or byte data, external memory accesses are reduced to a single 64-bit wide access if consecutive address hits occur to the 64-bit wide write buffer. The merge write or read buffers for each port can be disabled using the register interface by writing to the MPMCAHBControlx Register for the AHB port.
Buffered write and read transactions on a single AHB port to the same memory location are always coherent.
For 32 and 64-bit masters which use separate read and write ports, the associated AHB port buffers must be disabled to ensure that data coherency is maintained.
The internal merge buffers of both 32 and 64-bit ports are fixed at 64 bit, and any external read accesses always perform enough transfers to fill the 64-bit buffer, assuming they are enabled and HPROT indicates it can be cached or buffered.
The AHB port data buffers are only used for dynamic memory transfers. The static memory controller uses a 32-bit buffer to optimize reads and writes.
The buffers are automatically disabled during ARM11 Exclusive Access transactions.
If an 8-bit, 16-bit, or 32-bit wide AHB read transfer is performed with the buffers enabled, the read transaction submitted to the memory is at the maximum width of the memory chip-select, with enough transfers to fill the 64-bit buffer. Therefore a chip-select with a 64-bit data bus returns 64 bits of data. This data is then placed into the buffer. Data is provided from the buffer to the AHB bus as necessary. This reduces the number of read transactions to external memory for 8-bit, 16-bit, or 32-bit wide AHB transactions. The MPMC can then re-arbitrate to a different AHB port and perform memory transactions while the data is being transferred from the data buffer.
Enabling the buffers has no impact on 64-bit wide read transactions.
The MPMC re-arbitrates to an open page transfer during a buffered transaction.
For read transactions the respective AHB port buffer is only used if:
The AHB respective AHB port buffer is enabled.
One of the 8-bit, 16-bit, or 32-bit wide AHB read transactions is performed. 64-bit wide transactions do not use the buffer, because this does not improve performance.
A multi-word AHB burst is performed. AHB burst SINGLE transfers do not use the buffer.
AHB HPROT protection information indicates that the transfer is cacheable, indicating to the MPMC that the particular read transaction can be buffered.
The transaction is not a locked transfer (HMASTLOCK is LOW). This ensures that atomic AHB transactions are performed straight to memory and are not re-arbitrated during the transaction.
The transaction is not an ARM11 Exclusive Access read.
If an AHB master does not provide AHB HPROT protection information, the relevant HPROT bits can be tied off as required.
Only the AHB burst that fetched the data from the memory into the buffer can use the data in the buffer. Subsequent AHB bursts do not make use of the read data in the buffer even if the transaction is to the same memory area.
Enabling read buffering:
reduces power consumption because fewer commands are issued to memory.
increases memory bandwidth for 8-bit, 16-bit, and 32-bit wide memory transactions, because the MPMC can re-arbitrate to service a different AHB port while the data is being read from the buffer.
Only AHB ports that have a memory request to open pages of SDRAM memory are serviced.
The example described is for the case where the AHB port buffers are enabled and cacheable transfers are performed, with AHB port 0 and AHB port 1 performing INCR4, 16-bit wide read transactions. Accesses are to a 64-bit wide dynamic memory chip-select.
Table 2.3 shows that with the read buffer enabled the memory is operating at maximum efficiency and providing 64 bits of data on each clock cycle.
Table 2.3. Read buffer enabled
| Clock cycle | AHB 0 | AHB 1 |
|---|---|---|
| 1 | 64-bit read from memory and placed in buffer. Data 0 returned on AHB. | AHB port waiting for data. |
| 2 | Data 1 returned on AHB. | 64-bit read from memory and placed in buffer. Data 0 returned on AHB |
| 3 | 64-bit read from memory and placed in buffer. Data 2 returned on AHB. | Data 1 returned on AHB. |
| 4 | Data 3 returned on AHB. | 64-bit read from memory and placed in buffer. Data 2 returned on AHB. |
| 5 | AHB port available. | Data 3 returned on AHB. |
Table 2.4 shows that with the read buffer disabled the memory provides 32 bits of data on each clock cycle. The MPMC therefore only provides half the amount of bandwidth compared to the case with the buffers enabled.
Table 2.4. Read buffer disabled
| Clock cycle | AHB 0 | AHB 1 |
|---|---|---|
| 1 | 32-bit read from memory. Data 0 returned on AHB. | AHB port waiting for data. |
| 2 | 32-bit read from memory. Data 1 returned on AHB. | AHB port waiting for data. |
| 3 | 32-bit read from memory. Data 2 returned on AHB. | AHB port waiting for data. |
| 4 | 32-bit read from memory. Data 3 returned on AHB. | AHB port waiting for data. |
| 5 | AHB port available. | 32-bit read from memory. Data 0 returned on AHB. |
| 6 | AHB port available. | 32-bit read from memory. Data 1 returned on AHB. |
| 7 | AHB port available. | 32-bit read from memory. Data 2 returned on AHB. |
| 8 | AHB port available. | 32-bit read from memory. Data 3 returned on AHB. |
The disadvantage with enabling read buffering is that an 8-bit, 16-bit, or 32-bit wide read transfer can take longer to complete, because the AHB port is re-arbitrated to maximize bandwidth while the data is being read from the buffer. If buffering is not required, read buffering can be disabled by either:
setting the Buffer enable (E) field of the MPMCAHBControl Register inactive
setting the AHB HPROT protection information to indicate a non-cacheable access.
The worst case additional latency is for the case where one AHB port, in this example AHB port 0, performs INCR16 32-bit wide read transactions. Other AHB ports, in this case AHB port 1 and 2, perform continuous, in page, INCR16 64-bit read transactions. Accesses are to a 32-bit wide dynamic memory chip-select.
With the read buffer enabled the memory provides 64 bits of data on every clock cycle. The MPMC re-arbitrates to a different AHB port when data is being read from the buffer. The latency of a buffer transfer might therefore be longer than a nonbuffered transfer.
The worst case additional latency for a buffered transfer is when a buffered INCR16 32-bit wide burst is being performed and the other AHB ports are performing INCR16 64-bit wide transfers. Table 2.5 shows that the INCR16 32-bit wide read can take up to 128 cycles to complete.
Table 2.5. Read buffer enabled
| Clock cycle | AHB 0 | AHB 1 | AHB 2 |
|---|---|---|---|
| 1 | 64-bit read from memory and placed in buffer. 16-bit data 0 returned on AHB. | AHB port waiting for data | AHB port waiting for data |
| 2 | 32-bit data 1 returned on AHB. | AHB port waiting for data | AHB port waiting for data |
| 3 | AHB port waiting for data. | 64-bit read zero from memory | AHB port waiting for data |
| 4-17 | AHB port waiting for data. | 64-bit read 1-14 from memory | AHB port waiting for data |
| 18 | AHB port waiting for data. | 64-bit read 15 from memory | AHB port waiting for data |
| 19 | 64-bit read from memory and placed in buffer. 32-bit data 2 returned on AHB. | Next INCR16 transfer | AHB port waiting for data |
| 20 | 32-bit data 3 returned on AHB. | AHB port waiting for data | AHB port waiting for data |
| 21 | AHB port waiting for data. | AHB port waiting for data | 64-bit read zero from memory |
| 22-37 | AHB port waiting for data. | AHB port waiting for data | 64-bit read 1-14 from memory |
| 38 | AHB port waiting for data. | AHB port waiting for data | 64-bit read 15 from memory |
| 39 | AHB port waiting for data. | AHB port waiting for data | Next INCR16 transfer |
| 40 | 64-bit read from memory and placed in buffer. 32-bit data 4 returned on AHB. | AHB port waiting for data | AHB port waiting for data |
| 41 | 32-bit data 5 returned on AHB. | AHB port waiting for data | AHB port waiting for data |
Table 2.5 shows the first 41 cycles of the transaction. Cycles 42 to 136 are a continuation of the sequence shown.
Table 2.6 shows that if read buffers are not enabled, an INCR16 32-bit wide read burst takes 16 cycles to complete when the read data starts being returned from the memory.
Table 2.6. Read buffer disabled
| Clock cycle | AHB 0 | AHB 1 | AHB 2 |
|---|---|---|---|
| 1 | 32-bit read from memory. Data 0 returned on AHB. | AHB port waiting for data. | AHB port waiting for data |
| 2-15 | 32-bit read from memory. Data 1-14 returned on AHB. | AHB port waiting for data. | AHB port waiting for data |
| 16 | 32-bit read from memory. Data 15 returned on AHB. | AHB port waiting for data. | AHB port waiting for data |
| 17 | AHB port available. | 64-bit read from memory. Data 0 returned on AHB. | AHB port waiting for data |
| 18-31 | AHB port available. | 64-bit read from memory. Data 1-14 returned on AHB. | AHB port waiting for data |
| 32 | AHB port available. | 64-bit read from memory. Data 15 returned on AHB. | AHB port waiting for data3 |
If an 8-bit, 16-bit, or 32-bit wide AHB write transfer is performed with the buffers enabled the write data is merged into the buffer, so that a write of the maximum width of the memory chip-select is performed. Therefore, for a chip-select with a 64-bit data bus, 64 bits of data are written at a time. This reduces the number of write transactions to external memory for 8-bit, 16-bit, or 32-bit wide AHB transactions. The MPMC can then re-arbitrate to a different AHB port and perform memory transactions while the data is being written into the data buffer.
Enabling the buffers has no impact on 64-bit wide write transactions.
The MPMC re-arbitrates to an open page transfer during a buffered transaction.
For write transactions the respective AHB port buffer is only used if:
The respective AHB port buffer is enabled.
An 8-bit, 16-bit, or 32-bit write transaction is performed. 64-bit wide transactions do not make use of the buffer because this does not improve performance.
A multi-word AHB burst is performed. AHB burst SINGLE transfers do not use the buffer.
AHB HPROT protection information indicates that the transfer is bufferable. This indicates to the MPMC that the particular write transaction can be buffered.
The transaction is not a locked transfer (HMASTLOCK is LOW). This ensures that atomic AHB transactions are performed straight to memory and are not re-arbitrated during the transaction.
The transaction is not an ARM11 Exclusive Access write.
If an AHB master does not provide AHB HPROT protection information the relevant HPROT bits can be tied off as required.
If an AHB port performs a burst write into a buffer subsequent AHB burst writes do not merge data into the same buffer, even if the subsequent burst is to the same area of memory. Instead the data in the buffer from the first write is submitted to memory and only then can the second burst start. This ensures that the memory is updated at the end of each burst write so that if another AHB port reads from the same memory location the memory is updated.
Enabling write buffering:
reduces power consumption as fewer commands are issued to memory
increases memory bandwidth for 8-bit, 16-bit, and 32-bit wide memory transactions, because the MPMC can re-arbitrate to service a different AHB port while the data is being written to the buffer.
Only AHB ports that have a memory request to open pages of SDRAM memory are serviced.
The write buffer example describes the case where the AHB port buffers are enabled and bufferable transfers are performed, with AHB port 0 and AHB port 1 performing INCR4, 32-bit wide write transactions. Accesses are to a 64-bit wide dynamic memory chip-select.
Table 2.7 shows that with the write buffer enabled the memory is operating at maximum efficiency and providing 64 bits of data on each clock cycle.
Table 2.7. Write buffer enabled
| Clock cycle | AHB 0 | AHB 1 |
|---|---|---|
| 1 | 32-bit write data 0 written into buffer. | AHB port waiting for data. |
| 2 | 32-bit write data 1 written into buffer. 64-bit write data written to memory. | 32-bit write data 0 written into buffer. |
| 3 | 32-bit write data 2 written into buffer. | 32-bit write data 1 written into buffer. 64-bit write data written to memory. |
| 4 | 32-bit write data 3 written into buffer. | 32-bit write data 2 written into buffer. |
| 5 | AHB port available. | 32-bit write data 3 written into buffer. 64-bit write data written to memory. |
Table 2.8 shows that with the write buffer disabled 32 bits of data is written to memory on each clock cycle. The MPMC therefore only provides half the amount of bandwidth compared to the case with the buffers enabled.
Table 2.8. Write buffer disabled
| Clock cycle | AHB 0 | AHB 1 |
|---|---|---|
| 1 | 32-bit write data 0 written into memory | AHB port waiting for data |
| 2 | 32-bit write data 1 written into memory | AHB port waiting for data |
| 3 | 32-bit write data 2 written into memory | AHB port waiting for data |
| 4 | 32-bit write data 3 written into memory | AHB port waiting for data |
| 5 | AHB port available | 32-bit write data 0 written into memory |
| 6 | AHB port available | 32-bit write data 1 written into memory |
| 7 | AHB port available | 32-bit write data 2 written into memory |
| 8 | AHB port available | 32-bit write data 3 written into memory |
The disadvantage with enabling write buffering is that an 8-bit, 16-bit, or 32-bit wide write transfer can take longer to complete, because the AHB port is re-arbitrated to maximize bandwidth while the data is being written into the buffer. If buffering is not required, write buffering can be disabled by:
setting the Buffer enable (E) field of the MPMCAHBControl Register inactive
setting the AHB HPROT protection information to indicate a nonbufferable access.
The worst case additional latency is for the case where one AHB port, in this case AHB port 0, performs INCR16 32-bit wide write transactions. Other AHB ports, in this case AHB port 1 and 2, perform continuous, in page, INCR16 64-bit write transactions. Accesses are to a 64-bit wide dynamic memory chip-select.
With the write buffer enabled the memory writes 64 bits of data on every clock cycle.
The MPMC re-arbitrates to a different AHB port when data is being written to the buffer. The latency of a buffer transfer might therefore be longer than a nonbuffered transfer.
The worst case additional latency for a buffered transfer is when a buffered INCR16 32-bit wide burst is being performed and the other AHB ports are performing INCR16 64-bit wide transfers. See Table 2.9 for an example of the worse case scenario.
Table 2.9. Write buffer enabled
| Clock cycle | AHB 0 | AHB 1 | AHB 2 |
|---|---|---|---|
| 1 | 32-bit write data 0 written into buffer. | AHB port waiting for data | AHB port waiting for data |
| 2 | 32-bit write data 1 written into buffer. 64-bit write to memory. | AHB port waiting for data | AHB port waiting for data |
| 3 | 32-bit write data 2 written into buffer. | 64-bit write 0 to memory | AHB port waiting for data |
| 4 | 32-bit write data 3 written into buffer. | 64-bit write 1 to memory | AHB port waiting for data |
| 5-17 | AHB port waiting for data. | 64-bit write 2-14 to memory | AHB port waiting for data |
| 18 | AHB port waiting for data. | 64-bit write 15 to memory | AHB port waiting for data |
| 19 | 64-bit write to memory. | Next INCR16 transfer | AHB port waiting for data |
| 20 | 32-bit write data 4 written into buffer. | AHB port waiting for data | AHB port waiting for data |
| 21 | 32-bit write data 5 written into buffer. | AHB port waiting for data | 64-bit write 0 to memory |
| 23-37 | AHB port waiting for data. | AHB port waiting for data | 64-bit write 1-14 to memory |
| 38 | AHB port waiting for data. | AHB port waiting for data | 64-bit write 15 to memory |
| 39 | AHB port waiting for data. | AHB port waiting for data | Next INCR16 transfer |
| 40 | 64-bit write to memory. | AHB port waiting for data | AHB port waiting for data |
Table 2.9 shows the first 39 cycles of the transaction. Cycles 40 to 122 are a continuation of the sequence shown.
Table 2.10 shows that if write buffers are not enabled an INCR16 32-bit wide write burst takes 16 cycles to complete when the write data starts being written to memory.
Table 2.10. Write buffer disabled
| Clock cycle | AHB 0 | AHB 1 | AHB 2 |
|---|---|---|---|
| 1 | 32-bit write 0 to memory | AHB port waiting for data | AHB port waiting for data |
| 2-15 | 32-bit write 2-15 to memory | AHB port waiting for data | AHB port waiting for data |
| 16 | 32-bit write 16 to memory | AHB port waiting for data | AHB port waiting for data |
| 17 | AHB port available | 64-bit write 0 to memory | AHB port waiting for data |
| 18-31 | AHB port available | 64-bit write 1 to memory | AHB port waiting for data |
| 32 | AHB port available | 64-bit write 2-14 to memory | AHB port waiting for data |
| 33 | AHB port available | 64-bit write 16 to memory | 64-bit write 0 to memory |
| 34-47 | AHB port available | AHB port waiting for data | 64-bit write 1 to memory |
| 48 | AHB port available | AHB port waiting for data | 64-bit write 2-14 to memory |
| 49 | AHB port available | AHB port available | 64-bit write 16 to memory |
When the buffers are disabled, all write transfers receive HREADY LOW wait states until the transfer has started to be performed by the MPMC. HREADY then goes HIGH. This functionality is important for multi-port masters to ensure data coherency over multiple ports, or when data is being passed between masters. For example, the ARM11 read and write data ports require the write transfers to receive wait states until the write is being performed to ensure coherency between the two ports. This is the default operation of the MPMC after reset.
With the buffers enabled and empty, a write transfer receives no wait states and completes immediately, enabling reduced write latency if there are no data coherency issues. Subsequent writes are waited until the first write has completed.
To enable zero wait state writes to be performed without using the data merge buffers, the buffers must be enabled and the HPROT[3:2] lines driven correctly to indicate that reads are not cacheable and writes are not bufferable.
Exclusive write transfers always have a minimum of two wait states inserted, even when the merge buffer is used.