| |||
| Home > Caches and Write Buffer > Cache architecture | |||
The ARM940T uses:
a 4KB Instruction cache
a 4KB Data cache
an 8-word write buffer.
Each cache comprises four, fully associative 1KB segments which support single cycle reads, and either one or two-cycle writes depending on the sequentiality of the access.
Each cache segment consists of 64 CAM rows to select one of 64 RAM lines of four words in length. On an I Cache or D Cache access, a segment is selected and the access address is compared with the 64 TAGs in the CAM. If a match occurs, the cache has ‘hit’. The row line corresponding to the match is then enabled so the data can be accessed. If none of the row TAGs match, the access has missed. External memory must be accessed unless the access is a buffered write, in which case the write buffer is used.
If a read access from a cacheable memory region misses in the cache, one of the 64 segment row lines is selected as a target into which to load new data (allocate on read‑miss replacement policy). This selection is performed by a randomly clocked target row counter. Critical or frequently accessed instructions and/or data may be locked down in the I Cache and D Cache respectively, by restricting the range of the target counter. Locked down lines are immune to replacement and remain in the cache until they are unlocked, or flushed.
Figure 4.2 shows the 4KB Instruction Cache or Data Cache architecture:
Address bits 5 to 4 select one of the four cache segments
Bits 3 to 2 select a word in the cache line.
The CAM allows 64 address TAGs to be stored for an address that selects a given segment (64-way associativity). This reduces the chance of an address sequence in, for example, a program loop that constantly selects the same segment from replacing data that will be required again in a later iteration of the loop. The overhead for this high associativity is the need to store a larger TAG, in this case 26 bits per line. Figure 4.1 shows how the address space accesses the 4KB I Cache and 4KB D Cache.
Two additional bits are used on each segment row line:
The valid bit is set once the cache line has been written with valid data. Only a valid line can return a hit during a CAM lookup. On reset, all the valid bits are cleared.
The dirty bit is associated with write operations in the D Cache and is used to indicate that a cache line contains data that differs from data stored at the address in external memory (data can only be marked dirty if it resides in a writeback protection region).