| |||
| Home > Caches, Write Buffer, and Physical Address TAG (PA TAG) RAM > Cache coherence | |||
The ICache and DCache contain copies of information normally held in main memory. If these copies of memory information get out of step with each other because one is updated and the other is not updated, they are said to have become incoherent. If the DCache contains a line that has been modified by a store or swap instruction, and the main memory has not been updated, the cache line is said to be dirty. Clean operations force the cache to write dirty lines back to main memory. The ICache then has to be made coherent with a changed area of memory after any changes to the instructions that appear at an MVA, and before the new instructions are executed.
On the ARM922T, software is responsible for maintaining coherence between main memory, the ICache, and the DCache.
Register 7, cache operations register describes facilities for invalidating the entire ICache or individual ICache lines, and for cleaning and/or invalidating DCache lines, or for invalidating the entire DCache.
To clean the entire DCache efficiently, software must loop through each cache entry using the clean D single entry (using index) operation or the clean and invalidate D entry (using index) operation. You must perform this using a two-level nested loop going though each index value for each segment. See DCache organization.
Example 4.4 shows an example loop for two alternative DCache cleaning operations.
Example 4.4. DCache cleaning loop
for seg = 0 to 3
for index = 0 to 63
Rd = {seg,index}
MCR p15,0,Rd,c7,c10,2 ; Clean DCache single
; entry (using index)
or
MCR p15,0,Rd,c7,c14,2 ; Clean and Invalidate
; DCache single entry
; (using index)
next index
next seg
DCache, ICache, and memory coherence is generally achieved by:
cleaning the DCache to ensure memory is up to date with all changes
invalidating the ICache to ensure that the ICache is forced to re-fetch instructions from memory.
Software can minimize the performance penalties of cleaning and invalidating caches by:
Cleaning only small portions of the DCache when only a small area of memory has to be made coherent, for example, when updating an exception vector entry. Use Clean DCache single entry (using MVA) or Clean and Invalidate DCache single entry (using MVA).
Invalidating only small portions of the ICache when only a small number of instructions are modified, for example, when updating an exception vector entry. Use Invalidate ICache single entry (using MVA).
Not invalidating the ICache in situations where it is known that the modified area of memory cannot be in the cache, for example, when mapping a new page into the currently running process.
Situations that necessitate cache cleaning and invalidating include:
Writing instructions to a cachable
area of memory using STR or STM instructions, for
example:
self-modifying code
JIT compilation
copying code from another location
downloading code using the EmbeddedICE JTAG debug features
updating an exception vector entry.
Another bus master, such as a DMA controller, modifying a cachable area main memory.
Turning the MMU on or off.
Changing the virtual-to-physical mappings, or Ctt, or Btt, or protection information, in the MMU page tables. The DCache must be cleaned, and both caches invalidated, before the cache and write buffer configuration of an area of memory is changed by modifying Ctt or Btt in the MMU translation table descriptor. This is not necessary if it is known that the caches cannot contain any entries from the area of memory whose translation table descriptor is being modified.
Turning the ICache or DCache on, if its contents are no longer coherent.
Changing the FCSE PID in CP15 register 13 does not change the contents of the cache or memory, and does not affect the mapping between cache entries and physical memory locations. It only changes the mapping between ARM9TDMI addresses and cache entries. This means that changing the FCSE PID does not lead to any coherency issues. No cache cleaning or cache invalidation is required when the FCSE PID is changed.
The software design must also consider that the pipelined
design of the ARM9TDMI core means that it fetches three instructions
ahead of the current execution point. So, for example, the three
instructions following an MCR that invalidates
the ICache, have already been read from the ICache before it is
invalidated.