3.3.2 Processor L1 cache programming

This section describes the recommended practice for programming the processor L1 cache. It is not however, intended as an exhaustive guide on writing driver software.

Initialization

After power and reset have been applied, the cache starts up in a disabled state and begins its invalidation process. Because the cache is disabled, all accesses arriving at the cache are not cached and bypass the cache.

During the invalidation process, you can always enable the cache by setting the CACHEEN control bit to 1. However, all accesses through the cache are still treated as uncached and bypass the cache until the cache invalidation process completes.

At the end of the cache invalidation process, the IC_STATUS interrupt status is asserted, and if that interrupt is already enabled or is enabled at a later stage, an interrupt is raised. If caching of code fetches is important, you can poll this status register, or wait for this interrupt to be raised before continuing the rest of your code execution.

Cache disable

You can disable the cache at anytime by clearing the CACHEEN control bit.

If there are outstanding accesses occurring, these are completed before the cache is disabled. Therefore, to determine when the cache is finally disabled, the software can either poll the CDC_STATUS register or enable the CDC interrupt and wait for the interrupt to arrive, after clearing the CACHEEN bit.

Cache invalidation

You can invalidate the cache at anytime by setting FINV. Setting this bit triggers a full cache invalidation.

During cache invalidation, all accesses through the cache are still treated as uncached and bypass the cache until the cache invalidation process completes. At the end of the cache invalidation process, the IC_STATUS interrupt status is asserted. If that interrupt is already enabled or is enabled at a later stage, an interrupt is raised.

Performance targets

The cache aims to improve the average performance of the connected processor by holding a local copy of previously accessed or specified memory locations.

With zero-cycle access time on address hits, the design improves performance significantly, especially when the code memory access is slowed down, due to the slow memory or large clock ratio different or intensive parallel usage.

The cache can actively reduce processor performance in the following scenarios:

Writes
If INVMAT feature is used and a write access has to be compared, then there is an extra cycle of bus latency.
Cache misses

A cache miss causes a fetch to occur, and causes an extra cycle of bus latency for the initial data.

Subsequent transactions are also stalled while the rest of the fetch process occurs. The extra latency is determined by the time it takes for the memory subsystem to return the rest of the WRAP4 transaction.

The cache supports the comparison of stalled and ongoing WRAP4 transactions, and can on-fly respond with matching transactions.

Non-ConfidentialPDF file icon PDF version101104_0200_00_en
Copyright © 2016–2018 Arm Limited or its affiliates. All rights reserved.