|Home > Functional description > Level 1 memory system > L1 data memory system > Memory system implementation|
This section describes the implementation of the L1 memory system.
The Cortex®-A55 core supports a single limited order range that includes the entire memory space.
The Cortex-A55 core supports the atomic instructions added in the Arm®v8.1‑A architecture.
Atomic instructions to cacheable memory can be performed as either near atomics or far atomics, depending on where the cache line containing the data resides. If the instruction hits in the L1 data cache in a unique state then it will be performed as a near atomic in the L1 memory system. If the atomic operation misses in the L1 cache, or the line is shared with another core then the atomic is sent as a far atomic out to the L3 cache. If the operation misses everywhere within the cluster, and the master interface is configured as CHI, and the interconnect supports far atomics, then the atomic will be passed on to the interconnect to perform the operation. If the operation hits anywhere inside the cluster, or the interconnect does not support atomics, then the L3 memory system will perform the atomic operation and allocate the line into the L3 cache if it is not already there.
The Cortex-A55 core supports atomics to device or non-cacheable memory, however this relies on the interconnect also supporting atomics. If such an atomic instruction is executed when the interconnect does not support them, it will result in a synchronous Data Abort (for load atomics) or an asynchronous Data Abort (for store atomics). The behavior of the atomic instructions can be modified by the CPUECTLR register settings.
For more information on the CPUECTLR register, see B2.30 CPUECTLR_EL1, CPU Extended Control Register, EL1.
The core supports Load acquire instructions adhering to the RCpc consistency
semantic introduced in the Armv8.3‑A extensions. This is reflected in register ID_AA64ISAR1_EL1
where bits[23:20] are set to
to indicate that the core supports
instructions implemented in AArch64.
For more information on the ID_AA64ISAR1_EL1 register, see B2.58 ID_AA64ISAR0_EL1, AArch64 Instruction Set Attribute Register 0, EL1.
The core has a specific behavior for memory regions that are marked as Write-Back cacheable and transient, as defined in the Armv8‑A architecture.
For any load that is targeted at a memory region that is marked as transient, the following occurs:
For stores that are targeted at a memory region that is marked as transient, if the store misses in the L1 data cache, the line is allocated into the L2 cache.
Non-temporal loads indicate to the caches that the data is likely to be
used for only short periods. For example, when streaming single-use read data that is then
discarded. In addition to non-temporal loads, there are also prefetch-memory (
instructions with the
Non-temporal loads cause allocation into the L1 data cache, with the same performance as normal loads. However, when a later linefill is allocated into the cache, the cacheline marked as non-temporal has higher priority to be be replaced. To prevent pollution of the L2 cache, a non-temporal line that is evicted from L1, is not allocated to L2 as would happen for a normal line.
Non-temporal stores are treated the same as stores to a memory region that is marked as transient.