A6.2.6 Write streaming mode

A cache line is allocated to the L1 on either a read miss or a write miss.

However, there are some situations where allocating on writes is not required. For example, when executing the C standard library memset() function to clear a large block of memory to a known value. Writes of large blocks of data can pollute the cache with unnecessary data. It can also waste power and performance if a linefill must be performed only to discard the linefill data because the entire line was subsequently written by the memset().

To counter this, the L1 memory system includes logic to detect when the core has stores pending to a full cache line when it is waiting for a linefill to complete, or when it detects a DCZVA (full cache line write to zero). If this situation is detected, then it switches into write streaming mode.

When in write streaming mode, loads behave as normal, and can still cause linefills, and writes still lookup in the cache, but if they miss then they write out to L2 (or possibly L3, system cache, or DRAM) rather than starting a linefill.

The L1 memory system continues in write streaming mode until it can no longer create a full cacheline of store (for example because of a lack of resource in the L1 memory system) or has detected a high proportion of store hitting in the cache.


The L1 memory system is monitoring transaction traffic through L1 and, depending on different thresholds, can set a stream to go out to L2, L3, and system cache and DRAM.

The following registers control the different thresholds:

AArch64 state
CPUECTLR_EL1 configure the L2, L3, and system cache write streaming mode threshold. See B2.32 CPUECTLR_EL1, CPU Extended Control Register, EL1 .
Non-ConfidentialPDF file icon PDF version100798_0400_00_en
Copyright © 2016–2019 Arm Limited or its affiliates. All rights reserved.