3.12.3. Cache Coherent Interface

Extending hardware coherency to a multi-cluster system requires a coherent bus protocol. In 2011, ARM released the AMBA® 4 ACE specification that introduces the AXI Coherency Extensions (ACE) on top of the AXI protocol. The full ACE interface enables hardware coherency among clusters and enables an SMP operating system to extend to more cores.

The ACE protocol adds three coherency channels in addition to the normal five channels of AXI. If you have two clusters, any shared access to memory can snoop into the cache of the other cluster to see if the data is already on chip. If not, it is fetched from external memory.

The AMBA 4 ACE-Lite interface is designed for I/O coherent system masters such as DMA engines, network interfaces, and GPUs. These devices might not have any caches of their own, but they can read shared data from the ACE processor data caches.

The CoreLink™ CCI-400 is one of the first implementations of AMBA 4 ACE. It supports up to two ACE processor clusters, enabling up to eight cores to see the same view of memory and run an SMP OS.

Figure 3.24 shows the steps in a coherent data read from the Cortex-A7 cluster to the Cortex-A15 cluster. This starts with the Cortex-A7 cluster issuing a Coherent Read Request. The CCI-400 hands over the request to the Cortex-A15 processor to snoop into the Cortex-A15 cluster cache.

Figure 3.24. Cache coherency in a multi-cluster system

To view this graphic, your browser must support the SVG format. Either install a browser with native support, or install an appropriate plugin such as Adobe SVG Viewer.

When the request from the CCI-400 is received, the Cortex-A15 cluster checks the data availability and reports this information back. If the requested data is in the cache, the CCI-400 moves the data from the Cortex-A15 cluster to the Cortex-A7 cluster, resulting in a cache linefill in the Cortex-A7 cluster. The CCI-400 and the ACE protocol enable full coherency between the Cortex-A15 and Cortex-A7 clusters, enabling data sharing to take place without external memory transactions.

Copyright © 2014 ARM. All rights reserved.ARM DAI0425