|Home > Functional description > Technical overview > Components|
The cluster consists of:
For more information, see the Arm® DynamIQ™ Shared Unit Technical Reference Manual.
The following figure includes a top-level functional diagram of a core.
The IFU fetches instructions from the instruction cache or from external memory and predicts the outcome of branches in the instruction stream. It passes the instructions to the Data Processing Unit (DPU) for processing.
The DPU decodes and executes instructions. It executes instructions that require data transfer to or from the memory system by interfacing to the Data Cache Unit (DCU). The DPU includes the PMU, the Advanced SIMD and floating-point support, and the Cryptographic Extension.
The PMU provides six performance monitors that can be configured to gather statistics on the operation of each core and the memory system. The information can be used for debug and code profiling.
Advanced SIMD is a media and signal processing architecture that adds instructions primarily for audio, video, 3D graphics, image and speech processing. The floating-point architecture provides support for single-precision and double-precision floating-point operations.
All scalar floating-point instructions are available in the A64 instruction set.
All VFP instructions are available in the A32 and T32 instruction sets.
The A64 instruction set offers additional Advanced SIMD instructions, including double-precision floating-point vector operations.
The MMU provides fine-grained memory system control through a set of virtual-to-physical address mappings and memory attributes that are held in translation tables. These are saved into the Translation Lookaside Buffer (TLB) when an address is translated. The TLB entries include global and Address Space Identifiers (ASIDs) to prevent context switch TLB flushes. They also include Virtual Machine Identifiers (VMIDs) to prevent TLB flushes on virtual machine switches by the hypervisor.
A unified L2 TLB handles the misses from the L1 TLBs.
In implementations with core cache protection, parity bits protect the TLB RAMs by enabling the detection of any single-bit error. If an error is detected, the entry is invalidated and fetched again.
The L1 memory system includes the DCU, the Store Buffer (STB), and the Bus Interface Unit (BIU).
The DCU manages all load and store operations.
The L1 data cache RAMs are protected using Error Correction Codes (ECC). The ECC scheme is Single Error Correct Double Error Detect (SECDED). The DCU includes a combined local and global exclusive monitor that is used by Load-Exclusive and Store-Exclusive instructions.
The STB holds store operations when they have left the load/store pipeline in the DCU and have been committed by the DPU. The STB can request access to the L1 data cache, initiate linefills, or write to L2 and L3 memory systems.
The STB is also used to queue maintenance operations before they are broadcast to other cores in the cluster.
The L2 memory system contains the L2 cache. The L2 cache is optional and private to each core. The L2 cache is 4-way set associative, supports 64-byte cache lines, and has a configurable cache RAM size between 64KB and 256KB. The L2 memory system is connected to the DynamIQ Shared Unit through an optional asynchronous bridge.
The GIC CPU interface, when integrated with an external distributor component, is a resource for supporting and managing interrupts in a cluster system.
The DynamIQ Shared Unit (DSU) contains the L3 cache and logic required to maintain coherence between the cores in the cluster. For more information, see the Arm® DynamIQ™ Shared Unit Technical Reference Manual.
The Cortex-A55 core supports a range of debug, test, and trace options including:
Details of the core-specific debug elements can be found in this document. For information on the cluster debug and trace components supported by the Cortex-A55 core, see the Arm® DynamIQ™ Shared Unit Technical Reference Manual.