2.1.1 Components of the processor
The main components of the processor are:
- Instruction fetch.
- Instruction decode.
- Instruction dispatch.
- Integer execute.
- Load/Store unit.
- L2 memory system.
- Advanced SIMD and Floating-point unit.
- GIC CPU interface.
- Generic Timer.
- Debug and trace.
The instruction fetch unit fetches instructions from L1 instruction cache and delivers up to three instructions per cycle to the instruction decode unit. It supports dynamic and static branch prediction.
The instruction fetch unit includes:
- L1 instruction cache that is a 48KB 3-way set-associative cache with a 64-byte cache line and
optional dual-bit parity protection per 32 bits in the Data RAM and 36 bits in the Tag
- 48-entry fully-associative L1 instruction Translation
Lookaside Buffer (TLB) with native support for 4KB, 64KB,
and 1MB page sizes.
- 2-level dynamic predictor with Branch
Target Buffer (BTB) for fast target generation.
- Static branch predictor.
- Indirect predictor.
- Return stack.
The instruction decode unit decodes the following instruction sets:
The instruction decode unit supports the A32, T32, and A64
Advanced SIMD and Floating-point instruction sets. The instruction
decode unit also performs register renaming to facilitate out-of-order
execution by removing Write-After-Write (WAW)
and Write-After-Read (WAR) hazards.
The instruction dispatch unit controls when the decoded instructions are dispatched to the execution pipelines and when the returned results are retired. It includes:
ARM core general-purpose registers.
- The Advanced SIMD and Floating-point register set.
- The AArch32 CP15 and AArch64 System registers.
The integer execute unit includes:
- Two Arithmetic Logical Unit (ALU)
- Integer multiply-accumulate and ALU pipelines.
- Iterative integer divide hardware.
- Branch and instruction condition codes resolution
- Result forwarding and comparator logic.
The Load/Store (LS) execution unit executes load and store instructions and encompasses the L1 data side memory system. It also services memory coherency requests from the L2 memory system.
The load/store unit includes:
- L1 data cache that is a 32KB 2-way set-associative cache with a
64-byte cache line and optional Error Correction
Code (ECC) protection per 32 bits.
- 32-entry fully-associative L1 data TLB with native support for
4KB, 64KB, and 1MB page sizes.
- Automatic hardware prefetcher that generates prefetches
targeting the L1D cache and the L2 cache.
L2 memory system
The L2 memory system services L1 instruction and data cache misses from each processor. It manages requests on the AMBA 4 AXI Coherency Extensions (ACE) or CHI master interface and the optional Accelerator Coherency Port (ACP) slave interface.
The L2 memory system includes:
L2 cache that is:
- 512KB, 1MB, 2MB, or 4MB
- 16-way set-associative cache with optional data
protection per 64 bits.
- Duplicate copy of L1 data cache Tag RAMs from each processor for handling snoop
- 4-way set-associative of 1024-entry L2 TLB in each processor.
- Automatic hardware
prefetcher with programmable instruction fetch
Advanced SIMD and Floating-point unit
The Advanced SIMD and Floating-point unit provides support for the ARMv8 Advanced SIMD and Floating-point execution. In addition, the Advanced SIMD and Floating-point unit provides support for the optional Cryptography engine.
NoteThe optional Cryptography engine is not included in the base product of the
Cortex-A72 processor. ARM requires licensees to have contractual rights to
obtain the Cortex-A72 processor Cryptography engine.
GIC CPU interface
The Generic Interrupt Controller (GIC) CPU interface delivers interrupts to the processor.
The Generic Timer provides the ability to schedule events and trigger interrupts.
Debug and trace
The debug and trace unit includes:
- Support for ARMv8 Debug architecture with an AMBA Advanced Peripheral Bus
(APB) slave interface for access to the debug registers.
- Performance Monitor Unit (PMU) based on the PMUv3 architecture.
- Embedded Trace Macrocell (ETM) based on the ETMv4 architecture and an AMBA
Advanced Trace Bus (ATB) interface for each processor.
- Cross trigger interfaces for