2.1.1 Components of the processor

The main components of the processor are:

  • Instruction fetch.
  • Instruction decode.
  • Instruction dispatch.
  • Integer execute.
  • Load/Store unit.
  • L2 memory system.
  • Advanced SIMD and Floating-point unit.
  • GIC CPU interface.
  • Generic Timer.
  • Debug and trace.

Instruction fetch

The instruction fetch unit fetches instructions from L1 instruction cache and delivers up to three instructions per cycle to the instruction decode unit. It supports dynamic and static branch prediction.

The instruction fetch unit includes:
  • L1 instruction cache that is a 48KB 3-way set-associative cache with a 64-byte cache line and optional dual-bit parity protection per 32 bits in the Data RAM and 36 bits in the Tag RAM.
  • 48-entry fully-associative L1 instruction Translation Lookaside Buffer (TLB) with native support for 4KB, 64KB, and 1MB page sizes.
  • 2-level dynamic predictor with Branch Target Buffer (BTB) for fast target generation.
  • Static branch predictor.
  • Indirect predictor.
  • Return stack.

Instruction decode

The instruction decode unit decodes the following instruction sets:

  • A32.
  • T32.
  • A64.
The instruction decode unit supports the A32, T32, and A64 Advanced SIMD and Floating-point instruction sets. The instruction decode unit also performs register renaming to facilitate out-of-order execution by removing Write-After-Write (WAW) and Write-After-Read (WAR) hazards.

Instruction dispatch

The instruction dispatch unit controls when the decoded instructions are dispatched to the execution pipelines and when the returned results are retired. It includes:

  • The ARM core general-purpose registers.
  • The Advanced SIMD and Floating-point register set.
  • The AArch32 CP15 and AArch64 System registers.

Integer execute

The integer execute unit includes:

  • Two Arithmetic Logical Unit (ALU) pipelines.
  • Integer multiply-accumulate and ALU pipelines.
  • Iterative integer divide hardware.
  • Branch and instruction condition codes resolution logic.
  • Result forwarding and comparator logic.

Load/Store unit

The Load/Store (LS) execution unit executes load and store instructions and encompasses the L1 data side memory system. It also services memory coherency requests from the L2 memory system.

The load/store unit includes:
  • L1 data cache that is a 32KB 2-way set-associative cache with a 64-byte cache line and optional Error Correction Code (ECC) protection per 32 bits.
  • 32-entry fully-associative L1 data TLB with native support for 4KB, 64KB, and 1MB page sizes.
  • Automatic hardware prefetcher that generates prefetches targeting the L1D cache and the L2 cache.

L2 memory system

The L2 memory system services L1 instruction and data cache misses from each processor. It manages requests on the AMBA 4 AXI Coherency Extensions (ACE) or CHI master interface and the optional Accelerator Coherency Port (ACP) slave interface.

The L2 memory system includes:
  • L2 cache that is:
    • 512KB, 1MB, 2MB, or 4MB configurable size.
    • 16-way set-associative cache with optional data ECC protection per 64 bits.
  • Duplicate copy of L1 data cache Tag RAMs from each processor for handling snoop requests.
  • 4-way set-associative of 1024-entry L2 TLB in each processor.
  • Automatic hardware prefetcher with programmable instruction fetch distance.

Advanced SIMD and Floating-point unit

The Advanced SIMD and Floating-point unit provides support for the ARMv8 Advanced SIMD and Floating-point execution. In addition, the Advanced SIMD and Floating-point unit provides support for the optional Cryptography engine.

Note

The optional Cryptography engine is not included in the base product of the Cortex-A72 processor. ARM requires licensees to have contractual rights to obtain the Cortex-A72 processor Cryptography engine.

GIC CPU interface

The Generic Interrupt Controller (GIC) CPU interface delivers interrupts to the processor.

Generic Timer

The Generic Timer provides the ability to schedule events and trigger interrupts.

Related information

Debug and trace

The debug and trace unit includes:

  • Support for ARMv8 Debug architecture with an AMBA Advanced Peripheral Bus (APB) slave interface for access to the debug registers.
  • Performance Monitor Unit (PMU) based on the PMUv3 architecture.
  • Embedded Trace Macrocell (ETM) based on the ETMv4 architecture and an AMBA Advanced Trace Bus (ATB) interface for each processor.
  • Cross trigger interfaces for core debugging.
Non-ConfidentialPDF file icon PDF versionARM 100095_0002_04_en
Copyright © 2014-2016 ARM. All rights reserved.