12.1. The Translation Lookaside Buffer

The Translation Lookaside Buffer (TLB) is a cache of recently accessed page translations in the MMU. For each memory access performed by the processor, the MMU checks whether the translation is cached in the TLB. If the requested address translation causes a hit within the TLB, the translation of the address is immediately available.

Each TLB entry typically contains not just physical and Virtual Addresses, but also attributes such as memory type, cache policies, access permissions, the Address Space ID (ASID), and the Virtual Machine ID (VMID). If the TLB does not contain a valid translation for the Virtual Address issued by the processor, known as a TLB miss, an external translation table walk or lookup is performed. Dedicated hardware within the MMU enables it to read the translation tables in memory. The newly loaded translation can then be cached in the TLB for possible reuse if the translation table walk does not result in a page fault. The exact structure of the TLB differs between implementations of the ARM processors.

If the OS modifies translation entries that may have been cached in the TLB, it is then the responsibility of the OS to invalidate these stale TLB entries.

When executing A64 code, there is a TLBI, which is a TLB invalidate instruction.

  TLBI <type><level>{IS} {, <Xt>}

The following list gives some of the more common selections for the type field. A complete list is given in Table 12.1.

ALL

All TLB entries.

VMALL

All TLB entries. This is stage 1 for current guest OS.

VMALLS12

All TLB entries. This is stage 1 and 2 for current guest OS.

ASID

Entries that match ASID in Xt.

VA

Entry for Virtual Address and ASID specified in Xt.

VAA

Entries for Virtual Address specified in Xt, with any ASID.

Each Exception level, that is EL3, EL2, or EL1, has its own Virtual Address space that the operation applies to. The IS field specifies that this is only for Inner Shareable entries.

Note

See Context switching for information about ASIDs and Translation table configuration for more about the concept of shareability.

The <level> field simply specifies the Exception level Virtual Address space (can be 3, 2 or 1) that the operation should apply to.

The IS field specifies that this is only for Inner Shareable entries.

Table 12.1. TLB configuration instructions

TLB invalidateVariantDescription
TLBIALLEnTLB invalidate All, ELn.
ALLEnISTLB invalidate All, ELn, Inner Shareable.
ASIDE1TLB invalidate by ASID, EL1.
ASIDE1ISTLB invalidate by ASID, EL1, Inner Shareable.
IPAS2E1TLB invalidate by IPA, Stage 2, EL1.
IPAS2E1ISTLB invalidate by IPA, Stage 2, EL1, Inner Shareable.
IPAS2LE1ISTLB invalidate by IPA, Stage 2, Last level, EL1, Inner Shareable.
VAAE1 TLB invalidate by VA, All ASID, EL1.
VAAE1ISTLB invalidate by VA, All ASID, EL1, Inner Shareable.
VAALE1ISTLB invalidate for the Last level, by VA, All ASID, EL1, Inner Shareable.
VAEnTLB invalidate by VA, ELn.
VAEnISTLB invalidate by VA, ELn, Inner Shareable.
VALEnTLB invalidate by VA, Last level, ELn.
VALEnISTLB invalidate by VA, Last level, ELn, Inner Shareable.
VMALLE1TLB invalidate by VMID, All at stage 1, EL1.
VMALLE1ISTLB invalidate by VMID, EL1, Inner Shareable.
VMALLS12E1TLB invalidate by VMID, All at Stage 1 and 2, EL1.
VMALLS12E1 TLB invalidate by VMID, All at Stage 1 and 2, EL1.
VMALLS12E1ISTLB invalidate by VMID, All at Stage 1 and 2, EL1 Inner Shareable.
VMALLS12E1IS TLB invalidate by VMID, All at Stage 1 and 2, EL1 Inner Shareable.

The following code example shows a sequence for writes to translation tables backed by inner shareable memory:

  << Writes to Translation Tables >>
  DSB ISHST            // ensure write has completed
  TLBI ALLE1           // invalidate all TLB entries
  DSB ISH              // ensure completion of TLB invalidation
  ISB                  // synchronize context and ensure that no instructions are
                       // fetched using the old translation 

See Barriers for more information about the DSB and ISB barrier instructions shown in the example.

For a change to a single entry, for example, use the instruction:

  TLBI VAE1, X0

which invalidates an entry associated with the address specified in the register X0.

The TLB can hold a fixed number of entries. You can achieve best performance by minimizing the number of external memory accesses caused by translation table traversal and obtaining a high TLB hit rate. The ARMv8-A architecture provides a feature known as contiguous block entries to efficiently use TLB space. Translation table block entries each contain a contiguous bit. When set, this bit signals to the TLB that it can cache a single entry covering translations for multiple blocks. A lookup can index anywhere into an address range covered by a contiguous block. The TLB can therefore cache one entry for a defined range of addresses, making it possible to store a larger range of Virtual Addresses within the TLB than is otherwise possible.

To use a contiguous bit, the contiguous blocks must be adjacent, that is they must correspond to a contiguous range of Virtual Addresses. They must start on an aligned boundary, have consistent attributes, and point to a contiguous output address range at the same level of translation. The required alignment is that VA[20:16] for a 4KB granule or VA[28:21] for a 64KB granule, are the same for all addresses. The following numbers of contiguous blocks are required:

If these conditions are not met, a programming error occurs, which can cause TLB aborts or corrupted lookups. Possible examples of such an error include:

With the ARMv8 architecture, incorrect use does not allow permissions checks outside of EL0 and EL1 valid address space to be escaped, or to erroneously provide access to EL3 space.

Copyright © 2015 ARM. All rights reserved.ARM DEN0024A
Non-ConfidentialID050815