C2.4 PMU events

The following table shows the events that are generated and the numbers that the PMU uses to reference the events. The table also shows the bit position of each event on the event bus. Event reference numbers that are not listed are reserved.

Table C2-3 PMU events

Event number

PMU event bus (to trace) Event mnemonic Event name
0x00 - SW_INCR Instruction architecturally executed, condition code check pass, software increment.
0x01 [0] L1I_CACHE_REFILL

Level 1 instruction cache refill.

This event counts any instruction fetch which misses in the cache.

The following instructions are not counted:

  • Cache maintenance instructions.
  • Non-cacheable accesses.
0x02 [1] L1I_TLB_REFILL

Level 1 instruction TLB refill.

This event counts any refill of the instruction L1 TLB from the L2 TLB. This includes refills which result in a translation fault.

The following instructions are not counted:

  • TLB maintenance instructions.

This event counts regardless of whether the MMU is enabled.

0x03 [2] L1D_CACHE_REFILL

Level 1 data cache refill.

This event counts any load or store operation or pagewalk access which causes data to be read from outside the L1, including accesses which do not allocate into L1.

The following instructions are not counted:

  • Cache maintenance instructions and prefetches.
  • Stores of an entire cache line, even if they make a coherency request outside the L1.
  • Partial cache line writes which do not allocate into the L1 cache.
  • Non-cacheable accesses.

This event counts the sum of L1D_CACHE_REFILL_RD and L1D_CACHE_REFILL_WR.

0x04 [3] L1D_CACHE

Level 1 data cache access.

This event counts any load or store operation or pagewalk access which looks up in the L1 data cache. In particular, any access which could count the L1D_CACHE_REFILL event causes this event to count.

The following instructions are not counted:

  • Cache maintenance instructions and prefetches.
  • Non-cacheable accesses.

This event counts the sum of L1D_CACHE_RD and L1D_CACHE_WR.

0x05 [4] L1D_TLB_REFILL

Level 1 data TLB refill.

This event counts any refill of the data L1 TLB from the L2 TLB. This includes refills which result in a translation fault.

The following instructions are not counted:

  • TLB maintenance instructions.

This event counts regardless of whether the MMU is enabled.

0x06 [5] LD_RETIRED

Instruction architecturally executed, condition code check pass, load.

This event counts all load and prefetch instructions.

This includes the Arm®v8.1‑A atomic instructions, other than the ST* variants.

0x07 [6] ST_RETIRED

Instruction architecturally executed, condition code check pass, store.

This event counts all store instructions and DC ZVA.

This includes all the Armv8.1‑A atomic instructions.

The following instructions are not counted:

  • Store-Exclusive instructions which fail.
0x08 [7] INST_RETIRED

Instruction architecturally executed.

This event counts all retired instructions, including those that fail their condition check.

0x09 [8] EXC_TAKEN

Exception taken.

0x0A [9] EXC_RETURN

Instruction architecturally executed, condition code check pass, exception return.

0x0B [10] CID_WRITE_RETIRED

Instruction architecturally executed, condition code check pass, write to CONTEXTIDR.

This event only counts writes to CONTEXTIDR in AArch32, and via the CONTEXTIDR_EL1 mnemonic in AArch64.

The following instructions are not counted:

  • Writes to CONTEXTIDR_EL12 and CONTEXTIDR_EL2.
0x0C [11] PC_WRITE_RETIRED

Instruction architecturally executed, condition code check pass, software change of the PC.

This event counts all branches taken and popped from the branch monitor. This excludes exception entries, debug entries, and CCFAIL branches.

0x0D [12] BR_IMMED_RETIRED

Instruction architecturally executed, immediate branch.

This event counts all branches decoded as immediate branches, taken or not, and popped from the branch monitor. This excludes exception entries, debug entries, and CCFAIL branches.

0x0E [13] BR_RETURN_RETIRED

Instruction architecturally executed, condition code check pass, procedure return.

0x0F [14] UNALIGNED_LDST_RETIRED

Instruction architecturally executed, condition code check pass, unaligned load or store.

0x10 [15] BR_MIS_PRED Mispredicted or not predicted branch speculatively executed.

This event counts any predictable branch instruction which is mispredicted either due to dynamic misprediction or because the MMU is off and the branches are statically predicted not taken.

0x11 - CPU_CYCLES Cycle.
0x12 [16] BR_PRED

Predictable branch speculatively executed.

This event counts all predictable branches.

0x13 [17] MEM_ACCESS Data memory access.

This event counts memory accesses due to load or store instructions.

The following instructions are not counted:

  • Instruction fetches.
  • Cache maintenance instructions.
  • Translation table walks or prefetches.
This event counts the sum of MEM_ACCESS_RD and MEM_ACCESS_WR.
0x14 [18] L1I_CACHE

Level 1 instruction cache access.

This event counts any instruction fetch which accesses the L1 instruction cache.

The following instructions are not counted:

  • Cache maintenance instructions.
  • Non-cacheable accesses.
0x15 [19] L1D_CACHE_WB

Level 1 data cache Write-Back.

This event counts any write back of data from the L1 data cache to L2 or L3. This counts both victim line evictions and snoops, including cache maintenance operations.

The following instructions are not counted:

  • Invalidations which do not result in data being transferred out of the L1.
  • Full-line writes which write to L2 without writing L1, such as write-streaming mode.
0x16 [20] L2D_CACHE

Level 2 data cache access.

  • If the core is configured with a per-core L2 cache:

    This event counts any transaction from L1 which looks up in the L2 cache, and any write-back from the L1 to the L2. Snoops from outside the core and cache maintenance operations are not counted.

  • If the core is not configured with a per-core L2 cache:

    This event counts the cluster cache event, as defined by L3D_CACHE.

  • If there is neither a per-core cache nor a cluster cache configured, then this event is not implemented.
0x17 [21] L2D_CACHE_REFILL

Level 2 data cache refill.

  • If the core is configured with a per-core L2 cache:

    This event counts any cacheable transaction from L1 which causes data to be read from outside the core. L2 refills caused by stashes into L2 should not be counted.

  • If the core is not configured with a per-core L2 cache:

    This event counts the cluster cache event, as defined by L3D_CACHE_REFILL.

  • If there is neither a per-core cache nor a cluster cache configured, then this event is not implemented.
0x18 [22] L2D_CACHE_WB

Level 2 data cache Write-Back.

  • If the core is configured with a per-core L2 cache:

    This event counts any write back of data from the L2 cache to outside the core. This includes snoops to the L2 which return data, regardless of whether they cause an invalidation. Invalidations from the L2 which do not write data outside of the core and snoops which return data from the L1 are not counted.

  • If the core is not configured with a per-core L2 cache, this event is not implemented.

0x19 [23] BUS_ACCESS

Bus access.

This event counts for every beat of data transferred over the data channels between the core and the SCU. If both read and write data beats are transferred on a given cycle, this event is counted twice on that cycle.

This event counts the sum of BUS_ACCESS_RD and BUS_ACCESS_WR.

0x1A [24] MEMORY_ERROR

Local memory error.

This event counts any correctable or uncorrectable memory error (ECC or parity) in the protected core RAMs.

0x1B - INT_SPEC

Operation speculatively executed.

This event duplicates INST_RETIRED.

0x1C [25] TTBR_WRITE_RETIRED Instruction architecturally executed, condition code check pass, write to TTBR.

This event only counts writes to TTBR0/TTBR1 in AArch32 and TTBR0_EL1/TTBR1_EL1 in AArch64.

The following instructions are not counted:

  • Accesses to TTBR0_EL12/TTBR1_EL12 or TTBR0_EL2/TTBR1_EL2.
0x1D - BUS_CYCLES Bus cycles.

This event duplicates CPU_CYCLES.

0x1E - CHAIN

Odd performance counter chain mode.

0x20 [26] L2D_CACHE_ALLOCATE

Level 2 data cache allocation without refill.

  • If the core is configured with a per-core L2 cache:

    This event counts any full cache line write into the L2 cache which does not cause a linefill, including write-backs from L1 to L2 and full-line writes which do not allocate into L1.

  • If the core is not configured with a per-core L2 cache:

    This event counts the cluster cache event, as defined by L3D_CACHE_ALLOCATE.

  • If there is neither a per-core cache nor a cluster cache configured, this event is not implemented.
0x21 [27] BR_RETIRED

Instruction architecturally executed, branch.

This event counts all branches, taken or not, popped from the branch monitor. This excludes exception entries, debug entries, and CCFAIL branches. In the Cortex®-A55 core, an ISB is a branch, and even micro architectural ISBs are counted.

0x22 [28] BR_MIS_PRED_RETIRED

Instruction architecturally executed, mispredicted branch.

This event counts any branch counted by BR_RETIRED which is not correctly predicted and causes a pipeline flush.

0x23 [29] STALL_FRONTEND

No operation issued because of the frontend.

The counter counts on any cycle when no operations are issued due to the instruction queue being empty.

0x24 [30] STALL_BACKEND

No operation issued because of the backend.

The counter counts on any cycle when no operations are issued due to a pipeline stall.

0x25 [31] L1D_TLB

Level 1 data TLB access.

This event counts any load or store operation which accesses the data L1 TLB. If both a load and a store are executed on a cycle, this event counts twice.

This event counts regardless of whether the MMU is enabled.
0x26 [32] L1I_TLB

Level 1 instruction TLB access.

This event counts any instruction fetch which accesses the instruction L1 TLB.

This event counts regardless of whether the MMU is enabled.
0x29 [33] L3D_CACHE_ALLOCATE

Attributable Level 3 unified cache allocation without refill.

  • If the core is configured with a per-core L2 cache and the cluster is configured with an L3 cache:

    This event counts any full cache line write into the L3 cache which does not cause a linefill, including write-backs from L2 to L3 and full-line writes which do not allocate into L2.

  • If either the core is configured without a per-core L2 or the cluster is configured without an L3 cache, this event is not implemented.

0x2A [34] L3D_CACHE_REFILL

Attributable Level 3 unified cache refill.

  • If the core is configured with a per-core L2 cache and the cluster is configured with an L3 cache:

    This event counts for any cacheable read transaction returning data from the SCU for which the data source was outside the cluster. Transactions such as ReadUnique are counted here as “read” transactions, even though they can be generated by store instructions.

  • If either the core is configured without a per-core L2 or the cluster is configured without an L3 cache, this event is not implemented.

0x2B [35] L3D_CACHE

Attributable Level 3 unified cache access.

  • If the core is configured with a per-core L2 cache and the cluster is configured with an L3 cache:

    This event counts for any cacheable read transaction returning data from the SCU, or for any cacheable write to the SCU.

  • If either the core is configured without a per-core L2 or the cluster is configured without an L3 cache, this event is not implemented.

0x2D [36] L2D_TLB_REFILL

Attributable Level 2 unified TLB refill.

This event counts on any refill of the L2 TLB, caused by either an instruction or data access.

This event does not count if the MMU is disabled.

0x2F [37] L2D_TLB

Attributable Level 2 unified TLB access.

This event counts on any access to the L2 TLB (caused by a refill of any of the L1 TLBs).

This event does not count if the MMU is disabled.

0x34 [39] DTLB_WALK

Access to data TLB that caused a page table walk.

This event counts on any data access which causes L2D_TLB_REFILL to count.

0x35 [40] ITLB_WALK

Access to instruction TLB that caused a page table walk.

This event counts on any instruction access which causes L2D_TLB_REFILL to count.

0x36 [41] LL_CACHE_RD

Last level cache access, read.

  • If CPUECTLR.EXTLLC is set:

    This event counts any cacheable read transaction which returns a data source of "interconnect cache".

  • If CPUECTLR.EXTLLC is not set:

    This event is a duplicate of the L*D_CACHE_RD event corresponding to the last level of cache implemented – L3D_CACHE_RD if both per-core L2 and cluster L3 are implemented, L2D_CACHE_RD if only one is implemented, or L1D_CACHE_RD if neither is implemented.

0x37 [42] LL_CACHE_MISS_RD

Last level cache miss, read.

  • If CPUECTLR.EXTLLC is set:

    This event counts any cacheable read transaction which returns a data source of "DRAM", "remote" or "inter-cluster peer".

  • If CPUECTLR.EXTLLC is not set:

    This event is a duplicate of the L*D_CACHE_REFILL_RD event corresponding to the last level of cache implemented – L3D_CACHE_REFILL_RD if both per-core L2 and cluster L3 are implemented, L2D_CACHE_REFILL_RD if only one is implemented, or L1D_CACHE_REFILL_RD if neither is implemented.

0x38 [38] REMOTE_ACCESS_RD Access to another socket in a multi-socket system, read.

This event counts any read transaction which returns a data source of "remote".

0x40 - L1D_CACHE_RD

Level 1 data cache access, read.

This event counts any load operation or pagewalk access which looks up in the L1 data cache. In particular, any access which could count the L1D_CACHE_REFILL_RD event causes this event to count.

The following instructions are not counted:

  • Cache maintenance instructions and prefetches.
  • Non-cacheable accesses.
0x41 - L1D_CACHE_WR

Level 1 data cache access, write.

This event counts any store operation which looks up in the L1 data cache. In particular, any access which could count the L1D_CACHE_REFILL event causes this event to count.

The following instructions are not counted:

  • Cache maintenance instructions and prefetches.
  • Non-cacheable accesses.
0x42 - L1D_CACHE_REFILL_RD

Level 1 data cache refill, read.

This event counts any load operation or pagewalk access which causes data to be read from outside the L1, including accesses which do not allocate into L1.

The following instructions are not counted:

  • Cache maintenance instructions and prefetches.
  • Non-cacheable accesses.
0x43 - L1D_CACHE_REFILL_WR

Level 1 data cache refill, write.

This event counts any store operation which causes data to be read from outside the L1, including accesses which do not allocate into L1.

The following instructions are not counted:

  • Cache maintenance instructions and prefetches.
  • Stores of an entire cache line, even if they make a coherency request outside the L1.
  • Partial cache line writes which do not allocate into the L1 cache.
  • Non-cacheable accesses.
0x44 - L1D_CACHE_REFILL_INNER Level 1 data cache refill, inner.

This event counts any L1 D-cache linefill (as counted by L1D_CACHE_REFILL) which hits in the L2 cache, L3 cache or another core in the cluster.

0x45 - L1D_CACHE_REFILL_OUTER Level 1 data cache refill, outer.

This event counts any L1 D-cache linefill (as counted by L1D_CACHE_REFILL) which does not hit in the L2 cache, L3 cache or another core in the cluster, and instead obtains data from outside the cluster.

0x50 - L2D_CACHE_RD

Level 2 cache access, read.

  • If the core is configured with a per-core L2 cache:

    This event counts any read transaction from L1 which looks up in the L2 cache. Snoops from outside the core are not counted.

  • If the core is configured without a per-core L2 cache:

    This event counts the cluster cache event, as defined by L3D_CACHE_RD.

  • If there is neither a per-core cache nor a cluster cache configured, this event is not implemented.
0x51 - L2D_CACHE_WR

Level 2 cache access, write.

  • If the core is configured with a per-core L2 cache:

    This event counts any write transaction from L1 which looks up in the L2 cache or any write-back from L1 which allocates into the L2 cache. Snoops from outside the core are not counted.

  • If the core is configured without a per-core L2 cache:

    This event counts the cluster cache event, as defined by L3D_CACHE_WR.

  • If there is neither a per-core cache nor a cluster cache configured, this event is not implemented.
0x52 - L2D_CACHE_REFILL_RD

Level 2 cache refill, read.

  • If the core is configured with a per-core L2 cache:

    This event counts any cacheable read transaction from L1 which causes data to be read from outside the core. L2 refills caused by stashes into L2 should not be counted. Transactions such as ReadUnique are counted here as “read” transactions, even though they can be generated by store instructions.

  • If the core is configured without a per-core L2 cache:

    This event counts the cluster cache event, as defined by L3D_CACHE_REFILL_RD.

  • If there is neither a per-core cache nor a cluster cache configured, this event is not implemented.
0x53 - L2D_CACHE_REFILL_WR

Level 2 cache refill, write.

  • If the core is configured with a per-core L2 cache:

    This event counts any write transaction from L1 which causes data to be read from outside the core. L2 refills caused by stashes into L2 should not be counted. Transactions such as ReadUnique are not counted as write transactions.

  • If the core is configured without a per-core L2 cache:

    This event counts the cluster cache event, as defined by L3D_CACHE_REFILL_WR.

  • If there is neither a per-core cache nor a cluster cache configured, this event is not implemented.
0x60 - BUS_ACCESS_RD

Bus access, read.

This event counts for every beat of data transferred over the read data channel between the core and the SCU.

0x61 - BUS_ACCESS_WR

Bus access, write.

This event counts for every beat of data transferred over the write data channel between the core and the SCU.

0x66 - MEM_ACCESS_RD Data memory access, read.

This event counts memory accesses due to load instructions.

The following instructions are not counted:

  • Instruction fetches.
  • Cache maintenance instructions.
  • Translation table walks.
  • Prefetches.
0x67 - MEM_ACCESS_WR Data memory access, write.

This event counts memory accesses due to store instructions.

The following instructions are not counted:

  • Instruction fetches.
  • Cache maintenance instructions.
  • Translation table walks.
  • Prefetches.
0x70 - LD_SPEC

Operation speculatively executed, load.

This event duplicates LD_RETIRED.

0x71 - ST_SPEC

Operation speculatively executed, store.

This event duplicates ST_RETIRED.

0x72 - LDST_SPEC

Operation speculatively executed, load or store.

This event counts the sum of LD_SPEC and ST_SPEC.

0x73 - DP_SPEC

Operation speculatively executed, integer data processing.

This event counts retired integer data-processing instructions.

0x74 - ASE_SPEC

Operation speculatively executed, Advanced SIMD instruction.

This event counts retired Advanced SIMD instructions.

0x75 - VFP_SPEC

Operation speculatively executed, floating-point instruction.

This event counts retired floating-point instructions.

0x76 - PC_WRITE_SPEC

Operation speculatively executed, software change of the PC.

This event counts retired branch instructions.

0x77 - CRYPTO_SPEC

Operation speculatively executed, Cryptographic instruction.

This event counts retired Cryptographic instructions.

0x78 - BR_IMMED_SPEC

Branch speculatively executed, immediate branch.

This event duplicates BR_IMMED_RETIRED.

0x79 - BR_RETURN_SPEC

Branch speculatively executed, procedure return.

This event duplicates BR_RETURN_RETIRED.

0x7A - BR_INDIRECT_SPEC Branch speculatively executed, indirect branch.

This event counts retired indirect branch instructions.

0x86 - EXC_IRQ

Exception taken, IRQ.

0x87 - EXC_FIQ Exception taken, FIQ.
0xA0 - L3D_CACHE_RD

Attributable Level 3 unified cache access, read.

This event counts for any cacheable read transaction returning data from the SCU.

If either the core is configured without a per-core L2 or the cluster is configured without an L3 cache, this event is not implemented.

0xA2 - L3D_CACHE_REFILL_RD Attributable Level 3 unified cache refill, read.

This event duplicates L3D_CACHE_REFILL.

If either the core is configured without a per-core L2 or the cluster is configured without an L3 cache, this event is not implemented.

0xC0 - L3D_CACHE_REFILL_PREFETCH

Level 3 cache refill due to prefetch.

This event counts any linefills from the hardware prefetcher which cause an allocation into the L3 cache.

Note:

It might not be possible to both distinguish hardware vs software prefetches and also which prefetches cause an allocation. If so, only hardware prefetches should be counted, regardless of whether they allocate.

If either the core is configured without a per-core L2 or the cluster is configured without an L3 cache, this event is not implemented.

0xC1 - L2D_CACHE_REFILL_PREFETCH

Level 2 cache refill due to prefetch.

  • If the core is configured with a per-core L2 cache:

    This event does not count.

  • If the core is configured without a per-core L2 cache:

    This event counts the cluster cache event, as defined by L3D_CACHE_REFILL_PREFETCH.

  • If there is neither a per-core cache nor a cluster cache configured, this event is not implemented.
0xC2 - L1D_CACHE_REFILL_PREFETCH

Level 1 data cache refill due to prefetch.

This event counts any linefills from the prefetcher which cause an allocation into the L1 D-cache.

0xC3 - L2D_WS_MODE

Level 2 cache write streaming mode.

This event counts for each cycle where the core is in write-streaming mode and not allocating writes into the L2 cache.

0xC4 - L1D_WS_MODE_ENTRY

Level 1 data cache entering write streaming mode.

This event counts for each entry into write-streaming mode.
0xC5 - L1D_WS_MODE

Level 1 data cache write streaming mode.

This event counts for each cycle where the core is in write-streaming mode and not allocating writes into the L1 D-cache.
0xC6 - PREDECODE_ERROR

Predecode error.

0xC7 - L3D_WS_MODE

Level 3 cache write streaming mode.

This event counts for each cycle where the core is in write-streaming mode and not allocating writes into the L3 cache.
0xC9 - BR_COND_PRED

Predicted conditional branch executed.

This event counts when any branch which can be predicted by the conditional predictor is retired. This event still counts when branch prediction is disabled due to the MMU being off.
0xCA - BR_INDIRECT_MIS_PRED

Indirect branch mis-predicted.

This event counts when any indirect branch which can be predicted by the BTAC is retired, and has mis-predicted for either the condition or the address. This event still counts when branch prediction is disabled due to the MMU being off.
0xCB - BR_INDIRECT_ADDR_MIS_PRED

Indirect branch mis-predicted due to address mis-compare.

This event counts when any indirect branch which can be predicted by the BTAC is retired, was taken and correctly predicted the condition, and has mis-predicted the address. This event still counts when branch prediction is disabled due to the MMU being off.
0xCC - BR_COND_MIS_PRED

Conditional branch mis-predicted.

This event counts when any branch which can be predicted by the conditional predictor is retired, and has mis-predicted the condition. This event still counts when branch prediction is disabled due to the MMU being off. Conditional indirect branches which correctly predicted the condition but mis-predicted on the address do not count this event.
0xCD - BR_INDIRECT_ADDR_PRED

Indirect branch with predicted address executed.

This event counts when any indirect branch which can be predicted by the BTAC is retired, was taken and correctly predicted the condition. This event still counts when branch prediction is disabled due to the MMU being off.
0xCE - BR_RETURN_ADDR_PRED

Procedure return with predicted address executed.

This event counts when any procedure return which can be predicted by the CRS is retired, was taken and correctly predicted the condition. This event still counts when branch prediction is disabled due to the MMU being off.
0xCF - BR_RETURN_ADDR_MIS_PRED

Procedure return mis-predicted due to address mis-compare.

This event counts when any procedure return which can be predicted by the CRS is retired, was taken and correctly predicted the condition, and has mis-predicted the address. This event still counts when branch prediction is disabled due to the MMU being off.
0xD0 - L2D_LLWALK_TLB

Level 2 TLB last-level walk cache access.

This event does not count if the MMU is disabled.
0xD1 - L2D_LLWALK_TLB_REFILL

Level 2 TLB last-level walk cache refill.

This event does not count if the MMU is disabled.
0xD2 - L2D_L2WALK_TLB

Level 2 TLB level-2 walk cache access.

This event counts accesses to the level-2 walk cache where the last-level walk cache has missed. The event only counts when the translation regime of the pagewalk uses level 2 descriptors. This event does not count if the MMU is disabled.
0xD3 - L2D_L2WALK_TLB_REFILL

Level 2 TLB level-2 walk cache refill.

This event does not count if the MMU is disabled.
0xD4 - L2D_S2_TLB Level 2 TLB IPA cache access.

This event counts on each access to the IPA cache.

  • If a single pagewalk needs to make multiple accesses to the IPA cache, each access is counted.
  • If stage 2 translation is disabled, this event does not count.
0xD5 - L2D_S2_TLB_REFILL Level 2 TLB IPA cache refill.

This event counts on each refill of the IPA cache.

  • If a single pagewalk needs to make multiple accesses to the IPA cache, each access which causes a refill is counted.
  • If stage 2 translation is disabled, this event does not count.
0xD6 - L2D_CACHE_STASH_DROPPED Level 2 cache stash dropped.

This event counts on each stash request received from the interconnect or ACP, that is targeting L2 and gets dropped due to lack of buffer space to hold the request.

0xE1 - STALL_FRONTEND_CACHE No operation issued due to the frontend, cache miss.

This event counts every cycle the DPU IQ is empty and there is an instruction cache miss being processed.

0xE2 - STALL_FRONTEND_TLB No operation issued due to the frontend, TLB miss.

This event counts every cycle the DPU IQ is empty and there is an instruction L1 TLB miss being processed.

0xE3 - STALL_FRONTEND_PDERR No operation issued due to the frontend, pre-decode error.

This event counts every cycle the DPU IQ is empty and there is a pre-decode error being processed.

0xE4 - STALL_BACKEND_ILOCK No operation issued due to the backend interlock.

This event counts every cycle that issue is stalled and there is an interlock. Stall cycles due to a stall in Wr (typically awaiting load data) are excluded.

0xE5 - STALL_BACKEND_ILOCK_AGU No operation issued due to the backend, interlock, AGU.

This event counts every cycle that issue is stalled and there is an interlock that is due to a load/store instruction waiting for data to calculate the address in the AGU. Stall cycles due to a stall in Wr (typically awaiting load data) are excluded.

0xE6 - STALL_BACKEND_ILOCK_FPU No operation issued due to the backend, interlock, FPU.

This event counts every cycle that issue is stalled and there is an interlock that is due to an FPU/NEON instruction. Stall cycles due to a stall in the Wr stage (typically awaiting load data) are excluded.

0xE7 - STALL_BACKEND_LD No operation issued due to the backend, load.

This event counts every cycle there is a stall in the Wr stage due to a load.

0xE8 - STALL_BACKEND_ST No operation issued due to the backend, store.

This event counts every cycle there is a stall in the Wr stage due to a store.

0xE9 - STALL_BACKEND_LD_CACHE No operation issued due to the backend, load, cache miss.

This event counts every cycle there is a stall in the Wr stage due to a load which is waiting on data (due to missing the cache or being non-cacheable).

0xEA - STALL_BACKEND_LD_TLB No operation issued due to the backend, load, TLB miss.

This event counts every cycle there is a stall in the Wr stage due to a load which has missed in the L1 TLB.

0xEB - STALL_BACKEND_ST_STB No operation issued due to the backend, store, STB full.

This event counts every cycle there is a stall in the Wr stage due to a store which is waiting due to the STB being full.

0xEC - STALL_BACKEND_ST_TLB No operation issued due to the backend, store, TLB miss.

This event counts every cycle there is a stall in the Wr stage due to a store which has missed in the L1 TLB.

L2 and L3 cache events (L2D_CACHE*, L3D_CACHE*)

The behavior of these events depends on the configuration of the core.

If the private L2 cache is present, the L2D_CACHE* events count the activity in the private L2 cache, and the L3D_CACHE* events count the activity in the DSU L3 cache (if present).

If the private L2 cache is not present but the DSU L3 cache is present, the L2D_CACHE* events count activity in the DSU L3 cache and the L3D_CACHE* events do not count. The L2D_CACHE_WB, L2D_CACHE_WR and L2D_CACHE_REFILL_WR events do not count in this configuration.

If neither the private L2 cache nor the DSU L3 cache are present, neither the L2D_CACHE* or L3D_CACHE* events will count.

Last Level cache events (LL_CACHE_*)

The behavior of these events depends on the configuration of the core and the value of the CPUECTLR.EXTLLC/CPUECTLR_EL1.EXTLLC bit.

If the EXTLLC bit is 0:
These events count activity in the last level of data cache implemented in the core. This is the DSU L3 cache if it is present, else the private L2 cache if it is present, otherwise the L1 data cache.
If the EXTLLC bit is 1:
These events count activity in a last level cache outside the core (if present). These events may not count in all implementations.
Non-ConfidentialPDF file icon PDF version100442_0200_00_en
Copyright © 2016–2018 Arm Limited or its affiliates. All rights reserved.