2.4.4 Performance Monitoring Unit

The PMU events and counters indicate the runtime performance of the CCI-550.

The CCI-550 includes logic to gather various statistics on the operation of the interconnect during runtime, using events and counters. These events provide useful information about the behavior of the interconnect, that you can use when debugging or profiling traffic.
The PMU provides eight counters. Each counter can count any of the events available in the CCI-550. To keep the PMU logic overhead to a minimum, the absolute count and timing of events might vary slightly. This variation has a negligible effect except when the counters are enabled for a very short time.
The PMU consists of:
The PMU obeys the following rules:
This section describes:

PMU event list

The CCI-550 can generate a wide range of events, attributed to a specific interface or globally where they apply to central functions. A list of these events and interface identifiers enables you to identify and then program the events and source locations you want to monitor.

To program the CCI-550 use the code column in each respective table to identify the value to program in to each register field. If you monitor events using the EVNTBUS, then use the EVNTBUS offset column to identify each position of the bit.
Each event has a 9-bit configuration identifier comprising a source identifier and an event code concatenated {identifier,code}. The source identifier is a 4-bit code that indicates the interface that generated the 5-bit event code.
The following table shows the possible 4-bit source identifiers.

Table 2-2 Event source identifiers

Code[8:5]
Source
0x0
Slave interface 0, SI0
0x1
Slave interface 1, SI1
0x2
Slave interface 2, SI2
0x3
Slave interface 3, SI3
0x4
Slave interface 4, SI4
0x5
Slave interface 5, SI5
0x6
Slave interface 6, SI6
0x7
Reserved
0x8
Master interface 0, MI0
0x9
Master interface 1, MI1
0xA
Master interface 2, MI2
0xB
Master interface 3, MI3
0xC
Master interface 4, MI4
0xD
Master interface 5, MI5
0xE
Master interface 6, MI6
0xF
Global

Note

As CCI-550 is a configurable product not all interfaces might be present, but the source encodings remain the same. If you select an interface that is not present in the specific implementation, then no events are generated.
The following tables show the 5-bit event codes for slave interfaces, master interfaces, and global events.

Table 2-3 Slave interface event codes

Slave event
Code[4:0]
EVNTBUS offset
Secure exempt ACE only
Read request handshake, where both:
  • ARVALID is HIGH.
  • ARREADY is HIGH.
0x00
0
- -
Read request handshake: Device.
0x01
1
- -
Read request handshake: Normal, Non-shareable.
0x02
2
- -
Read request handshake: Normal, Shareable, non-allocating.
This applies to ReadOnce transactions.
0x03
3
- -
Read request handshake: Normal, Shareable allocating.
This applies to ReadClean, ReadShared, ReadNotSharedDirty, and ReadUnique transactions.
0x04
4
- Y
Read request handshake: invalidation.
This applies to MakeUnique and CleanUnique transactions.
0x05
5
- Y
Read request handshake: cache maintenance operation.
This applies to CleanInvalid, MakeInvalid, and CleanShared transactions.
0x06
6
- -
Read request handshake: DVM.
This applies to DVM Message and DVM Complete transactions.
0x07
7
- -
Read data handshake, where both:
  • RVALID is HIGH.
  • RREADY is HIGH.
0x08
8
Y -
Read data handshake with RLAST HIGH, for a snoop hit.
0x09
9
Y -
Write request handshake, where both:
  • AWVALID is HIGH.
  • AWREADY is HIGH.
0x0A
10
- -
Write request handshake: Device.
0x0B
11
- -
Write request handshake: Non-shareable.
0x0C
12
- -
Write request handshake: Shareable.
This applies to WriteBack and WriteClean transactions.
0x0D
13
- Y
Write request handshake: Shareable.
This applies to WriteLineUnique transactions.
0x0E
14
- -
Write request handshake: Shareable.
This applies to WriteUnique transactions.
0x0F
15
- -
Write request handshake.
This applies to Evict transactions.
0x10
16
- Y
Write request handshake.

Note

This applies to WriteEvict transactions. However, because WriteEvict is not supported in the CCI-550, this event does not fire.
0x11
17
- Y
Write data handshake, where both:
  • WVALID is HIGH.
  • WREADY is HIGH.
0x12
18
Y -
Snoop request handshake, where both:
  • ACVALID is HIGH.
  • ACREADY is HIGH.
0x13
19
- -
Snoop request handshake: read.
This applies to ReadOnce, ReadClean, ReadNotSharedDirty, ReadShared, and ReadUnique transactions.
0x14
20
- Y
Snoop request handshake: clean or invalidate.
This applies to MakeInvalid, CleanInvalid, and CleanShared transactions.
0x15
21
- Y
Snoop response handshake: Data Transfer bit, indicated by CRRESP[0] LOW.
0x16
22
Y -
Read request stall, where both:
  • ARVALID is HIGH.
  • ARREADY is LOW.
0x17
23
- -
Read data stall, where both:
  • RVALID is HIGH.
  • RREADY is LOW.
0x18
24
Y -
Write request stall, where both:
  • AWVALID is HIGH.
  • AWREADY is LOW.
0x19
25
- -
Write data stall, where both:
  • WVALID is HIGH.
  • WREADY is LOW.
0x1A
26
Y -
Write response stall, where both:
  • BVALID is HIGH.
  • BREADY is LOW.
0x1B
27
Y -
Snoop request stall, where both:
  • ACVALID is HIGH.
  • ACREADY is LOW.
0x1C
28
- -
Snoop data stall, where both:
  • CDVALID is HIGH.
  • CDREADY is LOW.
0x1D
29
Y Y
Request stall cycle because of OT transaction limit.
0x1E
30
- -
Read stall because of arbitration.
0x1F
31
- -
The following table shows the event codes for master interfaces.

Table 2-4 Master interface event codes

Master event
Code[4:0]
EVNTBUS offset
Secure exempt
Read data handshake.
0x00
0
Y
Write data handshake.
0x01
1
Y
Read request stall, where both:
  • ARVALID is HIGH.
  • ARREADY is LOW.
0x02
2
-
Read data stall, where both:
  • RVALID is HIGH.
  • RREADY is LOW.
0x03
3
Y
Write request stall, where both:
  • AWVALID is HIGH.
  • AWREADY is LOW.
0x04
4
-
Write data stall, where both:
  • WVALID is HIGH.
  • WREADY is LOW.
0x05
5
Y
Write response stall, where both:
  • BVALID is HIGH.
  • BREADY is LOW.
0x06
6
Y
The following table shows the event codes for global events.

Table 2-5 Event codes for global events

Global event
Code[4:0]
EVNTBUS offset
Secure exempt
Access to snoop filter bank 0 or 1, any response.
0x00
0
-
Access to snoop filter bank 2 or 3, any response.
0x01
1
-
Access to snoop filter bank 4 or 5, any response.
0x02
2
-
Access to snoop filter bank 6 or 7, any response.
0x03
3
-
Access to snoop filter bank 0 or 1, miss response.
0x04
4
-
Access to snoop filter bank 2 or 3, miss response.
0x05
5
-
Access to snoop filter bank 4 or 5, miss response.
0x06
6
-
Access to snoop filter bank 6 or 7, miss response.
0x07
7
-
Back-invalidation from snoop filter.
0x08
8
-
Requests that allocate into a snoop filter bank might be stalled because all ways are used.
The snoop filter RAM might be too small.
0x09
9
Y
Stall because TT full.
Increase TT_DEPTH parameter to avoid performance degradation.
0x0A
10
-
CCI-generated write request.
0x0B
11
-
CD handshake in snoop network. Use this to measure snoop data bandwidth. Each event corresponds to:
  • 16 bytes of snoop data, if the CCI-550 is configured with a single-layer snoop data network.
  • Either 16 bytes or 32 bytes of snoop data, if the CCI-550 is configured with a dual-layer snoop data network.
0x0C
12
Y
Request stall because of address hazard.
0x0D
13
-
Snoop request stall because of snoop TT being full.
0x0E
14
Y
Snoop request type override for TZMP1 protection.
0x0F
15
Y

Event bus

The CCI-550 exports a vector of event signals providing information from the Performance Monitor Unit (PMU) using the EVNTBUS signal. The width of this bus varies depending on the number of master and slave interfaces in your CCI-550 implementation.

The EVNTBUS output is a concatenation of all events that is, global events and events on each Master Interface (MI) and Slave Interface (SI). The global events are always the least significant bits from [15:0] irrespective of the number of interfaces. Table 2-5 Event codes for global events lists the bit offsets in the EVNTBUS output.

Note

By default, only events for Non-secure transactions are recorded. However, if the SPNIDEN input signal is HIGH, or if both DBGEN and SPIDEN inputs are HIGH, then the CCI-550 counts and exports both Secure and Non-secure events. Events marked in the tables as Secure exempt do not have a security classification, so they are counted and exported in either case.

PMU registers

The CCI-550 contains the following performance-related registers:

Using the PMU

You can run performance and monitor tests to check the CCI-550 performance.

For each performance and monitor test that you run, you can:
  • Select a maximum of eight events to monitor during the test.
  • Read the value of each event counter at the end of the test.
  • Detect counter overflows.

    Note

    The CCI-550 PMU does not include a clock counter because the clock can be disabled to save power. To make time-related measurements, you must use another system timer, for example, the clock counter in the processor PMU.
Use the following registers to set up your test, and to monitor each event:
  • Event Select Register to select the event.
  • Event Counter Control Register to enable or disable the event counter.
  • Event Count Register to indicate how many events occur.

    Note

    The event counters are clock gated when not enabled. You must enable the event counters before writing values to an Event Count Register.
  • Event Overflow Flag Status Register to detect the event counter overflow.

Example of how to use the PMU

Use the following example to run a test scenario and show how to use the PMU to measure the snoop hit rate for shareable read requests for one ACE master and one ACE-Lite master.

In this example, it is assumed that the ACE master is connected to slave interface 3 and the ACE-Lite master is connected to slave interface 2.

Procedure

  1. Set up the performance counters as follows:
    1. Program the Event Select Registers as follows:
      • Program the counter 0 register to count shareable, non-allocating read requests through slave interface 3:
        • Program bits[8:5] to 0x3 to select slave interface 3.
        • Program bits[4:0] to 0x03 to select the event for Read request handshake: normal, shareable, non-allocating.
      • Program the counter 1 register to count shareable, allocating read requests through slave interface 3:
        • Program bits[8:5] to 0x3 to select slave interface 3.
        • Program bits[4:0] to 0x04 to select the event for Read request handshake: normal, shareable, non-allocating.
      • Program the counter 2 register to count slave interface 3 snoop hits:
        • Program bits[8:5] to 0x3.
        • Program bits[4:0] to 0x09.
      • Program the counter 3 register to count shareable non-allocating read requests through slave interface 2:
        • Program bits[8:5] to 0x2.
        • Program bits[4:0] to 0x03.
      • Program the counter 4 register to count slave interface 2 snoop hits:
        • Program bits[8:5] to 0x2.
        • Program bits[4:0] to 0x09.
    2. Enable the event counters by programming the Count Control Registers as follows:
      • Set counter 0 register bit[0] to 1.
      • Set counter 1 register bit[0] to 1.
      • Set counter 2 register bit[0] to 1.
      • Set counter 3 register bit[0] to 1.
      • Set counter 4 register bit[0] to 1.
  2. Ensure that the NIDEN and SPNIDEN input are HIGH.
  3. Program the following bits in the Performance Monitor Control Register (PMCR):
    • Program bit[1] to 1 to reset event counters.
    • Program bit[0] to 1 to enable all counters.
  4. Permit the test to run for an appropriate amount of time.
  5. Program the PMCR bit[0] to 0 to disable all counters to stop the test:
  6. Read the results of the test from the event counters:
    • Counter 0 and 1 hold the number of shareable reads for slave interface 3.
    • Counter 2 holds the number of snoop hits for slave interface 3.
    • Counter 3 holds the number of shareable reads for slave interface 4.
    • Counter 4 holds the number of snoop hits for slave interface 4.
  7. Check the overflow bits of all counters and adjust your results accordingly.
Non-ConfidentialPDF file icon PDF versionARM 100282_0100_00_en
Copyright © 2015, 2016 ARM. All rights reserved.