3.10.33 MMU_600

This model is written in C++.

MMU_600 contains the following CADI targets:

  • MMU_600

MMU_600 contains the following MTI components:

MMU_600 - about

MMU_600 is an SMMUv3-compliant device and is used for I/O virtualization of devices.

The hardware is a distributed SMMU. It consists of the following:

  • A single Translation Control Unit (TCU), which has a port for the programming interface of the SMMU, receives DVM messages, and does all the page walking and queue manipulation, for example.
  • One or more Translation Bus Units (TBUs), which translate transactions from upstream client devices into downstream transactions.
  • Zero or more connections to PCIe Root Complexes (PCIe-RCs). A maximum of 62 TBUs and PCIe-RCs can be attached.
  • An interconnect that connects the TBUs and PCIe-RCs to the TCU.

A PCIe-RC has one connection to the TCU to make ATS requests but the PCIe-RC uses one or more TBUs to transform the transactions and pass them to the memory system. In the model, the TBUs used are listed in the parameter list_of_pcie_mode. The SMMU does not know which TBUs a particular PCIe-RC is attached to.

The TBU ORs a value into each StreamID that it receives. In the model, this is configured by the following parameters:

  • list_of_s_sid_high_at_bitpos0
  • list_of_ns_sid_high_at_bitpos0

The TCU, TBU, and the interconnect are all represented by this single model component.

In the model, the tbs_pvbus_s[i] and tbm_pvbus_m[i] port pair represent a TBU i, or tbs_pvbus_s[i] represents an incoming connection for a PCIe-RC. The corresponding reverse connection from the TCU to the PCIe-RC is by a special bus called pvbus_id_routed_m that is used to transport ATC Invalidates to the PCIe-RC.

To reduce system construction complexity, the tbs_pvbus_s[i] and tbm_pvbus_m[i] pair also acts as a TBU so that the PCIe-RC does not need to separate its normal transactions and its ATS requests.

However, ATC Invalidates are only sent to a port which appears in list_of_pcie_rc. It should be uniquely decoded to a single port based on list_of_ns_sid_high_at_bitpos0, and ATC Invalidates must be routed to the correct PCIe subsystem to invalidate the cache of ATS Responses in the subsystem. Therefore all TBUs that a PCIe-RC uses must have a unique reverse mapping from stream id to port.

Note:

A bad configuration renders the model inactive.

Some configuration can be adjusted by configuration pins. These are only sampled at the negative edge of the reset pin. If you want to use these pins, then you must drive them before sending a negative edge on the reset pin. During simulation_reset, the component driving them must also drive this transition again.

The pin sup_oas is not supported, instead it is a parameter, as it is assumed that it would be tied to a fixed value in any specific platform.

Debug reads to the registers do not disturb the state.

Writes to registers with Update flags, including debug writes, are ignored if the Update flag is already set to one.

Debug and real accesses to the registers must be 32 bits or 64 bits.

MSIs are issued on the qtw_pvbus_m port using attributes that are determined by the parameter msi_attribute_transform, while Event queue writes are always issued with ExtendedID=0, UserFlags=0, MasterID=0xFFFFffff. In the hardware, there is no way of distinguishing Event queue writes from MSI writes. However, this provides a mechanism for the model system to distinguish them.

The hardware only has a single cacheability attribute for input transactions, but PVBus transports have both inner and outer cacheability.

For non-PCIe-mode TBUs, whose index does not appear in list_of_pcie_mode or list_of_pcie_rc, for non-cache maintenance operations:

  • If the input attribute is any type of device then it is well-defined as being outer-shared and Device-nGnRnE or Device-nGnRE. There is no support for Gathering or Reordering.
  • If the outer cacheable input attribute is normal, if it is Write-back, this is converted to Inner Write-back Outer Write-back (iWB-oWB) with the desired shareability. No Transient hint is supported and is always treated as non-transient.
  • This leaves all other normal memory types that are mapped to Inner Normal Non-cacheble, Outer Normal Non-cacheable outer shared (iNC-oNC-osh).

Therefore, the upstream devices must present the cacheability in the outer cacheability attribute on PVBus if it is cacheable. If it is a device type then both the inner and outer attributes must be set to the same value. If it is iNC-oNC-osh, then it must be presented as such.

For PCIe-mode TBUs, that is, whose index appears in list_of_pcie_mode or list_of_pcie_rc:

  • Input transactions are from PCIe and the only indication of the memory type is in the NoSnoop bit of the transaction. No shareability is transported. NoSnoop is interpreted as iNC-oNC-osh. ! NoSnoop is interpreted as iWB-oWB-ish (note Inner Shareable).
  • If a NoSnoop transaction has an attribute transform applied to it and the result of the transform is weaker than iNC-oNC-osh, it is forced to iNC-oNC-osh. For example, if a NoSnoop transaction uses a page table and is transformed to iWB-oWB-nsh then it is forced to iNC-oNC-osh. However, if the page tables transformed it to a device type, as all device types are stronger than iNC-oNC-osh, it exits the SMMU as the device type.
  • In the model, transactions are classified by their incoming memory attributes as to whether they are NoSnoop or not and then are normalized appropriately:
    • iWB-oWB-any-shareability are interpreted as ! NoSnoop, therefore are normalized to iWB-oWB-ish.
    • Anything else is considered NoSnoop, therefore is normalized to iNC-oNC-osh.
  • Translated accesses also have the same interpretation to determine NoSnoop and how they are normalized. Therefore they could enter the system with different attributes to if they entered the SMMU as Untranslated Accesses and were translated by the normal translation process, before exiting downstream of the SMMU and entering the system
  • It is expected that there is a component downstream of the SMMU that is aware of the system address map and will override the memory type to device for any transaction that accesses a peripheral.

The hardware has a single cacheability on input, and, for transactions that are neither cache-maintenance operations nor PCIe transactions, it normalizes the input to an architectural form before performing the SMMUv3 architectural transform:

  • Any device type is left untouched. The input can only represent Device-nGnRE and Device-nGnRnE.
  • If the input is Write-back (WB), it is normalized to iWB-oWB with the incoming shareability.
  • If the input is anything else, it is normalized to iNC-oNC-osh.

The model accepts full architectural attributes of two levels of cacheability and so has to decide how to interpret this in terms of the hardware. For transactions that are not cache maintenance operations, the model replicates the outer attribute into the inner attribute and then performs the normalisation that the hardware does.

The hardware normalizes the architectural output attributes and outputs a single level of cacheability and a user flag (OC) specifying if the architectural attributes were cacheable in the outer cacheable domain. If the transaction is classified as a PCIe transaction, the NoSnoop transform previously described is applied. That is, if the original transaction was NoSnoop then any weaker memory type is strengthened to iNC-oNC-osh, and the following transform is applied:

if      iWB-oWB-nsh/ish/osh     then output WB-nsh/ish/osh, OC = 1
else if i(NC/WT/WB)-o(WB/WT)    then output NC-Sys,         OC = 1
else if i(NC/WT/WB)-oNC         then output NC-Sys,         OC = 0
else if Device-(GRE/nGRE/nGnRE) then output DV-Sys,         OC = 0
else                                 output SO-Sys,         OC = 0

The model only normalizes according to PCIe but otherwise leaves the architectural attributes intact on the output bus.

Limitations

The model has the following limitations:

  • The PMU has limited functionality. It is intended for demonstration purposes only and does not implement all architecturally mandated events. It does not implement the pmusnapshot_ack or pmusnapshot_req interface.
  • The model does not implement:
    • RAS.
    • Power control.
    • sup_oas, which controls the OAS of the SMMU. This is expected to be constant for a system.
    • The SYSCO interface.
    • The Low-Power Interface
  • Cache maintenance operations cannot be inserted into the TBU ports of the SMMU.
  • PVBus has no representation of the cache stash operations, so they are not supported.
  • TCU_CFG.XLATE_SLOTS is fixed at 512.
  • TCU_STATUS.GNT_XLATE_SLOTS always reads as 512.

Important:

The model deals with groups of transactions with the same attributes and a similar range of addresses. The mapping that is used is stored by the bus infrastructure and is used for subsequent sufficiently similar transactions without needing the intervention of the SMMU model, and therefore are not traced.

Table 3-346 Ports

Name Protocol Type Description
clk_in ClockSignal Slave Clock signal (in RTL aclk) This is a clock time-base used by the TCU to spread some of its processing over time, if enabled by the wait_* parameters. The clock must always be connected.
cmd_sync_irpt_ns Signal Master Pulsed interrupt output signal for non-secure CMD_SYNC having a completion signal of SIG_IRQ.
cmd_sync_irpt_s Signal Master Pulsed interrupt output signal for the PRI queue becoming non-empty. Pulsed interrupt output signal for secure CMD_SYNC having a completion signal of SIG_IRQ.
event_q_irpt_ns Signal Master Pulsed interrupt output signal for the non-secure event queue becoming non-empty.
event_q_irpt_s Signal Master Pulsed interrupt output signal for the secure event queue becoming non-empty.
evento Signal Master Event signal
global_irpt_ns Signal Master Pulsed interrupt output signal for non-secure SMMU_GERROR(N) signalling an error.
global_irpt_s Signal Master Pulsed interrupt output signal for secure SMMU_S_GERROR(N) signalling an error.
identify SMMUv3AEMIdentifyProtocol Master Map the transaction to the tuple (StreamID, SubStreamID, SubStreamIDValid, SSD) The StreamID that is produced by the implementation of this protocol is not the final StreamID. The final StreamID is produced by using the list_of_ns_sid_high_at_bitpos0/list_of_s_sid_high_at_bitpos0 parameter to map the StreamID based on the upstream port index. Also see the parameter howto_identify which can replace the functionality of this port under certain circumstances.
prog_pvbus_s PVBus Slave Register slave port (in RTL PROG)
pvbus_id_routed_m[62] PVBus Master This is a special "id-routed" port for transmitting ATC invalidates upstream into the PCIe EndPoints, it is not a normal bus. The FastModels ATC Invalidate and PRI Response protocol specifies how to route and deal with this port.
qtw_pvbus_m PVBus Master Master port used for Table Walks, MSIs and Queue access when separate_tw_msi_qs_port==true (in RTL QTW)
sec_override Signal Slave Allow certain registers to be accessible to non-secure accesses from reset, as described in the TCU_SCR register.
sup_btm Signal Slave System supports BTM and will be reflected in the IDR registers. This signal can override the value set by the parameters configuring the IDR registers. If BTM (Broadcast Table Maintenance) is not supported then DVM messages will be ignored.
sup_cohacc Signal Slave System supports COHACC and will be reflected in the IDR registers.
sup_sev Signal Slave System supports SEV and will be reflected in the IDR registers. This signal can override the value set by the parameters configuring the IDR registers.
tbm_pvbus_m[62] PVBus Master The TBU master ports that carry transactions that have been translated from the correspondingly numbered tbs_pvbus_s[] port.
tbs_pvbus_s[62] PVBus Slave The TBU slave ports that receive transactions to be translated. They will exit the SMMU through the same numbered pvbus_m[] port.
tbu_pmu_irpt[62] Signal Master TBU Performance Monitoring Unit interrupt, one per TBU.
tbu_reset_in[62] Signal Slave Reset signals The TBUs can have independent reset signals. Each signal tbu_reset_in[n] corresponds to the TBU using tbs_pvbus_s[n]/tbs_pvbus_s[n] pair. If the SMMU receives a transaction whilst the TBU is expected to be in reset then it will complain using the ArchMsg.Warning.warning trace source. Those tbu_reset_in that correspond to a PCIe-RC connection can be connected to monitor the PCIe-RC's reset signal. If it receives an ATS request when in reset then it will complain in a similar way. You must connect these pins if you wish the TCU_NODE_STATUS for the nodes to be accurate (including any connected to the PCIe-RC).
tcu_pmu_irpt Signal Master TCU Performance Monitoring Unit interrupt
tcu_reset_in Signal Slave The reset signal to the TCU interface.

Table 3-347 Parameters for MMU_600

Name Type Default value Description
all_error_messages_through_trace bool 0x0 Some conditions in the SMMU are so strange that the software programming the SMMU has done something wrong. At this point messages are output to either ArchMsg.Error.* or ArchMsg.Warning.* or to the error stream of the simulator. Outputting to the error stream of the simulator may cause it to return with a non-zero exit status. If you set this option to true then instead of using the error stream of the simulator it will always use a trace stream allowing the simulation to exit with a zero exit status.
behaviour_of_sampled_at_reset_signals int 0x0 Some configuration signals into the SMMU are sampled on negedge of reset. However, it can sometimes be hard to arrange to drive a configuration pin before the negedge of reset. The configuration pins are sampled: 0 -- at negedge reset. 1 -- at negedge reset, but if a later change occurs at the same simulated time, and no transactions have occurred, then they will be resampled and the SMMU reset again.
cmdq_max_number_of_commands_to_buffer int 0xa The command queues can buffer fetched commands before issuing them. This parameter is roughly the maximum number of commands to do this for. The programmer visible effects are that just because the CONS pointer shows a command has been _consumed_ does not necessarily mean that it has been issued (and completed). Higher values will accentuate this effect.
enable_device_id_checks bool 0x1 If this parameter is true then the DeviceIDs seen by the GIC are: * for client devices: DeviceID = StreamID + translated_device_id_base * for SMMU-generated MSIs: smmu_msi_device_id This parameter enables two checks: * If the DeviceID is used in the output_attribute_transform then if it overflows 32 bits then the model will warn. If the DeviceID is not used then it is assumed that the external agent that forms the DeviceID will warn if it overflows. * If the SMMU supports MSIs, then the model will check that the GIC will be able to distinguish an MSI generated by the SMMU from one generated by a client device. As the exact mechanism to determine the DeviceID is in the system and not necessarily under control of the SMMU then you can disable these warnings using this parameter. See also the parameters: output_attribute_transform and msi_attribute_transform.
howto_identify string "use-identify" If 'use-identify' then will use the 'identify' port to determine the SSD, StreamID, SubStreamID of a transaction. Otherwise, this string extracts them from the transaction's attributes. Examples:- SEC_SID=ExtendedID[63], SSV=ExtendedID[62], SubstreamID=ExtendedID[51:32], StreamID=ExtendedID[31:0] nSEC_SID=ExtendedID[63], StreamID=ExtendedID[55:24], nSSV=ExtendedID[20], SubstreamID=ExtendedID[19:0] SEC_SID is one bit wide, true if StreamID is secure. SSV is one bit wide, true if SubstreamID is valid. The alternative symbols nSEC_SID and nSSV have the negative logic to SEC_SID and SSV. You must not use the negative and positive logic for the same symbol at the same time. However, just because you use negative logic for one symbol does not force you to use negative logic for the other. SubstreamID is 20 bits wide and StreamID is 32 bits. More complex examples: StreamID[31:24]=0, StreamID[23:0]=ExtendedID[23:0], SSV=1[0], ... Available attributes: ExtendedID, MasterID, UserFlags
list_of_ns_sid_high_at_bitpos0 string "" A comma-separated list of values to bitwise OR into each Non-secure StreamID for each TBU/Node. Bit 0 of the value corresponds to bit 0 of the StreamID. Each TBU that is connected to a PCIe subsystem must serve a unique contiguous subset of StreamIDs as determined by their top bits. This is used in order to know which port to route ATC Invalidates to the PCIe subsystems.
list_of_pcie_mode string "" A comma-separated list of ranges of ports that represent TBUs that are attached to PCIe Root Complexes (PCIe-RC). A single PCIe-RC might use several TBUs and stripe accesses across them. The attributes handling for these TBUs are slightly different in that if the PCIe transaction is NoSnoop and the output attributes of the translation would be weaker than iNC-oNC-osh then the output is forced to iNC-oNC-osh. iNC-oNC-osh == "inner normal non-cacheable, out normal non-cacheable, outer shared"
list_of_pcie_rc string "" This is a list of ports that are connected to PCIe Root Complex (PCIe-RC) by a protocol called DTI-ATS. This port is used to transport ATS and PRI Requests to the SMMU from the PCIe-RC. In the real hardware, then the PCIe-RC uses this port for ATS/PRI, and then the actual transactions go through separate TBUs. In the model, then this port can accept actual transactions as well. However, in the model, then the ATC Invalidates and the PRI Responses need to be transferred over the corresponding pvbus_id_routed_m port as DTI-ATS is bidirectional, but PVBus is not.
list_of_s_sid_high_at_bitpos0 string "" A comma-separated list of values to bitwise OR into each Secure StreamID for each TBU/Node. Bit 0 of the value corresponds to bit 0 of the StreamID.
msi_attribute_transform string "ExtendedID[31:0]=smmu_msi_device_id, MasterID=0xFFFFffff" The SMMU will use this parameter to determine the MSI output attributes of MSIs it generates itself: * MasterID * UserFlags * ExtendedID of the transaction to use. This is a set of comma separated transforms on the output attributes. The right hand side of the transform is: * a numeric literal (or a slice of a literal) * the parameter smmu_msi_device_id * the symbol 'interrupt_kind' * 0/1 -- EVENTQ Secure/Non-secure * 2 -- PRIQ * 3/4 -- CMD_SYNC Secure/Non-secure * 5/6 -- GERROR Secure/Non-secure Example: UserFlags[15:0]=smmu_msi_device_id[31:16], MasterID[15:0]=smmu_msi_device_id[15:0], ExtendedID=0 This transform might be used as part of a system-specific way of determining the DeviceID that is passed to the GIC to distinguish MSIs generated by the SMMU and those generated by client devices of the SMMU. See also: output_attribute_transform and enable_device_id_checks.
number_of_ports int 0x1 The number of port pairs that the SMMU has.
output_attribute_transform string "ExtendedID[31:0]=DeviceID" Transform the downstream attributes of a translated transaction. * "" or "none" -- the input and output attributes are identical. * How to alter the output attributes, e.g. "ExtendedID[15:0]=DeviceID[15:0], UserFlags[31]=nSSV, UserFlags[19:0]=SubstreamID" The attributes that can appear on the left hand side of the transform are ExtendedID, MasterID and UserFlags. The source attributes that can be used are: * ExtendedID/MasterID/UserFlags -- the incoming attributes. * DeviceID -- StreamID + translated_device_id_base * StreamID/SubstreamID/SSV/SEC_SID * nSEC_SID/nSSV -- the negative logic versions. * St1PBHA/St2PBHA -- the Page Based Hardware Attributes from any used leaf descriptors (or zero if not used). * STE_IMPDEF1 -- STE[127:116] The right hand side may also contain numeric literals. Any bits of the attributes that have no transform specified are retained from the input. The StreamID has had ns_sid_high/s_sid_high ORred into it for the appropriate TBU.
prefetch_only_requests int 0x0 The simulator supports 'prefetch-only' DMI requests, which can occur for anytime for any reason and are intended to be invisible to the end execution of the model and to the user. 0 -- deny all prefetch-only requests 1 -- use debug requests for any page table walks -- form and use debug TLB/cache entries -- any faults will not record, but deny the prefetch request 2 -- treat prefetch-only requests like normal transactions -- use normal page table walk transactions -- use and form normal TLB/cache entries -- faults will alter the programmer visible state of the SMMU 0 is the safest. 1 treats the access like a debug request and requires that debug page table walks are treated correctly downstream. Any descriptors that need HTTU to allow the transaction to proceed will fail the request. 2 is dangerous, it use real transactions and reports faults that are unphysical. Real transactions can be wait()ed and this disobeys the SystemC spec for get_direct_mem_ptr().
sec_override bool 0x0 The IMP DEF port sec_override controls whether some of the registers are accessible to secure or non-secure transactions. This parameter is the default value assumed for that port if the port is not driven by a signal.
seed int 0x12345678 Used to seed the pseudo-random number generator that the SMMU model uses.
size_of_cd_cache int 0x0 The number of entries in the cache holding CD structures. If this is zero then it is treated as a large number ('infinite') but it is bounded so that the host memory usage of the cache is also bounded.
size_of_l1cd_cache int 0x0 The number of entries in the cache holding L1CD descriptors. If this is zero then it is treated as a large number ('infinite') but it is bounded so that the host memory usage of the cache is also bounded.
size_of_l1ste_cache int 0x0 The number of entries in the cache holding L1STE descriptors. If this is zero then it is treated as a large number ('infinite') but it is bounded so that the host memory usage of the cache is also bounded.
size_of_ste_cache int 0x0 The number of entries in the cache holding STE structures. If this is zero then it is treated as a large number ('infinite') but it is bounded so that the host memory usage of the cache is also bounded.
size_of_tlb int 0x0 The number of entries in the TLB. If this is zero then it is treated as a large number ('infinite') but it is bounded so that the host memory usage of the cache is also bounded.
smmu_msi_device_id int 0x0 When appropriately enabled, assume that MSIs that are generated by the SMMU are presented to the GIC with this DeviceID. See parameter msi_attribute_transform and enable_device_id_checks.
sup_cohacc bool 0x1 The value to put in SMMU_IDR0.COHACC
sup_oas int 0x5 The hardware has an input port sup_oas[2:0] that indicates what output address size (OAS) the _system_ has. This is sampled at reset. The model does not have this port as it is expected to be a constant for the system and not to change. Instead it is just a parameter. The allowed values are: 0 -- 32 bits 1 -- 36 bits 2 -- 40 bits 3 -- 42 bits 4 -- 44 bits 5 -- 48 bits
tlb_when_do_f_tlb_conflict_on_overlap int 0x0 If a TLB entry is created by a walk and it overlaps an existing entry. Then there are some architectural situations where the result is known. For all others, then an implementation is allowed to use an UNPREDICTABLE combination of the two entries, or it can generate F_TLB_CONFLICT: 0 -- never generate 1 -- sometimes generate 2 -- always generate Conflicts between global and non-global entries are not detected by the model.
translated_device_id_base int 0x0 When appropriately enabled, assume that client device accesses are translated to a DeviceID as seen by the GIC of: StreamID + translated_device_id_base See parameter output_attribute_transform and enable_device_id_checks.
version string "r0p0" The version of this product
wait_atos_ticks int 0x0 This is the time to wait before doing an ATOS operation. If bit 32 is set (0x1_0000_0000) then the time waited for is a uniform randomly distributed time [0,max(2,(t & 0xFFFFffff))).
wait_cmdq_ticks int 0x0 This is the time to wait before doing something on the command queue. If bit 32 is set (0x1_0000_0000) then the time waited for is a uniform randomly distributed time [0,max(2,(t & 0xFFFFffff))).
wait_eventq_ticks int 0x0 This is the time to wait before doing something on the event queue. If bit 32 is set (0x1_0000_0000) then the time waited for is a uniform randomly distributed time [0,max(2,(t & 0xFFFFffff))).
wait_msi_ticks int 0x0 This is the time to wait before sending an MSI. If bit 32 is set (0x1_0000_0000) then the time waited for is a uniform randomly distributed time [0,max(2,(t & 0xFFFFffff))).
wait_pri_req_ticks int 0x0 This is the time to wait before processing a PRI Request. If bit 32 is set (0x1_0000_0000) then the time waited for is a uniform randomly distributed time [0,max(2,(t & 0xFFFFffff))).
wait_pri_resp_ticks int 0x1 This is the time to wait before sending a PRI Response back to the PCIe subsystem. When a PRI Response is an auto-response then the ATC might immediately make a new ATS request, that immediately fails, that immediately makes a PRI Request, that auto-responds, etc. To break this loop, then we introduce a minimum time on all PRI Responses to give other components in the system a chance to run. If bit 32 is set (0x1_0000_0000) then the time waited for is a uniform randomly distributed time [0,max(2,(t & 0xFFFFffff))).
Non-ConfidentialPDF file icon PDF version100964_1180_00_en
Copyright © 2014–2019 Arm Limited or its affiliates. All rights reserved.