ARM Technical Support Knowledge Articles

Quick-start Guide for Benchmarking

Applies to: ARM Developer Suite (ADS)


The ARMulator is a very useful benchmarking tool, but it can be difficult to set-up for new users, as not all settings can be done from inside the debugger interface (GUI). We provide here a simple guide to the basic settings of ARMulator. For information on the cycle types that are output, as well as more detailed information than the below, please see App Note 93.

Non-GUI settings:

1) Core Clock:Bus Clock setting

This is set by MCCFG in file "default.ami", located in the Bin directory of your tools installation. This number must be an integer (asynchronous modes are not supported). The default setting is 3, and so a 150MHz core clock will result a 50MHz bus clock. We will set the Core Clock value in the debugger (see later). Note that this setting has no relevance to an ARM7TDMI user (because this core supports only a single clock domain).

2) Cache and TCM size settings

These are set in the core definition inside "armulate.dsc" for relevant cores. First search for the appropriate definition (for example ARM926EJ-S=Processors_Common_ARMULATE). Inside this definition you will see some or all of the following, depending on what is supported by the core of interest.

TCM sizes are set by the values of the parameters:


Cache sizes are set in terms of number of cache lines present, with the parameters:


Please consult the Technical Reference Manual of the core you are using for suitable values of the above settings.

3) Memory system

You can then define the memory system to ARMulator, to allow wait-states and other factors to be added to the model. This is simply a user-created text file that defines a region in terms of start address, length (in Hex, without leading 0x), width, and access time. This is described in detail in 'Map Files' section of Application Note 93.

Access times are defined in the mapfile in nanoseconds (10^-9), and only integer values can be added. The ARMulator will then apply the appropriate number of wait-states for the bus system. This will be displayed in the ARMulator banner.

GUI settings:

Launch AXD debugger, and select Options -> Configure Target -> armulate.dll -> Configure, to launch the configuration window. This is where all GUI settings are made.

1) Processor
Please select appropriate ARM Processor from the pull-down list.

2) Clock
Select 'Emulated', and then enter an appropriate ARM CORE clock speed. The default unit is Hz, and so; an entry of 120 defines a clock speed of 120 Hz. To define a clock speed of 120 MHz, either enter "120 MHz", or "120000000".

3) Floating Point Emulation
This simulates the obsolete FPE coprocessor. This is normally not selected.

4) Debug-Endian
Defines if system is Little-Endian of Big-endian under normal execution.

5) Start target Endian
This defines endianess at start-up (if different).

6) Memory Map File
Select Map File, and browse to your map file defined above, if appropriate.

7) Floating Point Coprocessor
This option selects presence of a VFP coprocessor.

8) MMU/PU Initialization
This selects whether memory management systems are enabled at start-up (of simulation).
To enable at start-up, select DEFAULT_PAGETABLES
Otherwise, select NO_PAGETABLES.

The DEFAULT_PAGETABLES option is useful for simple benchmarking, but if your code enables MMUs, as it would do in a real system, please select NO_PAGETABLES

When completed, press OK to save changes.

Checking your set-up:

All information concerning your set-up will be displayed in the ARMulator banner upon connection to the debugger.

ARM926EJ-S, 16Kb I-cache, 8Kb D-cache, 4Kb I-Ram, 8KB D-Ram, Memory Management Unit, I-uTLB, D-uTLB, TLB, 200.00MHz core clock, BIU, Little endian, Debug Comms Channel, 40.0MHz

This section defines core, cache/TCM sizes and other parameters. In this example, MCCFG=5, and Core Clock is set to 200MHz, hence Bus Clock is set to 40MHz.

Memory map:

00000000..0003ffff, 32-Bit, wr, wait states: RN=1/0 WN=1/0 RS=0 WS=0 RIS=1/0 WIS=1/0
00040000..0007ffff, 08-Bit, -r, wait states: RN=3/2 WN=Abt RS=2 WS=Abt RIS=3/2 WIS=3/2

This defines the memory map of system. Mapfile used was:

00040000 00040000 ROM 1 R 100/70 100/70
00000000 00040000 RAM 4 RW 40/20 40/20

For this example, a single bus access (at 40MHz) will take 25ns, and so wait-states are added appropriately.

Article last edited on: 2008-09-09 15:47:41

Rate this article

Disagree? Move your mouse over the bar and click

Did you find this article helpful? Yes No

How can we improve this article?

Link to this article
Copyright © 2011 ARM Limited. All rights reserved. External (Open), Non-Confidential