11.3.2. Cycle counting example: Dhrystone

In this example, the number of instructions executed by the main loop of the Dhrystone application and the number of cycles consumed are determined. A suitable place to break within the loop is the invocation of function Proc_5.

If you are using the command-line tools:

  1. Load the executable, produced in Code and data sizes example: Dhrystone, into the debugger:


    armsd -nofpe dhry

  2. Set a breakpoint on the first instruction of Proc_5:


    break @Proc_5

  3. Type go at the armsd prompt to begin execution. When prompted, request at least two runs through Dhrystone.

  4. When the breakpoint at the start of Proc_5 is reached, display the system variable $statistics (which gives the total number of instructions and cycles taken so far) and restart execution:


    print $statistics
    go

  5. When the breakpoint is reached again, you can obtain the number of instructions and cycles consumed by one iteration:


    print $statistics_inc

If you are using the Windows toolkit:

  1. If you have not already done so, build the Dhrystone project as described in Code and data sizes example: Dhrystone.

  2. If you use ADW and are running APM then click on the Debug button to start ADW and load the Dhrystone project. If you use ADU then start ADU and select Load Image... from the File menu to load the Dhrystone project.

  3. Disable floating point emulation. Select Options  ? Configure Debugger...  ? Target  ? ARMulate and switch the FPE check box off.

  4. Locate function Proc_5 by selecting Low Level Symbols from the View menu.

  5. Double click on Proc_5 to open the Disassembly Window.

  6. Toggle the breakpoint on Proc_5 in the Disassembly Window by selecting the instruction, then clicking the Toggle breakpoint button on the toolbar.

  7. Click the Go button to begin execution.

  8. When prompted, request at least two runs through Dhrystone.

  9. When the breakpoint set at main is reached, click Go again to begin execution of the main application.

  10. When the breakpoint at Proc_5 is reached, choose Debugger Internals from the View menu.

  11. Double click on the statistics_inc field to display the detail for this variable.

  12. Click the Go button. When the breakpoint at Proc_5 is reached again, the contents of the statistics_inc_w field is updated to reflect the number of instructions and cycles consumed by one iteration of the loop.

Results

The results are shown in the following table:

Table 11.2. Cycle counting results

Instructions

S-cycles

N-cycles

I-cycles

C-cycles

F-cycles

3584271886400
S-cycles

Sequential cycles. The CPU requests transfer to or from the same address, or from an address that is a word or halfword after the preceding address.

N-cycles

Non-sequential cycles. The CPU requests transfer to or from an address that is unrelated to the address used in the preceding cycle.

I-cycles

Internal cycles. The CPU does not require a transfer because it is performing an internal function (or running from cache).

C-cycles

Coprocessor cycles.

F-cycles

Fast clock cycles for cached processors (FCLK).

Note

You may obtain slightly different figures, depending on the version of the compiler, linker, or library in use, and the processor for which the ARMulator is configured.

Copyright © 1997, 1998 ARM Limited. All rights reserved.ARM DUI 0040D
Non-Confidential