3.5.3. Setting the sample rate

When profiling your application, the ARM Profiler records every executed instruction, enabling it to accurately reconstruct and report the call chain sequence. In addition to that, the ARM Profiler also records timing information for executed instructions. This timing information is collected in samples from the trace stream captured by the RealView Trace 2 unit. The sample rate defines, in cycles, how frequently these samples are taken. Sampling therefore gives you an idea of how much time is spent on each instruction, which, in the bigger picture, allows you to gauge the performance of your application as a whole. A lower sample rate means more frequent samples are taken. This gives you a more accurate performance measurement but increases the volume of information sent over the trace port. It also increases the amount of data that the ARM Profiler has to parse, making it more difficult for slower host machines to keep up with faster targets.

The potential side effect of a sample rate that is too low is trace overflows. A higher sample rate, reduces the amount of data that is transmitted over the trace port, but means fewer samples are taken and the accuracy of the performance statistics reported by the ARM Profiler is reduced. The default value, 1021, tells the RealView Trace 2 unit to report the executing instruction every 1021 cycles.

Note

The ARM Profiler records every instruction executed, no matter what you have set as the sample rate. The sample rate only changes how often instruction timing information is recorded.

Use the Sample Rate drop-down menu to set the sample rate for the profiling run to one of the following preset values:

Note

  • To reduce the risk of trace port overflows, the default sample rate is set to 1021.

  • The drop-down menu is populated with mostly prime numbers to ensure a more random sampling of executed instructions. This avoids the potential of a divisible sample rate matching execution loops. Some targets have a limited capacity for the sample rate that can be set.

Cycle Accurate provides the highest level of accuracy, as it records the cycle count for every instruction. If Cycle Accurate is selected for a target that does not allow it, the ARM Profiler sets it to the lowest supported value and gives you a message similar to the following: Target does not support sample rate 1 - using 16.

Setting the sample rate to Estimated Cycles provides maximum performance for smaller trace port widths, but turns sampling off. Just like profiling using an RTSM, the ARM Profiler estimates time for each instruction based on the instruction type and reports timing data based on these estimates, but provides no visibility to actual hardware stall behavior. This is not as accurate as sampling and does not provide insight into what instructions are performing more slowly than expected.

Note

Estimated Cycles does not use the cycle accurate mode of the Embedded Trace Macrocell (ETM) port, so it can be used with narrower ETM ports to reduce ETM buffer overflows.

Note

The Cortex-M3 allows a sample rate of either 64 or 1024. The ARM Profiler uses one of these two values based on the number chosen from this drop-down menu. Any values below 64 are converted to 64 and values above 64 are converted to 1024. The Cortex-M3 does not support the Estimated Cycles sampling setting.

Copyright © 2007- 2009 ARM Limited. All rights reserved.ARM DUI 0414D
Non-Confidential