| |||
| Home > Tracing with RealView Debugger > Examples of using trace in RealView Debugger > Capturing profiling information | |||
This example demonstrates how you can use RealView Debugger to capture profiling information for your application.
The dhrystone.axf image used in this
example performs a benchmarking sample that is executed n number
of times, where you are prompted to indicate n when
running the image. By performing multiple runs, you can sum the
profiling data. This example assumes that you want to analyze the
execution times of all functions that are executed in the main for loop
that is run repeatedly. It also assumes that you are using an ETM-based
target.
This example is in two parts:
To capture profiling information using the dhrystone example:
Load the example image dhrystone.axf into
the debugger. This file is located in the directory.
The tab for install_directory\RVDK\Examples\...\dhrystone\DebugReldhry_1.c is displayed in the File
Editor pane.
Select Edit → Advanced → Show Line Numbers from
the Code window main menu to display the line numbers in dhry_1.c.
Select View → Analysis Window from the Code window main menu to view the Analysis window, and position the window so that it does not cover the Code window.
Configure the ETM for profiling:
Select Edit → Configure Analyzer Properties... from the Analysis window main menu to display the Configure ETM dialog box (see Configuring the ETM for more details on this dialog box).
To generate profiling information you must enable one of the settings that causes this to be captured. For RealView Trace, you can select either or both of the settings Enable Timestamping and Cycle accurate tracing.
For this example, select Enable Timestamping.
Be aware that selecting Cycle accurate tracing might give less accurate results than timestamping.
Click OK to close the Configure ETM dialog box.
Set a trace start point at the start of the program loop as follows:
In the dhry_1.c source
tab, scroll down the source file and right-click in the gray area
to the left of the code listing in line 146 to display the context menu.
This line represents the start of a for loop.
Select Set/Toggle Tracepoint... from the context menu. A List Selection dialog box is displayed.
Select Trace Start Point from
the List Selection dialog box. An arrow
is placed
next to line 146 to indicate the start point you have set, and details of
this control point are displayed in the Break/Tracepoints pane of
the Code window, as shown in Figure 2.47.
Set a trace end point after the end of the program loop as follows:
Scroll down the source file and right-click in the gray area to the left of the code listing at line 204 to display the context menu.
By placing the end point after the end of the loop, you ensure that RealView Debugger captures all the iterations of the loop, rather than a single iteration.
In addition, the area bounded by the trace start and end points does not include any code that activates semihosting. The semihosting mechanism can affect the profiling results.
Select Set/Toggle Tracepoint... from the context menu. A List Selection dialog box is displayed.
Select Trace End Point from
the List Selection dialog box. An arrow
is placed
next to line 204 to indicate the end point you have set, and details
of this control point are displayed in the Break/Tracepoints pane
of the Code window.
Because you have set trace start and end points, and not a trace range, you are instructing RealView Debugger to capture and display all trace information between the start and end points, including any instructions that might be branched to between the points. For more details on these types of tracepoints, see Setting unconditional tracepoints.
Select Debug → Run from the Code window main menu. The image is executed and a prompt is displayed in the Output pane at the bottom of the Code window.
In the Output pane, enter the number of runs through
the benchmark you want RealView Debugger to perform, in this case 5000,
and press Enter. Trace capture begins for the area of execution
you have specified using tracepoints, that is, for the for loop
area of code. The Analysis window is updated with the results of
the trace capture.
In the Analysis window, click the Profile tab to display the profile data, as shown in Figure 2.48.
In the Profile tab, as shown in Figure 2.48, the execution
times for all functions accessed during the for loop
are displayed, and a graphical representation of their respective
execution times is shown in the Histogram column.
For details on the types of information displayed in each column,
see Columns.
To view call-graph data for these results:
Select Profiling Data → Parents of Function from the Analysis window main menu.
Select Profiling Data → Children of Function from the Analysis window main menu.
The Profile tab displays parents and children of each function, as shown in Figure 2.49.
In addition to displaying the execution times for each function
accessed during the for loop, the execution times
of the parents and children of these functions are also displayed. Figure 2.50 shows the data
for a single function, Func_2.
The Exec% column shows that:
6.92% of the total execution time was spent in code
of the function Func_2 when called from the
parent main.
6.92% of the total execution time was spent in code
of the function Func_2.
1.41% of the total execution time was spent in code
of the function Func_1 when called as a child
from Func_2.
14.22% of the total execution time was spent in
code of the function strcmp when called as
a child from Func_2.
The B=>E% column shows that:
22.56% of the total execution time was spent in
calls to the function Func_2 and its children when called from the
parent main.
22.56% of the total execution time was spent in calls to the function Func_2 and its children.
1.41% of the total execution time was spent in calls
to the function Func_1 called as a child from Func_2.
14.22% of the total execution time was spent in
calls to the function strcmp called as a child from Func_2.
The Count column shows that:
There were 5000 calls from the function main to
the function Func_2.
There were 5000 calls to the function Func_2.
There were 5000 calls to the function Func_1 from
the function Func_2.
There were 5000 calls to the function strcmp from
the function Func_2.
You can sum profiling data over additional runs of dhrystone as described in Procedure for summing profiling data over multiple runs.
To sum profiling data over multiple runs, do the following:
If you have not yet done so, perform the procedure described in Procedure for capturing profiling data.
Select Profiling Data → Sum Profiling Data from the Analysis window main menu.
Select Debug → Set PC to Entry Point from the Code window main menu.
If you reload the image, or load a new image, the trace details are cleared and Sum Profiling Data is disabled.
Select Debug → Run from the Code window main menu.
Enter 500 for the number of runs
through the benchmark. The profiling data is updated as shown in Figure 2.51.
If you no longer want to sum profiling data, select Profiling Data → Sum Profiling Data from the Analysis window main menu to deselect it.
Compare Figure 2.51 with Figure 2.48. The summing of profiling data reveals the following information:
The Count column reveals the following:
After the first run:
there were 15000 calls to functions Proc_7, Func_1
there were 5000 calls to all other functions.
After the second run:
there
were 16500 calls to functions Proc_7, Func_1
there were 5500 calls to all other functions.
The B=>E% column reveals the following:
The total execution time after two runs is equal to the total execution time of the first run, plus the total execution time of the second run.
The time taken to execute a particular function after two runs is equal to the time taken to execute it after the first run.
The percentage of the total execution time that is spent in the code of a particular function is the same as that after the first run. However, this might not always be the case.
The Exec% column reveals the following:
The total execution time after two runs is equal to the total execution time of the first run, plus the total execution time of the second run.
The time spent from entry to exit of a particular function after two runs is equal to the time spent from entry to exit of that function after the first run.
The percentage of the total execution time that is spent in calls to a particular function is the same as that after the first run. However, this might not always be the case.