7.4 Stack use in C and C++
C and C++ both use the stack intensively.
For example, the stack holds:
- The return address of functions.
- Registers that must be preserved, as determined by the Arm® Architecture Procedure Call Standard for the Arm® 64-bit
Architecture (AAPCS64), for instance, when register contents are
saved on entry into subroutines.
- Local variables, including local arrays, structures, unions, and in C++,
Some stack usage is not obvious, such as:
- Local integer or floating point variables are allocated stack memory if
they are spilled (that is, not allocated to a register).
- Structures are normally allocated to the stack. A space equivalent to
sizeof(struct) padded to a multiple of 16
bytes is reserved on the stack. The compiler tries to allocate structures to
If the size of an array is known at compile time, the compiler allocates memory
on the stack. Again, a space equivalent to
sizeof(array) padded to a multiple of 16 bytes is reserved on
Note: Memory for variable length arrays is allocated at runtime, on the
- Several optimizations can introduce new temporary variables to hold
intermediate results. The optimizations include: CSE elimination, live range
splitting and structure splitting. The compiler tries to allocate these
temporary variables to registers. If not, it spills them to the stack.
- Generally, code compiled for processors that support only 16-bit encoded
T32 instructions makes more use of the stack than A64 code, A32 code and code
compiled for processors that support 32-bit encoded T32 instructions. This is
because 16-bit encoded T32 instructions have only eight registers available for
allocation, compared to fourteen for A32 code and 32-bit encoded T32
- The AAPCS64 requires that some function arguments are passed through the
stack instead of the registers, depending on their type, size, and order.
Methods of estimating stack usage
Stack use is difficult to estimate because it is code dependent, and can
vary between runs depending on the code path that the program takes on execution. However,
it is possible to manually estimate the extent of stack utilization using the following
--callgraph to produce a static callgraph. This shows information on all
functions, including stack use.
This uses DWARF frame information from the
.debug_frame section. Compile with the
option to generate the necessary DWARF information.
- Link with
--info=summarystack to list the stack usage of all
- Use the debugger to set a watchpoint on the last available location in
the stack and see if the watchpoint is ever hit. Compile with the
-g option to generate the necessary DWARF
Use the debugger, and:
- Allocate space in memory for the stack that is much larger than you
expect to require.
- Fill the stack space with copies of a known value, for example,
- Run your application, or a fixed portion of it. Aim to use as much
of the stack space as possible in the test run. For example, try to
execute the most deeply nested function calls and the worst case path
found by the static analysis. Try to generate interrupts where
appropriate, so that they are included in the stack trace.
- After your application has finished executing, examine the stack
space of memory to see how many of the known values have been
overwritten. The space has garbage in the used part and the known values
in the remainder.
- Count the number of garbage values and multiply by
sizeof(value), to give their size, in
The result of the calculation shows how the size of the stack has
grown, in bytes.
- Use Fixed Virtual Platforms (FVP), and define a region of memory where
access is not allowed directly below your stack in memory, with a map file. If
the stack overflows into the forbidden region, a data abort occurs, which can be
trapped by the debugger.
Methods of reducing stack usage
In general, you can lower the stack requirements of your program by:
- Writing small functions that only require a small number of
- Avoiding the use of large local structures or arrays.
- Avoiding recursion, for example, by using an alternative algorithm.
- Minimizing the number of variables that are in use at any given time at
each point in a function.
- Using C block scope and declaring variables only where they are
required, so overlapping the memory used by distinct scopes.