ARM Technical Support Knowledge Articles

Inlining C/C++ functions

Applies to: ARM Developer Suite (ADS), Software Development Toolkit (SDT)

Inlining is a trade-off between code size and performance. For embedded systems, code size is a major issue, so the ARM compilers are very careful that they produce minimal code size by default.

In standard C++ you can use the inline keyword to hint to the compiler that it should inline the function. While inline is not part of the original C standard most vendors tools allow you to use this in C code as well.

For C, the ARM compilers use __inline to provide equivalent functionality and our implementation is otherwise exactly the same as the standard C++ keyword. The reason for the "__" at the beginning is to maintain strict compatibility with standard C. Consider:

int inline;  /* this declares an integer called inline in C but is meaningless in C++ */ 

By default our compilers decide for themselves whether to inline code or not. The following points should be taken into account when using inlining:

1) __inline/inline is a "hint" not an "order"

__inline (for C) and inline (for C++) are "hints" to the compiler. Our compilers decide for themselves whether the "hint" is sensible or not, and whether to inline the function or not, depending on a number of conditions including the size of the function, the current optimization level, and whether optimizing for speed (-Otime) or size (-Ospace), so may ultimately ignore the "hint".

In ADS, smaller functions stand a better chance of being inlined. Compiling with -Otime will increase the likelihood that a function will get inlined. Large functions are not normally inlined because this could adversely affect code density and performance. There is an (undocumented) compiler option that will force functions marked as __inline to always be inlined: "--inlinemax=0" (or equivalently "--no_inlinemax"). Note that using this option may significantly increase codesize (especially for C++) and may also adversely affect performance.

In SDT, there was no restriction on the maximum length of an inline function - all functions marked __inline were inlined if they could be inlined. Other vendors tools may, or may not, place limits on the maximum length.

In ADS 1.1 and later the compiler is able to automatically inline functions when it feels it is appropriate, even if the user does not explicitly give the "hint". This only occurs by default at the highest optimization level (-O2). This behavior is on by default, but can be turned on/off using -Oautoinline and -Ono_autoinline. See note 5) below.

2) __inline/inline functions cannot be in other object files

Inlining can only be done on code within a single compilation unit. extern functions will not be inlined. Marking a function __inline means that it cannot be called from another compilation unit. The result is that "extern __inline" may not behave as you expect, due to __inline having the C++ semantics. The C++ standard requires an inline function to be identically defined in each "translation unit" in which it is used, so it is not possible to link to an inline function in another file.

If you want a function to be available for inlining in multiple files, the solution is to place the function in a header ".h". file, marked as "extern __inline", then #include it in each file where it is needed. If the compiler decides not to inline the function in one or more cases, the function is compiled so that only one copy of it will remain after linking.

Functions that are only called locally ("static") should never be placed in a header file - these "static inline" functions cannot be shared, so multiple copies may exist after linking. Marking a function local to a compilation unit "static inline" instead of just "inline" is not necessary but is a good coding style because the compiler can apply additional optimizations.

3) __inline/inline functions may need to be defined before they are "called" (ADS 1.0.1 and earlier)

The compiler needs to place compiled code for the inline function into other functions as it compiles them. Older versions of our compilers (SDT 2.5x and ADS 1.0.1) perform a single pass compilation. This means that if it has not seen the actual code for the __inline function, it cannot inline it.

ADS 1.1 and later use a two pass compilation process. This allows the function to be inlined even if it appears later in the (same) source file than where it is to be used.

Bjarne Stroustrup (who 'invented' C++) noted: "To make inlining possible in the absence of unusually clever compilation and linking facilities, the definition - and not just the declaration - of an inline function must be in scope".

4) Debug data

It is quite complex to provide full debug information for code which appears in multiple places, as is the case with inlined functions. Consequently there is a trade-off to be made between being able to debug functions that are declared "inline" and actually placing them "inline".

In SDT 2.5x, debug data is not generated for inline functions, unless you use #pragma debug_inlines. See SDT 2.51 Reference Guide, page 3-4, and the SDT FAQ entry "armcc/tcc Source-level debugging __inline functions".

In ADS, there are the -Oinline and -Ono_inline command line options that offer similar functionality. Use of -Ono_inline can help when debugging inline functions at interleaved C/assembler level.

5) Mark functions not called from other modules as 'static'

In ADS 1.1 and later the compiler is able to automatically inline functions when it feels it is appropriate, even if the function is not marked as __inline/inline. This only occurs by default at the highest optimization level (-O2). This behavior is on by default, but can be turned on/off using -Oautoinline and -Ono_autoinline. See note 5) below.

If a function gets automatically inlined, unless the function is declared as 'static', be aware that both the in-line _and_ out-of-line version of the function may end up in the final image, which might possibly increase code size.

Unless a function is declared as 'static' (or __inline), the compiler has to retain the out-of-line version of it in the object file in case it might be called from some other module. The linker is unable to remove unused out-of-line functions from an object, unless code is compiled with -zo (one function per area). To avoid this duplication, functions which you know are never called from outside a module should be marked as 'static'.

Having both in-line and out-of-line copies of a function in code can also result in a more complex debug view, which can sometimes be confusing when setting breakpoints or single-stepping. The debugger has to display both in-line and out-of-line versions in its interleaved source view, so the users can see what is happening when stepping either through the in-line or out-of-line version.

In general, if you are sure that a function will never be called from another module, it should be declared as 'static'. This has two effects:

a) smaller code size (no out-of-line version retained in image), and
b) a simpler debug view (no out-of-line version to display)

6) Setting breakpoints on inlined functions in ROM images

When you set a breakpoint on an inline function, the ARM debuggers attempt to set a breakpoint on each inlined instance of that function. If you are using Multi-ICE or other hardware to debug an image in ROM, and the number of inline instances is greater than the number of available hardware breakpoints, the debugger may not be able to set the additional breakpoints and will then report an error.

Article last edited on: 2008-09-09 15:47:27

Rate this article

[Bad]
|
|
[Good]
Disagree? Move your mouse over the bar and click

Did you find this article helpful? Yes No

How can we improve this article?

Link to this article
Copyright © 2008-9 ARM Limited. All rights reserved. External (Open), Non-Confidential