2.3.9. Controlling code generation

Use the options described in this section to control aspects of the code generated by the compiler such as optimization. See Pragmas for information on additional code generation options that are controlled using pragmas.

This section describes:

Defining optimization criteria

The following options control aspects of how the compilers optimize generated code.

-O, number

This option specifies the level of optimization to be used. The optimization levels are:

-O0

Turns off all optimization, except some simple source transformations. This is the default optimization level if debug tables are generated with -g. It gives the best possible debug view and the lowest level of optimization.

-O1

Turns off optimizations that seriously degrade the debug view. If used with -g, this option gives a satisfactory debug view with good code density.

-O2

Generates fully optimized code. If used with -g, the debug view might be less satisfactory because the mapping of object code to source code is not always clear. This is the default optimization level if debug tables are not generated.

See Pragmas for information on controlling optimization with pragmas.

-Ospace

This option optimizes to reduce image size at the expense of a possible increase in execution time. For example, large structure copies are done by out-of-line function calls instead of inline code. Use this option if code size is more critical than performance. This is the default.

-Otime

This option optimizes to reduce execution time at the possible expense of a larger image. Use this option if execution time is more critical than code size. For example, it compiles:

while (expression) body;

as:

if (expression) {
	do body;
	while (expression);
}

If you specify neither -Otime or -Ospace, the compiler uses -Ospace. You can compile time-critical parts of your code with -Otime, and the rest with -Ospace. You must not specify both -Otime and -Ospace in the same compiler invocation.

-Ono_inline

This option disables inlining of functions. Calls to inline functions are not expanded inline. You can use this option to help debug inline functions.

-Oinline

This option enables the compiler to inline functions. This is the default.

The compiler inlines functions when it is sensible to do so:

  • Automatically, for optimization level O2 unless the -Ono_autoinline option is specified.

  • When the function is qualified as an inline function, for example with the __inline keyword in C or the inline keyword in C++. This applies for all optimization levels. Functions qualified as inline functions are more likely to be inlined, but the qualifier is only a hint to the compiler. See Function keywords.

The compiler changes its criteria for inlining functions depending on whether you select -Ospace or -Otime. Selecting -Otime increases the number of functions that are inlined.

Setting breakpoints in ROM images

When you set a breakpoint on an inline function, the ARM debuggers attempt to set a breakpoint on each inlined instance of that function. If you are using Multi-ICE® or other hardware to debug an image in ROM, and the number of inline instances is greater than the number of available hardware breakpoints, the debugger cannot set the additional breakpoints and reports an error.

-Ono_autoinline

This option disables automatic inlining. This is the default for optimization levels -O1 and -O0 if -Oinline is enabled.

-Oautoinline

This option enables automatic inlining. It is off by default for optimization levels -O0 and -O1, and on by default for optimization level -O2. The compiler automatically inlines functions where it is sensible to do so. The -Ospace and -Otime options influence how the compiler automatically inlines functions.

-Ono_ldrd

This option disables optimizations specific to ARM Architecture v5TE processors. This is the default.

-Ono_data_reorder

This option disables automatic reordering of top-level data items (globals, for example). The C/C++ compilers save memory by eliminating wasted space between data items. However, this optimization can break legacy code, if the code (incorrectly) makes assumptions about ordering of data by the compiler. The C standard does not guarantee data order, so you must avoid writing code that depends on any assumed ordering. If you require data ordering, place the data items into a structure.

-Oldrd

This option enables optimizations specific to ARM Architecture v5TE processors. If you select this option, and select an Architecture v5TE -cpu option such as -cpu xscale, the compiler:

  • Generates LDRD and STRD instructions where appropriate.

  • Sets the natural alignment of double and long long variables to eight. This is equivalent to specifying __align(8) for each variable.

    Note

    If you select this option, the output object is marked as requiring 8-byte alignment. This means that it is unlikely to link with objects built with versions of ADS earlier than 1.1.

-split_ldm

This option instructs the compiler to split LDM and STM instructions into two or more LDM or STM instructions, where required, to reduce the maximum number of registers transferred to:

  • five, for all STMs, and for LDMs that do not load the PC

  • four, for LDMs that load the PC.

This option can reduce interrupt latency on ARM systems that:

  • do not have a cache or a write buffer (for example, a cacheless ARM7TDMI)

  • use zero-wait-state, 32-bit memory.

Note

Using this option increases code size and decreases performance slightly.

This option does not split ARM inline assembly LDM or STM instructions, or VFP FLDM or FSTM instructions, but does split Thumb LDM and STM inline assembly instructions where possible. Using inline Thumb assembly routines, however, is deprecated and generates a warning message.

This option has no significant benefit for cached systems, or for processors with a write buffer.

This option also has no benefit for systems with non-zero-wait-state memory, or for systems with slow peripheral devices. Interrupt latency in such systems is determined by the number of cycles required for the slowest memory or peripheral access. This is typically much greater than the latency introduced by multiple register transfers.

Setting the default type of unqualified floating-point constants

-auto_float_constants

This option changes the type of unsuffixed floating-point constants from double (as specified by the ANSI/ISO C and C++ standards) to unspecified. In this context, unspecified means that uncast double constants and double constant expressions are treated as float when used in expressions with values other than double. This can sometimes improve the execution speed of a program that uses float variables.

Compile-time evaluation of constant expressions that contain such constants is unchanged. The compiler uses double-precision calculations, but the unspecified type is preserved. For example:

(1.0 + 1.0) // evaluates to the floating-point 
            // constant 2.0 of double precision and
            // unspecified type.

In a binary expression that must be evaluated at runtime (including expressions that use the ?: operator), a constant of unspecified type is converted to float, instead of double. The compiler issues the following warning:


C2621W: double constant automatically converted to float

You can avoid this warning by explicitly suffixing floating-point constants that you want to be treated as float with an f as shown in Example 2.1. You can turn this warning off with the -Wk compiler option.

Note

This behavior is not in accordance with the ANSI C standard.

If the other operand in the expression has type double, a constant of unspecified type is converted to double. A cast of a constant of unspecified type to type T produces a constant of type T (Example 2.1).

Example 2.1. Double and float

float f1(float x) { return x + 1.0; }  // Uses float add and is treated the same
                                       // as f2() below, a warning is issued.
float f2(float x) { return x + 1.0f;}  // Uses float add with no warning, with
                                       // or without -auto_float_constants.
float f3(double x) { return x + 1.0;}  // Uses double add, 
                                       // no special treatment.
float f4(float x) { return x + (double)1.0;}  // Uses double add,
                                              // no special treatment.

Controlling code and data sections

-zo

This option generates one ELF section for each function in source file. Output sections are named with the same name as the function that generates the section. For example:

int f(int x) { return x+1; }

compiled with -zo gives:

        AREA ||i.f||, CODE, READONLY
f PROC
        ADD      r0,r0,#1
        MOV       pc,lr

This option enables the linker to remove unused functions when the default -remove linker option is active. This option increases code size sightly (typically by a few percent) for some functions because it reduces the potential for sharing addresses, data, and string literals between functions. However, when creating code for a library, it can prevent unused functions being included at the link stage. This can result in the reduction of the final image size. The option can be used with a linker scatter-loading description file to place some functions in fast memory and others in slow memory (see the section on scatter-loading files in the ADS Linker and Utilities Guide). You can also use a scatter-loading file to place a function at a particular address in memory. If you are using third-party code, you do not have to change the source, but you must recompile (unless the code was already compiled with the -zo option).

pragma arm section

This pragma specifies the code or data section name used for subsequent functions or objects. This includes definitions of anonymous objects the compiler creates for initializations.

Use a scatter-loading description file with the linker to control placing a named section at a particular address in memory (see Pragmas controlling code generation and the ADS Linker and Utilities Guide).

Setting byte order

-littleend

This option generates code for an ARM processor using little-endian memory. With little-endian memory, the least significant byte of a word has lowest address. This is the default.

-bigend

This option generates code for an ARM processor using big-endian memory. With big-endian memory, the most significant byte of a word has lowest address.

Setting alignment options

-zasNumber

This option specifies the minimum byte alignment for structures. Valid values for Number are:


1, 2, 4, 8

The default is 1. This option is deprecated and will not be supported in future versions of the product.

-memaccess option

This option indicates to the compiler that the memory in the target system has slightly restricted or expanded capabilities. By default, ARM compilers assume that the memory system can load and store words at 4-byte alignment, halfwords at 2-byte alignment, and bytes. Load and store capability can be indicated by specifying option:

+L41

The memory can return the aligned word containing the addressed byte. This is useful only with ARM architecture v3 processors that lack load halfword.

-S22

The memory cannot store halfwords. You can use this to suppress the generation of STRH instructions when generating ARM code for architecture v4 (and later) processors.

-L22

The memory cannot load halfwords. You can use this to suppress the generation of LDRH instructions when generating ARM code for architecture v4 (and later) processors.

Note

Do not use -L22 or -S22 when compiling Thumb code.

It is possible that the processor has memory access modes available that the physical memory lacks (load aligned halfword, for example).

It is also possible that the physical memory has access modes that the processor cannot use (architecture v3 load aligned halfword, for example).

Controlling implementation details

-fy

This option forces all enumerations to be stored in integers. This option is switched off by default and the smallest data type is used that can hold the values of all enumerators.

Note

This option is not recommended for general use and is not required for ANSI-compatible source.

-zc

This option makes the char type to be signed. It is normally unsigned.

Note

This option is not recommended for general use and is not required for ANSI-compatible source. If used incorrectly, this option can cause errors in the resulting image.

The sign of char is set by the last option specified that would normally affect it. For example, if you specify both -ansi and -zc options, and you want to make char signed, you must specify the -zc option after the -ansi option.

Copyright © 1999-2001 ARM Limited. All rights reserved.ARM DUI 0067D
Non-Confidential