| |||
| Home > Using the ARM Compiler > armcc command syntax > Controlling code generation | |||
Use the options described in this section to control aspects of the code generated by the compiler such as optimization. See Pragmas for information on other code generation options that are controlled using pragmas.
This section describes:
These options control the target instruction set:
--armConfigures the compiler to target the ARM instruction set. This is the default.
--thumbConfigures
the compiler to target the Thumb instruction set. This predefines __thumb and __thumb__.
Also, see the descriptions of #pragma arm and #pragma
thumb in Pragmas controlling code generation.
These pragmas enable you to compile specific functions for ARM or
Thumb.
If you are compiling code that is intended for mixed ARM/Thumb
systems for processors that support ARMv4T or ARMv5, then you must
specify the interworking option --apcs /interwork. This
is enabled by default for processors that support ARMv5 or above. See Interworking qualifiers for more details.
Interworking is described in detail in RealView Compilation
Tools v3.0 Developer Guide.
If you enter armcc --thumb --fpu vfp on
the command line, the compiler compiles as much of the code using
the Thumb instruction set as possible. However, the compiler might
generate ARM code for some parts of the compilation.
If you enter armcc --thumb on the command
line, the compiler compiles as much of the code using the Thumb
instruction set as possible. However, the compiler might generate ARM
code for some parts of the compilation. In particular, if you are
compiling code for a pre-Thumb-2 processor and using VFP, any function
containing floating-point operations is compiled for ARM.
See details on the argument --fpu in Specifying the target processor or
architecture.name
These options control endianess:
--littleendGenerates code for an ARM processor using little-endian memory. With little-endian memory, the least significant byte of a word has the lowest address. This is the default.
--bigendGenerates code for an ARM processor using big-endian memory. With big-endian memory, the most significant byte of a word has the lowest address.
Choose between Byte Invariant Addressing mode and Word Invariant
Addressing mode at link time with the armlink command-line
options --be8 and --be32 (see RealView Compilation
Tools v3.0 Linker and Utilities Guide for details).
The optimization options can be grouped into:
This section describes how to control multiple optimizations with a single option.
You can also apply the -O, num-Ospace,
and -Otime optimizations on individual functions using
pragmas. See Pragmas controlling multiple optimizations for
more information.
The optimization options prefixed by -O are
specified using lowercase. However, the -O prefix
must be uppercase.
The multi-optimization options are:
-O, numSpecifies the level of optimization to be used:
-O0Minimum optimization. Turns off most optimizations. It gives the best possible debug view and the lowest level of optimization.
-O1Restricted
optimization. Removes unused inline functions and unused static
functions. Turns off optimizations that seriously degrade the debug
view. If used with --debug (see Debug table generation options), this option
gives a satisfactory debug view with good code density.
-O2High
optimization. If used with --debug (see Debug table generation options), the debug view
might be lss satisfactory because the mapping of object code to
source code is not always clear.
This is the default optimization level.
-O3Maximum
optimization. -O3 performs the same optimizations as -O2 however
the balance between space and time optimizations in the generated
code is more heavily weighted towards space or time compared with -O2.
That is:
-O3 -Otime aims
to produce faster code than -O2 -Otime, at the
risk of increasing your image size
-O3 -Ospace aims to produce smaller
code than -O2 -Ospace, but performance might be
degraded.
In addition, -O3 performs extra optimizations
that are more aggressive, such as:
High-level scalar optimizations, including loop unrolling. This can give significant performance benefits at a small code size cost, but at the risk of a slower build time.
More aggressive inlining and automatic inlining
for -O3 -Otime.
Multifile compilation by default (see Multifile compilation).
For floating-point code, -O3 is not necessarily
ISO C and C++ standard-compliant. Use -O3 --fpmode=std to
ensure ISO compliance. See the description of --fpmode for
more information.
Do not rely on the implementation details of these optimizations, because they might change in future releases.
-OspaceInstructs the compiler to perform optimizations to reduce image size at the expense of a possible increase in execution time. For example, large structure copies are done by out-of-line function calls instead of inline code. Use this option if code size is more critical than performance. This is the default.
-OtimeInstructs the compiler to perform optimizations to reduce execution time at the possible expense of a larger image. Use this option if execution time is more critical than code size. For example, it compiles:
while (expression)body;
as:
if (expression) { dobody; while (expression); }
If you specify neither -Otime nor -Ospace,
the compiler uses -Ospace. You can compile time-critical
parts of your code with -Otime, and the rest with -Ospace.
If you specify both -Otime and -Ospace in
the same compiler invocation, the last one wins (see Ordering command-line options).
--feedback filenameSpecifies the feedback file created by a previous execution of the ARM linker. The file contains a list of functions that the linker identifies as being unused in your code. The contents of this file are optimization hints only. These hints might be ignored by the compiler. Therefore, this is a safe optimization.
See Linker feedback for more details.
It is recommended that you use liker feedback in preference
to the --split_sections option (formerly -zo)
for removing unused functions. This is because linker feedback produces
smaller code, by avoiding the overhead of splitting all sections.
--fpmode modelSpecifies the floating-point conformance, and sets
library attributes and floating-point optimizations. can
be one of:model
ieee_fullAll facilities, operations, and representations guaranteed by the IEEE standard are available in single and double-precision. Modes of operation can be selected dynamically at runtime.
This defines the symbols:
__FP_IEEE
__FP_FENV_EXCEPTIONS
__FP_FENV_ROUNDING
__FP_INEXACT_EXCEPTION
ieee_fixedIEEE standard with round-to-nearest and no inexact exceptions.
This defines the symbols:
__FP_IEEE
__FP_FENV_EXCEPTIONS
ieee_no_fenvIEEE standard with round-to-nearest and no exceptions. This mode is stateless and is compatible with the Java floating-point arithmetic model.
This defines the symbol __FP_IEEE.
stdIEEE finite values with denormals flushed to zero, round-to-nearest, and no exceptions. This is compatible with standard C and C++ and is the default option.
Normal finite values are as predicted by the IEEE standard. However:
NaNs and infinities might not be produced in all circumstances defined by the IEEE model. Also, when they are produced, they might not have the same sign.
The sign of zero might not be that predicted by the IEEE model.
fastPerform
more aggressive floating-point optimizations that might cause a
small loss of accuracy to provide a significant performance increase. This
option defines the symbol __FP_FAST.
This option results in behavior that is not fully compliant with the ISO C or C++ standard, however numerically robust floating-point programs will behave correctly.
A number of transformations might be performed, including:
Double-precision math functions might be converted to single precision equivalents if all floating-point arguments can be exactly represented as single precision values, and the result is immediately converted to a single-precision value.
This transformation is only performed when the selected library
contains the single-precision equivalent functions, for example,
when the selected library is rvct or aeabi_glibc (for
more information, see the description of --library_interface in Single-optimization options).
For example:
float f(float a) { return sqrt(a); }
is transformed to
float f(float a) { return sqrtf(a); }.
Double-precision floating-point expressions that
are narrowed to single-precision are evaluated in single-precision
when it is beneficial to do so. For example, float y
= (float)(x + 1.0) is evaluated as float y
= (float)x + 1.0f.
Division by a floating-point constant is replaced
by multiplication with the inverse. For example, x / 3.0 is evaluated
as x * (1.0 / 3.0).
It is not guaranteed that the value of errno is
compliant with the ISO C or C++ standard after math functions have
been called. This enables the compiler to inline the VFP square
root instructions in place of calls to sqrt() or sqrtf().
--multifileEnables
the compiler to perform optimization across all specified files, instead
of on each individual file. The specified files are compiled into one
single object file. Using --multifile requires
large amounts of memory while compiling. Although there is no limit
to the number of files you can specify on the command line, a practical
limit is 10 source files.
--multifile is on by default for optimization
level -O3.
For more details on multifile compilation, see Multifile compilation.
--vfe --no_vfeEnables or disables unused virtual function elimination
(VFE) in C++ mode. --vfe is the default, except
for the case where legacy object files compiled with a pre-RVCT
v2.1 compiler do not contain VFE information.
When VFE is enabled, the compiler places the information in
special sections with the prefix .arm_vfe_. These
sections are harmless to a linker that is not VFE-aware, because
they are not referenced by the rest of the code. Therefore, they
do not increase the size of the executable. However, they increase
the size of the object files. If this is a problem, then specify --no_vfe.
For more details on VFE, and the associated linker options, see RealView Compilation Tools v3.0 Linker and Utilities Guide. Also, see Calling a pure virtual function for more information on pure virtual functions.
This section describes how to have individual control of the compiler optimizations:
--autoinline
--no_autoinlineEnables or disables
automatic inlining. --no_autoinline is the default
for optimization levels -O0 and -O1,
and --autoinline is the default for optimization
levels -O2 and -O3 (see Multi-optimization options).
The compiler automatically inlines functions where it is sensible
to do so. The -Ospace and -Otime options
influence how the compiler automatically inlines functions. Selecting -Otime increases
the likelihood that functions are inlined.
--data_reorder --no_data_reorderEnables or disables automatic reordering of top-level
data items (globals, for example). The compiler can save memory
by eliminating wasted space between data items. However, --data_reorder can
break legacy code, if the code makes invalid assumptions about ordering
of data by the compiler.
The ISO C Standard does not guarantee data order, so you must avoid writing code that depends on any assumed ordering. If you require data ordering, place the data items into a structure.
--forceinlineIf used, the compiler always attempts to inline
those functions marked as __inline, if possible.
The compiler attempts to inline the function, regardless of the
characteristics of the function. However, it does not inline a function
if doing so causes problems, for example, a recursive function is
inlined only once.
If you want to force specific functions to be inlined, use
the __forceinline function storage class modifier
(see Function storage class qualifiers).
--no_inlineDisables inlining of functions (see --inline).
Calls to inline functions are not expanded inline. You can use this
option to help debug inline functions.
If a function is declared inline, then it is compiled out-of-line
into a common code section. Functions marked as __forceinline are
still expanded inline (see Function storage class qualifiers).
--inlineEnables the compiler to inline functions. This is the default.
The compiler inlines functions as follows:
Automatically, for optimization levels -O2 and -O3 (see Multi-optimization options), unless you
use the option --no_autoinline.
When the function is qualified as an inline function.
That is with the __inline keyword in C, the __forceinline keyword
in C and C++, or the inline keyword in C++. This
applies for all optimization levels. Functions that are explicitly
qualified as inline functions are more likely to be inlined. However
using the inline qualifier does not guarantee
that functions are inlined. See Function keywords. Also, see the description of --forceinline.
The compiler changes the criteria for inlining functions depending
on whether you select -Ospace or -Otime.
Select -Otime to increase the likelihood that a
function is inlined. See Multi-optimization options for more details.
Sometimes, an out-of-line copy of an inlined function might remain in an object or image, even though that code is no longer used. Linker feedback enables you to detect and remove any unused code fragments. See Linker feedback.
When you set a breakpoint on an inline function, an ARM debugger attempts to set a breakpoint on each inlined instance of that function. If you are using Multi-ICE®, RealView ICE, or other hardware to debug an image in ROM, and the number of inline instances is greater than the number of available hardware breakpoints, the debugger cannot set the additional breakpoints and reports an error.
--lower_ropi --no_lower_ropiEnables or disables less restrictive C in ROPI mode.
See Position independence
qualifiers for details
of the /ropi option.
If you compile with --lower_ropi, then the
static initialization is done at runtime by the C++ constructor
mechanism, even for C. This enables these static initializations
to work with ROPI code.
--lower_rwpi --no_lower_rwpiEnables or disables less restrictive C and C++ in
RWPI mode. --lower_rwpi is the default. See Position independence
qualifiers for details of
the /rwpi option.
If you compile with --lower_rwpi, then the
static initialization is done at runtime by the C++ constructor
mechanism, even for C. This enables these static initializations
to work with RWPI code.
--split_ldmBy
default, the compiler uses registers for LDM and STM instructions:
16, for ARM instructions
15, for 32-bit Thumb-2 instructions
eight, for 16-bit Thumb and Thumb-2 instructions.
The --split_ldm option instructs the compiler
to split LDM and STM instructions into
two or more LDM or STM instructions, where
required, to reduce the maximum number of registers transferred
to:
five, for all STMs,
and for LDMs that do not load the PC
four, for LDMs that load the PC.
Inline assembler LDM and STM instructions
are split by default. However, the compiler might subsequently recombine
the separate instructions into an LDM or STM (see Instruction expansion for more details).
The --split_ldm option has the following
effects:
It can reduce interrupt latency on ARM systems that:
do not have a cache or a write buffer (for example, a cacheless ARM7TDMI)
use zero-wait-state, 32-bit memory.
Using --split_ldm increases code size and
decreases performance slightly.
It does not split VFP FLDM or FSTM instructions.
There are some systems that do not benefit from being built
with --split_ldm:
It has no significant benefit for cached systems, or for processors with a write buffer.
It has no benefit for systems with nonzero-wait-state memory, or for systems with slow peripheral devices. Interrupt latency in such systems is determined by the number of cycles required for the slowest memory or peripheral access. Typically, this is much greater than the latency introduced by multiple register transfers.
--library_interface=libSpecifies that the compiler output works with the
RVCT libraries or with any AEABI-compliant library. lib can
be one of:
rvctSpecifies that the compiler output works with the RVCT runtime libraries. Use this option to exploit the full range of compiler optimizations when linking. This is the default.
aeabi_clibSpecifies that the compiler output works with any AEABI-compliant C library.
aeabi_glibcSpecifies that the compiler output works with an AEABI-compliant version of the GNU C library.
Use this option when linking with any ABI-compliant, third-party, libraries
or where your code includes replacement functions, for example,
where using an embedded operating system. In this case, use this
option to disable the compiler variants, for example, if you are re-implementing
any functions such as printf or scanf.
This option ensures that the compiler does not generate calls to
any optimized functions. See ABI for the ARM Architecture compliance for more details.
--split_sectionsGenerates one ELF section for each function in the
source file. Output sections are named with the same name as the
function that generates the section, but with an i. prefix.
For example:
int f(int x) { return x+1; }
compiled with --split_sections gives:
AREA ||i.f||, CODE, READONLY
f PROC
ADD r0,r0,#1
MOV pc,lr
This option increases code size slightly (typically by a few percent) for some functions because it reduces the potential for sharing addresses, data, and string literals between functions.
If you want to remove unused functions, it is recommended that you use the linker feedback optimization in preference to this option. This is because linker feedback produces smaller code, by avoiding the overhead of splitting all sections. See Linker feedback for more details.
This section describes how to control symbol visibility:
--export_defs_implicitlyEnables you to control how dynamic symbols are exported.
Use this option to export definitions where the prototype was marked __declspec(dllimport).
See Storage class modifiers for
details on __declspec(dllimport).
--dllexport_allEnables you to control symbol visibility when building
DLLs. Use this option to mark all extern definitions
as __declspec(dllexport).
See Storage class modifiers for
details on __declspec(dllexport).
--no_hide_allEnables you to control symbol visibility when building
SVr4 shared objects. Use this option to mark all extern definitions
as __declspec(dllexport), and to import all undefined
references.
See Storage class modifiers for
details on __declspec(dllexport).
This option enables you to control pointer alignment:
--pointer_alignment=numSpecifies the unaligned pointer support required,
where is one of
the following:num
1Treats accesses through pointers as having an alignment of one, that is, byte-aligned or unaligned.
2Treats accesses through pointers as having an alignment of at most two, that is, at most halfword aligned.
4Treats accesses through pointers as having an alignment of at most four, that is, at most word aligned.
8Accesses through pointers have normal alignment, that is, at most doubleword aligned.
De-aligning pointers might increase the code size, even on CPUs with unaligned access support. This is because only a subset of the load and store instructions benefit from unaligned access support. The compiler is unable to use multiple-word transfers or coprocessor-memory transfers, including hardware floating-point loads and stores, directly on unaligned memory objects.
Code size might increase significantly when compiling for CPUs without hardware support for unaligned access.
Unaligned pointer mode does not affect the placement of objects in memory, nor the layout and padding of structures.
This option assists the porting of source code that has been
written for architectures without alignment requirements. You can
achieve finer control of access to unaligned data, with less impact
on the quality of generated code, using the __packed qualifier.
For more details on the __packed qualifier, see Type qualifiers.
These options enable you to control memory alignment:
--unaligned_access
--no_unaligned_accessIf you specify
a processor that supports ARMv6 (for example, --cpu ARM1136J-S)
or the ARMv6 architecture (that is, --cpu 6), the
compiler assumes the U bit is set and utilizes unaligned access
support to speed up accesses to packed structures by enabling an LDR instruction
to load from, or an STR instruction to store to, a
non-word aligned address. This means that the compiler might generate
unaligned word and halfword accesses, and might select a library
that supports unaligned accesses. Structures remain unpacked, unless
you explicitly qualify them with __packed (see Type qualifiers).
Therefore, code compiled for ARMv6 can run correctly only
if you enable unaligned support. To do this, you must set the U
bit (bit 22) of CP15 register 1 in your initialization code. This
can also be achieved in hardware, by tying the UBITINIT input
to the core HIGH.
Use --no_unaligned_access to disable the
generation of unaligned accesses on ARMv6 processors.
The --no_unaligned_access option replaces
the (now deprecated) --memaccess -UL41. The --memaccess option
is deprecated and will be removed in a future release.
--min_array_alignment=optionSpecifies the minimum alignment of arrays, where is
one of the following:option
1Byte alignment, or unaligned.
2Two-byte (halfword) alignment.
4Four-byte (word) alignment.
8Eight-byte (doubleword) alignment.
For example, compiling the following code with --min_array_alignment=8,
gives the alignment described in the comments:
char arr_c1[1]; // alignment == 8
char c1; // alignment == 1
char arr_c2[3]; // alignment == 8
char arr_c3[10]; // alignment == 8
struct st {
int i1;
} c; // alignment == 4
char c2; // alignment == 1
Also, see Storage class modifiers for
a description of the __align( storage
class modifier.n)
These options enable you to specify implementation details:
--enum_is_intForces the size of all enumeration types to be at least 4 bytes. This option is switched off by default and the smallest data type is used that can hold the values of all enumerators.
The --enum_is_int option is not recommended
for general use and is not required for ISO-compatible source. Code
compiled with this option is not compliant with the ABI
for the ARM Architecture (base standard) [BSABI], and
incorrect use might result in a failure at runtime. This option
is not supported by the C++ libraries.
--dollar --no_dollarAccepts dollar signs, $, in identifiers.
The default is --dollar, except in --strict mode.
--alternative_tokens
--no_alternative_tokensEnables or
disables the recognition of alternative tokens. This controls recognition
of the digraphs in C and C++, and controls recognition of the operator
keywords, such as and and bitand, in C++.
For more details on digraphs, see The Design and Evolution
of C++, or any other book describing the C++ programming
language. The default behavior is --alternative_tokens.
--multibyte_chars --no_multibyte_charsEnables or disables processing for multibyte character
sequences in comments, string literals, and character constants.
Multibyte encodings are used for character sets such as the Japanese Shift-Japanese
Industrial Standard (Shift-JIS). The default behavior
is --no_multibyte_chars.
--locale lang_countryUse this option in combination with --multibyte_chars to
switch the default locale for source files to the one you specify
in .lang_country
For example, to compile Japanese source files on an English-based Windows workstation, use:
--multibyte_chars --locale japanese
and on a UNIX workstation use:
--multibyte_chars --locale ja_JP
The locale name might be case-sensitive, depending on the host platform.
The permitted settings of locale are determined by the host platform.
Ensure that you have installed the appropriate locale support for the host platform.
--message_locale lang_country --message_locale lang_country.codepageUse this option to switch the default language for
the display of error and warning messages to the one you specify
in or lang_country.lang_country.codepage
For example, to display messages in Japanese, use:
--message_locale ja_JP
The locale name might be case-sensitive, depending on the host platform.
Ensure that you have installed the appropriate locale support for the host platform.
The permitted languages are independent of the host platform. The following settings are supported in this release of RVCT:
en_US (the
default)
zh_CN
ko_KR
ja_JP.
The ability to specify a codepage, and its meaning, depends on the host platform.
If you specify a setting that is not supported, the compiler silently ignores this and uses the default for your environment.
--loose_implicit_castMakes illegal implicit casts legal, such as implicit casts of a nonzero int to pointer, for example:
int *p = 0x8000;
Without this option, the compiler reports:
Error: #144: a value of type “int” cannot be used to initialize an entity of type “int *”
With this option, the compiler generates the following warning message, which you can suppress (see Suppressing diagnostic messages):
Warning: #152-D: conversion of nonzero integer to pointer
--restrict --no_restrictEnables or disables the use of the C99 restrict keyword.
The default is --no_restrict.
See restrict for more details on the restrict keyword.
--signed_bitfields --unsigned_bitfieldsMakes bitfields signed. The default is --unsigned_bitfields.
The AAPCS requirement for bitfields to default to unsigned on ARM has been overturned.
--signed_chars --unsigned_charsMakes the char type to be signed or
unsigned. The default is --unsigned_chars.
When char is signed, the macro __FEATURE_SIGNED_CHAR is
defined by the compiler.
For --unsigned_chars, any char that
is assigned a negative number causes the following warning to be
generated:
Warning: #68-D: integer conversion resulted in a change of sign
The --signed_chars option is not recommended
for general use and is not required for ISO-compatible source. Code
compiled with this option is not compliant with the ABI
for the ARM Architecture (base standard) [BSABI], and
incorrect use might result in a failure at runtime. This option
is not supported by the C++ libraries.