8.1.2. ARM and Thumb instruction sets

The ARM and Thumb instruction sets are described in the ARM Architectural Reference Manual. All instruction opcodes and register specifiers may be written in either lowercase or uppercase.

Operand expressions

Any register or constant operand may be an arbitrary C or C++ expression, so that variables can be read or written. The expression must be integer assignable, that is, of type char, short, or int. No sign extension is performed on char and short types. You must perform sign extension explicitly for these types. The compiler may add code to evaluate these expressions and allocate them to registers.

When an operand is used as a destination, the expression must be assignable (an lvalue). When writing code that uses both physical registers and expressions, you must take care not to use complex expressions that require too many registers to evaluate. The compiler issues an error message if it detects conflicts during register allocation.

Physical registers

The inline assemblers allow restricted access to the physical registers. It is illegal to write to pc. Only Branches using B or BL are allowed. In addition, it is inadvisable to intermix inline assembler instructions that use physical registers and complex C or C++ expressions.

The compiler uses r12 (ip) for intermediate results, and r0-r3, r12 (ip), r14 (lr) for function calls while evaluating C expressions, so these cannot be used as physical registers at the same time.

Physical registers, like variables, must be set before they can be read. When physical registers are used the compiler saves and restores C/C++ variables that may be allocated to the same physical register. However, the compiler cannot restore sp, sl, fp, or sb in calling standards where these registers have a defined role.


The constant expression specifier (#) is optional. If it is used, the expression following it must be constant.

Instruction expansion

The constant in instructions with a constant operand is not limited to the values allowed by the instruction. Instead, such an instruction is translated into a sequence of instructions with the same effect. For example:

	ADD r0, r0, #1023

may be translated into:

	ADD r0, r0, #1024
	SUB r0, r0, #1

With the exception of coprocessor instructions, all ARM and Thumb instructions with a constant operand support instruction expansion.In addition, the MUL instruction can be expanded into a sequence of adds and shifts when the third operand is a constant.

The effect of updating the CPSR by an expanded instruction is:

  • Arithmetic instructions set the NZCV flags correctly.

  • Logical instructions:

    • set the NZ flags correctly

    • do not change the V flag

    • corrupt the C flag.

  • MRS sets the NZCV flags correctly.


C and C++ labels can be used in inline assembler statements. C and C++ labels can be branched to by branch instructions only in the form:

B{cond} label

You cannot branch to labels using BL.

Storage declarations

All storage can be declared in C or C++ and passed to the inline assembler using variables. Therefore, the storage declarations that are supported by armasm are not implemented.

SWI and BL instructions

SWIs and branch link instructions must specify exactly which calling standard is used. Three optional register lists follow the normal instruction fields. The register lists specify:

  • the registers that are the input parameters

  • the registers that are output parameters after return

  • the registers that are corrupted by the called function.

For example:

SWI{cond} swi_num, {input_regs}, {output_regs}, {corrupted_regs
BL{condfunction, {input_regs}, {output_regs}, {corrupted_regs}

An omitted list is assumed to be empty, except for BL, which always corrupts r0-r3, ip, and lr.

The register lists have the same syntax as LDM and STM register lists. If the NZCV flags are modified you must specify PSR in the corrupted register list.

Copyright © 1997, 1998 ARM Limited. All rights reserved.ARM DUI 0040D