RealView® Compilation Tools Assembler Guide

Version 4.0


Table of Contents

Preface
About this book
Intended audience
Using this book
Typographical conventions
Further reading
Feedback
Feedback on RealView Compilation Tools
Feedback on this book
1. Introduction
1.1. About the RealView Compilation Tools assemblers
1.1.1. ARM assembly language
1.1.2. Wireless MMX Technology instructions
1.1.3. NEON technology
1.1.4. Using the examples
2. Writing ARM Assembly Language
2.1. Introduction
2.1.1. Code examples
2.2. Overview of the ARM architecture
2.2.1. Architecture versions
2.2.2. ARM, Thumb, Thumb‑2, and Thumb‑2EE instruction sets
2.2.3. ARM, Thumb, and ThumbEE state
2.2.4. Processor mode
2.2.5. Registers
2.2.6. Instruction set overview
2.2.7. Instruction capabilities
2.3. Structure of assembly language modules
2.3.1. Layout of assembly language source files
2.3.2. An example ARM assembly language module
2.3.3. Calling subroutines
2.4. Conditional execution
2.4.1. The ALU status flags
2.4.2. Conditional execution
2.4.3. Using conditional execution
2.4.4. Example of the use of conditional execution
2.4.5. The Q flag
2.5. Loading constants into registers
2.5.1. Direct loading with MOV and MVN
2.5.2. Loading with MOV32
2.5.3. Loading with LDR Rd, =const
2.5.4. Loading floating‑point constants
2.6. Loading addresses into registers
2.6.1. Direct loading with ADR and ADRL
2.6.2. Loading addresses with LDR Rd, =label
2.7. Load and store multiple register instructions
2.7.1. Load and store multiple instructions available in ARM and Thumb
2.7.2. Implementing stacks with LDM and STM
2.7.3. Block copy with LDM and STM
2.8. Using macros
2.8.1. Test‑and‑branch macro example
2.8.2. Unsigned integer division macro example
2.9. Adding symbol versions
2.10. Using frame directives
2.11. Assembly language changes
3. Assembler Reference
3.1. Command syntax
3.1.1. Obtaining a list of available options
3.1.2. Specifying command‑line options with an environment variable
3.1.3. AAPCS
3.1.4. Floating‑point model
3.1.5. CPU names
3.1.6. FPU names
3.1.7. Memory access attributes
3.1.8. Pre‑executing a SET directive
3.1.9. Splitting long LDMs and STMs
3.1.10. Listing output to a file
3.1.11. Project template options
3.1.12. Controlling the output of diagnostic messages
3.1.13. Controlling exception table generation
3.2. Format of source lines
3.3. Predefined register and coprocessor names
3.3.1. Predeclared register names
3.3.2. Predeclared extension register names
3.3.3. Predeclared XScale register names
3.3.4. Predeclared coprocessor names
3.4. Built‑in variables and constants
3.4.1. Detecting versions of armasm
3.5. Symbols
3.5.1. Symbol naming rules
3.5.2. Variables
3.5.3. Numeric constants
3.5.4. Assembly time substitution of variables
3.5.5. Labels
3.5.6. Local labels
3.6. Expressions, literals, and operators
3.6.1. String expressions
3.6.2. String literals
3.6.3. Numeric expressions
3.6.4. Numeric literals
3.6.5. Floating‑point literals
3.6.6. Register‑relative and program‑relative expressions
3.6.7. Logical expressions
3.6.8. Logical literals
3.6.9. Operator precedence
3.6.10. Unary operators
3.6.11. Binary operators
3.7. Diagnostic messages
3.7.1. Interlocks
3.7.2. IT block generation
3.7.3. Thumb branch target alignment
3.8. Using the C preprocessor
4. ARM and Thumb Instructions
4.1. Instruction summary
4.2. Memory access instructions
4.2.1. Address alignment
4.2.2. LDR and STR (immediate offset)
4.2.3. LDR and STR (register offset)
4.2.4. LDR and STR (User mode)
4.2.5. LDR (pc‑relative)
4.2.6. ADR
4.2.7. PLD, PLDW, and PLI
4.2.8. LDM and STM
4.2.9. PUSH and POP
4.2.10. RFE
4.2.11. SRS
4.2.12. LDREX and STREX
4.2.13. CLREX
4.2.14. SWP and SWPB
4.3. General data processing instructions
4.3.1. Flexible second operand
4.3.2. ADD, SUB, RSB, ADC, SBC, and RSC
4.3.3. SUBS pc, lr
4.3.4. AND, ORR, EOR, BIC, and ORN
4.3.5. CLZ
4.3.6. CMP and CMN
4.3.7. MOV and MVN
4.3.8. MOVT
4.3.9. TST and TEQ
4.3.10. SEL
4.3.11. REV, REV16, REVSH, and RBIT
4.3.12. ASR, LSL, LSR, ROR, and RRX
4.3.13. SDIV and UDIV
4.4. Multiply instructions
4.4.1. MUL, MLA, and MLS
4.4.2. UMULL, UMLAL, SMULL, and SMLAL
4.4.3. SMULxy and SMLAxy
4.4.4. SMULWy and SMLAWy
4.4.5. SMLALxy
4.4.6. SMUAD{X} and SMUSD{X}
4.4.7. SMMUL, SMMLA, and SMMLS
4.4.8. SMLAD and SMLSD
4.4.9. SMLALD and SMLSLD
4.4.10. UMAAL
4.4.11. MIA, MIAPH, and MIAxy
4.5. Saturating instructions
4.5.1. Saturating arithmetic
4.5.2. QADD, QSUB, QDADD, and QDSUB
4.5.3. SSAT and USAT
4.6. Parallel instructions
4.6.1. Parallel add and subtract
4.6.2. USAD8 and USADA8
4.6.3. SSAT16 and USAT16
4.7. Packing and unpacking instructions
4.7.1. BFC and BFI
4.7.2. SBFX and UBFX
4.7.3. SXT, SXTA, UXT, and UXTA
4.7.4. PKHBT and PKHTB
4.8. Branch and control instructions
4.8.1. B, BL, BX, BLX, and BXJ
4.8.2. IT
4.8.3. CBZ and CBNZ
4.8.4. TBB and TBH
4.9. Coprocessor instructions
4.9.1. CDP and CDP2
4.9.2. MCR, MCR2, MCRR, and MCRR2
4.9.3. MRC, MRC2, MRRC and MRRC2
4.9.4. LDC, LDC2, STC, and STC2
4.10. Miscellaneous instructions
4.10.1. BKPT
4.10.2. SVC
4.10.3. MRS
4.10.4. MSR
4.10.5. CPS
4.10.6. SMC
4.10.7. SETEND
4.10.8. NOP, SEV, WFE, WFI, and YIELD
4.10.9. DBG, DMB, DSB, and ISB
4.10.10. MAR and MRA
4.11. Instruction width selection in Thumb
4.11.1. Instruction width specifiers, .W and .N
4.11.2. Different behavior for some instructions
4.11.3. Diagnostic warning
4.12. ThumbEE instructions
4.12.1. ENTERX and LEAVEX
4.12.2. CHKA
4.12.3. HB, HBL, HBLP, and HBP
4.13. Pseudo‑instructions
4.13.1. ADRL pseudo‑instruction
4.13.2. MOV32 pseudo‑instruction
4.13.3. LDR pseudo‑instruction
4.13.4. UND pseudo‑instruction
5. NEON and VFP Programming
5.1. Instruction summary
5.1.1. NEON instructions
5.1.2. Shared NEON and VFP instructions
5.1.3. VFP instructions
5.2. Architecture support for NEON and VFP
5.2.1. Half-precision extension
5.3. The extension register bank
5.3.1. NEON views of the register bank
5.3.2. VFP views of the extension register bank
5.4. Condition codes
5.5. General information
5.5.1. Floating-point exceptions
5.5.2. NEON and VFP data types
5.5.3. Normal, Long, Wide, Narrow, and saturating instructions in NEON
5.5.4. NEON Scalars
5.5.5. Extended notation
5.5.6. Polynomial arithmetic over {0,1}
5.5.7. The VFP coprocessor
5.6. Instructions shared by NEON and VFP
5.6.1. VLDR and VSTR
5.6.2. VLDM, VSTM, VPOP, and VPUSH
5.6.3. VMOV (between two ARM registers and an extension register)
5.6.4. VMOV (between an ARM register and a NEON scalar)
5.6.5. VMOV (between one ARM register and single precision VFP)
5.6.6. VMRS and VMSR
5.7. NEON logical and compare operations
5.7.1. VAND, VBIC, VEOR, VORN, and VORR (register)
5.7.2. VBIC and VORR (immediate)
5.7.3. VBIF, VBIT, and VBSL
5.7.4. VMOV, VMVN (register)
5.7.5. VACGE and VACGT
5.7.6. VCEQ, VCGE, VCGT, VCLE, and VCLT
5.7.7. VTST
5.8. NEON general data processing instructions
5.8.1. VCVT (between fixed-point or integer, and floating-point)
5.8.2. VCVT (between half-precision and single-precision floating-point)
5.8.3. VDUP
5.8.4. VEXT
5.8.5. VMOV, VMVN (immediate)
5.8.6. VMOVL, V{Q}MOVN, VQMOVUN
5.8.7. VREV
5.8.8. VSWP
5.8.9. VTBL, VTBX
5.8.10. VTRN
5.8.11. VUZP, VZIP
5.9. NEON shift instructions
5.9.1. VSHL, VQSHL, VQSHLU, and VSHLL (by immediate)
5.9.2. V{Q}{R}SHL (by signed variable)
5.9.3. V{R}SHR{N}, V{R}SRA (by immediate)
5.9.4. VQ{R}SHR{U}N (by immediate)
5.9.5. VSLI and VSRI
5.10. NEON general arithmetic instructions
5.10.1. VABA{L} and VABD{L}
5.10.2. V{Q}ABS and V{Q}NEG
5.10.3. V{Q}ADD, VADDL, VADDW, V{Q}SUB, VSUBL, and VSUBW
5.10.4. V{R}ADDHN and V{R}SUBHN
5.10.5. V{R}HADD and VHSUB
5.10.6. VPADD{L}, VPADAL
5.10.7. VMAX, VMIN, VPMAX, and VPMIN
5.10.8. VCLS, VCLZ, and VCNT
5.10.9. VRECPE and VRSQRTE
5.10.10. VRECPS and VRSQRTS
5.11. NEON multiply instructions
5.11.1. VMUL{L}, VMLA{L}, and VMLS{L}
5.11.2. VMUL{L}, VMLA{L}, and VMLS{L} (by scalar)
5.11.3. VQDMULL, VQDMLAL, and VQDMLSL (by vector or by scalar)
5.11.4. VQ{R}DMULH (by vector or by scalar)
5.12. NEON load / store element and structure instructions
5.12.1. Interleaving
5.12.2. Alignment restrictions in load and store, element and structure instructions
5.12.3. VLDn and VSTn (single n‑element structure to one lane)
5.12.4. VLDn (single n‑element structure to all lanes)
5.12.5. VLDn and VSTn (multiple n‑element structures)
5.13. NEON and VFP pseudo‑instructions
5.13.1. VLDR pseudo‑instruction
5.13.2. VLDR and VSTR (post-increment and pre-decrement)
5.13.3. VMOV2
5.13.4. VAND and VORN (immediate)
5.13.5. VACLE and VACLT
5.13.6. VCLE and VCLT
5.14. NEON and VFP system registers
5.14.1. FPSCR, the floating‑point status and control register
5.14.2. FPEXC, the floating‑point exception register
5.14.3. FPSID, the floating‑point system ID register
5.14.4. Modifying individual bits of a NEON and VFP system register
5.15. Flush‑to‑zero mode
5.15.1. When to use flush‑to‑zero mode
5.15.2. The effects of using flush‑to‑zero mode
5.15.3. Operations not affected by flush‑to‑zero mode
5.16. VFP instructions
5.16.1. VABS, VNEG, and VSQRT
5.16.2. VADD, VSUB, and VDIV
5.16.3. VMUL, VMLA, VMLS, VNMUL, VNMLA, and VNMLS
5.16.4. VCMP
5.16.5. VCVT (between single-precision and double-precision)
5.16.6. VCVT (between floating-point and integer)
5.16.7. VCVT (between floating-point and fixed-point)
5.16.8. VCVTB, VCVTT (half-precision extension)
5.16.9. VMOV
5.17. VFP vector mode
5.17.1. Register banks
5.17.2. Vectors
5.17.3. VFP vector and scalar operations
5.17.4. VFP directives and vector notation
6. Wireless MMX Technology Instructions
6.1. Introduction
6.2. ARM support for Wireless MMX Technology
6.2.1. Registers
6.2.2. Directives, WRN and WCN
6.2.3. Frame directives
6.2.4. Wireless MMX load and store instructions
6.2.5. Wireless MMX Technology and XScale instructions
6.3. Wireless MMX instructions
6.3.1. Pseudo‑instructions
7. Directives Reference
7.1. Alphabetical list of directives
7.2. Symbol definition directives
7.2.1. GBLA, GBLL, and GBLS
7.2.2. LCLA, LCLL, and LCLS
7.2.3. SETA, SETL, and SETS
7.2.4. RELOC
7.2.5. RN
7.2.6. RLIST
7.2.7. CN
7.2.8. CP
7.2.9. QN, DN, and SN
7.3. Data definition directives
7.3.1. LTORG
7.3.2. MAP
7.3.3. FIELD
7.3.4. SPACE or FILL
7.3.5. DCB
7.3.6. DCD and DCDU
7.3.7. DCDO
7.3.8. DCFD and DCFDU
7.3.9. DCFS and DCFSU
7.3.10. DCI
7.3.11. DCQ and DCQU
7.3.12. DCW and DCWU
7.3.13. COMMON
7.3.14. DATA
7.4. Assembly control directives
7.4.1. Nesting directives
7.4.2. MACRO and MEND
7.4.3. MEXIT
7.4.4. IF, ELSE, ENDIF, and ELIF
7.4.5. WHILE and WEND
7.5. Frame directives
7.5.1. FRAME ADDRESS
7.5.2. FRAME POP
7.5.3. FRAME PUSH
7.5.4. FRAME REGISTER
7.5.5. FRAME RESTORE
7.5.6. FRAME RETURN ADDRESS
7.5.7. FRAME SAVE
7.5.8. FRAME STATE REMEMBER
7.5.9. FRAME STATE RESTORE
7.5.10. FRAME UNWIND ON
7.5.11. FRAME UNWIND OFF
7.5.12. FUNCTION or PROC
7.5.13. ENDFUNC or ENDP
7.6. Reporting directives
7.6.1. ASSERT
7.6.2. INFO
7.6.3. OPT
7.6.4. TTL and SUBT
7.7. Instruction set and syntax selection directives
7.7.1. ARM, THUMB, THUMBX, CODE16 and CODE32
7.8. Miscellaneous directives
7.8.1. ALIAS
7.8.2. ALIGN
7.8.3. AREA
7.8.4. ATTR
7.8.5. END
7.8.6. ENTRY
7.8.7. EQU
7.8.8. EXPORT or GLOBAL
7.8.9. EXPORTAS
7.8.10. GET or INCLUDE
7.8.11. IMPORT and EXTERN
7.8.12. INCBIN
7.8.13. KEEP
7.8.14. NOFP
7.8.15. REQUIRE
7.8.16. REQUIRE8 and PRESERVE8
7.8.17. ROUT

List of Tables

2.1. ARM processor modes
2.2. Condition code suffixes
2.3. Conditional branches only
2.4. All instructions conditional
2.5. ARM state immediate constants (8‑bit)
2.6. ARM state immediate constants in MOV instructions
2.7. Thumb-2 immediate constants
2.8. Thumb-2 immediate constants in MOV instructions
2.9. Stack-oriented suffixes and equivalent addressing mode suffixes
2.10. Suffixes for load and store multiple instructions
2.11. Changes from earlier ARM assembly language
2.12. Relaxation of requirements
2.13. Differences between pre-UAL Thumb syntax and UAL syntax
3.1. Specifying a command line option and an AREA directive for GNU-stack sections
3.2. Compatible processor or architecture combinations
3.3. Severity of diagnostic messages
3.4. Built‑in variables
3.5. Built‑in Boolean constants
3.6. Operator precedence in armasm
3.7. Operator precedence in C
3.8. Unary operators that return strings
3.9. Unary operators that return numeric or logical values
3.10. Multiplicative operators
3.11. String manipulation operators
3.12. Shift operators
3.13. Addition, subtraction, and logical operators
3.14. Relational operators
3.15. Boolean operators
3.16. Command-line options
3.17. armcc equivalent command-line options
4.1. Location of instructions
4.2. Offsets and architectures, LDR/STR, word, halfword, and byte
4.3. Options and architectures, LDR/STR (register offsets)
4.4. Offsets and architectures, LDR/STR (User mode)
4.5. pc-relative offsets
4.6. pc-relative offsets
4.7. Branch instruction availability and range
4.8. Range and encoding of expr
5.1. Location of NEON instructions
5.2. Location of shared NEON and VFP instructions
5.3. Location of VFP instructions
5.4. Condition codes
5.5. NEON data types
5.6. VFP data types
5.7. NEON saturation ranges
5.8. Patterns for immediate constant
5.9. Available constants
5.10. Results for out‑of‑range inputs
5.11. Results for out‑of‑range inputs
5.12. Permitted combinations of parameters
5.13. Permitted combinations of parameters
5.14. Permitted combinations of parameters
5.15. Pre-UAL VFP mnemonics
5.16. Floating-point constant values
6.1. Status and Control registers
6.2. Wireless MMX Technology instructions
6.3. Wireless MMX Technology pseudo‑instructions
7.1. Location of directives
7.2. OPT directive settings

Proprietary Notice

Words and logos marked with ® or ™ are registered trademarks or trademarks of ARM® in the EU and other countries, except as otherwise stated below in this proprietary notice. Other brands and names mentioned herein may be the trademarks of their respective owners.

Neither the whole nor any part of the information contained in, or the product described in, this document may be adapted or reproduced in any material form except with the prior written permission of the copyright holder.

The product described in this document is subject to continuous developments and improvements. All particulars of the product and its use contained in this document are given by ARM in good faith. However, all warranties implied or expressed, including but not limited to implied warranties of merchantability, or fitness for purpose, are excluded.

This document is intended only to assist the reader in the use of the product. ARM shall not be liable for any loss or damage arising from the use of any information in this document, or any error or omission in such information, or any incorrect use of the product.

Where the term ARM is used it means “ARM or any of its subsidiaries as appropriate”.

Confidentiality Status

This document is Non-Confidential. The right to use, copy and disclose this document may be subject to license restrictions in accordance with the terms of the agreement entered into by ARM and the party that ARM delivered this document to.

Unrestricted Access is an ARM internal classification.

Product Status

The information in this document is final, that is for a developed product.

Revision History
Revision AAugust 2002Release 1.2
Revision BJanuary 2003Release 2.0
Revision CSeptember 2003Release 2.0.1 for RealView Development Suite v2.0
Revision DJanuary 2004Release 2.1 for RealView Development Suite v2.1
Revision EDecember 2004Release 2.2 for RealView Development Suite v2.2
Revision FMay 2005Release 2.2 for RealView Development Suite v2.2 SP1
Revision GMarch 2006Release 3.0 for RealView Development Suite v3.0
Revision HMarch 2007Release 3.1 for RealView Development Suite v3.1
Revision ISeptember 2008Release 4.0 for RealView Development Suite v4.0
Revision I23 January 2009Update 1 for RealView Development Suite v4.0
Revision J10 December 2010Update 2 for RealView Development Suite v4.0
Copyright © 2002-2010 ARM. All rights reserved.ARM DUI 0204J
Non-ConfidentialID101213