VFP11 ™ VectorFloating-point Coprocessor Technical Reference Manual

for ARM1136JF-S processorr1p5


Table of Contents

Preface
About this document
Product revision status
Intended audience
Using this manual
Conventions
Further reading
Feedback
Feedback on the VFP11 coprocessor
Feedback on this manual
1. Introduction
1.1. About the VFP11 coprocessor
1.2. Applications
1.3. Coprocessor interface
1.4. VFP11 coprocessor pipelines
1.4.1. FMAC pipeline
1.4.2. DS pipeline
1.4.3. LS pipeline
1.5. Modes of operation
1.5.1. Full-compliance mode
1.5.2. Flush-to-zero mode
1.5.3. Default NaN mode
1.5.4. RunFast mode
1.6. Short vector instructions
1.7. Parallel execution of instructions
1.8. VFP11 treatment of branch instructions
1.9. Writing optimal VFP11 code
1.10. Product revisions
2. Register File
2.1. About the register file
2.2. Register file internal formats
2.2.1. Integer data format
2.2.2. Single-precision data format
2.2.3. Double-precision data format
2.3. Decoding the register file
2.4. Loading operands from ARM1136JF-S registers
2.5. Maintaining consistency in registerprecision
2.6. Data transfer between memory and VFP11 registers
2.7. Access to register banks in CDP operations
2.7.1. About register banks
2.7.2. Operations using register banks
3. Programmer’s Model
3.1. About the programmer’s model
3.2. Compliance with the IEEE 754 standard
3.2.1. An IEEE 754 standard-compliant implementation
3.2.2. Complete implementation of the IEEE 754 standard
3.2.3. IEEE 754 standard implementation choices
3.3. ARMv5TE coprocessor extensions
3.3.1. FMDRR
3.3.2. FMRRD
3.3.3. FMSRR
3.3.4. FMRRS
3.4. VFP11 system registers
3.4.1. Floating-Point System ID Register,FPSID
3.4.2. Floating-Point Statusand Control Register, FPSCR
3.4.3. Floating-Point Exception Register,FPEXC
3.4.4. Floating Point Instruction Registers,FPINST and FPINST2
3.4.5. Media and VFP Feature Registers
4. Instruction Execution
4.1. About instruction execution
4.2. Serializing instructions
4.3. Interrupting the VFP11 coprocessor
4.4. Forwarding
4.5. Hazards
4.6. Operation of the scoreboards
4.6.1. Scoreboard operation when an instruction bounces
4.6.2. Single-precision source register locking
4.6.3. Single-precision source register clearing
4.6.4. Double-precision source register locking
4.6.5. Double-precision source register clearing
4.7. Data hazards in full-compliance mode
4.7.1. Status register RAW hazard example
4.7.2. Load multiple-CDP RAW hazard example
4.7.3. Load multiple-short vector CDP RAWhazard example
4.7.4. CDP-CDP RAW hazard example
4.7.5. Short vector CDP-load multiple WARhazard example
4.8. Data hazards in RunFast mode
4.8.1. Short vector CDP-load multiple WAR hazard example
4.9. Resource hazards
4.9.1. Load multiple-load-CDP resource hazardexample
4.9.2. Load multiple-short vector CDP resourcehazard example
4.9.3. Short vector CDP-CDP resource hazardexample
4.10. Parallel execution
4.11. Execution timing
5. Exception Handling
5.1. About exception processing
5.2. Bounced instructions
5.2.1. Potential or actual exception that the VFP11 coprocessorcannot handle
5.2.2. Potential or actual exception withthe exception enable bit set
5.3. Support code
5.3.1. Illegal instructions
5.4. Exception processing
5.4.1. Determination of the trigger instruction
5.4.2. Exception processing for CDP scalarinstructions
5.4.3. Exception processing for CDP shortvector instructions
5.4.4. Examples of exception detection forvector instructions
5.5. Input Subnormal exception
5.5.1. Exception enabled
5.5.2. Exception disabled
5.6. Invalid Operation exception
5.6.1. Exception enabled
5.6.2. Exception disabled
5.7. Division by Zero exception
5.7.1. Exception enabled
5.7.2. Exception disabled
5.8. Overflow exception
5.8.1. Exception enabled
5.8.2. Exception disabled
5.9. Underflow exception
5.9.1. Exception enabled
5.9.2. Exception disabled
5.10. Inexact exception
5.10.1. Exception enabled
5.10.2. Exception disabled
5.11. Input exceptions
5.12. Arithmetic exceptions
5.12.1. FADD and FSUB
5.12.2. FCMP , FCMPZ , FCMPE ,and FCMPEZ
5.12.3. FMUL and FNMUL
5.12.4. FMAC , FMSC , FNMAC ,and FNMSC
5.12.5. FDIV
5.12.6. FSQRT
5.12.7. FCPY , FABS ,and FNEG
5.12.8. FCVTDS and FCVTSD
5.12.9. FUITO and FSITO
5.12.10. FTOUI , FTOUIZ , FTOSI ,and FTOSIZ
Glossary

List of Tables

2.1. VFP11 MCR instructions
2.2. VFP11 MRC instructions
2.3. VFP11 MCRR instructions
2.4. VFP11 MRRC instructions
2.5. Single-precision data memory images and byte addresses
2.6. Double-precision data memory images and byte addresses
2.7. Single-precision three-operand register usage
2.8. Single-precision two-operand register usage
2.9. Double-precision three-operand register usage
2.10. Double-precision two-operand register usage
3.1. Default NaN values
3.2. QNaN and SNaN handling
3.3. VFP11 system registers
3.4. Accessing VFP11 system registers
3.5. FPSID Register bit fields
3.6. FPSCR Register bit fields
3.7. Vector length and stride combinations
3.8. FPEXC Register bit fields
3.9. Media and VFP Feature Register 0 bit fields
3.10. Media and VFP Feature Register 1 bit fields
4.1. Single-precision source register locking
4.2. Single-precision source register clearing
4.3. Double-precision source register locking
4.4. Double-precision source register clearing for one-cycle instructions
4.5. Double-precision source register clearing for two-cycle instructions
4.6. FCMPS -FMSTAT RAW hazard
4.7. FLDM -FADDS RAW hazard
4.8. FLDM -short vector FADDS RAWhazard
4.9. FMULS -FADDS RAW hazard
4.10. Short vector FMULS -FLDMS WARhazard
4.11. Short vector FMULS -FLDMS WARhazard in RunFast mode
4.12. FLDM -FLDS -FADDS resourcehazard
4.13. FLDM -short vector FMULS resourcehazard
4.14. Short vector FDIVS -FADDS resourcehazard, cycles 1 to 22
4.15. Short vector FDIVS -FADDS resourcehazard, cycles 23 to 36
4.16. Parallel execution in all three pipelines
4.17. Throughput and latency cycle counts for VFP11 instructions
5.1. Exceptional short vector FMULD followedby load/store instructions
5.2. Exceptional short vector FADDS with a FADDS inthe pretrigger slot
5.3. Exceptional short vector FADDD with an FMACS triggerinstruction
5.4. Possible Invalid Operation exceptions
5.5. Default results for invalid conversion inputs
5.6. Rounding mode overflow results
5.7. LSA and USA determination
5.8. FADD family bounce thresholds
5.9. FMUL family bounce thresholds
5.10. FDIV bounce thresholds
5.11. FCVTSD bounce thresholds
5.12. Single-precision float-to-integer bounce thresholds and storedresults
5.13. Double-precision float-to-integer bounce thresholds and storedresults

ProprietaryNotice

Words and logos marked with ® or ™ are registered trademarks or trademarksof ARM Limited in the EU and other countries, except as otherwisestated below in this proprietary notice. Other brands and names mentionedherein may be the trademarks of their respective owners.

Neither the whole nor any part of the information containedin, or the product described in, this document may be adapted orreproduced in any material form except with the prior written permissionof the copyright holder.

The product described in this document is subject to continuousdevelopments and improvements. All particulars of the product andits use contained in this document are given by ARM Limited in goodfaith. However, all warranties implied or expressed, including butnot limited to implied warranties of merchantability, or fitnessfor purpose, are excluded.

This document is intended only to assist the reader in theuse of the product. ARM Limited shall not be liable for any lossor damage arising from the use of any information in this document,or any error or omission in such information, or any incorrect useof the product.

Some material in this document is based on IEEEStandard for Binary Floating-Point Arithmetic , ANSI/IEEE Std754-1985. The IEEE disclaims any responsibility or liability resultingfrom the placement and use in the described manner.

Confidentiality Status

This document is Non-Confidential. The right to use, copyand disclose this document may be subject to license restrictionsin accordance with the terms of the agreement entered into by ARMand the party that ARM delivered this document to.

Product Status

The information in this document is final, that is for a developedproduct.

Revision History
Revision A 19December 2002 First release
Revision B 10February 2003 First release for VFP11 r0p1 coprocessor
Revision C 9July 2003 First release for VFP11 r0p2 coprocessor
Revision D 2December 2003 FPINST2 reset state changed to Unpredictable
Revision E 11March 2005 First release for ARM1136JF-S r1p0 processor.
Revision F 20July 2005 First release for ARM1136JF-S r1p1 processor.Table 5-8 corrected.
Revision G 06December 2006 First release for r1p3. No changeto functionality.
Revision H 06July 2007 First release for r1p5. No change tofunctionality.
Copyright © 2002, 2003, 2005-2007 ARM Limited. All rights reserved. ARM DDI 0274H
Non-Confidential