8.3.4. Usage models

This section describes some ways in which errors can be handled in a system. Exactly how you program the processor to handle errors depends on the configuration of your processor and system, and what you are trying to achieve.

If an abort exception is taken, the abort handler reads the information in the link register, SPSR, and fault status registers to determine the type of abort. Some types of abort are fatal to the system, and others can be fixed, and program execution resumed. For example, an MPU background fault might indicate a stack overflow, and be rectified by allocating more stack and reprogramming the MPU to reflect this. Alternatively, an asynchronous external abort might indicate that a software error meant that a store instruction occurred to an unmapped memory address. Such an abort is fatal to the system or process because no information is recorded about the address the error occurred on, or the instruction that caused the error.

Table 8.1 shows which types of abort are typically fatal because either the location of the error is not recorded or the error is unrecoverable. Some aborts that are marked as not fatal might turn out to be fatal in some systems when the cause of the error has been determined. For example, an MPU background fault might indicate a stack overflow, that can be rectified, or it might indicate that, because of a bug, the software has accessed a nonexistent memory location, that can be fatal. These cases can be distinguished by determining the location where the error occurred. If an error is unrecoverable, that is, it is not a correctable parity or ECC error, and it is not a TCM external retry request, it is normally fatal regardless of whether or not the location of the error is recorded. When an abort is taken on an external TCM, parity, or ECC error, the appropriate Auxiliary Fault Status Register records whether the error was recoverable. See Fault Status and Address Registers.

Table 8.1. Types of aborts

MPU faultAccess not permitted by MPU[a]MPUYesNo
Synchronous ExternalLoad using L2 memory interfaceAXI, AHBYesNo
Asynchronous ExternalStore to Normal or Device memory using L2 memory interfaceAXI, AHBNoYes
Synchronous Parity/ECC CacheLoad from cache[b]CacheYesMaybe[c]
Synchronous ECC TCMLoad/store from/to TCM[d]TCMYesMaybe[c]
Synchronous TCM external errorLoad/store from/to TCM[e]TCMYesYes
Asynchronous Parity/ECC CacheStore to cache or cache maintenance operation[b]CacheNoMaybe[c]
Asynchronous TCM external errorStore to TCM[e]TCMNoYes

[a] See MPU faults for more information about the types of MPU fault.

[b] See Cache error detection and correction for more information about parity/ECC errors from the cache.

[c] These types of error can be correctable or uncorrectable. Uncorrectable errors are typically fatal. Correctable errors are automatically corrected by the hardware and might not cause the abort handler to be called. See Cache error detection and correction and TCM internal error detection and correction.

[d] See TCM internal error detection and correction for more information about ECC errors from the TCM.

[e] Aborts generated by external TCM errors are always unrecoverable, and therefore fatal, see External TCM errors for more information about external errors from the TCM.

Correctable errors

In a system in which the processor is configured to automatically correct ECC errors without taking an abort exception, you can still configure it to respond to such errors. Connect the event output or outputs that indicate a correctable error to an interrupt controller. When such an event occurs, the interrupt input to the processor is set, and the processor takes an interrupt exception. When your interrupt handler has identified the source of the interrupt as a correctable error, it can read the CFLR to determine where the ECC error occurred. You can examine this information to identify trends in such errors. By masking the interrupt when necessary, your software can ensure that when critical code is executing, the processor corrects the error automatically, but delays examining information about the error until after the critical code has completed.

When the processor is in debug halt-state, any correctable error is corrected as appropriate, but the memory access is not repeated to fetch the correct data, therefore the instruction generating the error does not complete successfully. Instead, the sticky synchronous abort flag in the DBGDSCR is set. See CP14 c1, Debug Status and Control Register.

Copyright © 2010-2011 ARM. All rights reserved.ARM DDI 0460C