14.2.1. Protection method

The following sections describe how RAM errors are managed:

Detecting errors

The Cortex-M7 processor uses ECC to detect errors in the cache RAMs.

Recovering from errors

The Cortex-M7 processor can recover from a RAM error detected in the cache by using clean and invalidate and retry. When an error is detected, as shown in Table 14.1, the corresponding index/way is cleaned and invalidated. When the clean and invalidate operation completed, the requester retries its access. The ECC can also be used to correct single-bit errors in the RAM.

Instruction cache

In the instruction cache, lines are always clean so that invalidating the line is sufficient. The retried access then fetches the correct value from external memory.

Data cache

In the data cache, the cache line can be dirty. The correction of the RAM contents is done as part of the clean and invalidate operation for caches. This takes place in the write buffer and the corrected data is written back to external memory. The retried access then reads the correct value from external memory. If the data cannot be corrected then the error is non-recoverable.

Handling permanent errors

Permanent errors are handled as follows:

General behavior

If hard, or permanent, errors occur on the RAMs, the clean, invalidate and retry scheme might cause a deadlock, and the access is continuously replayed. To prevent this, error bank registers are provided to mask the faulty locations as unusable and invalid. There are two banks for each side of the memory system. When an error is detected, the location is pushed in the bank, masking the corresponding valid bit of the location when reading and when allocating a new line. The line is therefore no longer used unless the entry is reset. Because of implementation details, there is a short period of time when the line is still seen by the system, but is removed from the allocation pool.

The depth of the error bank determines how many errors can be supported by the system. When this limit is reached, the system might livelock. The processor provides information to the system indicating the number of corrupted locations to monitor the error bank status before it becomes full. This is a condition that can cause a potential deadlock. This information is reported on several pins signaling the use of the error bank, that is, showing if the error bank is empty or at least one error has been encountered.

Error bank register behavior

Both the instruction and data side use the same algorithm to select the bank to update:

  • If there is a non-valid, unlocked bank then it is always allocated in preference to a valid bank.

  • If both banks are valid, or both banks are non-valid, and both are not locked then a round-robin counter updated on each allocation selects the bank to fill.

  • If there is one locked bank then the other bank is always allocated, whether or not it is already valid.

  • If both banks are locked then no allocation takes place.

Copyright © 2014-2016, 2018 Arm. All rights reserved.ARM DDI 0489F