ARM Technical Support Knowledge Articles

Bit-banded accesses versus read-modify-write accesses

Applies to: Cortex-M3, Cortex-M4


ARM architecture instruction sets include instructions to read or write memory in quantities of bytes, but not single bits.

Conventional methods for accessing a single bit involve reading one or more bytes into a register and using data processing instructions either to mask and shift the desired single bit for a read, or for a write, to modify the required bit and write back the byte or bytes.

Cortex-M3 and Cortex-M4 offer the chip designer a "bit-banding" option in hardware that allows a single bit in the memory system to be read or written by accessing an "alias" address. A single operation to access the alias address is converted in hardware into the required sequence of operations to return the single bit for a read or to modify the single bit for a write. These operations are available in a fixed range of addresses, with each bit in the address range 0x20000000 to 0x200fffff being aliased at a word (4-byte) aligned address in the range 0x22000000 to 0x23fffffc in the SRAM space. The same mechanism applies to the corresponding address ranges in the region 0x40000000 to 0x43fffffc in the Peripheral space.


The conventional method for accessing a zero wait-state bufferable memory region on a Cortex-M3 or Cortex-M4 method might require three instructions and four or five clock cycles.

  cycle 1:  load the byte(s) - issue the load address on the bus
  cycle 2:  receive the read data
  cycle 3:  modify the bit of interest, or mask the other bits
  cycle 4:  store the modified byte(s), or shift the bit of interest to bit[0]

Also, for a write:

  cycle 5:  the buffered store completes on the bus, while the processor executes the next instruction

The bit-band operation to achieve the same result requires only a single instruction and one or two processor clock cycles.

For a read, the processor executes a load from the alias address, which the hardware converts into a load from the bit-band address. In most cases this includes a stall cycle while the processor waits for the loaded data value to be returned on the bus. However, if the next instruction to be executed is a NOP, this data transfer phase of the load completes while the NOP is executing. The masking and shifting takes place in the hardware with no additional latency, so the required bit appears in bit[0] of the target register when the load completes. Therefore a bit-band read, like a normal memory load, typically executes in two cycles in a zero wait-state memory system.

For a write, the processor executes a store to the alias address. For bufferable memory, a normal store completes in one clock cycle of the processor pipeline while the write buffer manages completion of the bufferable store on the bus. However, for a bit-band write, the hardware converts this into two back-to-back bus accesses, a load from the bit-band address immediately followed by a store to the same address. Manipulation of the required bit is handled in hardware between the load and the store operation with no additional latency. So the processor executes the bit-band store in one cycle, and memory is updated with the modified value two cycles later.

Rate this article

Disagree? Move your mouse over the bar and click

Did you find this article helpful? Yes No

How can we improve this article?

Link to this article
Copyright © 2011 ARM Limited. All rights reserved. External (Open), Non-Confidential