4.1.1. V7 Cache and MMU setup code

Example 4.1 shows you how to set up the caches, the MMU, and branch predictors. You can begin by disabling the MMU and caches, and invalidating the caches and TLB.

This example code is for the Cortex-A9 processor. Some of the Cortex-A processors automatically invalidate the L1 or the L2 caches, or both, at reset, but others require manual invalidation. Check the Technical Reference Manual (TRM) for a particular core to determine which options have been implemented.

The MMU TLBs must be invalidated. The branch target predictor hardware might not have to be explicitly invalidated, but it must be enabled by boot code. Branch prediction can safely be enabled at this point, and it can improve performance.

Example 4.1. Setting up caches, MMU, and branch predictors

	@ Disable MMU.
	MRC p15, 0, r1, c1, c0, 0 								@ Read Control Register configuration data.
	BIC r1, r1, #0x1
	MCR p15, 0, r1, c1, c0, 0 								@ Write Control Register configuration data.
		
	@ Disable L1 Caches.
	MRC p15, 0, r1, c1, c0, 0 								@ Read Control Register configuration data.
	BIC r1, r1, #(0x1 << 12) 								@ Disable I Cache.
	BIC r1, r1, #(0x1 << 2) 								@ Disable D Cache.
	MCR p15, 0, r1, c1, c0, 0 								@ Write Control Register configuration data

	@ Invalidate L1 Caches.
	@ Invalidate Instruction cache.
	MOV r1, #0
	MCR p15, 0, r1, c7, c5, 0

	@ Invalidate Data cache.
	@ To make the code general purpose, calculate the
	@ cache size first and loop through each set + way.

	MRC p15, 1, r0, c0, c0, 0 								@ Read Cache Size ID.
	LDR r3, #0x1ff
	AND r0, r3, r0, LSR #13 								@ r0 = no. of sets - 1.

	MOV r1, #0 								@ r1 = way counter way_loop.
way_loop:
	MOV r3, #0 								@ r3 = set counter set_loop.
set_loop:
	MOV r2, r1, LSL #30
	ORR r2, r3, LSL #5 								@ r2 = set/way cache operation format.
	MCR p15, 0, r2, c7, c6, 2								@ Invalidate the line described by r2.
	ADD r3, r3, #1 								@ Increment set counter.
	CMP r0, r3 								@ Last set reached yet?
	BGT set_loop 								@ If not, iterate set_loop, 
	ADD r1, r1, #1 								@ else, next.
	CMP r1, #4 								@ Last way reached yet?
	BNE way_loop 								@ if not, iterate way_loop.

	@ Invalidate TLB
	MCR p15, 0, r1, c8, c7, 0

	@ Branch Prediction Enable.
	MOV r1, #0
	MRC p15, 0, r1, c1, c0, 0 								@ Read Control Register configuration data.
	ORR r1, r1, #(0x1 << 11) 								@ Global BP Enable bit.
	MCR p15, 0, r1, c1, c0, 0 								@ Write Control Register configuration data.

The following table shows the code you must use to create your translation tables. Use the variable ttb_address to denote the address for the initial translation table. This must be a 16KB area of memory whose start address is aligned to a 16KB boundary, to which an L1 translation table can be written.

Example 4.2. Create translation tables

	@ Enable D-side Prefetch
	MRC p15, 0, r1, c1, c0, 1 								@ Read Auxiliary Control Register.
	ORR r1, r1, #(0x1 <<2) 								@ Enable D-side prefetch.
	MCR p15, 0, r1, c1, c0, 1;								@ Write Auxiliary Control Register.
	DSB
	ISB
	@ DSB causes completion of all cache maintenance operations appearing in program
	@ order before the DSB instruction.
	@ An ISB instruction causes the effect of all branch predictor maintenance
	@ operations before the ISB instruction to be visible to all instructions
	@ after the ISB instruction.
	@ Initialize PageTable.

	@ Create a basic L1 page table in RAM, with 1MB sections containing a flat
	@ (VA=PA) mapping, all pages Full Access, Strongly Ordered.

	@ It would be faster to create this in a read-only section in an assembly file.

	LDR r0, =2_00000000000000000000110111100010 												@ r0 is the non-address part of
 													@ descriptor.
	LDR r1, ttb_address
	LDR r3, = 4095									@ loop counter.
write_pte
	ORR r2, r0, r3, LSL #20									@ OR together address & default PTE bits.
	STR r2, [r1, r3, LSL #2]									@ Write PTE to TTB.
	SUBS r3, r3, #1									@ Decrement loop counter.
	BNE write_pte

	@ For the first entry in the table, You can make it cacheable, normal,	@ write-back, write allocate.
	BIC r0, r0, #2_1100 									@ Clear CB bits.
	ORR r0, r0, #2_0100 									@ inner write-back, write allocate
	BIC r0, r0, #2_111000000000000 									@ Clear TEX bits.
	ORR r0, r0, #2_101000000000000 									@ set TEX as write-back, write allocate
	ORR r0, r0, #2_10000000000000000 									@ shareable.
	STR r0, [r1]

	@ Initialize MMU.
	MOV r1,#0x0
	MCR p15, 0, r1, c2, c0, 2 								@ Write Translation Table Base Control Register.
	LDR r1, ttb_address
	MCR p15, 0, r1, c2, c0, 0 								@ Write Translation Table Base Register 0.

	@ In this simple example, do not use TRE or Normal Memory Remap Register.
	@ Set all Domains to Client.
	LDR r1, =0x55555555
	MCR p15, 0, r1, c3, c0, 0 									@ Write Domain Access Control Register.

	@ Enable MMU
	MRC p15, 0, r1, c1, c0, 0 									@ Read Control Register configuration data.
	ORR r1, r1, #0x1 									@ Bit 0 is the MMU enable.
	MCR p15, 0, r1, c1, c0, 0 									@ Write Control Register configuration data.

Copyright © 2014 ARM. All rights reserved.ARM DAI0425
Non-ConfidentialID080414