ARM Paging
This page or section is a work in progress and may thus be incomplete. Its content may be changed in the near future. |
Introduction
ARM CPUs are used in smaller applications than x86 CPUs, although the line is blurring. Due to the number of different ARM architectures, details of which can differ significantly, this page is aimed at ARMv7-A and ARMv8. ARMv7-M does not have the same concept of virtual memory - it does not have an MMU. The author understands that paging on ARMv5 and ARMv6 is similar to ARMv7, but ARMv4 is somewhat different.
ARMv7-A
ARMv7-A supports two different paging modes. These are the short descriptor format and long descriptor format described in B3.5 and B3.6 respectively of the ARMv7 reference manual. The long descriptor format is an ARM equivalent of the X86 PAE system. However, even the short descriptor format allows access to a 1TB physical address space, but only with a 16MB granularity. As is described in the ARM:
Overview
Detecting paging support is done by consulting the "coprocessor" registers. It is represented in ID_MMFR0
mcr p15, 0, <Rt>, c0, c1, 4
The structure of the register is as follows:
31-28 | 27-24 | 23-20 | 19-16 | 15-12 | 11-8 | 7-4 | 3-0 |
---|---|---|---|---|---|---|---|
Innermost Shareability | FCSE Support | Auxillary Registers | TCM Support | Shareability levels | Outermost Shareability | PMSA Support | VMSA Support |
- VMSA support - the gold.
- 0b0000 - not supported (no paging)
- 0b0001 - implementation defined. Weird MMU somewhere
- 0b0010 - VMSAv6, with cache and TLB type registers. ARMv6 paging.
- 0b0011 - VMSAv7, with support for remapping and access flag. ARMv7-A, as described in the following section.
- 0b0100 - VMSAv7 with PXN bit supported.
- 0b0101 - VMSAv7, PXN and long format descriptors. EPAE is supported.
Short Format
- Up to two levels of address lookup
- 32 bit input addresses
- Output addresses up to 40 bits
- Supports >32 bit Physical Addresses with supersections
- Support for No access, Client and Manager domains
- 32 bit table entries
Long Format
- Up to three levels of address lookup
- Input addresses of up to 40 bits, when used for stage 2 translations
- Output addresses of up to 40 bits
- 4KB assignment granularity across the entire PA range
- No support for domains, all memory regions are treated as in a Client domain
- 64-bit table entries
- Fixed 4KB table size, unless truncated by the size of the input address space
Note that the Large Physicsl Address Extension is an optional feature. Furthermore, if an implementation supports LPAE, it also supports the ARM multiprocessing extensions. The paging mode is controlled with the TTBCR (Translation Table Base Control Register).
Control Registers
TTBCR
31 | 30 | 29-28 | 27-26 | 25-24 | 23 | 22 | 21-19 | 18-16 | 15-14 | 13-12 | 11-10 | 9-8 | 7 | 6 | 5 | 4 | 3 | 2-0 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
EAE | IDF | SH1 | ORGN1 | IRGN1 | EPD1 | A1 | SBZP | T1SZ | SBZP | SH0 | ORGN0 | IRGN0 | EPD0 | SBZP | PD1 | PD0 | SBZP | T0SZ |
- EAE - Extended Address Enable. SBZP if LPAE is not supported
Following fields are SBZP if EAE=0
- IDF - Implementation Defined
- SH1 - Shareability attribute for memory associated with translation table walks using TTBR1.
00 | 01 | 10 | 11 |
---|---|---|---|
non-shareable | unpredictable | outer shareable | inner shareable |
- ORGN1 - Outer cacheability using TTBR1
00 | 01 | 10 | 11 |
---|---|---|---|
outer non-cacheable | outer write-back write-allocate cacheable | outer write-through cacheable | outer write-back no write-allocate cacheable |
- IRGN1 - Inner cacheablility using TTBR1
00 | 01 | 10 | 11 |
---|---|---|---|
inner non-cacheable | inner write-back write-allocate cacheable | inner write-through cacheable | inner write-back no write-allocate cacheable |
- EPD1 - Disable Page walks with TTBR1. If 0, table walks are performed. Otherwise, a translation fault is generated.
- A1 - defines whether TTBR0 or TTBR1 defines the ASID, for 0 and 1 respectively. The ASID is the Address Space Identifier.
- SBZP - Should Be Zero or Preserved. This is more commonly called RES0.
- T1SZ - The size of the memory region addressed by TTBR1. 2^(32-T1SZ) is the size.
- SH0 - like SH1, but for TTBR0.
- ORGN0 - ""
- IRGN0 - ""
- EPD0 - ""
The following fields only apply when EAE is 0
- PD1 - like EPD1.
- PD0 - like EPD0.
This field can take different meanings
- T0SZ - like T1SZ. If EAE=0, this is field N.
- N - Indicated the width of the base address in TTBR0. The base address is bits [31:14-N]. If N=0, the format is compatible with ARMv5 and ARMv6. This field also determines whether TTBR0 or TTBR1 is used for the page walk.
Accessing the TTBCR
To access TTBCR, software reads or writes the CP15 registers with <opc1> set to 0, <CRn> set to c2, <CRm> set to c0, and <opc2> set to 2. For example:
MRC p15, 0, <Rt>, c2, c0, 2 ; Read TTBCR into Rt
MCR p15, 0, <Rt>, c2, c0, 2 ; Write RT to TTBCR
(ARMv7-A ARM, Section B4.1, page 1728)
Here RT denotes a register of your choice.
TTBR0
EAE=0
31-x | (x-1)-7 | 6 | 5 | 4-3 | 2 | 1 | 0 |
---|---|---|---|---|---|---|---|
TTB0A | SBZP | IRGN[0] | NOS | RGN | IMP | S | C/IRGN[1] |
- TTB0A - Bits [31:x] of the TTB0 table base address. Must be 2^x aligned, as determined in the TTBCR.
- IRGN[0] - SBZP if Multiprocessor extensions are not present. Otherwise, bit zero of IRGN
- NOS - Not Outer Shareable. If 1, region is only inner shareable. Ignored when TTBR0.S == 0, SBZP if no distinction between outer or inner shareable.
- RGN - region bits. Outer cacheability attributes, see TTBCR.ORGN0.
- IMP - Implementation Defined
- S - Shareable. 0 - non-shareable, 1 - shareable.
- C - Cacheable. 0 - inner non-cacheable, 1 - inner cacheable. If Multiprocessor extensions are present, this is bit 1 of IRGN
- IRGN - inner region attributes. See TTBCR.IRGN0
EAE = 1
63-56 | 55-48 | 47-40 | 39-x | (x-1)-0 |
---|---|---|---|---|
SBZP | ASID | SBZP | BADDR | SBZP |
- ASID - Address Space Identifier
- BADDR - Bits [39:x] of base address of table. Must be 2^x aligned.
TTBR1
See #TTBR0, except x is fixed to 14 when EAE=0.
Accessing the TTBRx register
To access TTBR0 in an implementation that does not include the Large Physical Address Extension, or bits[31:0] of TTBR0 in an implementation that includes the Large Physical Address Extension, software reads or writes the CP15 registers with <opc1> set to 0, <CRn> set to c2, <CRm> set to c0, and <opc2> set to 0. For example:
MRC p15, 0, <Rt>, c2, c0, 0 ; Read 32-bit TTBR0 into Rt
MCR p15, 0, <Rt>, c2, c0, 0 ; Write Rt to 32-bit TTBR0
In an implementation that includes the Large Physical Address Extension, to access all 64 bits of TTBR0, software performs a 64-bit read or write of the CP15 registers with <CRm> set to c2 and <opc1> set to 0. For example:
MRRC p15, 0, <Rt>, <Rt2>, c2 ; Read 64-bit TTBR0 into Rt (low word) and Rt2 (high word)
MCRR p15, 0, <Rt>, <Rt2>, c2 ; Write Rt (low word) and Rt2 (high word) to 64-bit TTBR0
In these MRRC and MCRR instructions, Rt holds the least-significant word of TTBR0, and Rt2 holds the most-significant word.
To access TTBR1 in an implementation that does not include the Large Physical Address Extension, or bits[31:0] of TTBR1 in an implementation that includes the Large Physical Address Extension, software reads or writes the CP15 registers with <opc1> set to 0, <CRn> set to c2, <CRm> set to c0, and <opc2> set to 1. For example:
MRC p15, 0, <Rt>, c2, c0, 1 ; Read 32-bit TTBR1 into Rt
MCR p15, 0, <Rt>, c2, c0, 1 ; Write Rt to 32-bit TTBR1
In an implementation that includes the Large Physical Address Extension, to access all 64 bits of TTBR1, software performs a 64-bit read or write of the CP15 registers with <CRm> set to c2 and <opc1> set to 1. For example:
MRRC p15, 1, <Rt>, <Rt2>, c2 ; Read 64-bit TTBR1 into Rt (low word) and Rt2 (high word)
MCRR p15, 1, <Rt>, <Rt2>, c2 ; Write Rt (low word) and Rt2 (high word) to 64-bit TTBR1
In these MRRC and MCRR instructions, Rt holds the least-significant word of TTBR1, and Rt2 holds the most-significant word.
Page Tables
OK, so now we know how to set up the control registers, we need to know the in-memory structures. ARM is more flexible than x86 in terms of page size. While x86 supports 4KB and 4MB pages, ARM supports 4KB, 64KB and 1MB pages. It can also support 16MB pages, but this is optional (guaranteed The first level translation table is 16KB in size when N = 0. Generally, it is 2^(14-N).
Short Descriptor: Level 1
Invalid:
31-2 | 1 | 0 |
---|---|---|
Ignored | 0 | 0 |
Page Table:
31-10 | 9 | 8-5 | 4 | 3 | 2 | 1 | 0 |
---|---|---|---|---|---|---|---|
BADDR | IMP | Domain | SBZ | NS | PXN | 0 | 1 |
- BADDR - Page table base address, bits [31:10]. Must be 1KB aligned.
- IMP - Implementation defined
- Domain - used as a memory protection mechanism. 16 possible domains.
- SBZ - Should be Zero
- NS - Non-secure bit. This is used by the security extensions
- PXN - Privileged Execute Never - SBZ if PXN is not supported
Section:
31-20 | 19 | 18 | 17 | 16 | 15 | 14-12 | 11-10 | 9 | 8-5 | 4 | 3 | 2 | 1 | 0 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
BADDR | NS | 0 | nG | S | AP[2] | TEX[2:0] | AP[1:0] | IMP | Domain | XN | C | B | 1 | PXN |
- BADDR - Section Base Address [31:20]. Must be 1MB aligned.
- XN - Execute never. Stops execution of page.
- nG - not global. Determines how this is marked in the TLB.
Supersection:
31-24 | 23-20 | 19 | 18 | 17 | 16 | 15 | 14-12 | 11-10 | 9 | 8-5 | 4 | 3 | 2 | 1 | 0 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
BADDR | BADDRM | NS | 1 | nG | S | AP[2] | TEX[2:0] | AP[1:0] | IMP | BADDRH | XN | C | B | 1 | PXN |
- BADDR - Supersection Base address [31:24]. 16MB aligned.
- BADDRM - Supersection base address [35:32], if supported.
- BADDRH - Supersection base address [39:36], if supported.
Short Descriptor: Level 2
Invalid:
31-2 | 1 | 0 |
---|---|---|
Ignored | 0 | 0 |
Large Page:
31-16 | 15 | 14-12 | 11 | 10 | 9 | 8-6 | 5-4 | 3 | 2 | 1 | 0 |
---|---|---|---|---|---|---|---|---|---|---|---|
BADDR | XN | TEX[2:0] | nG | S | AP[2] | SBZ | AP[1:0] | C | B | 0 | 1 |
- BADDR - Large Page Base Address [31:16]
Small page:
31-12 | 11 | 10 | 9 | 8-6 | 5-4 | 3 | 2 | 1 | 0 |
---|---|---|---|---|---|---|---|---|---|
BADDR | nG | S | AP[2] | TEX[2:0] | AP[1:0] | C | B | 1 | XN |
- BADDR - Small Page Base Address [31:12]
Long Descriptor: Level 1/2
Invalid:
63-1 | 0 |
---|---|
Ignored | 0 |
Block:
63-52 | 51-40 | 39-n | (n-1)-12 | 11-2 | 1 | 0 |
---|---|---|---|---|---|---|
UBAT | SBZP | ADDR | SBZP | LBAT | 0 | 1 |
- UBAT - Upper Block Attributes
- ADDR - Output Address [39:n]
- LBAT - Lower Block Attributes
- n - 30 for first level, 21 for 2nd level.
Table:
63 | 62-61 | 60 | 59 | 58-52 | 51-40 | 39-12 | 11-2 | 1 | 1 |
---|---|---|---|---|---|---|---|---|---|
NST | APT | XNT | PXNT | Ignored | SBZP | ADDR | Ignored | 1 | 1 |
Stage 1 attributes, SBZP at stage 2:
- NST - NSTable. For secure memory accesses, determines type of next level. Otherwise ignored.
- APT - APTable. Access permissions limit for next level lookup.
- XNT - XNTable. XN limit for subsequent lookups.
- PXNT - PXNTable. PXN limit for subsequent levels. SBZ for non-secure PL2 (hypervisor) level 1 translation tables.
Any stage:
- ADDR - Next level table address [39:12].
Long Descriptor: Level 3
Invalid:
63-1 | 0 |
---|---|
Ignored | 0 |
Reserved:
63-2 | 1 | 0 |
---|---|---|
SBZP | 0 | 1 |
Page
63-52 | 51-40 | 39-12 | 11-2 | 1 | 1 |
---|---|---|---|---|---|
UPAT | SBZP | ADDR | LPAT | 1 | 1 |
- UPAT - Upper Page Attributes
- ADDR - Output Address [39:12]
- LPAT - Lower Page Attributes
Stage 1 Attributes
Upper:
63-59 | 58-55 | 54 | 53 | 52 |
---|---|---|---|---|
Ignored | Ignored | XN | PXN | CONT |
- CONT - Contiguous. 16 adjacent entries point to contiguous memory regions.
Lower:
11 | 10 | 9-8 | 7-6 | 5 | 4-2 |
---|---|---|---|---|---|
nG | AF | SH[1:0] | AP[2:1] | NS | AttrIndex[2:0] |
- AttrIndex - memory attributes index field
Stage 2 Attributes
Upper:
63-59 | 58-55 | 54 | 53 | 52 |
---|---|---|---|---|
Ignored (system MMU) | Ignored (software) | XN | 0 | CONT |
Lower:
11 | 10 | 9-8 | 7-6 | 5-2 |
---|---|---|---|---|
0 | AF | SH[1:0] | HAP[2:1] | MemAttr[3:0] |
- HAP - Stage 2 Access Permissions
- MemAttr - Stage 2 memory attributes
Choice Between TTBR0 and TTBR1
Short format
TTBR0 table | ||||
---|---|---|---|---|
TTBCR.N | First address translated with TTBR1 | Size | Index range | |
0b000 | TTBR1 not used | 16KB | VA[31:20] | |
0b001 | 0x80000000 | 8KB | VA[30:20] | |
0b010 | 0x40000000 | 4KB | VA[29:20] | |
0b011 | 0x20000000 | 2KB | VA[28:20] | |
0b100 | 0x10000000 | 1KB | VA[27:20] | |
0b101 | 0x08000000 | 512 bytes | VA[26:20] | |
0b110 | 0x04000000 | 256 bytes | VA[25:20] | |
0b111 | 0x02000000 | 128 bytes | VA[24:20] |
Long format
TTBCR | Input address range | ||
---|---|---|---|
T0SZ | T1SZ | TTBR0 | TTBR1 |
0b000 | 0b000 | All addresses | Not used |
M | 0b000 | 0 to 2^(32-M)-1 | 2^(32-M) to maximum address |
0b000 | N | 0 to (2^32-2^(32-N)-1) | (2^32-2^(32-N)) to maximum |
M | N | 0 to (2^(32-M)-1) | 2^32-2^(32-N) to maximum |
Recursive Table Mapping
If you wish to use a recursive mapping trick, such as is done on x86, use the long format. The short format might be doable with much hastle. The long format has been designed with recursive mapping in mind, supports the full physical address space, and does not suffer from mismatching of the page size to table like the short format does. However, the basic structure is similar to PAE, so if you did this naively, 1GB of the address space would be wasted. Instead, use the support for eliminating the first stage by concatenating the 2nd stage. Then point the last few entries to the appropriate points in the table. This reaps all the benefits of the long format, while eliminating bloat of the recursive mapping.