ARM Paging

From OSDev Wiki
Jump to navigation Jump to search

This page is under construction! This page or section is a work in progress and may thus be incomplete. Its content may be changed in the near future.

Introduction

ARM CPUs are used in smaller applications than x86 CPUs, although the line is blurring. Due to the number of different ARM architectures, details of which can differ significantly, this page is aimed at ARMv7-A and ARMv8. ARMv7-M does not have the same concept of virtual memory - it does not have an MMU. The author understands that paging on ARMv5 and ARMv6 is similar to ARMv7, but ARMv4 is somewhat different.

ARMv7-A

ARMv7-A supports two different paging modes. These are the short descriptor format and long descriptor format described in B3.5 and B3.6 respectively of the ARMv7 reference manual. The long descriptor format is an ARM equivalent of the X86 PAE system. However, even the short descriptor format allows access to a 1TB physical address space, but only with a 16MB granularity. As is described in the ARM:

Overview

Detecting paging support is done by consulting the "coprocessor" registers. It is represented in ID_MMFR0

mcr p15, 0, <Rt>, c0, c1, 4

The structure of the register is as follows:

31-28 27-24 23-20 19-16 15-12 11-8 7-4 3-0
Innermost Shareability FCSE Support Auxillary Registers TCM Support Shareability levels Outermost Shareability PMSA Support VMSA Support
  • VMSA support - the gold.
    • 0b0000 - not supported (no paging)
    • 0b0001 - implementation defined. Weird MMU somewhere
    • 0b0010 - VMSAv6, with cache and TLB type registers. ARMv6 paging.
    • 0b0011 - VMSAv7, with support for remapping and access flag. ARMv7-A, as described in the following section.
    • 0b0100 - VMSAv7 with PXN bit supported.
    • 0b0101 - VMSAv7, PXN and long format descriptors. EPAE is supported.

Short Format

  • Up to two levels of address lookup
  • 32 bit input addresses
  • Output addresses up to 40 bits
  • Supports >32 bit Physical Addresses with supersections
  • Support for No access, Client and Manager domains
  • 32 bit table entries

Long Format

  • Up to three levels of address lookup
  • Input addresses of up to 40 bits, when used for stage 2 translations
  • Output addresses of up to 40 bits
  • 4KB assignment granularity across the entire PA range
  • No support for domains, all memory regions are treated as in a Client domain
  • 64-bit table entries
  • Fixed 4KB table size, unless truncated by the size of the input address space

Note that the Large Physicsl Address Extension is an optional feature. Furthermore, if an implementation supports LPAE, it also supports the ARM multiprocessing extensions. The paging mode is controlled with the TTBCR (Translation Table Base Control Register).

Control Registers

TTBCR

31 30 29-28 27-26 25-24 23 22 21-19 18-16 15-14 13-12 11-10 9-8 7 6 5 4 3 2-0
EAE IDF SH1 ORGN1 IRGN1 EPD1 A1 SBZP T1SZ SBZP SH0 ORGN0 IRGN0 EPD0 SBZP PD1 PD0 SBZP T0SZ
  • EAE - Extended Address Enable. SBZP if LPAE is not supported

Following fields are SBZP if EAE=0

  • IDF - Implementation Defined
  • SH1 - Shareability attribute for memory associated with translation table walks using TTBR1.
00 01 10 11
non-shareable unpredictable outer shareable inner shareable
  • ORGN1 - Outer cacheability using TTBR1
00 01 10 11
outer non-cacheable outer write-back write-allocate cacheable outer write-through cacheable outer write-back no write-allocate cacheable
  • IRGN1 - Inner cacheablility using TTBR1
00 01 10 11
inner non-cacheable inner write-back write-allocate cacheable inner write-through cacheable inner write-back no write-allocate cacheable
  • EPD1 - Disable Page walks with TTBR1. If 0, table walks are performed. Otherwise, a translation fault is generated.
  • A1 - defines whether TTBR0 or TTBR1 defines the ASID, for 0 and 1 respectively. The ASID is the Address Space Identifier.
  • SBZP - Should Be Zero or Preserved. This is more commonly called RES0.
  • T1SZ - The size of the memory region addressed by TTBR1. 2^(32-T1SZ) is the size.
  • SH0 - like SH1, but for TTBR0.
  • ORGN0 - ""
  • IRGN0 - ""
  • EPD0 - ""

The following fields only apply when EAE is 0

  • PD1 - like EPD1.
  • PD0 - like EPD0.

This field can take different meanings

  • T0SZ - like T1SZ. If EAE=0, this is field N.
  • N - Indicated the width of the base address in TTBR0. The base address is bits [31:14-N]. If N=0, the format is compatible with ARMv5 and ARMv6. This field also determines whether TTBR0 or TTBR1 is used for the page walk.
Accessing the TTBCR

To access TTBCR, software reads or writes the CP15 registers with <opc1> set to 0, <CRn> set to c2, <CRm> set to c0, and <opc2> set to 2. For example:

MRC p15, 0, <Rt>, c2, c0, 2 ; Read TTBCR into Rt
MCR p15, 0, <Rt>, c2, c0, 2 ; Write RT to TTBCR

(ARMv7-A ARM, Section B4.1, page 1728)
Here RT denotes a register of your choice.

TTBR0

EAE=0
31-x (x-1)-7 6 5 4-3 2 1 0
TTB0A SBZP IRGN[0] NOS RGN IMP S C/IRGN[1]
  • TTB0A - Bits [31:x] of the TTB0 table base address. Must be 2^x aligned, as determined in the TTBCR.
  • IRGN[0] - SBZP if Multiprocessor extensions are not present. Otherwise, bit zero of IRGN
  • NOS - Not Outer Shareable. If 1, region is only inner shareable. Ignored when TTBR0.S == 0, SBZP if no distinction between outer or inner shareable.
  • RGN - region bits. Outer cacheability attributes, see TTBCR.ORGN0.
  • IMP - Implementation Defined
  • S - Shareable. 0 - non-shareable, 1 - shareable.
  • C - Cacheable. 0 - inner non-cacheable, 1 - inner cacheable. If Multiprocessor extensions are present, this is bit 1 of IRGN
  • IRGN - inner region attributes. See TTBCR.IRGN0
EAE = 1
63-56 55-48 47-40 39-x (x-1)-0
SBZP ASID SBZP BADDR SBZP
  • ASID - Address Space Identifier
  • BADDR - Bits [39:x] of base address of table. Must be 2^x aligned.

TTBR1

See #TTBR0, except x is fixed to 14 when EAE=0.

Accessing the TTBRx register

To access TTBR0 in an implementation that does not include the Large Physical Address Extension, or bits[31:0] of TTBR0 in an implementation that includes the Large Physical Address Extension, software reads or writes the CP15 registers with <opc1> set to 0, <CRn> set to c2, <CRm> set to c0, and <opc2> set to 0. For example:

MRC p15, 0, <Rt>, c2, c0, 0 ; Read 32-bit TTBR0 into Rt
MCR p15, 0, <Rt>, c2, c0, 0 ; Write Rt to 32-bit TTBR0

In an implementation that includes the Large Physical Address Extension, to access all 64 bits of TTBR0, software performs a 64-bit read or write of the CP15 registers with <CRm> set to c2 and <opc1> set to 0. For example:

MRRC p15, 0, <Rt>, <Rt2>, c2 ; Read 64-bit TTBR0 into Rt (low word) and Rt2 (high word)
MCRR p15, 0, <Rt>, <Rt2>, c2 ; Write Rt (low word) and Rt2 (high word) to 64-bit TTBR0

In these MRRC and MCRR instructions, Rt holds the least-significant word of TTBR0, and Rt2 holds the most-significant word.

To access TTBR1 in an implementation that does not include the Large Physical Address Extension, or bits[31:0] of TTBR1 in an implementation that includes the Large Physical Address Extension, software reads or writes the CP15 registers with <opc1> set to 0, <CRn> set to c2, <CRm> set to c0, and <opc2> set to 1. For example:

MRC p15, 0, <Rt>, c2, c0, 1 ; Read 32-bit TTBR1 into Rt
MCR p15, 0, <Rt>, c2, c0, 1 ; Write Rt to 32-bit TTBR1

In an implementation that includes the Large Physical Address Extension, to access all 64 bits of TTBR1, software performs a 64-bit read or write of the CP15 registers with <CRm> set to c2 and <opc1> set to 1. For example:

MRRC p15, 1, <Rt>, <Rt2>, c2 ; Read 64-bit TTBR1 into Rt (low word) and Rt2 (high word)
MCRR p15, 1, <Rt>, <Rt2>, c2 ; Write Rt (low word) and Rt2 (high word) to 64-bit TTBR1

In these MRRC and MCRR instructions, Rt holds the least-significant word of TTBR1, and Rt2 holds the most-significant word.

Page Tables

OK, so now we know how to set up the control registers, we need to know the in-memory structures. ARM is more flexible than x86 in terms of page size. While x86 supports 4KB and 4MB pages, ARM supports 4KB, 64KB and 1MB pages. It can also support 16MB pages, but this is optional (guaranteed The first level translation table is 16KB in size when N = 0. Generally, it is 2^(14-N).

Short Descriptor: Level 1

Invalid:

31-2 1 0
Ignored 0 0

Page Table:

31-10 9 8-5 4 3 2 1 0
BADDR IMP Domain SBZ NS PXN 0 1
  • BADDR - Page table base address, bits [31:10]. Must be 1KB aligned.
  • IMP - Implementation defined
  • Domain - used as a memory protection mechanism. 16 possible domains.
  • SBZ - Should be Zero
  • NS - Non-secure bit. This is used by the security extensions
  • PXN - Privileged Execute Never - SBZ if PXN is not supported

Section:

31-20 19 18 17 16 15 14-12 11-10 9 8-5 4 3 2 1 0
BADDR NS 0 nG S AP[2] TEX[2:0] AP[1:0] IMP Domain XN C B 1 PXN
  • BADDR - Section Base Address [31:20]. Must be 1MB aligned.
  • XN - Execute never. Stops execution of page.
  • nG - not global. Determines how this is marked in the TLB.

Supersection:

31-24 23-20 19 18 17 16 15 14-12 11-10 9 8-5 4 3 2 1 0
BADDR BADDRM NS 1 nG S AP[2] TEX[2:0] AP[1:0] IMP BADDRH XN C B 1 PXN
  • BADDR - Supersection Base address [31:24]. 16MB aligned.
  • BADDRM - Supersection base address [35:32], if supported.
  • BADDRH - Supersection base address [39:36], if supported.

Short Descriptor: Level 2

Invalid:

31-2 1 0
Ignored 0 0

Large Page:

31-16 15 14-12 11 10 9 8-6 5-4 3 2 1 0
BADDR XN TEX[2:0] nG S AP[2] SBZ AP[1:0] C B 0 1
  • BADDR - Large Page Base Address [31:16]

Small page:

31-12 11 10 9 8-6 5-4 3 2 1 0
BADDR nG S AP[2] TEX[2:0] AP[1:0] C B 1 XN
  • BADDR - Small Page Base Address [31:12]

Long Descriptor: Level 1/2

Invalid:

63-1 0
Ignored 0

Block:

63-52 51-40 39-n (n-1)-12 11-2 1 0
UBAT SBZP ADDR SBZP LBAT 0 1
  • UBAT - Upper Block Attributes
  • ADDR - Output Address [39:n]
  • LBAT - Lower Block Attributes
  • n - 30 for first level, 21 for 2nd level.

Table:

63 62-61 60 59 58-52 51-40 39-12 11-2 1 1
NST APT XNT PXNT Ignored SBZP ADDR Ignored 1 1

Stage 1 attributes, SBZP at stage 2:

  • NST - NSTable. For secure memory accesses, determines type of next level. Otherwise ignored.
  • APT - APTable. Access permissions limit for next level lookup.
  • XNT - XNTable. XN limit for subsequent lookups.
  • PXNT - PXNTable. PXN limit for subsequent levels. SBZ for non-secure PL2 (hypervisor) level 1 translation tables.

Any stage:

  • ADDR - Next level table address [39:12].

Long Descriptor: Level 3

Invalid:

63-1 0
Ignored 0

Reserved:

63-2 1 0
SBZP 0 1

Page

63-52 51-40 39-12 11-2 1 1
UPAT SBZP ADDR LPAT 1 1
  • UPAT - Upper Page Attributes
  • ADDR - Output Address [39:12]
  • LPAT - Lower Page Attributes
Stage 1 Attributes

Upper:

63-59 58-55 54 53 52
Ignored Ignored XN PXN CONT
  • CONT - Contiguous. 16 adjacent entries point to contiguous memory regions.

Lower:

11 10 9-8 7-6 5 4-2
nG AF SH[1:0] AP[2:1] NS AttrIndex[2:0]
  • AttrIndex - memory attributes index field
Stage 2 Attributes

Upper:

63-59 58-55 54 53 52
Ignored (system MMU) Ignored (software) XN 0 CONT

Lower:

11 10 9-8 7-6 5-2
0 AF SH[1:0] HAP[2:1] MemAttr[3:0]
  • HAP - Stage 2 Access Permissions
  • MemAttr - Stage 2 memory attributes

Choice Between TTBR0 and TTBR1

Short format

TTBR0 table
TTBCR.N First address translated with TTBR1 Size Index range
0b000 TTBR1 not used 16KB VA[31:20]
0b001 0x80000000 8KB VA[30:20]
0b010 0x40000000 4KB VA[29:20]
0b011 0x20000000 2KB VA[28:20]
0b100 0x10000000 1KB VA[27:20]
0b101 0x08000000 512 bytes VA[26:20]
0b110 0x04000000 256 bytes VA[25:20]
0b111 0x02000000 128 bytes VA[24:20]

Long format

TTBCR Input address range
T0SZ T1SZ TTBR0 TTBR1
0b000 0b000 All addresses Not used
M 0b000 0 to 2^(32-M)-1 2^(32-M) to maximum address
0b000 N 0 to (2^32-2^(32-N)-1) (2^32-2^(32-N)) to maximum
M N 0 to (2^(32-M)-1) 2^32-2^(32-N) to maximum

Recursive Table Mapping

If you wish to use a recursive mapping trick, such as is done on x86, use the long format. The short format might be doable with much hastle. The long format has been designed with recursive mapping in mind, supports the full physical address space, and does not suffer from mismatching of the page size to table like the short format does. However, the basic structure is similar to PAE, so if you did this naively, 1GB of the address space would be wasted. Instead, use the support for eliminating the first stage by concatenating the 2nd stage. Then point the last few entries to the appropriate points in the table. This reaps all the benefits of the long format, while eliminating bloat of the recursive mapping.

External References