User:Dbstream/Booting on x86

From OSDev Wiki
Jump to navigation Jump to search

What this article is not

This article is not about writing a bootloader. It is also not a tutorial, so don't treat it like one. This article is about writing an operating system kernel that boots, more specifically writing a kernel for a 64-bit x86 machine that boots using an already existing boot protocol such as multiboot2.

This article is not a programming tutorial. It is not an x86 assembler tutorial. Some familiarity with the GNU as syntax is assumed.

There is a much more modern alternative to multiboot2 available, the Limine boot protocol. But, this article will focus mainly on multiboot2, because that's where most mistakes will be made. A lot of the techniques in this article can be applied in other areas and for other boot protocols as well, but you should carefully consider whether or not there are other alternatives available. For example, I wouldn't do any unneccessary operations involving page tables in an assembler entry stub if I was writing a Limine kernel, and you cannot load a Limine kernel in lower-half memory.

Relocatable kernel image

It is trivial to write a multiboot2 kernel that can only be loaded in one place, but much more awesome to have a fully relocatable kernel image that can be loaded anywhere in the lower 4GiB of RAM. GRUB2 probably supports relocating an ELF64 image, but do we trust GRUB2? Personally, I have my doubts, not to speak of how multiboot2 enters the kernel in 32-bit mode.

This article assumes that you are writing a lower-half 64-bit kernel, which is very effective for a prekernel (which can be thought of as something in-between the bootloader and your "real" kernel, as it extracts the "real" kernel and constructs page tables that map it from the higher-half).

All C code is assumed to be compiled with -mcmodel=large -fPIE, and the kernel is expected to be linked as -pie.

Linker script

It is a good idea to link the kernel into an ELF64 image and then objcopy it into a binary. This can be done using the following Makefile fragment (please adapt to your own build system):

kernel: kernel.elf
        x86_64-elf-objcopy -O binary kernel.elf kernel

# note how 'kernel.lds' is explicitly listed as the first dependency
# of 'kernel.elf', so it will become the argument to '-T'.
kernel.elf: kernel.lds $(kernel-objs)
        x86_64-elf-ld $(LDFLAGS) -T $^ -o $@

The linker script will link the kernel at virtual address 0, for simplicity. This conflicts with lots of things already in memory at that location, but the kernel image is relocatable and will actually be loaded at some other place in the lower 4GiB of RAM.

The multiboot2 header needs to go very early in the kernel image, so why not place it first?

OUTPUT_FORMAT("elf64-x86-64")
OUTPUT_ARCH(i386:x86-64)
ENTRY(_start)

PHDRS
{
        /* not that these permissions actually matter, but it is good practice */
        head    PT_LOAD         FLAGS(7); /* RWX */
        text    PT_LOAD         FLAGS(5); /* R-X */
        rodata  PT_LOAD         FLAGS(4); /* R-- */
        data    PT_LOAD         FLAGS(6); /* RW- */
        dynamic PT_DYNAMIC      FLAGS(6); /* RW- */
}

SECTIONS
{
        . = 0;
        .head           : { *(.head) }                  :head
        .text           : { *(.text .text.*) }          :text
        .rodata         : { *(.rodata .rodata.*) }      :rodata
        .data           : { *(.data .data.*) }          :data
        .dynamic        : { *(.dynamic) }               :data :dynamic
        .bss            : { *(COMMON .bss .bss.*) }     :data
        __image_end = .;

        /DISCARD/ :
        {
                *(.comment)
                *(.eh_frame)
                *(.note .note.*)
        }
}

Header and 32-bit entry stub

We need to write the 32-bit entry stub in assembler, because it needs to be able to deal with being relocated. We also need a multiboot2 header. Because we're emitting a flat binary, we need the load address tag, entry point tag, and relocatable image tag. An example of how the multiboot2 header can look is shown below:

        .file "head.S"

        /*
         * Sometimes, we want to reference a symbol without generating a
         * relocation entry for it. Use this macro for that. (assumes
         * image_start is at physical address zero)
         */
#define linktime_va(x) (x - image_start)

        .section ".head", "awx"
        .globl image_start
image_start:

mb2_header:
        .long 0xe85250d6                                        /* MB2_MAGIC */
        .long 0                                                 /* MB2_ARCH_X86 */
        .long mb2_header_end - mb2_header                       /* header length */
        .long -(0xe85250d6 + 0 + (mb2_header_end - mb2_header)) /* checksum */
tag_info_req:
        .word 1           /* MB2_TAG_INFO_REQ */
        .word 0           /* flags */
        .long 8           /* size */
tag_load_addr:
        .word 2           /* MB2_TAG_LOAD_ADDR */
        .word 0           /* flags */
        .long 24          /* size */
        .long linktime_va(mb2_header)      /* header_addr */
        .long 0xffffffff                   /* load_addr (-1 means load the entire file) */
        .long 0                            /* load_end_addr (0 means load the entire file */
        .long linktime_va(__image_end)     /* bss_end_addr */
tag_entry_addr:
        .word 3           /* MB2_TAG_ENTRY_ADDR */
        .word 0           /* flags */
        .long 12          /* size */
        .long linktime_va(_start)    /* entry_addr */
        .long 0           /* padding (for 8-byte alignment of tags) */
tag_relocatable:
        .word 10          /* MB2_TAG_RELOCATABLE */
        .word 0           /* flags */
        .long 24          /* size */
        .long 0           /* min_addr */
        .long 0xffffffff  /* max_addr */
        .long 0x1000      /* align */
        .long 0           /* preference (0 means "place me wherever") */
tag_end:
        .word 0           /* MB2_TAG_END */
        .word 0           /* flags */
        .long 8
mb2_header_end:

When we enter the kernel, we need to figure out the load offset:

        /*
         * 32-bit entry point into the kernel.
         *
         * We are in a very fragile environment. We are in 32-bit mode with
         * paging disabled, and we have been relocated to somewhere in memory
         * that we don't know. Luckily, we do know that %ebx points to the boot
         * information structure, so we can use it as scratch space for a stack,
         * letting us figure out where we are.
         */
 
        .code32

        .balign 16
        .globl _start
        .type _start, @function
_start:
        leal 8(%ebx), %esp          /* use the fixed 'reserved' field as scratch space */
        movl (%esp), %eax           /* save it, if its meaning is to be changed in the future */
        call 1f                     /* the assembler will emit a 'near call', a relative jump */
1:      popl %ebp                   /* %ebp now holds the runtime address of this instruction */
        subl $linktime_va(1b), %ebp /* calculate the load offset */
        movl %eax, (%esp)           /* restore 'reserved' */

        /* %ebp now holds the relocation offset, and we can do stuff with it. */

Now that we have a load offset, our code will follow a generic pattern of fixing up pointers and loading stuff. Where we'd normally use movl $label, %eax, we instead need to use leal linktime_va(label)(%ebp), %eax. In simplified terms, we need to:

  • Load a Global Descriptor Table, with entries for 64-bit ring-zero.
  • Setup page tables.
  • Enter long-mode.
  • Relocate ourselves using the _DYNAMIC table.

After determining the relocation offset, we can relocate the hardware datastructures, like aforementioned GDT and page tables:

        leal linktime_va(boot_stack_top)(%ebp), %esp      /* setup a small stack, boot_stack_top
                                                             should point to the top of some free
                                                             range of memory.                     */

        subl $8, %esp                        /* allocate space for a 32-bit segment_ptr on the stack */
        movw $gdt_end - gdt - 1, (%esp)      /* segment_ptr::limit */
        leal linktime_va(gdt)(%ebp), %eax
        movl %eax, 2(%esp)                   /* segment_ptr::base */
        lgdt (%esp)
        addl $8, %esp

        /* later in the file */
        .balign 16
gdt:
        .quad 0                  /* offset 0x00: null segment */
        .quad 0x00009a000000ffff /* offset 0x08: 16-bit kernel CS (we don't use this) */
        .quad 0x000093000000ffff /* offset 0x10: 16-bit kernel DS (we don't use this) */
        .quad 0x0000cf9a0000ffff /* offset 0x18: 32-bit kernel CS (we don't use this) */
        .quad 0x0000cf930000ffff /* offset 0x20: 32-bit kernel DS (we don't use this) */
        .quad 0x00af9b000000ffff /* offset 0x28: 64-bit kernel CS */
        .quad 0x00af93000000ffff /* offset 0x30: 64-bit kernel DS */
gdt_end:

        .balign 16
boot_stack:
        .fill 4096, 1, 0         /* this allocates space for a 4KiB stack */
boot_stack_top:                  /* top of the stack */

We need to have a GDT with 64-bit kernel entries so that we can switch to long mode later.

TODO: finish writing this article.