User:Mduft/HigherHalf Kernel with 32-bit Paging

From OSDev Wiki
Jump to navigation Jump to search

This page is under construction! This page or section is a work in progress and may thus be incomplete. Its content may be changed in the near future.


Hi, and welcome to this small tutorial. It is assembled from the code, that i have written for my own hobby OS. As a prerequisite i recommend that you make yourself familiar with the AT&T x86 assembler syntax used by the GNU assembler (which i will be using here). I will try to explain everything i do in very much detail, but still - it's not trivial :)

I also suggest you have the Intel Processor Documentation(s) at hand, especially Volume 3a, which explains 32-bit paging and all the related structure, etc.


As a side note, i want to use this to advertise qemu as a development utility (no i'm not affiliated with 'em ;)). I use it very much, and it is really, really useful. For example you can investigate the current page tables of your kernel right after loading CR3, which prove very usefull when writing this code. Also having a symbolic debugger (although "symbolic" is not so much usefull in assembler ;p) is the bare minimum you should be equipped with when going on the kernel development route...

Additionally, you don't have to worry about boot loaders and such, as long as your kernel is multiboot compliant.

Boot code Structure

I will give a rough outline of the kernel (if you can call my 3-filer a kernel ;p) as i will explain it here:

  • boot.S: This is the main meat of the startup code. it will contain code to:
    • setup a initial stack.
    • setup initial paging structures (the kernel PD, and two PTs)
    • identity map the low 1MB, and all of the kernel.
    • map all the kernel to the higher half
    • enable paging
    • call the C kernel
    • write "PANIC!" on the screen in nice white-on-red letters
      (this is why you read this, right?)
  • link.ld: This will contain the instructions for the linker.

Also, i plan to provide the original files i use in my kernel, so you can have a closer look at them. When reading through them, you will mention, that i have (basic) C++ support in them. You either don't have to worry about it, or can use it to extend your own kernel in this direction.

(you can find boot.s here, and link.ld here)

Thinking about it

At first, i want you to think a little about what we will need, and what we want to accomplish.

We want to have a kernel, which runs (not loads!) at a high address. At such a high address, that it may be, that there is not so much physical memory, that this address is available. So we need virtual memory (or a segmentation trick, described in Higher Half With GDT). We will use paging, to map our kernel to a high address after it has been loaded to a low address (which should be always available physically).

For this to work correctly, the code in our kernel needs to know about this. Most of the kernel will have to be linked to a high address, so we don't need to worry about this anymore, once we're over the bootstrap phase. The initial bootstrap code, in contrast, needs to be linked to a low address, or be completely position independent.

For the paging stuff, we will need to have at least a PD (Page Directory) for the kernel, and two PTs (Page Tables). Why two, you might ask. The reason is, we need one PT for the low addresses (the lower 1MB, and the kernel in lower half), and one for the addresses in the higher half. If your kernel grows larger and larger, it might even require more PTs, and you will for sure need to deal with PDs and PTs a lot when it comes to processes, where probably each process will have it's own PD and associated PTs.

For this tutorial i chose some very common addresses:

  • 0x100000 as the physical load address of the kernel. This is the location the bootloader will load our kernel to.
  • 0xC0000000 as the offset into the higher half.

Those two added together, are the address where the kernel will finally be running (so 0xC0100000). You will see the use of KERNEL_HIGH_VMA all through the code to take into account the offset to the higher half. This is, because the linker assigns higher half addresses to some of the symbols. Since our kernel is loaded to lower half and "thinks" it is running in higher half, we need to be careful when using the addresses of symbols assigned by the linker. During the initial bootstrap, i'm doing a lot of adjustments from higher to lower half, so things are accessible as long as we don't have paging set up.


Now for the real code. i will go through it, and explain every section as thorough as possible.

Global symbols

We will have only one single global symbol for this file, and thats the entry point of the kernel. So the file starts with the following line:

.global bootstrap_ia32

The required .data stuff

We will need some room in our kernel to do the things we want to. However, we don't (yet) have means to dynamically allocate things, and we don't want to worry too much, so we will simple reserve all the required space in the executable itself. this way, the linker will take care of the kernels bounds, and the bootloader (or qemu in our case), will take care about loading, and checking for enough room, etc., etc.

Use the .data section

First, we'll tell the assembler, that we want to put the following stuff in the .data section of the output file:

.section .data

Make room for the PD and PTs

Then, we need room for the PD and PTs. I'll reserve some space for them like this:

# the kernel page directory, lowest page table (for the low 1MB identity 
# mapping in kernel space) and the kernel page table, used to map the
# kernel itself from physical 0x100000 to 0xC0000000).

.align 0x1000
   .space 0x1000, 0x00
   .space 0x1000, 0x00
   .space 0x1000, 0x00

The above tells the assembler to pad the current position in the output file until it is aligned to 4K (0x1000 -> 4096). The we reserver three times 4K of space for the PD and PTs. These symbols will end up in the .data section of the kernel, and thus will be there as soon as the bootloader loaded the kernel. No need to worry about allocating those from "somewhere".

Make room for the Stack

Another thing we will need from the very start, is the stack. we won't have time to defer it's creation until we can allocate memory, so we need to put it in here too. Of course you can make it way smaller if you like; Linux uses 16K or 8K if you tell it to do so.

# a whopping 64K of initial stack space.
.set INITSTACKSIZE, 0x10000
   .space INITSTACKSIZE, 0x00


The last thing in the .data section are our strings we will be using from within the assembler code. There are not many of them, just the "PANIC!" i promised above :)

   .asciz "PANIC!"

The actual code

... or not quite yet. At least we tell the assembler to put the following things in the .text section.

Use the .text Section

The following tells the assembler to use the .text section. Also, i tell it to produce 32-bit code, which is not really a necessety, but i like to make it clear.

.section .text

Now for the exception to the rule

... the MultiBoot Header. It is put in the .text section for a special reason. The multiboot specification states, that the header must appear longword-aligned aligned somewher in the first 8192 bytes of the kernel. As you will see later on in the linker script, the .text section is the first section in the kernel, and thus has to contain the header.

.set ALIGN,         1<<0             # align loaded modules on page boundaries
.set MEMINFO,       1<<1             # provide memory map
.set FLAGS,         ALIGN | MEMINFO  # this is the Multiboot 'flag' field
.set MAGIC,         0x1BADB002       # 'magic number' lets bootloader find the header
.set CHECKSUM,      -(MAGIC + FLAGS) # checksum required

.align 4

   .long MAGIC
   .long FLAGS
   .long CHECKSUM

The above first sets some "absolute" symbols. They have a fixed value, and can be used as constants. The it sets the alignment of the following to 4 (longword; remember? the header has to be aligned like this... most likely, we could omit this, since it is the first thing in the file anyway, and the chance that we are misaligned here is very small to impossible. but still.. better be on the safe side).

A good thing to know is, that the ".set symbols" are present in the symbol table of the output file, but do not take up any space in the file at the current location. This means, for example, that the current alignment does not change because of their declaration. We're still at the very same position in the output file.

Required symbols for the screen buffer

We also need some helpers for handling the screen (clearing it, and then, the point of the whole tutorial: printing "PANIC!"):

.set VIDEO_RAM,     0xB8000          # Video Memory, used to print to the screen.
.set VIDEO_DWORDS,  0x3E8            # The count of DWORDs (!) the screen buffer is large.

The entry point

Now the entry point function, the one that will be visible from the outside world. We will start by setting up the inital stack. Because (as we all know, right?) the stack grows downwards in memory, we need to set the ESP register to point to the end of the stack space.

   # setup the stack
   mov   $(initstack + INITSTACKSIZE), %esp

   # adjust address to be physical, stack is in data segment, which is linked to
   # the kernels virtual higher half address.
   subl  $KERNEL_HIGH_VMA, %esp

This will put the address of the initstack symbol, with the length of the stack added to it, into ESP, the stack pointer. What you do not know yet (and what i will explain in detail when we come to the linker script): the stack symbol (as it is in the .data section) is assigned a higher half address by the linker. Thus we cannot yet access it, because we're still running in the lower half now. If we would not subtract the offset into the higher half, where the kernel will be mapped after paging is enabled, the kernel would maybe still run fine - as long as the machine has enough memory, so accessing the higher half address works (however i haven't tried this!).

Next, we'll want to setup some form of initial paging, which enables the kernel to run code off the higher half addresses, and use all the other things that will be linked there.

   # setup boot paging to map kernel to higher half
   call init_boot_paging_ia32

I will discuss the init_boot_paging_ia32 function in detail just a few lines further down. Let's first finish the entry point, as it is not so complex.

Imagine, we have paging setup all right (the init_boot_paging_ia32 function did it's job, whatever this means), and can now use the correct higher half addresses all over the place. The first thing we'll want to do, is to try whether it really works, and relocate the stack from the lower to the higher half.

"Relocate" is maybe a bad word for what we're doing. Either way, we're accessing the same physical memory, because we mapped the higher half to the lower half physical addresses. In fact, we can use both addresses just fine. Also, if you, say, write a byte to 0x100000 (please don't ;)), it will immediately appear at 0xC0100000 too; it's the same memory location.

   # adjust stack registers to point to the now mapped stack as virtual address
   addl  $KERNEL_HIGH_VMA, %esp
   mov   %esp, %ebp

Now the stack is used from it's higher half mapped address.

Since we have all set up now, we can test the lower 1MB mapping very easily, by trying to clear the screen, which will access the screen buffer at 0xB8000. If the mapping doesn't work, this will cause one or the other exception, double fault and tripple fault :)

   mov $VIDEO_RAM, %edi
   mov $VIDEO_DWORDS, %ecx
   mov $0x07200720, %eax
   rep stosl

This fill all the screen buffer with light-grey-on-black colored blanks, and thus clear it. (Each word is built like this: 0x0000 -> background (0 = black), 0x0700 -> foreground (7 = light grey), 0x0020 -> " " (blank)).

The only thing left to do (except error handling), is to call the main C kernel from here. First we push the parameters passed by the Multiboot compliant bootloader on the stack.

   # push parameters to the entry point (grub parameters)
   push  %eax
   push  %ebx

   #call boot # uncomment this if you have a C kernel entry function at hand :)

My kernel main C function happens to be named "boot", but thats of course up to you. Now, actually there is all we need. Still we will want to make sure, that if boot ever returns, the CPU does not continue to execute junk or code it shouldn't, so we will halt the CPU. Note: You'll probably want to halt all APs too, if you put SMP support in your kernel...

   mov $_msg_panic, %eax
   call boot_print_msg
   # halt the cpu... the kernel stopped.

I think it is pretty obvious: the above loads the "PANIC!" message into EAX, calls the print function (i will show it to you in a second...), then disables interupts and halts the CPU.

The simple printing function

Ok, before we start with the paging code, here is the printing function i have to write the "PANIC!" on the screen. I'll leave it to you to figure out how it works, i guess things like these are described a lot in th wiki :)

   # eax: address of the string to print.
   push %edx
   push %ebx

   mov $VIDEO_RAM, %edx

       movb (%eax), %bl
       xorb %bh, %bh
       cmpb $0x0, %bl
       je _end_print
       orw $0x4F00, %bx
       movw %bx, (%edx)
       add $0x2, %edx
       inc %eax
       jmp _print_loop

   pop %ebx
   pop %edx

Initialize Paging

The following is my init function for the paging. It makes the PD point to the correct PTs. It also calls another function to fill the PTs with the correct entries (it "maps" virtual to physical addresses). Let's start with the top of the function:

   # save registers used here.
   push %eax
   push %ebx
   push %edx
   push %ecx

At the end of the function, we will restore the registers, so the code behaves nicely, and doe not overwrite reigsters set by the bootloader (remember: grub passes two parameters in EAX and EBX).

Getting the PD and PTs right

Now, to insert the PDEs in the PD, we need to have the correct physical addresses of the PD and PTs. Those symbols (_kernel_pd, _kernel_pt, _kernel_low_pt; we declared those above in the .data section, remember?) are assigned higher half addresses by the linker, as you will see later on, when i explain the linker script. Since the higher half addresses are merely an "alias" to the lower half addresses, we can simply subtract the offset into higher half, to get the actual physical load address of those symbols. This way, we can access them without paging enabled.

   mov  $_kernel_pd, %eax          # get virtual address of kernel pd
   sub  $KERNEL_HIGH_VMA, %eax     # adjust to physical address

EAX now contains the physical address of the kernels PD. Now the same for the "low" PT (the one mapping the first 1MB of the memory, and all the physical kernel addresses, as the kernel is loaded right above 1MB. With this single PT, we can map up to 4MB, that makes room for a approximately 3MB kernel above the 1MB low memory).

   mov  $_kernel_low_pt, %ebx      # get virtual address of kernel low pt
   sub  $KERNEL_HIGH_VMA, %ebx     # adjust to physical address

If we want to set a PDE containing this PT, we just need to set the present flag. (Note: i don't clear the lower 12 bits of the address intentionally; it is not required, as i can be sure, that the structures are correctly aligned the way i declared them in the .data section - just in case you wonder if i forgot it ;)).

   or   $0x1, %ebx                 # set present flag

EBX is now in the correct form for a PDE, the only thing left to do, is actually set the entry in the PD (whos address is loaded into EAX):

   mov  %ebx, (%eax)               # set the pde

This was easy. The next one is a little more tricky. The low PT has the advantage of beeing at index zero of the PD. The higher half PT for the kernel doesn't have this advantage; we will first have to calculate the offset into the PD, where we want to set the PDE.

   push %eax
   mov  $KERNEL_HIGH_VMA, %eax     # get virtual address offset
   shr  $22,  %eax                 # calculate index in the pd
   mov  $4, %ecx
   mul  %ecx                       # calculate byte offset (4bytes each entry)
   mov  %eax, %edx
   pop  %eax

The above code does this: It first saves the EAX register, as we'll need it's value again later (and restores it at the end, see?). Then it loads the higher half kernel offset into EAX. I need this for a special reason: I assume, that all the kernel fits entirely into the one PT for now. I need to have any virtual address that should be mapped by the kernel PT. From there i can extract the Index that the PDE should have in the PD. It is not really required to load the actual offset here (although it's the clearest way i can think of). Anything linked to higher half will do (for example the address of the _kernel_pd or _kernel_pt symbols themselves). Next, EAX is right shifted by 22 bits to extract the index into the PD. Since all entries are 4 bytes each, we need to multiply the index with 4, to get the offset of the PDE from the PD base address. This offset is saved in the EDX register for later use.

To get the right position for our PDE, we now just have to add the offset to the PD base pointer, we loaded before.

   push %eax                       # save the real address for later
   add  %edx, %eax                 # move the pointer in the pd to the correct entry.

Now that we know the correct location, the code to set the _kernel_pt PDE entry is very similar to the one used for the _kernel_low_pt, so i'll leave it up to you to figure out those four lines:

   mov  $_kernel_pt, %ebx          # get virtual address of kernel main pt
   sub  $KERNEL_HIGH_VMA, %ebx     # adjust to physical address
   or   $0x1, %ebx                 # mark present
   mov  %ebx, (%eax)               # set the pde

If we reach this position in the code, the PD and it's PDEs are don. It points to the correct physical addresses, and we don't need to worry about it anymore, except when enabling paging below, where we will move the pointer to the PD into the CR3 register.

The real mapping

For this code, we will, again, have to assume for a moment, that we have a working function called boot_map_page_ia32 which takes the following parameters:

  1. EBX: the physical address of the kernel PD
  2. ECX: the virtual address which we want to map
  3. EDX: the physical address that the virtual address should be "redirected" to.

After calling the function with the correct parameters, accessing the given virtual address will really access the physical memory we passed in EDX. We'll come to the real implementation of the function a little further down the road (not too long to wait, don't worry ;)).

We need to map 3 things:

  1. We have to "identity map" the lowest MB in memory (except the first page! we don't want to be able to access 0x0; this should cause an exception!). Identity mapping means that the virtual and physical address are the same. We need this to be able to use some fixed addresses (video buffer at 0xB8000 for example). We could, of course, also map the low MB somewhere else, and adjust all pointers to the new location.
  2. We have to identity map all the kernel in lower half. If we wouldn't do this, enabling paging would lead to an immediate page fault, then double fault, then tripple fault the CPU. This is because our EIP is currently somewhere in the physical load range of the kernel. after enabling paging, the CPU tries to fetch the next instruction, but the address it tries to read is a virtual one now! if the virtual address is not the same as the physical one, the CPU won't get the next instruction. (In fact, it will probably page fault, if nothing is mapped at that location).
  3. Last but not least, we have to map all the kernel into higher half, so we can call the C kernel entry later on. This mapping is merely an "alias" to the lower half kernel. After all mappings are complete, we can access the same physical memory through both the higher and lower half addresses. If we like to, we can later on remove the lower half mapping, so that the only way to access the kernel's memory is through the higher half pointers.

Ok, let's go. first, we need to have the physical address of the kernel PD in EBX. A few lines up, i pushed EAX, which contained the address we want. So all we have to do, is pop it off the stack now.

   pop  %ebx                       # pop saved address of the kernel PD

Now we're ready to loop over the first MB. I'll set 0x100000 (1MB) into ECX, which will contain the virtual address we want to map. Each iteration will decrease ECX by 0x1000 (one page, 4K), and call the mapping function. We stop (without calling the mapping function), if ECX reaches zero.

   mov  $0x100000, %ecx            # map the low 1MB

       mov %ecx, %edx              # phys == virt (identity mapping)
       call boot_map_page_ia32     # do the mapping
       sub $0x1000, %ecx           # one page down.
       jnz _idmap_first_mb_loop    # if not zero, continue (DON'T map zero :))

As you can see, the loop sets EDX to the same value as ECX, and thus "identity maps" all of those addresses. (Hint: later on, when you run your kernel in qemu, you can look at the mappings easily by typing "info tlb" in the qemu monitor).

That was easy, right? Now for something slightly more sophisticated. We'll identity map and "high-map" the kernel in one go. For this, we'll have to know the bounds of the kernel. The linker script provides a few additional symbols, which help us there. The KERNEL_BOOT_VMA symbol is the absolute physical start address of the kernel, so we'll start there (i put it into ECX and increment by a page each loop run, until we reach the end of the kernel). The end of the kernel is determined by looking at the _core_end symbol. But: carefull; the _core_end symbol (as you will see when looking at the linker script) is linked to higher half, so to get the physical end of the kernel, we need to subtract the higher half offset again.

   mov  $KERNEL_BOOT_VMA, %ecx     # this is the _very_ beginning :)
   mov  $_core_end, %eax           # virtual address of end
   sub  $KERNEL_HIGH_VMA, %eax     # now it is physical.

Now that we have all values set up (EBX, which is required too, wasn't changed in the meantime, so no need to worry about it; it has the correct value), we can go for the actual loop:

       mov %ecx, %edx              # phys == virt (identity mapping)
       call boot_map_page_ia32     # do the mapping

       push %ecx
       add $KERNEL_HIGH_VMA, %ecx  # now map the virtual address to the same physical one
       call boot_map_page_ia32     # do it
       pop %ecx

       add $0x1000, %ecx           # on to the next page.
       cmp %eax, %ecx
       jle _map_kernel             # continue untill all the kernel is mapped.

This first sets EDX to identity map ECX, and creates the mapping (like in the last loop). But then, surprise, surprise, something more happens: I save ECX, as i need it's value as is a few lines further down. The i adjust ECX to point to the same address as before, but in higher half. Calling the mapping function now creates a mapping from higher half (virtual) to lower half (physical) memory. Then, ECX is restored, so we can check for the kernel bounds and continue the loop. ECX is moved on to the next page we want to map. ECX is less-or-equal to the upper kernel bounds, we continue, otherwise we're done.

Enable paging

That's it. We're nearly done. All the mappings are in place, all the PD and PTs are ready to be used (hopefully ;)). Now we can give it a try by activating all of this, which is really simple:

   mov  %ebx, %cr3                 # use the kernel pd
   mov  %cr0, %eax                 # get the current cr0 value
   or   $(1 << 31), %eax           # enable paging
   mov  %eax, %cr0                 # now re-set the cr0 register.

First, EBX is moved to CR3, which is the PD base pointer in 32-bit paging. Then, the paging bit is set in CR0, and that's it. Lets keep our fingers crossed that things continue to work ;)


Last thing to do for this function is to restore all the registers and return to the bootstrap_ia32 function. The first thing there will be, if you remember, adjusting the stack to use the virtual higher half addresses...

   # restore register values.
   pop  %ecx
   pop  %edx
   pop  %ebx
   pop  %eax


Mapping single pages

Oh - wait; we're not quite there, right? i think there is one more function we deferred for later implementation... the one that maps a single page, which we used above in the loops: boot_map_page_ia32.


Always the same boring function starts...

   # ebx: physical addr of kernel PD
   # ecx: the virtual address to map
   # edx: the physical address to map to

   push %eax
   push %ebx
   push %ecx
   push %edx

Additionally, we will immediately push some of the parameters, as we need to calculate offsets from them, and later have those values unmodified.

   push %edx                       # push physical address
   push %ecx                       # push virtual address

Find the PT in the PD

The first real code will find out, which PT we will have to use to be able to map the virtual address. If the given address is not within the two PTs we have already set up, the kernel will panic immediately (of course; there is no logic to create new PTs on the fly).

   mov  %ecx, %eax
   shr  $22, %eax
   mov  $4, %ecx
   mul  %ecx                       # now we have the offset in eax
   add  %eax, %ebx                 # now ebx points to the phys addr of a pt if present
   mov  (%ebx), %eax

To get the correct index, the above first loads the virtual address from ECX into EAX, then shifts it 22 bits to the right, as this will, again, result in the index into the PD, required to find the PT for this address. Then again, we have to multiply it by four, as each entry in the PD is four bytes long. Finally the calculated offset (now in EAX) is added to the physical address of the PD, which is passed in EBX (remember?). From this address, we can now load the PDE into EAX. Please remember that this now is not and address yet, as it has flags ORed into it. We'll have to clear those flags before we can use it.

But first, just to be sure, we check if the present flag is set in the PDE, otherwise the given virtual address is out of bounds.

   mov  %eax, %ecx
   and  $0x1, %ecx                 # check present flag
   cmp  $0x0, %ecx
   je the_end                      # if zero, PANIC!

Now, as i said, we'll have to extract the physical PT address from the PDE.

   and  $0xFFFFF000, %eax          # clear off possible flags from the PDE.

Calculate PTE address

Finally, we have the correct physical address of a valid PT in EAX. We no can use this to find the correct slot in there, and set a PTE for our virtual address.

Remember, at the entrance of the function, we pushed both the physical, and then the virtual address passed in; now we need the virtual address again to calculate the correct index into the PT from it, so pop it off the stack:

   pop  %edx                       # the virtual address.

The following code will calculate the index, and put it into EBX:

   push %eax
   mov  %edx, %eax
   shr  $0xC, %eax                 # shift right to discard non-significant bits.
   and  $0x3FF, %eax               # and away not-relevant bits on the left.
   mov  $0x4, %ecx                 # each entry is 4 bytes
   mul  %ecx
   mov  %eax, %ebx                 # now in ebx: the offset into the PT for the PTE.
   pop  %eax

EAX is preserved (first pushed, then poped at the end of this section), as you can see. Then, I put the virtual address to map (in EDX) into EAX. After that, right-shift the whole thing by twelve bits. Now clear off the uninteresting bits on the left side, so only the range described in the intel manuals is left over. This is the part of the address that serves as index into the PT. As always, we now have to multiply it by four, since all PT entries are (exactly like the PD entries) 4 bytes long. The result is put into the EBX register.

Settings the PTE

Let's evaluate our current state. EBX contains the offset from the PT base address for our virtual address to map. EAX still contains the base address of the PT. What we now additionally need is the physical address we have to map to. At the start of the function we pushed it on the stack, so pop it off now.

   pop  %edx                       # the phsyical target address

Now calculate the final location of the PTE, by adding up the base address and the offset, in EAX and EBX respectively.

   add  %ebx, %eax                 # add offset to pt. this is the final location now.

To get a valid (and more important: present) PTE, we need to OR some flags, in this case only the present flag to the final physical address. Then we can put the PTE (made up of the address and the present flag) to the location we just looked up in the PT (which is in EAX).

   or   $0x1, %edx                 # mark present...
   mov  %edx, (%eax)               # and insert into pt.


The rest is cleanup (restore registers), and return.

   pop %edx
   pop %ecx
   pop %ebx
   pop %eax


Wow. That was quite some code. i know, it is hard to get all this into the head ;)

Please also note, that some of the code is not exactly the best, fastest way to do things, but the one how i understood the concepts best. If you have suggestions, feel free to tell me! I'm eager to hear how you would improve the code!


To be able to actually build the kernel, you'll need one more file: the linker script, which tells the linker where to put things, which addresses to assign, and what symbols to create to help the boot code... Let's have a look at the contents:

Basic setup

The file starts with a few global symbols and instructions for the linker:

KERNEL_BOOT_VMA = 0x00100000;
KERNEL_HIGH_VMA = 0xC0000000;

This tells the linker, that we want a 32-bit elf executable, containig i386 code, that our entry-point function is named "bootstrap_ia32" (as defined in boot.S). Also two very important absolute symbols are defined: KERNEL_BOOT_VMA, which denotes the physical start address of the kernel, and KERNEL_HIGH_VMA, which denotes the offset from the BOOT_VMA to the higher half (It is important to understand, that these are symbols, not just constants you can use from withing the linker script. Those symbols are usable from the code (which we did alread)).

The output Sections

Next comes the definition of the sections in the output file. This is only indirectly related to the sections in the input files. We can more or less freely move/rename and merge sections from the input to the output. Also, we can tell the linker that special files should be treated specially, which is of great value.

Let's have a look at the beginning of the section definition:


This tells the linker, that the current physical address in the output file (denoted by '.') should be set to KERNEL_BOOT_VMA, which we defined as the start of the kernel.

The next thing tells the linker how to handle the code from our boot.S file (but not the data!):

   .boot :{ 
       */boot.o (.text)

This creates a section called .boot in the output file. The linker puts into this section the contents of all .text section from all input files whos name ends in '/boot.o' (This means that passing our boot.o to the linker as 'boot.o' doesn't work, we'll need './boot.o' or some other path to the file). We'll only have one single of those objects, so thats cool with us :)

Also the above make the linker assign addresses from KERNEL_BOOT_VMA onwards to the .text symbols from boot.o, so we're linked to lower half here.

Note: it would have been cool to have the PD and PT structures in the .boot section too, so we don't have to mess with KERNEL_HIGH_VMA offsets in the code all the time. However all i could come up with failed, so i had to put them into the .data section (and i tried a lot...)

The next instructions tell the linker, to set the current offset from which addresses are assigned to the higher half, which means, from here on, all symbols will be linked to higher half. Also, a symbol (_core_start) is assigned the current address. This symbol can be used from the C code if required.

   _core_start = .;

Now, let's define the rest of the sections. First, we'll need a .text section, containing all the kernel code. The section is aligned on a page boundary, and the physical address is set to lower half, by subtracting the higher half offset again.

   .text ALIGN(0x1000) : AT(ADDR(.text) - KERNEL_HIGH_VMA) {
       _core_code = .;
       *(EXCLUDE_FILE (*/boot.o) .text)

       /* all readonly data is merged with this section, too */

One more noteworthy thing from above is, that the 'boot.o' file is now excluded from the .text section, as it's .text contents is already in the .boot section, linked to lower half.

The next section to define, is the .data section, which contains read/write data symbols.

   .data ALIGN (0x1000) : AT(ADDR(.data) - KERNEL_HIGH_VMA) {
       _core_data = .;

This should be pretty clear now, given what we know from above. The section is page aligned, and the physical load address is set to the lower half. A symbol is declared, denoting the start of the section, and all .data sections from all input files (also 'boot.o'!) are included here.

The last section to define is the .bss section. It will also contain common symbols, if there are some (currently there should not we any; one can create those with the .comm and .lcomm gnu assembler pseudo directives):

   .bss ALIGN (0x1000) : AT(ADDR(.bss) - KERNEL_HIGH_VMA) {
       _core_bss = .;
       . = ALIGN(4096);
       _core_ebss = .;

Again, page aligned, and the address is adjusted to lower half. Multiple input sections are merged into this one output section (COMMON is special and means all symbols that are marked *COM* if you look at a objdump -x output for a given object file).

This section is padded until the current page is full, and an "end-symbol" is declared. this is, so that we could be able to clear the .bss section (but with elf, this is not necessary, as the bootloader has to do this already for us...).

finally, another symbol is declared to denote the end of the kernel. We use this symbol in the code to determine the size of the kernel, and thus the amount of memory we need to map for the kernel to work.

   /* 4K alignment is guaranteed! */
   _core_end = .;

That's it. You should be able to compile and link the boot code now.

Compile and link

You should be able to compile and link the kernel like this:

gcc -Wall -Werror -Wextra -ffreestanding -fno-rtti -fno-exceptions -O0 -g -c -o boot.o boot.S ld -T link.ld -o kernel ./boot.o


I suggest using qemu:

qemu -kernel kernel

if things worked out, you should see "PANIC!" on the screen now! congratulations, paging worked. If nothing seems to happen, you did something wrong, and if the screen constantly flickers, you have done somethings wrong too (and the machine triple faults and resets continuously). I hope though, that there are no errors in the code i showed here.


Thanks for reading, and Good luck!