The following is an overview of all the steps I took for this demonstration. Each section has a short description, and a link to a page covering the details of that section along with the assembly code to implement it.

To use this example, set up the Development Environment as detailed below (basically, install NASM) and download the GitHub ZIP file containing all of the source. Assemble the source as described, and enjoy!

The ZIP file also has a fully-assembled (46 kiB) ISO image that you should be able to record straight to CD, or use with your favourite virtualisation program. Of course, this makes it difficult to experiment...

Development Environment

Development Host

Main article: User:Johnburger/Demo/Host

I wanted this example to be available to as broad an audience as possible, so I've made very few assumptions. The development machine's OS can be Windows (tested on WinXP 32-bit and Win7 64-bit) or Linux (tested on Ubuntu 12.04), and I presume Macintosh as well.

Of course, to assemble the source code requires an assembler. I don't particularly like NASM (here's why), but it is portable and it does produce raw binary files, which is just perfect for what I need!

Target Machine

Main article: User:Johnburger/Demo/Target

The target machine needs to satisfy all of the following criteria:

Is at least a '386 (tested on an Olivetti 386SX-16, and Asus Core i7);
Has at least 640K of RAM (Bill Gates was right!);
Obeys the BIOS Boot Specification 1.01 (i.e. not EFI/UEFI);
Can boot off a Floppy, Hard Disk, USB, CD or ISO image (Yep! MultiBoot at its best!).

Note that I've tested this example with both VirtualPC and VMWare Player on Windows, and VirtualBox on Linux (Ubuntu 12.04), and it worked fine.

Coding Style

Hierarchical Code

It must be my mindset: I think in hierarchies, and I've structured the source code that way too. Not only the code, but even the identifiers! NASM lets me use dots to categorise (and sub-categorise, and sub-sub-...) identifiers - it sometimes makes for some long names, but I hope it will make it clearer for readers too. (It certainly did for me!) Besides, it also let me do some interesting macro tricks.

The identifiers all start with a category: x86 (definitions), BIOS (definitions), Dev (definitions), Boot (definitions and code), Ints (definitions and code), etc. These categories correspond to a top-level directory of the hierarchy. The second level is the subject within the category being defined: maybe a component of the x86; or a particular device; or a phase of the boot process. Of course, these correspond to sub-directories of the hierarchy.

Finally, you'll note that there is only one .asm file in the whole system. That is because it %includes all the other files, that themselves won't assemble standalone. They've all got the .inc extension to indicate this.

Use the Assembler as Much as Possible

Any calculations that can be performed at assembly time are CPU cycles that don't have to be performed at run-time. To that end, I've defined constants, and expressions, and tables, and even checksums so that as much as possible only raw numbers need to be output to the binary image. One of the advantages of NASM is that it has powerful macro facilities - hey, being able to write a checksum algorithm in macros to make our job easier has to be a good thing!

Define System Tables at Assembly Time

You'll notice that many tables have been built in source code, rather than at run time. Not all: some require the environment to be running to generate the correct values. To allow building tables in source code, many symbols and labels are defined to keep track of where the assembler is up to; and these symbols are often inter-related, and build upon each other such that changing one will change others.

To that end, I've also used macros to hide as many implementation details as possible at the time of invocation. Of course, you can look at the magic behind the scenes (Demo.inc, x86/Desc.inc and Pad/ISO/Defn.inc are the main files here), but it makes for concise one-liners at the point of invocation rather than convoluted NASM code to instantiate the various structures.

The Source Code

Main article: User:Johnburger/Demo/Demo

The main code simply %includes all the other code, in a hierarchical fashion to make each aspect (hopefully) easily digestible. The different sections (and their potential sub-sections) are outlined below.

Configurable settings

Main article: User:Johnburger/Demo/Demo.inc

The modifiable parameters that the rest of the code uses are embodied in this file. By all means experiment - but don't blame me when weird things happen! If you need help understanding what happened, however, please feel free to drop me a message.

Global Data

Main article: User:Johnburger/Demo/Data.inc

Although this Demonstrator uses Local Descriptor Tables, some data still needs to be Global. For example, interrupt handlers that need to store data, such as the Keyboard handler, can only be guaranteed access to a Data Segment if it resides in the Global Descriptor Table.

x86 Definitions

Main article: User:Johnburger/Demo/x86/x86

The designers of the '386 did more than just define opcodes to execute; they also defined a large number of system structures that define how the system will work in some pretty sophisticated scenarios. When coding for those, it pays to NOT use "magic numbers" - identifiers not only help comment the code, but make it less likely that you'll overlook the bug that has caused your masterpiece to triple-fault for the 1,847th time!

EFlags

Main article: User:Johnburger/Demo/x86/EFlags

It's not often that you need to access individual flags in the EFlags register: most of the time you can use Jcc to do the tests; or for the more sophisticated programmer, SETcc. But EFlags is 32 bits, and some of the flags are very useful - in particular, TF (the Trace Flag) calls the debugger after Every. Single. Instruction. Set this, and you can step through your code!

To that end, I've defined every flag in EFlags as a bit-mask, using the Intel-defined name.

Segment

Main article: User:Johnburger/Demo/x86/Seg

Once you're in Protected Mode, a Segment Register takes on a whole new personality - that deserve its own set of definitions.

Control Register 0 (CR0)

Main article: User:Johnburger/Demo/x86/CR0

Control Register 0 covers such things as whether the CPU is in Protected Mode, or in Paging Mode, or expecting Aligned accesses, or ... In short, you can change the entire architecture of your system by changing a couple of bits. Better define them, using Intel's defined names...

Descriptors

Main article: User:Johnburger/Demo/x86/Desc

Segment Descriptors reside in Descriptor Tables. There are three types: The Global Descriptor Table, the Local Descriptor Table, and the Interrupt Descriptor Table. Each have their functions and limitations, but all have eight-byte Descriptors in them with a common structure. That means that not only do they deserve definitions, but since they're going to be used a lot throughout the system, some macros would be useful too...

Task State Segment (TSS)

Main article: User:Johnburger/Demo/x86/TSS

The largest structure defined by Intel (actually, that's not true - the current state of the Floating Point Unit, with its huge registers, is bigger...) is the Task State Segment (TSS). It is used for hardware Task Switching, but it is also used for some other mechanisms of the CPU (e.g. switching privilege levels...) For a long time the Intel documentation actually stated that TSSes weren't required for systems that didn't use hardware switching - I disaverred them of that quickly enough...

Interrupt Vector Table (IVT)

Main article: User:Johnburger/Demo/x86/IVT

While the TSS is a single structure, the Interrupt Vector Table is merely a collection of 256 FAR pointers - larger, true, but merely a collection. In Real Mode, the IVT is always at 0000h:0000h, and always contains pointers to routines - although IBM and Microsoft subverted this horribly when they started using various interrupts to point to data tables... Can you imagine what would happen if code did an INT 41h or INT 46h? Executing data as code should send shudders through any programmer...

BIOS Definitions

Main article: User:Johnburger/Demo/BIOS/BIOS

Whereas Intel defined the hardware silicon of the '386, IBM and other BIOS vendors added software extensions in the form of system calls, vectored through the INT instruction. Don't get me started on IBM not following Intel's mandate to reserve the first 32 interrupts for themselves - IBM's first interrupt (after Intel's last-defined one on the 8086) that they decided to use was to dump the current contents of the screen to a printer. Sounds like a debug function to me...

Regardless, the fact that the BIOS defined the operation of the PC gave it gravitas, and as the industry evolved various manufacturers published standards and specifications to make it easy for programmers to use their services. In short, their efforts required knowledge and information for programmers to use. In other words: definitions!

A20 Gate

Main article: User:Johnburger/Demo/BIOS/A20

RAM Map

Main article: User:Johnburger/Demo/BIOS/RAMMap

Keyboard

Main article: User:Johnburger/Demo/BIOS/Key

Video Graphics Array (VGA)

Main article: User:Johnburger/Demo/BIOS/VGA

Disk

Main article: User:Johnburger/Demo/BIOS/Disk

Master Boot Record (MBR)

Main article: User:Johnburger/Demo/BIOS/MBR

Memory Map

Main article: User:Johnburger/Demo/BIOS/MemMap

Device Definitions

Main article: User:Johnburger/Demo/Dev/Dev

To write a device driver to control a peripheral requires detailed knowledge of the programming interface to that device. Some are straightforward. Others are highly complex. And there are different manufacturers out there that each use their own proprietary interface to get their devices to behave like others of the same type.

One of the best things that has happened since the development of the original IBM PC was the definition of the Universal Serial Bus (USB) standard. There was one programming interface that all USB controller manufacturers had to implement, which meant that there was only one USB driver that needed to be written to work with all controllers! Of course, then USB 1.1 came out, requiring changes. Then USB 2.0. Then USB 3.0...

Regardless, even before that there were some industry-standard interfaces that all manufacturers adhered to, almost by necessity: back in the DOS days installing device drivers to handle proprietary formats wasn't very user friendly, with CONFIG.SYS editing and arcane IRQ, I/O and memory parameters. Far better to use the default pre-prepared interfaces that were already coded in the BIOS.

The following files have definitions for the industry-standard implementations of the devices. Note that the code that uses these definitions might be in the Boot section or the Executive section, or even the Interrupt handlers.

A20 Gate

Main article: User:Johnburger/Demo/Dev/A20

Priority Interrupt Controller (PIC)

Main article: User:Johnburger/Demo/Dev/PIC

Periodic Interval Timer (PIT)

Main article: User:Johnburger/Demo/Dev/Timer

Keyboard

Main article: User:Johnburger/Demo/Dev/Key

Video Graphics Array (VGA)

Main article: User:Johnburger/Demo/Dev/VGA

Boot Sequence

Main article: User:Johnburger/Demo/Boot/Boot

When a PC first turns on, the familiar BIOS screen is shown while it initialises and tests various aspects of the system. It then determines which device to boot from, loads the boot sector (usually the first sector on the device, but for CDs it's the 17th - go figure, but I've used that to our advantage!), and then jumps to the loaded code.

Note that one of the first things that this boot code needs to do is load more code - the 512 or 2,048 bytes that the BIOS loads isn't very much! But before it can do anything, the code first needs to understand the starting point that it is working with.

Real Mode Entry

Main article: User:Johnburger/Demo/Boot/Entry

When the BIOS jumps to the loaded code, the CPU is in Real Mode. Given that the ultimate destination of this demonstration is Protected Mode, this boot code obviously needs to perform the switch. But once the switch is performed, the BIOS routines are no longer available. One way to handle this is to switch back whenever the BIOS is to be called. Another is to do the calls up front. The latter is what this Demonstrator does.

Test CPU

Main article: User:Johnburger/Demo/Boot/CPU

The first thing to do is confirm that the CPU is indeed a '386 or better. If not, there's no point in continuing!

Load Rest of Code

Main article: User:Johnburger/Demo/Boot/Load

The next thing is to load the rest of the code. From where, and to what address though? Luckily the BIOS Boot Specification helps us with the From question: it provides the device number, and the sector is the "next" one. Where in memory is arbitrary: just make sure you avoid anything important - including the code currently being executed! Check the memory map.

Of course, ideally all the Real Mode stuff will fit in the first 510 bytes. (Why not 512? We need the 0xAA55 BIOS Signature at the end.) If we can manage that, then the rest of the code can be loaded anywhere we like, and this code can be abandoned - tossed away like a booster rocket on a multi-stage orbital insertion.

Position IDT

Main article: User:Johnburger/Demo/Boot/IDT

Part of the design of this Demonstrator is to define some startup system tables at assemble time. One particular table is the default Interrupt Descriptor Table - it costs more in assembly instructions to initialize the IDT than it does to simply (effectively) hard-code it. The problem is that hard-coding it means that it's part of the binary image just loaded - but it's not where we want it. Solution? Move it!

Although you can set up the IDT once you've entered Protected Mode, I found that having it set up from the very first Protected Mode instruction made for better fault debugging. Of course it took a long time even then before I had the fault handlers themselves working - but as soon as I did, I could see where the problem instruction was and development proceeded much faster.

Get RAM Size

Main article: User:Johnburger/Demo/Boot/RAM

We should have asked this question before loading more code - if there's not enough RAM, there's no point in continuing. But '386s were always delivered with at least 1 MB of RAM, which is more than enough for our purposes!

But it is important to find out not only what RAM there is, but what RAM there isn't - amount of Extended Memory, where memory holes might exist, what memory the BIOS is using, and even what RAM was detected as bad during boot. To find all this out requires the help of the BIOS, so we need to save this information away for later use.

Switch A20 Gate

Main article: User:Johnburger/Demo/Boot/A20

As a legacy of the original IBM PC, this one is never going away! To protect against Real Mode Wrap, at boot time the A20 line is gated off. This needs to be gated on before Extended Memory can be used - and that requires the BIOS. There are a couple of other techniques tried if the BIOS doesn't support it.

Set up Keyboard

Main article: User:Johnburger/Demo/Boot/Key

The keyboard doesn't need much initializing - the BIOS initialization works well enough. However, any special setup can be done here.

Switch Video Mode

Main article: User:Johnburger/Demo/Boot/VGA

For the purposes of this Demonstrator, I didn't want to have to deal with the complexities of pixel-addressable graphics. The PC's video card has a number of different text modes, and that's sufficient for displaying stuff for the user in a quick and convenient manner.

By default I use the 80x50 text mode, but for experimentation I've also provided the code to switch to 90x60 mode. Not every PC or Virtualisation program supports this, so if you do enable it prepare to revert if it doesn't work! Not having it also saves a large number of bytes in the initial boot sector...

Switch to Protected Mode

Main article: User:Johnburger/Demo/Boot/PM

Switching to protected mode is three instructions - four if you include disabling interrupts first (a very smart move!) After all, all the previous code has set up all of the system structures...

You can then JMP to the Protected Mode code - and considering that the first thing that needs to happen is for the Segment Registers to be updated, we might as well start with the Code Segment Register (CS) first: make the first JMP far. And since we're using LDTs for this Demonstrator, why not JMP into one?

Interrupt Handlers

Main article: User:Johnburger/Demo/Ints/Ints

While developing, any unexpected interrupt is a calamity! To that end, I decided to simply dump the current state onto the screen, overwriting whatever was there already. This required all sorts of labels and register-to-hex display routines, but the result is unambiguous: something horrible has happened, and this is where it happened.

What then? All we can really do is reboot the computer, but it'd be nice to wait until you can write down the critical areas. (Screen dump? Don't make me laugh!) <Ctrl><Alt><Del> is a time-honoured tradition - let's make it that!

Of course, it'd also be nice to find out exactly which flavour of interrupt actually occurred. It'd be nice to point all Fault vectors to the same handler, but a little extra sophistication allows us to display the interrupt number as well. To do that, we need separate entry points for each vector: but that's OK, there's a complicating factor anyway.

A half-dozen of the Intel-defined exceptions don't follow the standard rules. (Typical hardware engineers. Sheesh!) As well as doing the normal things of pushing onto the stack EFLAGS, then CS, then EIP, some interrupts also push an Error Code. And that Error Code can be useful. That means that either those "special" interrupts need special handling - or that we can modify all the other interrupts to pretend that they push Error Codes too...

I decided to do just that: provide separate entry points for each Interrupt handler, and for those that didn't have an Error Code, push a Zero onto the Stack as though it did. From there, all Interrupt handlers would push their number onto the stack, and call the default Fault handler.

Interrupt Descriptor Table (IDT)

Main article: User:Johnburger/Demo/Ints/IDT

It was less code to embed the initial entries of the starting IDT as data than to write code to reproduce it. In fact, given that I wanted the IDT to be available from the very first Protected Mode instruction, that IDT setup code had to be in Real Mode, where there was even less room! So I deliberately put the IDT at the beginning of the block loaded by the Real Mode loader for it to relocate to its final position.

Vectors

Main article: User:Johnburger/Demo/Ints/Vectors

These are the entry points for each of the Intel-reserved exception handlers.

Default Fault Handler

Main article: User:Johnburger/Demo/Ints/Fault

The default Fault handler basically smears the screen with a huge amount of information: the hex representation of every register I can get my hands on - with labels. In a nice bright-white-on-red-background, in case anyone misses the point. Start copying...

Needless to say, as a default handler it is less than subtle. Especially since it doesn't try to correct the problem - however, after displaying the registers it returns to the faulting instruction. That should immediately re-raise the exception, Dingling(tm) the interrupt location on the screen...

Note that some of the Fault sources are not simple interrupts: they're full-blown Task switches. I still wanted to display that information too, since not doing so is very misleading - it would indicate the current environment, not the faulting one. For those I need to extract the information, not from the Stack, but from the relevant TSS. Luckily the Backlink field of the current TSS holds the calling TSS, so I can use that to extract the relevant fields - just as soon as I work out which Segment Descriptor to load!

Incidentally: even though a fault has been displayed, you should still be able to invoke the debugger...

Single Step Handler

Main article: User:Johnburger/Demo/Ints/Single

To experiment with debugging, I've hooked the Single Step Interrupt (INT 1) to make it simply wait in a tight loop until either <Space> or <Enter> is pressed:

While waiting for a keypress, I decided to indicate that it was waiting by incrementing a screen location. That tight loop is very tight indeed...
If <Space> is pressed, the routine simply returns. After the next instruction is executed, this same Handler will be re-entered, waiting for the next keypress. Single Step!
If <Enter> is pressed, then it also returns: but before doing so, it turns off Single Stepping - for this Task at least!

To turn on Single Stepping in the first place, simply add the TRACE macro to the desired place. This sets TF, the Trace Flag, in the EFLAGS register.

One cute place to add TRACE is just at the start of User's DrawFrame routine. That will mean that all the Tasks will be in Single Step mode, and you won't see anything until you hold down the Space Bar for long enough that the borders will start to be drawn. From there you can press <Enter> to continue individual Tasks until only a few "targets" remain.

Debug Handler

Main article: User:Johnburger/Demo/Ints/Debug/Debug

I really wanted to see the memory that I was so carefully constructing. System tables, stack depths, and even the screen - I wanted to look at them. But I also wanted to be able to interact with it: in short, I wanted a memory viewer that I could invoke at run-time. There's a key on the keyboard labeled <Break>. Let's see what pressing it does...

Initialisation

Main article: User:Johnburger/Demo/Ints/Debug/Init

First, I need to initialise the current run-time context. Prevent the code from switching to a different Task, set up some variables, then...

Show Registers

Main article: User:Johnburger/Demo/Ints/Debug/Regs

Display the saved registers. Luckily, there's a table already set up to define where the registers are saved in memory during an interrupt - just need to use the funky screen location parameter...

Show Memory

Main article: User:Johnburger/Demo/Ints/Debug/Show

Then we can display the current memory contents. We wouldn't want to raise an exception trying to access inaccessible memory, so pay careful attention to the current Segment's limits. Oh, and if the Segment happens to be an Expand Down segment, we need to totally invert the memory pointers for where we're looking at. In fact: let's take the opportunity to highlight the fact that it is an Expand Down Segment by displaying it at the bottom of the screen!

Keyboard

Main article: User:Johnburger/Demo/Ints/Debug/Key

Finally, a debugger without controls is not very useful. Let's look at the keypresses stored in the Global Data Segment and act on various direction arrows and context changing keys.

(Uh oh: what happens if we invoke the debugger while the debugger is running. Let's try it... Wow!)

Segment Not Present Handler

Main article: User:Johnburger/Demo/Ints/NoSeg

Just to show what could happen in a full-blown Operating System, I decided to implement a quick-'n'-dirty Segment Not Present exception handler. Of course, it assumes that the segment referenced really is present, but that the Segment Present bit in the descriptor is merely off.

This handler discovers the offending segment in the appropriate Descriptor Table, enables the Present bit, and returns. The faulting instruction will be re-executed, and (hopefully!) things will proceed as normal!

User System Calls

Main article: User:Johnburger/Demo/Ints/User

IBM used the INT paradigm to invoke system calls on the first PC. Microsoft continued that practice with MS-DOS. As more sophisticated techniques came about - especially SYSCALL - using INT has become less common. Indeed, Intel suggests using Call Gates since they're more Protected - they can automatically copy parameters between stacks.

But for purposes of this Demonstrator, I've added my own two INT calls, to change the way that the individual Tasks behave. Experiment!

Default IRQ Handler

Main article: User:Johnburger/Demo/Ints/IRQs

An Interrupt ReQuest (IRQ) is different from an internal exception or fault. For one thing, there's an (impatient) device out there! If we're not careful, the device will continue to interrupt - if we simply returned, the unacknowledged interrupt could immediately re-interrupt the CPU, effectively locking it up forever.

The correct procedure on an externel interrupt is to "placate" the device to stop it from raising more interrupts for the same event, and then to acknowledge the interrupt on the PIC(s) that formed part of the process (in the order Slave then Master - this is important). If you then want to continue to process the interrupt, that's your lookout.

The problem is: what should the default IRQ handler do? It won't know how to "placate" every device out there - it can't. So I don't even try. I could acknowledge the PIC(s) and simply return, at the risk of entering an infinite interrupt loop - but at another level: WHY is the PIC interrupting me for an unknown IRQ anyway?

When the PIC is initialised, all interrupts are turned off. That means that it shouldn't be generating interrupts for unknown sources. Therefore, quite simply, the default IRQ handler will simply use the default Fault handler, and put an error code of ~~C0DE~~ (I decided to use 0 again - 0C0DEh used too many unnecessary bytes in the code!)

Timer IRQ Handler

Main article: User:Johnburger/Demo/Ints/Timer

A Timer interrupt is quite expected: in fact, my default rate is 1,000 times a second. Why? I'm not doing much other than updating the screen, so you get smoother animations...

So what does the Timer interrupt handler do? Apart from housekeeping, it simply examines the GDT looking for TSSes that are not marked Busy, starting after the current one (which is, of course, Busy!). If it finds one, it checks to see if it's currently Active (that complication comes about because of the Intel suggestion to use TSSes to handle certain exceptions.) If it's Active, it JMPs to it. If it isn't, it keeps looking, wrapping around to the beginning of the GDT after the last known one. If it comes all the way back to the current TSS, it simply returns - it's the only one runnable!

The effect of a JMP to a TSS is that the CPU performs a Task Switch. It stores the current CPU state in the current TSS, marks it as not Busy, then updates TR and loads the new state from the new TSS. If the system has been running for a while, odds on the new TSS is pointing to the instruction after the Timer handler's JMP instruction, so the code will resume by returning from the Timer IRQ handler.

Keyboard IRQ Handler

Main article: User:Johnburger/Demo/Ints/Key

A keyboard interrupt is also quite expected. The handler doesn't have to do much: simply get the received byte from the keyboard (which incidentally acknowledges the interrupt) and store it away somewhere. Where?

With DOS, the answer is inside a 16-byte buffer in the BIOS Data Area (BDA) to be sampled by future calls to various "Get Next Key" BIOS calls. If the buffer fills up, the interrupt handler causes the Speaker to bleat - sorry, no Speaker code available in this Demonstrator (another exercise for the reader?)
With Windows, the key press (or key release - they both cause interrupts, thankfully!) generates yet another Event for the Event Queue: say no more.

For this demo, I don't envisage a signficant delay between the key press and any Task waiting to handle it. After all, there are only two types of Tasks waiting for key presses: the SingleStep handler waiting for a <Space> or <Enter>; and any of the Fault handlers waiting for a <Ctrl><Alt><Del>. Therefore, I've decided to merely store the last keycode received.

It's that multiple key combination that deserves a description: when a key is pressed, the keyboard controller generates an interrupt to provide a unique(ish) code for the key. The top bit (0x80) of that code is zero. If that key is released, another interrupt is generated for the same key - but this time that top bit is set. Also note that if a key is held down, after a delay a succession of interrupts are generated, all with the same keycode, and all with their top bit zero.

Using that "make/break" keycode high-bit flag, it is possible to determine which keys are currently held down. As well as storing the last keycode found, the Key interrupt handler will also keep track of which of the "Shift" style of keys are currently pressed: <Shift>, <Ctrl>, or <Alt>. Note that since some keyboards have two copies of these keys (left and right, explaining the "(ish)" above - they generate the same keycodes, but with a prefix...), there will be multiple flags to represent the multiple keys.

Utility Functions

Main article: User:Johnburger/Demo/Ints/Util

Different interrupt handlers need to perform similar functions, to display Hex values or switch to the next Task. This module holds those common routines.

User Task

Main article: User:Johnburger/Demo/User/User

Given that this Demonstrator will be running a number of User Tasks, and that it doesn't have a Task Loader (other than the one at Boot), I need a Task to run.

I decided to make the User Task run in User Mode instead of Supervisor Mode. This demonstrates the extra work that needs to be done to accommodate the new mode, and also provides a platform from which to test various User Mode features: protection from Privileged instructions (such as HLT), as well as the complexities of making a CALL to the Kernel (an exercise for the reader!).

Note that it is easiest to make each User Task the same. This isn't of course necessary, but it provides a number of tasks without having to write a new one every time.

Each User Task is simple - it has been given an area of the screen to work with, and works out what to do with the area that it has been given:

It draws a border in the allocated area - if one will fit;
If the area is only 1x1, it simply continually increments the screen location. This has the effect of showing that the Task is actually running.
Otherwise, it draws a "ball" using the defined character, and "bounces" it around in its allotted space. This may not cover every location in the defined area - a square area, for example, would only bounce the ball from the top left corner to the bottom right and back again.

User LDT

Main article: User:Johnburger/Demo/User/LDT

Each Task will have its own context: not only its own Task State Segment (TSS), but also its own Data, Stack, and Code. No, wait... since Code in Protected Mode is always Read-only, we can share the code between the Tasks.

Finally, they will each have their own Local Descriptor Table.

This actually makes the shared code easier - I can use constants for the different Segments, rather than organise new Segments for every new Task.

User Data

Main article: User:Johnburger/Demo/User/Data

The User code as given doesn't really need data: it can keep everything in its registers. However, as a Demonstrator it makes sense to show how it's done, so this is the Data segment - every Task will have its own copy.

User Code

Main article: User:Johnburger/Demo/User/Code

The User Task is given its starting parameters in registers, set up by the Executive when the TSS was created. It needs to use these parameters to work out how it is going to behave:

Whether to draw a frame;
Whether it is going to be bouncing or incrementing the ball.

Draw Frame

Main article: User:Johnburger/Demo/User/Frame

If the width is greater than 2, and the height is greater than 2, there's room for a border. This uses the high half of the ASCII character table to draw a border around the Task's allocated screen area. It then shrinks the allocated area to exclude the border from future calculations.

Executive

Main article: User:Johnburger/Demo/Exec/Exec

Now that we're in Protected Mode, we have the full protection of the CPU for all of the code we write. Of course, that means that we need to be careful - but at least we get the benefit of the CPU for detecting code problems!

Before we can start the Executive proper, which creates the User Tasks, we really need to finish setting up Protected Mode: install better Fault Handlers; set up some devices with hardware interrupts; and so on.

Some of these routines will be calling utility functions, to allocate RAM and Descriptor Table entries. These routines are described below.

Update Registers

Main article: User:Johnburger/Demo/Exec/Init

The Flags register and all of the Segment Registers need to be initialised now that we're in Protected Mode. CS was initialised during the JMP to here, so it's just the Stack and Data segment registers to do.

But, since we'll be using Supervisor and User modes, not to mention the CPU's Task-switching mechanism, it's also necessary to set up the Task Register - and a TSS to store stuff into.

Install Fault Handlers

Main article: User:Johnburger/Demo/Exec/Ints/Ints

Intel recommends that certain Faults should be handled by a full Task Switch, rather than merely an interrupt routine. For example, the following code while in Supervisor mode will halt the CPU:

    MOV   ESP,1
    PUSH  EAX

This code will:

Cause the CPU to underflow the stack, causing a Stack Fault.
That will cause the CPU to push the current EFLAGS, CS, EIP and Error Code onto the stack... Oops! Invalid Stack! Now it's a Double Fault!
That will cause the CPU to push the current EFLAGS, CS, EIP and Error Code onto the stack... Oops! Invalid Stack! Now it's a Triple Fault!
That will cause the CPU to shut down.
The PC often has hardware to detect this condition, and will simply reboot the computer when it sees the CPU has shut down. Fun fun fun!

At least, the above scenario will halt the CPU unless the Stack Fault handler is a separate Task - or at least the Double Fault handler is one. That will switch in a known Stack at Fault time, avoiding the above scenario.

Therefore, we initialise a number of TSSes to take over the Intel-suggested Fault handlers in the IDT:

Int 8 - Double Fault
Int 10 - Invalid TSS
Int 12 - Stack Fault

And we add some other interrupt handlers, for Debugging support and demonstration purposes.

Install Trace Handler

Main article: User:Johnburger/Demo/Exec/Ints/Trace

The Trace handler has already been written: it's just a matter of installing it.

Install Debug Handler

Main article: User:Johnburger/Demo/Exec/Ints/Debug

The Debug handler has already been written: it's just a matter of installing it.

Install No Segment Handler

Main article: User:Johnburger/Demo/Exec/Ints/NoSeg

The No Segment handler has already been written: it's just a matter of installing it.

Install Double Fault Handler

Main article: User:Johnburger/Demo/Exec/Ints/Double

The generic Fault handler has been written, and it is currently pointed to by the Double Fault IDT entry, but that's not good enough. It really ought to have its own context - especially its own Stack - so this module will set one up for it, and fix the IDT entry.

Given that it's not the only Fault handler that needs this special treatment, it will use Executive's more sophisticated generic Fault handler, which will finally CALL the above generic handler to give the same appearance as before, but in a safer context.

Install Stack Fault Handler

Main article: User:Johnburger/Demo/Exec/Ints/BadStack

The generic Fault handler has been written, and it is currently pointed to by the Stack Fault IDT entry, but that's not good enough. It really ought to have its own context - especially its own Stack - so this module will set one up for it, and fix the IDT entry.

Given that it's not the only Fault handler that needs this special treatment, it will use Executive's more sophisticated generic Fault handler, which will finally CALL the above generic handler to give the same appearance as before, but in a safer context.

Install Invalid TSS Handler

Main article: User:Johnburger/Demo/Exec/Ints/BadTSS

The generic Fault handler has been written, and it is currently pointed to by the Invalid TSS Fault IDT entry, but that's not good enough. It really ought to have its own context - especially its own Stack - so this module will set one up for it, and fix the IDT entry.

Given that it's not the only Fault handler that needs this special treatment, it will use Executive's more sophisticated generic Fault handler, which will finally CALL the above generic handler to give the same appearance as before, but in a safer context.

Generic Fault Handler Task

Main article: User:Johnburger/Demo/Exec/Ints/Fault

This is the more sophisticated generic Fault handler, that runs in its own Task context. That means that it needs to point to the registers in the faulting Task (using the TSS Back link field) - which means even more complex code!

Initialise PICs

Main article: User:Johnburger/Demo/Exec/PICs

Another IBM PC legacy that won't go away! IBM decided to ignore Intel's reservation of the first 32 CPU interrupts for themselves, and mapped the hardware Interrupt ReQuest (IRQ) vectors for IRQs 0 to 7 to Interrupts 8 to 15.

When Intel used some of those interrupts in its 80286, every system from then on either had to deal with the overlaying interrupt causes, or had to reprogram the PICs to change their mapping.

I've decided to use Interrupts 32 to 47 for IRQs 0-7 and 8-15.

Initialise Timer

Main article: User:Johnburger/Demo/Exec/Timer

At boot time, the BIOS programs the Timer to interrupt as slow as possible, which ends up being 18.2 times a second. Given that the BIOS only uses this Timer to update the current date and time (with a resolution of only one second), and to turn off the Floppy Drive motor a little while after the last access (not a time-critical operation!), this seems like a reasonable choice.

Of course, if a multi-tasking system wants to give the appearance of doing lots of things at once, 18.2 times a second is rather slow. So what's better? 60 times? 100 times? 1,000 times? More? Experiment! I chose 1,000 times, just for the giggle factor.

And of course the Timer Interrupt handler needs to be installed, which will switch between the available tasks.

Initialise Keyboard

Main article: User:Johnburger/Demo/Exec/Key

There's not a lot that you need to do for the keyboard. You could instruct it to use a different mapping, but the BIOS-defined default works well enough for our purposes.

We just need to set up a Keyboard Interrupt handler to save away keypresses as they occur.

Create User Tasks

Main article: User:Johnburger/Demo/Exec/User/User

Well, everything that we want initialised is now ready. There's nothing more for it but to let the system begin! And that's as simple as merely enabling interrupts (actually, that was done earlier...): the Timer handler will cycle through the list of active Tasks. All we need to do is create them!

Create User's LDT

Main article: User:Johnburger/Demo/Exec/User/LDT

The User's LDT is defined in the User's code, but that is so that it can access its symbols for the Selector offsets. It actually has to be populated by the Executive before use - which is this code!

Create User's TSS

Main article: User:Johnburger/Demo/Exec/User/TSS

Similarly, the TSS has to be populated by the Executive before the User code can start - which is this code!

Allocator Functions

Main article: User:Johnburger/Demo/Exec/Alloc/Alloc

Some utility functions are needed by both the Initialisation code and the Main Task itself, to allocate generic system structures for tailoring by the rest of the code. These are provided here.

RAM Allocator

Main article: User:Johnburger/Demo/Exec/Alloc/RAM

Keeping track of what RAM has been used and what is still available is important on any dynamic system. Since this demonstration will be allocating RAM but never returning it, it certainly makes the job easy! All we have to do is avoid allocating addresses that are already in use, or actually missing!

Of course, that's only the start of the problem. Once the RAM is allocated, it then needs to be assigned to a segment: without a segment descriptor, we can't access the RAM we've just allocated!

DT Allocator

Main article: User:Johnburger/Demo/Exec/Alloc/DT

If RAM has been allocated, or a system table needs to be created, a descriptor for it needs to be added to a Descriptor Table. There are actually three descriptor tables that need updating: the GDT, the IDT, and the current LDT.

Executive's LDT

Main article: User:Johnburger/Demo/Exec/LDT

The Executive's LDT can be used for a number of things other than just holding the Segments needed to run it: it is also the perfect place to put the Segments required by the Executive's more sophisticated Fault handlers. Rather than cluttering the GDT with various non-global structures, I have defined this LDT as the LDT for the Fault handlers' TSSes.

Global Descriptor Table

Main article: User:Johnburger/Demo/Exec/GDT

And finally, last thing, (arguably) the most important structure in the entire system: the Global Descriptor Table.

I've coded it last to allow it to grow, without having to reposition it at run-time first. Where it is is good enough - it's (up to) 64 kiB in size, so a healthy chunk of memory has been reserved for it, even though only the first few entries are defined at startup.

Padding

Main article: User:Johnburger/Demo/Pad/Sizes

The binary output from the assembler is exactly as large as it needs to be - which isn't likely to be a perfect size for any storage media. For this reason we need to Pad the binary output to round it up to the next Sector size. But here's also a perfect opportunity to Pad it with the structures and tables necessary to create an ISO image - without using a post-assemble tool!

Before we can do any Padding, though, we need a definitive size for each of the different Segments in the code. This module simply re-opens each Segment and defines a symbol for its .Base, .Size and .Limit.

HardDisk

Main article: User:Johnburger/Demo/Pad/HardDisk

Padding for the Hard Disk is not difficult, since the hard part was the Boot record - and that was handled in Demo/Boot/Boot. All we have to do here is Pad to the next Sector size.

ISO

Main article: User:Johnburger/Demo/Pad/ISO/ISO

ISO 9660 is the detailed specification for the format recorded on a CD-ROM. Further, El-Torito is the specification for making a CD-ROM bootable. Due to a quirk in these specifications, it's possible to "Pad" a binary image and turn it into a bootable CD-ROM image! As long as the original binary obeys some restrictions, at least...

I have tried to make these files as stand-alone as possible, so that they can be easily added to your own project. The main article describes the required definitions.

Definitions

Main article: User:Johnburger/Demo/Pad/ISO/Defn

ISO 9660 defines a number of small and large structures, so I've defined most of them here, along with some helper macros to make my life easier. You will (probably) not need to modify any definitions here for your own use.

Primary Volume

Main article: User:Johnburger/Demo/Pad/ISO/Primary

All CD-ROMs have a Primary Volume.

Boot Volume

Main article: User:Johnburger/Demo/Pad/ISO/Boot

Only Boot CD-ROMs have a Boot Volume. This is defined by the El-Torito specification.

Terminator Volume

Main article: User:Johnburger/Demo/Pad/ISO/Terminator

At the end of the list of volumes there needs to be a Terminator Volume.

Catalog Sector

Main article: User:Johnburger/Demo/Pad/ISO/Catalog

The El-Torito specification requires a Catalog of Boot entries. For simplicity, I only define one.

Path Table(s)

Main article: User:Johnburger/Demo/Pad/ISO/PathTable

They designed the CD File System in such a way that the same CD could be used by both little- and big-endian systems. Most of the time that is accomplished by recording the same multi-byte value twice, first in little-endian, then in big. They refer to this as "both-endian" format, and the reading software merely picks out the representation that it prefers.

But there's one structure, the Path Table, where they decided that it was too difficult to maintain, so they defined that the entire structure be recorded twice, once in each format. Since for this Demonstrator there is practically no difference between the two versions, I have written the source code for the structure generically, and used macros where the endian-ness is important.

Then it is simply a matter of:

%defineing the macro as Little;
%includeing this file once;
Re%defineing the macro as Big;
Re%includeing this file a second time.

I love macros!

Root Directory

Main article: User:Johnburger/Demo/Pad/ISO/RootDir

This ISO image is a rarity, and could be mistaken for a coaster: it doesn't have any files on it! The important stuff is in the Boot sector and following loaded binary; there are no files per se required. Unfortunately, ISO 9660 requires a root directory - even if it's empty.

Floppy

Main article: User:Johnburger/Demo/Pad/Floppy

Old versions of VMware wouldn't allow you to use a file image instead of a physical Floppy Disk, unless the file image precisely matched a standard Floppy size. I suspect the file size was used internally by VMware to work out what drive geometry to use (sectors per track etc.) So for years I added a huge number of zeroes onto the end of my experiments to make them 720 kiB, 1,200 kiB or 1,440 kiB in size.

One day, after a few VMware upgrades, I forgot - and it worked anyway. So I offer this code here in case it will be of any use. Personally, I doubt it...

Experiments

As well as adding MOV ESP,1 somewhere, there are a number of other experiments you can do with this code: some for the giggle factor, while others to explore the functionality of the CPU.

`HLT` vs `Yield` vs `LOOP`

In Demo/User/Code there are four different ways of getting the User-mode code to wait between each update of the ball position:

.Wait:
;                HLT                             ; Illegal User-mode instruction!

                  User.Halt                       ; So, use system-provided one

;                User.Yield                      ; This has a different effect

; This is an alternative to using HLT
;                PUSH            ECX
;                MOV             ECX,User.Delay  ; Wait for a little while
;                LOOP            $               ; About this long...
;                POP             ECX

You can uncomment each of them and see how it affects the behaviour of the individual Tasks - or comment them all out! I'm sorry I don't have a CPU usage display, but if you use a laptop and leave it running for a little while, pretty soon the CPU fan will become very noisy!

`System.TSS`

I defined the System.TSS flag to both prevent switching to a System Task, and to prevent switching from one. I'm not suggesting that you do it this way in your system, but you will need something! Take a look at what happens if you don't put it in:

In Demo/Ints/Debug/Init, comment out the CALL Ints.System.Init and CALL Ints.System.Done lines;
Assemble and start the system;
Invoke the Debugger by pressing <Break>;
Since the Debugged Task is no longer marked as special, the Task Switcher happily switches away from it and bounces balls through the display.
Select a different Segment with the <Right> arrow key;
Press <Break> again;
In all likelihood, you'll start Debugging a second Task, rather than Debugging the Debugger. You'll see the system try to display two different Segments, as well as bouncing balls.
Press <Break> a few more times, and press <Right> and <Left> arrow a bunch of times too;
Now you'll see fewer bouncing balls, but more and different Segments.
Keep pressing <Break> until you finally recurse one Task's Debugger deep enough to underflow its Stack.

Now the system is inside a System.TSS, courtesy of the Stack Fault (Int 0Ch) Task, and the Task Switcher stops the madness. You can now invoke the Debugger and examine the memory without being disturbed.

Default bit in Code Descriptor

The assembler has been told that User-mode code is USE32, so it uses the 32-bit forms for all the instructions. What happens if we get this wrong? Let's find out!

In Demo/Exec/User/LDT, there are some blocks of code to initialise the different User LDT entries. Go to the block that initialises the Code Segment:

                MOV             EAX,User.Base           ; Assembled User code
                MOV             ECX,User.Size
                MOV             DL,Type.Mem(Code, DPL3, NoRW) & ~x86.Desc.Type.Present
                MOV             DH,Gran.Mem(Byte, Def32)
                CALL            Exec.Alloc.LDT.Mem      ; Allocate LDT Entry

and change Def32 to Def16. This now tells the CPU to execute User code as 16-bit code - but we haven't told the assembler that!

Assemble and run the code. It should crash instantly with a GPF (Int 0D) in User Code (CS=002F), at (EIP) address 0000_0005. Take a look at the first few lines of User Code in the assembly listing:

                             <2>                 SEGMENT         User  VSTART=0  ALIGN=16
                             <2> 
                             <2> User.ColouredBall EQU           (User.BallColour << 8) | User.Ball
                             <2> 
                             <2> User.Entry:
00000000 FC                  <2>                 CLD                             ; Work forwards
00000001 891D[04000000]      <2>                 MOV             [User.Data.Row],EBX
00000007 668915[00000000]    <2>                 MOV             [User.Data.Left],DX  ; Left and Top
0000000E 66890D[02000000]    <2>                 MOV             [User.Data.Width],CX ; Width and Height
                             <2>

There isn't an instruction at 0000_0005! But if you look at the instruction at 0000_0001 you'll see the op-code 89 1D. This means "Move the contents of EBX into the following memory location" - followed by a 32-bit address. Only, the CPU is actually in 16-bit mode, so it will only move BX and only expects a 16-bit address: then expects the next instruction to be at - you guessed it! - 0000_0005. And what's there? The op-code 00 00, which decodes as ADD [BX+SI],AL - which is way past DS's Limit, and causes the GPF.

User:Johnburger/Demo/Overview

Development Environment

Development Host

Target Machine

Coding Style

Hierarchical Code

Use the Assembler as Much as Possible

Define System Tables at Assembly Time

The Source Code

Configurable settings

Global Data

x86 Definitions

EFlags

Segment

Control Register 0 (CR0)

Descriptors

Task State Segment (TSS)

Interrupt Vector Table (IVT)

BIOS Definitions

A20 Gate

RAM Map

Keyboard

Video Graphics Array (VGA)

Disk

Master Boot Record (MBR)

Memory Map

Device Definitions

A20 Gate

Priority Interrupt Controller (PIC)

Periodic Interval Timer (PIT)

Keyboard

Video Graphics Array (VGA)

Boot Sequence

Real Mode Entry

Test CPU

Load Rest of Code

Position IDT

Get RAM Size

Switch A20 Gate

Set up Keyboard

Switch Video Mode

Switch to Protected Mode

Interrupt Handlers

Interrupt Descriptor Table (IDT)

Vectors

Default Fault Handler

Single Step Handler

Debug Handler

Initialisation

Show Registers

Show Memory

Keyboard

Segment Not Present Handler

User System Calls

Default IRQ Handler

Timer IRQ Handler

Keyboard IRQ Handler

Utility Functions

User Task

User LDT

User Data

User Code

Draw Frame

Executive

Update Registers

Install Fault Handlers

Install Trace Handler

Install Debug Handler

Install No Segment Handler

Install Double Fault Handler

Install Stack Fault Handler

Install Invalid TSS Handler

Generic Fault Handler Task

Initialise PICs

Initialise Timer

Initialise Keyboard

Create User Tasks

Create User's LDT

Create User's TSS

Allocator Functions

`HLT` vs `Yield` vs `LOOP`

`System.TSS`