Kernel Debugging

From OSDev Wiki
Jump to: navigation, search

Humans make mistakes. Some of these mistakes may end up being part of your OS. Since bugs are more difficult to find than to fix, this page provides a list of common techniques that can be used to isolate bugs in your OS.

Contents

Debug statements and log files

The first solution is probably the easiest, and depends on what kind of information you want to get back from your debugger.

The problem with using a debugger such as DDD or GDB is that they require an OS to run... kinda useless when it's the OS itself that you want to debug.

Debugging is essentially being able to probe the contents of a variable at a specific breakpoint. When your program hits the breakpoint, you can probe the variable.

This can also be achieved without using a debugger, by instead inserting a line of code to write to the screen or to a log of some kind. This gives you the contents of the variable that you are interested in - but it means knowing in advance what variable to check, and when, and implies recompiling the kernel every time you want to check a different set of variables... but it is the simplest solution.

Pseudo-Breakpoints

In places where a full print or logging function is not feasible (such as when trying to isolate a single erroneous assembly language instruction), you can create a kind of 'pseudo-breakpoint' by inserting an "1: jmp 1b" instruction into the code. These can be used to perform a binary space isolation (often referred to as a 'binary chop') through the code. The idea is to place the endless loop at a point roughly halfway through the part of the code suspected to be at fault; if the CPU halts before the error occurs, then you know that the error is after the breakpoint, otherwise, it must be in the code before breakpoint. Repeat this procedure until the error is isolated. Unfortunately, this only works if the result of the error can be differentiated from the halt instruction itself, and it does little in the case of a problem occurring more than one repetition into loop, such as an array overrun. But you could use a virtual machine debugger to do single stepping with pseudo-breakpoints (see bellow "Using Debuggers with VMs").

IMPORTANT NOTE #1: the HLT instruction is a privileged instruction, and as such it will only work in your kernel. The pseudo-breakpoint "1: jmp 1b" is unprivileged, and works from user mode too.

IMPORTANT NOTE #2: gcc thinks it is smarter than the programmer, so if you use "while(1);", then it will falsely assume that everything after that loop is not needed, and it will REMOVE all those code from the binary. You MUST use inline assembly so that gcc will keep your code as-is.

asm volatile ("1: jmp 1b");

Use a virtual machine

A virtual machine is a program that simulates another computer (Java coders should be familiar with the concept).

There are a number of virtual machines that can simulate x86 machines, my favorite is Bochs (http://bochs.sourceforge.net). Bochs is capable of setting breakpoints in any kind of software (even if it is compiled without debugging info!), and provides an additional "debugging out port" you can easily access from within your kernel code to print debug messages.

The main downside to using a virtual machine like this is that all the code is displayed in assembler (or binary depending on what machine you choose) - instead of the C/C++ source you originally wrote. Also, simulating a virtual machine is slower than an actual machine, and the VM might not even behave exactly like the "real" hardware.

That being said, there are also a lot of other advantages to using a VM. For example, you don't have to reboot to test your new OS, you just start the VM.

Another virtual machine called Simics (https://www.simics.net) is capable not only of breakpoints and displaying register information, but it is also capable of opening a port for use with debugging with DDD (the simics command is 'gdb-remote'). Using this combination, it is possible to see your C source code as you step through the OS! However, the Bochs virtual machine is much faster at executing the OS than Simics and thus serves as a better virtual machine to run the OS, while Simics is the better debugger for those hard to find problems.

Using the serial port

Writing logfiles with QEMU

QEMU allows you to redirect everything that you send to COM1 port to a file on your host computer. To enable this feature, you have to add the following flag when launching QEMU:

-serial file:serial.log

... while "serial.log" is the path to the output file. Once you have this feature enabled, you can write log entries by simply writing characters to the COM1 port (reading from the file over the serial port is not supported).

On real hardware

When your real computer resets due to a programming error, anything you might have put on the screen will instantly vanish. If you're tampering with the video card, you will often find yourself with no visual debugging method at all. If you have a pair of computers connected with a null-modem cable, you can instead send all debug statments over the serial port instead and record them on your development machine that is more stable. Using an actual serial terminal works just as well. It requires a bit of additional cabling, but it works fairly simple and can prove to be a very good replacement for a VM log.

With remote debugger / GDB

Since serial works two ways, you can also control your kernel remotely in case of problems. This can be a simple interface, but you can also attach GDB onto the serial port and potentially get a full blown debugger running.

This is however rather tricky, since it requires additional hardware, and special support coded into your kernel. You might want to read the kernel hacking how-to and (at minimum) chapter 20 of the GDB manual, and chances are likely that your debugger will introduce even more bugs at first.

Using Debuggers with VMs

Use GDB with QEMU

You can run QEMU to listen for a "GDB connection" before it starts executing any code to debug it.

qemu -s -S <harddrive.img>

...will setup QEMU to listen on port 1234 and wait for a GDB connection to it. Then, from a remote or local shell:

gdb 
(gdb) target remote localhost:1234 

(Replace localhost with remote IP / URL if necessary.) Then start execution:

But that's not all, you can compile your source code under GCC with debugging symbols using "-g". This will add all the debugging symbols in the kernel image itself (Thus making it bigger). There is also a way to put all of the debugging information in a separate file using the "objcopy" tool, which is part of the GNU Binutils package.

objcopy --only-keep-debug kernel.elf kernel.sym

This will put the debugging information into a file called "kernel.sym". After that to strip your executable of debugging information you can do

objcopy --strip-debug kernel.elf

Or alternatively, if you are using a flat binary as your kernel image, you can do

objcopy -O binary kernel.elf kernel.bin

To produce a flat binary which can be debugged using the previously extracted debug information

You can import the symbols in GDB by pointing GDB to the file containing debug information

(gdb) symbol-file kernel.elf             ;kernel.elf is the actual unstripped kernel image in this case

From there, you can see the actual C source code as it runs line per line! (Use the stepi instruction in GDB to execute the code line per line.)

Example :

$ qemu -s -S c.img
warning: could not open /dev/net/tun: no virtual network emulation
Waiting gdb connection on port 1234 

(gdb) target remote localhost:1234
Remote debugging using localhost:1234
0x0000fff0 in ?? ()
(gdb) symbol-file kernel.b
Reading symbols from kernel.b...done.

(gdb) break kmain                        ; This will add a break point to any function in your kernel code.
Breakpoint 1 at 0x101800: file kernel/kernel.c, line 12.

(gdb) continue

Breakpoint 1, kmain (mdb=0x341e0, magic=0) at kernel/kernel.c:12
12      {

The above started code execution, and will stop at kmain specified in the "break kmain" above. You can view registers at anytime with this command

(gdb) info registers

I won't start explaining all the nice things about GDB, but as you can see, it is a very powerful tool for debugging OSes.

Alternatively you can force a breakpoint in your code without knowing the name of the function or the address. Place an endless loop pseudo-breakpoint somewhere in your code

asm volatile ("1: jmp 1b");

Then on the terminal that's running gdb, when your VM hangs press Ctrl^C to stop execution and drop you at the debugger prompt. There

(gdb) set $pc += 2

Will step over the endless loop, and you can start single stepping, executing one instruction at a time with

(gdb) si

Use bochs debugger

The easiest way to trigger a breakpoint in bochs is to place "xchg bx, bx" into your code. For example

asm volatile ("xchg %bx, %bx");

Then when you run the virtual machine, it will stop execution and drop you at debugger prompt. To single step from there, use

bochs:1> s

Use VirtualBox debugger

Unfortunately Virtualbox developers are morrons, and they have removed the "--start-dbg" command line option, so there's no way to set up breakpoints before your vm starts execution. But you can do a similar trick as with GDB, place an endless loop pseudo-breakpoint in your code somewhere:

asm volatile ("1: jmp 1b");

Then when the execution hangs, access "Command line..." under "Debug" menu (if you don't have a Debug menu in the Machine window, you'll have to enable the debugger see below). In the debugger command line, the first thing to do is that you MUST stop the VM from running:

VBoxDbg> stop

This should dump the registers. But if not, then get the current RIP value with:

VBoxDbg> r

Once you get the current RIP, add 2 to it, and set a new RIP (I couldn't find any way to reference RIP from command line, you have to use constants), for example:

VBoxDbg> r rip = 0xfffffffff1000102

Check if the current RIP correctly points to the instruction after the endless loop:

VBoxDbg> r

And you can start single stepping with

VBoxDbg> p

GUI frontends

While GDB provides a text-based user interface (available via the `-tui` command line option or by entering `wh` at the GDB prompt), you might want to use one of the available GUI frontents to GDB. These include but are not limited to:

* KDbg
* Insight
* DDD
* VisualKernel

Attaching to a QEMU session works similar to the command line GDB described above.

Develop in hosted environment

Another possibility, which is also a great architectural exercise, is to code every software module in a hosted environment like Linux, and then port it to your OS. You can do this for kernel code too, not just usermode programs.

Suppose you want to develop your VFS interface implementation. Your already created the interface for block devices (doesn't matter if you already implemented it in your kernel). In this case, you can implement your block device interface as a set of wrappers that adapts your interface to POSIX calls. You will then implement your VFS interface (i.e., the code that will manage the filesystem drivers in your kernel) on top of those wrappers. You will then test&debug your implementation all in the hosted environment, and when it is mature, you link it into your real kernel instead of into your hosted implementations. You will finally test your newly introduced code, now in the freestanding environment to ensure it works there as well.

Now, the Pros. First of all, you can use your favourite debugger. You can also use unit testing, for example, which is far better than testing software by hand, if you use the right method.

There are some Cons on this approach. For example, you are far from your target environment when you code like this. This is further aggravated by the fact that so-called freestanding environments are dramatically more sensitive to undefined behaviour, specially uninitialized variables. You can work around this limitation by asking the compiler to perform aggressive optimization while testing hosted, which make software more sensible to undefined behaviour, too. However, as the best debug environment is the final target environment, you will still want to test your code when you introduce in into your real kernel.

Another Con that will probably scare most people is that this approach requires you to consistently plan your interfaces beforehand. Depending on your specific requirements, you may still be able to avoid a too long planning phase. For example, if you want to throw away the hosted implementations once you get the modules working properly, then you don't have to bother maintaining the same interfaces forever.

Using an IDE

You can debug Linux kernel modules with Visual Studio if you use the VisualKernel plugin. Here's a tutorial showing a normal debugging session: http://visualkernel.com/tutorials/kgdb/

VirtualBox

Start your virtual machine with the command "VBoxManage startvm --putenv VBOX_GUI_DBG_ENABLED=true <Name>" and then a "Debug" menu will appear on the window. You can choose "Command Line" to open a debugging prompt.

Useful commands:

  • cpu x - switch CPU
  • r - views registers
  • dq <Address> - dump 48 bytes of memory at the given virtual address as quadwords
  • .pgmphystofile "File Path" - dump physical memory to file
  • info help/<Name> - View device information

Related Threads

Personal tools
Namespaces
Variants
Actions
Navigation
About
Toolbox