From OSDev Wiki
Jump to navigation Jump to search
Stack can also refer to the TCP/IP stack in Networking. This article discuss the datastructure and stacks used in architectures.
A normal stack, that grows upwards.

A stack is a datastructure. You can push and pop elements to and from it respectively. Unlike a FIFO (first in, first out) however, the popped elements from a stack are the elements that you pushed last. Because of this, a stack is also termed LIFO (last in, first out) or FILO (first in, last out).

In the X86 architecture, and many others, there is one stack used for code execution. It is used to store return pointers when calling routines, but you can also store temporary data and local variables on it.

Stack theory

Many languages and architectures have a stack at their disposal. When return values are stored on it, the concept of stack frames emerges. A stack is divided into a number of stack frames. Each stack frame contains the local/temporary data for the routine, parameters, and a return value to the former routine (the caller).

Stack example on the X86 architecture

On the X86 architecture the stack grows downwards. The stack frames have a certain structure regarding the calling convention. The CDECL calling convention is the most widely used. It is most likely used by your compiler. Two registers are used:

  • ESP: Extended Stack Pointer. 32 bit value containing the top-of-stack address (more accurately the ``bottom-of-stack`` on X86!)
  • EBP: Extended Base Pointer. 32 bit value defining the current stack frame, when using CDECL calling convention. It points at the current local data. It can also access the routine parameters.

Take care when implementing your kernel. If you use segmentation, the DS segment should be configured to have its base at the same address as SS does. Otherwise you may run into problems when passing pointers to local variables into functions, because normal GPRs can't access the stack the way you might think.

Here is an example stack. The elements are 4 byte words in protected mode:

Memory address:   Stack elements:

 0x105000      | Parameter 1 for routine 1  | \
               +----------------------------+  |
 0x104FFC      | First callers return addr. |   >  Stack frame 1
               +----------------------------+  |
 0x104FF8      | First callers EBP          | /
 0x104FF4   +->| Parameter 2 for routine 2  | \  <-- Routine 1's EBP
            |  +----------------------------+  |
 0x104FF0   |  | Parameter 1 for routine 2  |  |
            |  +----------------------------+  |
 0x104FEC   |  | Return address, routine 1  |  |
            |  +----------------------------+  |
 0x104FE8   +--| EBP value for routine 1    |   >  Stack frame 2
               +----------------------------+  |
 0x104FE4   +->| Local data                 |  | <-- Routine 2's EBP
            |  +----------------------------+  |
 0x104FE0   |  | Local data                 |  |
            |  +----------------------------+  |
 0x104FDC   |  | Local data                 | /
            |  +----------------------------+
 0x104FD8   |  | Parameter 1 for routine 3  | \
            |  +----------------------------+  |
 0x104FD4   |  | Return address, routine 2  |  |
            |  +----------------------------+   >  Stack frame 3
 0x104FD0   +--| EBP value for routine 2    |  |
               +----------------------------+  |
 0x104FCC   +->| Local data                 | /  <-- Routine 3's EBP
            |  +----------------------------+
 0x104FC8   |  | Return address, routine 3  | \
            |  +----------------------------+  |
 0x104FC4   +--| EBP value for routine 3    |  |
               +----------------------------+   >  Stack frame 4
 0x104FC0      | Local data                 |  | <-- Current EBP
               +----------------------------+  |
 0x104FBC      | Local data                 | /
 0x104FB8      |                            |    <-- Current ESP

The CDECL calling convention is described here:

Caller's responsibilities
  • Push parameters in reverse order (last parameter pushed first)
  • Perform the call
  • Pop the parameters, use them, or simply increment ESP to remove them (stack clearing)
  • The return value is stored in EAX
Callee's responsibilities (callee is the routine being called)
  • Store caller's EBP on the stack
  • Save current ESP in EBP
  • Code, storing local data on the stack
  • For a fast exit load the old ESP from EBP, else pop local data elements
  • Pop the old EBP and return – store return value in EAX

It looks like this in assembly (NASM):


    ; ...
    ; Caller responsibilities:
    PUSH  3         ; push the parameters in reverse order
    PUSH  2
    CALL  callee    ; perform the call
    ADD   ESP, 8    ; stack cleaning (remove the 2 words)
    ; ... Use the return value in EAX ...
    ; Callee responsibilities:
    PUSH  EBP       ; store caller's EBP
    MOV   EBP, ESP  ; save current stack pointer in EBP
    ; ... Code, store return value in EAX ...
    ; Callee responsibilities:
    MOV   ESP, EBP  ; remove an unknown number of local data elements
    POP   EBP       ; restore caller's EBP
    RET             ; return

The GCC compiler does all this automatically, but if you have to call C/C++ methods from assembly or reverse, you have to know the convention. Now look at one stack frame (the callee's):

| Parameter 3             |
| Parameter 2             |
| Parameter 1             |
| Caller's return address |
| Caller's EBP value      |
| Local variable          | <-- Current EBP
| Local variable          |
| Local variable          |
| Temporary data          |
| Temporary data          |
|                         | <-- Current ESP

Using EBP the callee can access both parameters and local variables:

MOV  EAX, [[EBP + 12]]  ; Load parameter 1 into EAX

MOV  EAX, [[EBP + 16]]  ; Load parameter 2

MOV  EAX, [[EBP + 4 * EBX + 12]]  ; Load parameter EBX (0-indexed)

MOV  EAX, [[EBP]]       ; Load local variable 1

MOV  EAX, [[EBP - 4]]   ; Load local variable 2

There are other calling conventions for X86. To name a few: Pascal calling convention, fastcall convention, stdcall. Read more on Wikipedia, see the links below.

Set up the stack

When creating a kernel you have to manually set up the stack. It can be done by the bootloader, but it is good to know how anyway.

If you go from real mode to protected mode, you have to set up the stack as well. This is because the SS segment might change, so ESP in protected mode does not point to the same location that SP in real mode did. If you switch a lot between real mode and protected mode, it can be nice for them to share the stack. You have to find a smart solution for that yourself. It can be done.

In protected mode you set up the stack by just moving a new pointer value into the ESP register:

MOV  ESP, 0x105000  ; Set the stack pointer

Remember that it grows downwards. You might allocate space for it in the kernel's .BSS section if it contains one:


    MOV  ESP, stack_end  ; Set the stack pointer


    RESB 4096  ; Reserve 4 KiB stack space

If your kernel is booted by a Multiboot-compliant bootloader, like GRUB, you are provided a memory map. You can set up the stack by looking for free memory chunks of the appropriate size. You just have to ensure that you don't overwrite any important data or code when setting the stack pointer.


The stack is easy to use, but it has one problem. There is no "end," so it is vulnerable to a variation of the buffer overflow attack. The attacker pushes more elements than the stack is able to contain, so elements are pushed outside the stack memory, overwriting code, which the attacker can then execute.

In X86 protected mode it can be solved by allocating a GDT descriptor solely for the stack, which defines its boundaries.

Stack trace

When debugging, a stack trace is often shown and can be helpful. Stack Trace describes how this can be done and provides sample code for X86 CDECL using the stack layout above.

Unwinding the stack

Unwinding the stack is complex. It is done when using exceptions, like in C++. It is performed when an exception is thrown. The purpose of unwinding the stack is to call the destructor of local objects of the stack frames and to remove stack frames until an appropriate landing pad is found. The landing pad is the try..catch block in C++ or Java. The catch block has to match the exception, i.e. a RuntimeException object can't be caught as a String object.

The unwinding algorithm depends on the architecture. Normally this algorithm is provided in the language runtime library. When using GCC and C++ it is defined in the libsupc++ library linked with your application. However this doesn't happen when creating a kernel. The libsupc++ library is also too bloated to use in kernel space.

See Also


Stack Trace - Trace the called functions from the stack


External Links