Inline Assembly/Examples

From OSDev Wiki
Jump to: navigation, search

What follows is a collection of Inline Assembly functions so common that they should be useful to most OS developers using GCC. Other compilers may have intrinsic alternatives (see references). Notice how these functions are implemented using GNU extensions to the C language and that particular keywords may cause you trouble if you disable GNU extensions. You can still use the disabled keywords such as asm if you instead use the alternate keywords in the reserved namespace such as __asm__. Be wary of getting inline assembly just right: The compiler doesn't understand the assembly it emits and can potentially cause rare nasty bugs if you lie to the compiler.


Memory access


Read a 8/16/32-bit value at a given memory location using another segment than the default C data segment. Unfortunately there is no constraint for manipulating segment registers directly, so issuing the mov <reg>, <segmentreg> manually is required.

static inline uint32_t farpeekl(uint16_t sel, void* off)
    uint32_t ret;
    asm ( "push %%fs\n\t"
          "mov  %1, %%fs\n\t"
          "mov  %%fs:(%2), %0\n\t"
          "pop  %%fs"
          : "=r"(ret) : "g"(sel), "r"(off) );
    return ret;


Write a 8/16/32-bit value to a segment:offset address too. Note that much like in farpeek, this version of farpoke saves and restore the segment register used for the access.

static inline void farpokeb(uint16_t sel, void* off, uint8_t v)
    asm ( "push %%fs\n\t"
          "mov  %0, %%fs\n\t"
          "movb %2, %%fs:(%1)\n\t"
          "pop %%fs"
          : : "g"(sel), "r"(off), "r"(v) );
    /* TODO: Should "memory" be in the clobber list here? */

I/O access


Sends a 8/16/32-bit value on a I/O location. Traditional names are outb, outw and outl respectively. The a modifier enforces val to be placed in the eax register before the asm command is issued and Nd allows for one-byte constant values to be assembled as constants, freeing the edx register for other cases.

static inline void outb(uint16_t port, uint8_t val)
    asm volatile ( "outb %0, %1" : : "a"(val), "Nd"(port) );
    /* There's an outb %al, $imm8  encoding, for compile-time constant port numbers that fit in 8b.  (N constraint).
     * Wider immediate constants would be truncated at assemble-time (e.g. "i" constraint).
     * The  outb  %al, %dx  encoding is the only option for all other cases.
     * %1 expands to %dx because  port  is a uint16_t.  %w1 could be used if we had the port number a wider C type */


Receives a 8/16/32-bit value from an I/O location. Traditional names are inb, inw and inl respectively.

static inline uint8_t inb(uint16_t port)
    uint8_t ret;
    asm volatile ( "inb %1, %0"
                   : "=a"(ret)
                   : "Nd"(port) );
    return ret;


Forces the CPU to wait for an I/O operation to complete. only use this when there's nothing like a status register or an IRQ to tell you the info has been received.

static inline void io_wait(void)
    /* TODO: This is probably fragile. */
    asm volatile ( "jmp 1f\n\t"
                   "1:jmp 2f\n\t"
                   "2:" );

Alternatively, you may use another I/O cycle on an 'unused' port (which has the nice property of being CPU-speed independent):

static inline void io_wait(void)
    /* Port 0x80 is used for 'checkpoints' during POST. */
    /* The Linux kernel seems to think it is free for use :-/ */
    asm volatile ( "outb %%al, $0x80" : : "a"(0) );
    /* %%al instead of %0 makes no difference.  TODO: does the register need to be zeroed? */

Interrupt-related functions


Returns a true boolean value if irq are enabled for the CPU.

static inline bool are_interrupts_enabled()
    unsigned long flags;
    asm volatile ( "pushf\n\t"
                   "pop %0"
                   : "=g"(flags) );
    return flags & (1 << 9);


Define a new interrupt table.

static inline void lidt(void* base, uint16_t size)
{   // This function works in 32 and 64bit mode
    struct {
        uint16_t length;
        void*    base;
    } __attribute__((packed)) IDTR = { size, base };
    asm ( "lidt %0" : : "m"(IDTR) );  // let the compiler choose an addressing mode

CPU-related functions


Request for CPU identification. See CPUID for more information.

/* GCC has a <cpuid.h> header you should use instead of this. */
static inline void cpuid(int code, uint32_t* a, uint32_t* d)
    asm volatile ( "cpuid" : "=a"(*a), "=d"(*d) : "0"(code) : "ebx", "ecx" );


Read the current value of the CPU's time-stamp counter and store into EDX:EAX. The time-stamp counter contains the amount of clock ticks that have elapsed since the last CPU reset. The value is stored in a 64-bit MSR and is incremented after each clock cycle.

static inline uint64_t rdtsc()
    uint64_t ret;
    asm volatile ( "rdtsc" : "=A"(ret) );
    return ret;

This can be used to find out how much time it takes to do certain functions, very useful for testing/benchmarking /etc. Note: This is only an approximation.

On x86_64, the "A" constraint expects to write into the "rdx:rax" registers instead of "edx:eax". So GCC can in fact optimize the above code by not setting "rdx" at all. You instead need to do it manually with bitshifting:

uint64_t rdtsc(void)
    uint32_t low, high;
    asm volatile("rdtsc":"=a"(low),"=d"(high));
    return ((uint64_t)high << 32) | low;

Read GCC Inline Assembly Machine Constraints for more details.


Read the value in a control register.

static inline unsigned long read_cr0(void)
    unsigned long val;
    asm volatile ( "mov %%cr0, %0" : "=r"(val) );
    return val;


Invalidates the TLB (Translation Lookaside Buffer) for one specific virtual address. The next memory reference for the page will be forced to re-read PDE and PTE from main memory. Must be issued every time you update one of those tables. The m pointer points to a logical address, not a physical or virtual one: an offset for your ds segment.

static inline void invlpg(void* m)
    /* Clobber memory to avoid optimizer re-ordering access before invlpg, which may cause nasty bugs. */
    asm volatile ( "invlpg (%0)" : : "b"(m) : "memory" );


Write a 64-bit value to a MSR. The A constraint stands for concatenation of registers EAX and EDX.

static inline void wrmsr(uint32_t msr_id, uint64_t msr_value)
    asm volatile ( "wrmsr" : : "c" (msr_id), "A" (msr_value) );

On x86_64, this needs to be:

static inline void wrmsr(uint64_t msr, uint64_t value)
	uint32_t low = value & 0xFFFFFFFF;
	uint32_t high = value >> 32;
	asm volatile (
		: "c"(msr), "a"(low), "d"(high)


Read a 64-bit value from a MSR. The A constraint stands for concatenation of registers EAX and EDX.

static inline uint64_t rdmsr(uint32_t msr_id)
    uint64_t msr_value;
    asm volatile ( "rdmsr" : "=A" (msr_value) : "c" (msr_id) );
    return msr_value;

On x86_64, this needs to be:

static inline uint64_t rdmsr(uint64_t msr)
	uint32_t low, high;
	asm volatile (
		: "=a"(low), "=d"(high)
		: "c"(msr)
	return ((uint64_t)high << 32) | low;
Personal tools