User:Joeeagar/ElfLoading

From OSDev Wiki
Jump to navigation Jump to search

The Executable Link Format, or ELF, is the format used by most Unixes today. It's a very versatile format (too much so, but more on that later).

Basic Information

Types of ELF Executables

There are many ways to link ELF executables:

  1. Absolute binaries. These are hard-linked with an absolute base address. This is the most common way to do Unix binaries, but only works on systems that support virtual memory.
  1. PIC, or Position Independent Code. These binaries use relative addressing to avoid relying on a specific base memory address. Library calls are implemented as jumps into what's called the Procedure Lookup Table, which stores the addresses of external library functions (themselves usually implemented as PIC code).
  1. DSO, or Dynamic Shared Object. This is basically ELF's version of DLLs. Rather than using relative addressing to jump to library calls stored in a PLT, library calls are patched on startup. This avoids the indirect jump, but at the cost of slower (in some cases much slower) start-up time.

ELF Binary Interpreters

Like I said in the beginning, ELF is a very versatile format. In fact it's so versatile that there are many ways to do the same thing, and informal standards as to which subset a system uses have sprung up. Modern Unixes use something called ELF binary interpreters, typically (always?) tied to the C library. ELF interpreters are special programs that handle dynamic linking of executables (e.g. /lib/ld.linux.so). The (absolute) path to the interpreter is hardwired into executables at build time.

Basic Program Loading

For now, let's assume our programs are statically linked and don't use an ELF interpreter. For that we need to load the ELF header and the program headers (note that datatype sizes differ between 32- and 64-bit systems):

typedef struct ElfHeader{
  unsigned char e_ident[EI_NIDENT]; //should start with [0x7f 'E' 'L' 'F']
  uint16_t e_type;
  uint16_t e_machine;
  uint32_t e_version;
  uint32_t e_entry;
  uint32_t e_phoff; //start of program headers in file
  uint32_t e_shoff;
  uint32_t e_flags;
  uint16_t e_ehsize;
  uint16_t e_phentsize; //size of each program header
  uint16_t e_phnum; //number of program headers
  uint16_t e_shentsize;
  uint16_t e_shnum;
  uint16_t e_shstrndx;
} ElfHeader;

typedef struct ElfProgramHeader {
  uint32_t p_type;
  uint32_t p_offset; //offset of data in elf image
  uint32_t p_vaddr; //virtual load address
  uint32_t p_paddr; //physical load address, not used
  uint32_t p_filesz; //size of data in elf image
  uint32_t p_memsz; //size of data in memory; any excess over disk size is zero'd
  uint32_t p_flags;
  uint32_t p_align; //alignment
} ElfProgramHeader;


First read the main header. Check that e_ident starts with [0x7f, 'E', 'L','F'], or [0x7f, 0x45, 0x4c, 0x46]. Then read the program headers, which are stored in a flat array starting at e_phoff. Each header has the same size, e_phentsize. There are e_phnum headers.

The first thing we need to do is calculate the base address. This is rather annoying, since ELF doesn't store it directly. Rather, the base address is the smallest virtual address stored in the p_vaddr field in each program header. Note that supposedly (at least according the documentation I read) p_vaddr must be aligned to be a multiple of p_align before we use it, though this doesn't seem to happen in practice (at least not with binaries generated via GNU binutils).

Here's an example of loading the main header and getting the base address (note that normally one would load the program headers into an array or linked list, which I've omitted to keep things simple):

#define MIN2(a, b) ((a) < (b) ? (a) : (b))

int ELF_loadBinary(unsigned char *elfimage) {
    ElfProgramHeader header;
    ProgramHeader ph;
    int i;
    uintptr_t base = UINTPTR_MAX;

    memcpy(&header, elfimage, sizeof(header));
    
    /*remember that strcmp returns zero when two strings are equal*/
    if (strcmp(header.e_ident, "\X7fELF")) {
       printf("error!\n");
       return -1;
    }
    
    /*note that we copy each program header into ph
      to avoid any alignment errors
    */
    
    for (i=0; i<header.e_phnum; i++) {
        memcpy(elfimage + header.e_phoff + header.e_phentsize*i, &ph, sizeof(ph));
        
        base = MIN2(base, ph.p_vaddr);
    }

    return 0;
}

Now that we have the base address, we need to figure out how much memory to allocate for the program image. To do this we find the maximum of subtracting each program header's p_vaddr from the base address, then adding e_memsz:

    uintptr_t size=0;

    for (i=0; i<header.e_phnum; i++) {
        uintptr_t segment_end;

        memcpy(elfimage + header.e_phoff + header.e_phentsize*i, &ph, sizeof(ph));
        
        segment_end = ph.p_vaddr - base + ph.p_memsz;
        size = MAX2(size, segment_end);
    }

Next we allocate the program image, and then load the program segments. There are different types of segments, so we'll need an enumeration of what those types are:

enum {
  PT_NULL,
  PT_LOAD,
  PT_DYNAMIC,
  PT_INTERP,
  PT_NOTE,
  PT_SHLIB,
  PT_PHDR,
  PT_LOPROC=0x70000000, //reserved
  PT_HIPROC=0x7FFFFFFF  //reserved
};

If you compiled an absolute binary (e.g. you didn't pass -fPIC to the compiler), you'll have to allocate the binary at the calculated base address. Otherwise you can put it whereever you want.

After allocating the program image, we need to load the program headers. In this simple example, our headers should have each have a type (p_type) of PT_LOAD or PT_PHDR. For each header, calculate where it'll go inside the allocated image by either subtracting base address from p_vaddr (if using PIC code) and then adding the address of the image, or by using p_vaddr directly if you went with absolute binaries.

Next, zero the segment data, using the value in p_memsz. Technically you only have to zero the difference between p_memsz and p_filesz, but zeroing the whole thing is fine too. After zeroing, copy the segment data from the file image to the allocated image. It's located at p_offset, and has size p_filesz.

Here's some code:

    unsigned char *finalimage = kmalloc(size); //for PIC
    unsigned char *finalimage = (unsigned char*) base; //for absolute

    for (i=0; i<header.e_phnum; i++) {
        uintptr_t addr;

        memcpy(elfimage + header.e_phoff + header.e_phentsize*i, &ph, sizeof(ph));

        addr = (ph.p_vaddr - base) + (uintptr_t)finalimage;

        memset(finalimage+addr, 0, ph.e_memsz);
        memcpy(finalimage+addr, elfimage+ph.e_offset, elfimage.e_filesz);
    }

Now we're ready to start the binary! All we need to do is set up a stack, and jump to the entry point (which is stored in the e_entry field in the main header).

Well, not quite. You might think the entry point would be the address of the main function, but this isn't the case. It's a special function (usually called _start) that's provided by the libc. This function needs a bunch of parameters to start the program with (e.g. the stdin/stdout/stderr file descriptors, argv, the program's environment variables, etc). You'll need to read up how to pass this data to it, either through the stack or a special system call. For more information, see the following links:

[1] [2] [3]

See Also

Articles

External Links