Dynamic Linker

From OSDev Wiki
Jump to navigation Jump to search

Sooner or later you'll reach the point where you want shared libraries. Here we won't discuss the difference between static and dynamic linking, you should be already familiar with that.

This article is about ELF on x86_64 architecture, but can be easily adopted to other systems as the concepts are similar.

Home work

Memory Layout

I suppose you have created a new, empty address space, and you've already loaded the executable in it. I also assume that you've loaded the libc shared library after the executable's segment. And now you're stuck, because you don't know how to call printf from the executable.

As your executable was loaded by you, you should know the virtual address of ELF magic bytes in the memory. Use that for start.

Elf64_Ehdr *ehdr = (Elf64_Ehdr *)(ptr);

if(!kmemcmp(ehdr->e_ident, ELFMAG,SELFMAG) &&
   ehdr->e_ident[EI_CLASS] == ELFCLASS64 &&
   ehdr->e_ident[EI_DATA] == ELFDATA2LSB &&
   ehdr->e_type == ET_EXEC && ehdr->e_shnum > 0)
{
   // We have a valid image with sections
}

First we check the magic bytes and the format of the ELF64. As an extra, we also check whether it's executable and has a non-empty section table (we'll going to need it).

Now let's see what the GNU toolchain does for us to find printf.

Segment Local Calls

In order to figure that out, first we should know how a segment local call works. For that we'll use a very minimal source.

void localfunction()
{
}

int main(int,char**)
{
    localfunction();
}

That compiles to:

$ objdump -d test
000000000020016f <localfunction>:
  20016f:	55                   	push   %rbp
  200170:	48 89 e5             	mov    %rsp,%rbp
  200173:	90                   	nop
  200174:	5d                   	pop    %rbp
  200175:	c3                   	retq   

0000000000200176 <main>:
  200176:	55                   	push   %rbp
  200177:	48 89 e5             	mov    %rsp,%rbp
  20017a:	b8 00 00 00 00       	mov    $0x0,%eax
  20017f:	e8 eb ff ff ff       	callq  20016f <localfunction>
  200184:	90                   	nop
  200185:	5d                   	pop    %rbp
  200186:	c3                   	retq

That's trivial, a rip relative addressing is used at 20017f.

Inter-segment Calls

Now let's modify the source a bit to use a libc call:

int main(int,char**)
{
    printf("Hello World");
}

Compile and see what's generated.

000000000020016f <main>:
  20016f:	55                   	push   %rbp
  200170:	48 89 e5             	mov    %rsp,%rbp
  200173:	48 8d 3d b6 01 00 00 	lea    0x1b6(%rip),%rdi        # 200330 <_DYNAMIC+0x110>
  20017a:	b8 00 00 00 00       	mov    $0x0,%eax
  20017f:	e8 8c 00 00 00       	callq  200210 <printf@plt>
  200184:	90                   	nop
  200185:	5d                   	pop    %rbp
  200186:	c3                   	retq   

0000000000200200 <printf@plt-0x10>:
  200200:	ff 35 4a 0e 00 00    	pushq  0xe4a(%rip)        # 201050 <_GLOBAL_OFFSET_TABLE_+0x8>
  200206:	ff 25 4c 0e 00 00    	jmpq   *0xe4c(%rip)        # 201058 <_GLOBAL_OFFSET_TABLE_+0x10>
  20020c:	0f 1f 40 00          	nopl   0x0(%rax)

0000000000200210 <printf@plt>:
  200210:	ff 25 4a 0e 00 00    	jmpq   *0xe4a(%rip)        # 201060 <_GLOBAL_OFFSET_TABLE_+0x18>
  200216:	68 00 00 00 00       	pushq  $0x0
  20021b:	e9 e0 ff ff ff       	jmpq   200200 <main+0x91>

What? Two more local functions? What happened here? The GNU toolchain has a concept for lazy run-time linking. That means the address is not resolved until it's referenced. To achieve that, it needs helper functions (generated to the .plt section in the text segment).

When the CPU executes this code, it will first call a normal local function at 20017f. That function is one of the helpers and it's purpose is to load an address from GOT+0x18 and jump to it. By default, the value points to the next instruction, which saves 0 to stack and calls the other helper function at 200200. That one is the reference resolver, and it's job is the replace the address in GOT with a relocated address.

Because the resolver function is not known at link time, it's address is also in GOT at 0x10. What's more it can receive one argument, stored at GOT+0x8. We can also spot that at 200206 the instruction is a jump and not a call, so resolver never returns to this helper, instead it should jump to the relocated address.

Implementing a dynamic linker

All that disassembly teach us two things about the dynamic linker:

1. it has to locate and write the GOT
2. it has two parts: load time linker and a run time resolver.

The first part runs before the thread is started and saves the second part's address and argument into GOT. On the other hand, the second part runs when the thread is already running, and saves relocated addresses into GOT. As you can see, both parts require the address of GOT. To save resources, one should not locate the GOT twice: this is where the resolver's argument came in.

To proceed we'll have to locate the GOT in memory and figure out what entries it has.

Locating the GOT

Time to peek on what's in the object file.

$ readelf -a test
Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [10] '.got.plt'        PROGBITS         0000000000201048  00001048
       0000000000000020  0000000000000008  WA       0     0     8

Symbol table .symtab contains 23 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
    15: 0000000000201048     0 OBJECT  LOCAL  DEFAULT   10 _GLOBAL_OFFSET_TABLE_

Symbol table shows that the GOT is at 201048. We can also see the same value in the section headers at '.got.plt'. That means we don't have to resolve symbols in order to get GOT's address which simplifies the first part. We can also learn that the GOT is 32 (0x20) bytes long in our example.

What's in the GOT?

We already know that

1. GOT+0x0 entry is unused
2. GOT+0x8 is an argument to second part
3. GOT+0x10 is function reference to second part

But what about the rest, starting at 201060 in our example? Here we have only one reference so it's obvious, but what if we have more references? How should we know which symbol is associated to which entry?

$ readelf -a test
Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 4] '.rela.plt'       RELA             00000000002001e8  000001e8
       0000000000000018  0000000000000018  AI       2    10     8

Relocation section .rela.plt at offset 0x1e8 contains 1 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000201060  000100000007 R_X86_64_JUMP_SLO 0000000000000000 printf + 0

How convenient that another table is also recorded in the section headers. It's called '.rela.plt' and describes exactly that.

Where are my libraries?

So far we assumed that shared libraries are already loaded. It's the case with libc, but how do we know what other shared libraries the executable wants?

$ readelf -a test
Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 6] '.dynamic'        DYNAMIC          0000000000200220  00000220
       0000000000000110  0000000000000010  WA       3     0     8

Dynamic section at offset 0x220 contains 12 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libc.so]

Not surprising that the answer lies in the section header again. There's a table pointer called '.dynamic'. That table has several records, but what we really are interested in is the ones marked by "NEEDED".

Symbol look up

To find out printf's address we should locate it's symbol first in the shared library.

$ readelf -a libc.so 
Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 2] '.dynsym'         DYNSYM           0000000100000218  00000218
       00000000000003c0  0000000000000018   A       3     1     8
  [ 3] '.dynstr'         STRTAB           00000001000005d8  000005d8
       0000000000000124  0000000000000000   A       0     0     1

Symbol table .dynsym contains 40 entries:
   Num:    Value          Size Type    Bind   Vis      Ndx Name
     8: 0000000100000175    93 FUNC    GLOBAL DEFAULT    1 printf

Bingo! It is 100000175 in our example.

Gimme code!

I've put all the above together in a very simple example, see elftool.c on gitlab.

When I run it on the executable it gives:

$ gcc elftool.c -o elftool
$ ./elftool -d mytestelf.o
Stringtable 000003f0 (118 bytes), symbols 000002b8 (312 bytes, one entry 24)

--- IMPORT ---
Dynamic 00001e20 (464 bytes, one entry 16):
  0. /lib/libc.so.6

GOT 00002000 (104 bytes), Rela 000004d0 (240 bytes, one entry 24):
  0. 00602018 +0 puts
  1. 00602020 +0 fread
  2. 00602028 +0 fclose
  3. 00602030 +0 printf
  4. 00602038 +0 strcmp
  5. 00602040 +0 ftell
  6. 00602048 +0 malloc
  7. 00602050 +0 fseek
  8. 00602058 +0 fopen
  9. 00602060 +0 exit

--- EXPORT ---

As you can see here we have an import section, but nothing to be exported. Now let's see a shared library!

$ ./elftool -d /lib/libc.so.6
Stringtable 00011038 (23041 bytes), symbols 00003d90 (53928 bytes, one entry 24)

--- IMPORT ---
Dynamic 00197b60 (496 bytes, one entry 16):
  0. /lib/ld-linux-x86-64.so.2

GOT 00198000 (88 bytes), Rela 0001f760 (192 bytes, one entry 24):
  0. 00398050 +844d0 
  1. 00398048 +a8560 
  2. 00398040 +7ff80 
  3. 00398038 +867c0 
  4. 00398030 +823f0 
  5. 00398028 +82780 
  6. 00398020 +a8640 
  7. 00398018 +83b50 

--- EXPORT ---
  8. 0008e850 __strspn_c1
  9. 0006aad0 putwchar
 10. 000f8640 __gethostname_chk
 11. 0008e870 __strspn_c2
 12. 0010f210 setrpcent
 13. 0009eda0 __wcstod_l
 14. 0008e8a0 __strspn_c3
 15. 000e8d10 epoll_create
 16. 000d1b50 sched_get_priority_min
 17. 000f8660 __getdomainname_chk
 18. 000e8f20 klogctl
 19. 0002c380 __tolower_l
 20. 0004f440 dprintf
 21. 000b8e00 setuid
 22. 000a3d20 __wcscoll_l
... lot more lines to come ...

This time it has hell a lot of functions to export, and also it imports the dynamic linker of Linux with addend offsets in the GOT.

Summary

To summarize a dynamic linker should be look like:

1. load-time linker
  1.1. locates GOT by '.got.plt' section, and saves that address in GOT+0x8
  1.2. stores second part's address at GOT+0x10
  1.3. reads '.dynamic' section to load shared libraries
2. run-time reference resolver
  2.1. it's called by a helper
  2.2. that helper places index-0x18 and the address of GOT as arguments on the stack
  2.3. it has to locate '.rela.plt' to get the symbol for the reference
  2.4. the symbol is looked up in the shared library's '.dynsym' section to get relocated address
  2.5. that relocated address has to be saved in the GOT (index and base on the stack)
  2.6. clean up stack, restore registers and jump to the relocated address

That's all, hope it helps somebody! Good luck with implementing your own dynamic linker!