TLB
The Translation Lookaside Buffer (TLB) is a cache of memory page translations employed in many systems with memory paging capability. When the processor needs to translate a given virtual address into a physical address, the TLB is consulted first. On x86 systems, TLB misses are handled transparently by hardware. Only if the page directory/table entry is not present in-core will the operating system be notified (by the means of a page fault exception.)
Usage implications
Like a regular CPU cache, the TLB is mostly transparent. There are two cases which the operating system must be aware of.
Modification of paging structures
The TLB is not transparently informed of changes made to paging structures. Therefore the TLB has to be flushed upon such a change. On x86 systems, this can be done by writing to the page directory base register (CR3):
movl %cr3,%eax
movl %eax,%cr3
Note: setting the global (G) bit in a page directory/table entry will prevent that entry from being flushed. This is useful for pinning interrupt handlers in place.
An alternate (and better) method is to use the invlpg
instruction, which should be used instead of the above method when doing small mapping modifications (creation, removing, changing.) invlpg
is mostly used in page unmapping and remapping routines in order to invalidate a previous cached translation. If invlpg
or some other TLB flush method had not been used, the mapping would remain cached, producing undefined consequences.
However, please note that the invlpg
instruction was introduced in the i486 ISA and is not part of the i386 ISA, thereby requiring a properly written i386-compatible kernel to use conditional inclusion of relevant code at compilation time depending on the target machine. An example routine declaration and source code follow:
void vm_page_inval(void *);
#include <kconfig.h>
.globl vm_page_inval
vm_page_inval:
#if TARGET_MACHINE >= TARGET_MACHINE_I486
movl 4(%esp),%eax
invlpg (%eax)
#else /* TARGET_MACHINE < TARGET_MACHINE_I486 */
movl %cr3,%eax
movl %eax,%cr3
#endif
ret
Note that changing/reloading CR3 should only need to be done when switching between process address spaces. Using it to completely flush TLBs is really quite overkill in most situations.
Multi-processor consistency
The above is more complicated in the multi-processor case. If another processor could also be affected by a page table write (because of shared memory, or multiple threads from the same process), you must also flush the TLBs on those processors. This will require some form of inter-processor communication.