Program Memory Allocation Types

From OSDev Wiki
Jump to navigation Jump to search

NB: The original version of this was based on a post discussing automatic memory management.

By it's nature, a program's thread of execution require a certain amount of memory for storing both code and data during its operation. While some of this memory usage can be determined statically when a program is compiled or assembled, most programs also have some memory use which cannot be until the program is loaded into memory, and in many cases, there is some whose size can only be determined over the course of the program execution.

There are four general models of memory usage in most programs, including operating systems: read-only memory, static global memory, lexically allocated memory, and dynamically allocated memory. All programs use the first two to some degree, the overwhelming majority of programs use the third (and most programming languages are designed to apply it as a matter of course), and the fourth is used in a significant proportion of programs, including any program in a language which implements Garbage collection as a primary language feature.

Read-Only Memory

Read-only memory, as the name implies, is a region of memory in which constant values are stored; the program should never attempt to change the values of addresses in read-only memory. This includes the memory for the program code (often called the .text section), as well as data constants (the .rodata section). In some instances, a compiler may place constants representing program literals (e.g., quoted strings, hard-coded numeric values) in .rodata as well.

Most modern systems incorporate some form of hardware page or segment protection, to prevent read-only memory from be altered. Trying to change something in read-only memory on such systems will result in a memory protection fault of some kind. Management of read-only memory sections is by necessity a function of the operating system for both the system itself and for userspace applications.

Global Memory

Global memory is used for storing variables which are in scope to the entire program and whose size is known when a process is launched (in most cases, it can be determined at . This is a fixed amount of space given to the program when it starts, and cannot be increased or decreased.

Some global memory may be initialized when the program starts, using values that are part of the executable image, but there may be a separate area of global memory (usually referred to as the .bss section) is simply an amount of memory that the loader clears when the program starts - the executable file just stores the size of the .bss section, to avoid having a large blob of useless empty memory in the file.

Lexically-allocated memory

Lexically-allocated memory, also called function local memory, is memory which is automatically allocated in a block when a function begins and deallocated when it returns. It can be seen as an area of 'scratch memory' which the program can use for temporary values whose lifespan is determined by the running time of the function which is using it, and can always be automatically reclaimed for reuse after the function exits.

In many cases, program is given a large, but fixed, block of memory for this purpose, whose size can generally be set when the program is launched; whether the program can request for additional memory for this block, and how, is a matter of OS design. While a process is in principle free to use this memory as it chooses, most operating systems and high-level languages impose standard conventions on the use to ensure the basic purpose - memory automatically allocated on function entry and automatically freed on function exit - is enforced, and many CPU instruction sets have hardware support for these conventions.

The function's local memory is structured as an activation record or activation frame, which contains the function's local variables (or references to them), as well as means of accessing the function's arguments and the address which the function should return to when it exits. Most of the time, the size of an activation record is fixed at compile time, but some kinds of local variables (most notably strings) can be dynamically sized for a given function invocation in some languages, so long as it doesn't change once the activation record is created.

Since functions may call other functions, or even call themselves recursively, in general each call to a function has to be provided with a separate activation record. This ensures that a given function has it's own versions of each of it's variables; this means that recursive calls do not overwrite the caller's variables. The variables are thus managed in a 'Last In, First Out' (LIFO) fashion, a fact which is used by most systems in implementing the activation records, as will be described below in the section on stack allocation.

The primary exception to this is when the last action a function performs is a call to another function, which is known as a tail call. In this instance, the compiler can reduce the short-term memory load by means of Tail Call Optimization, in which the existing activation record is reused for the function it is calling. Many compilers offer TCO as an optimization option, and some languages mandate it either as an default for all tail calls, or as something which can be specified through language pragmas.

Stack Allocation

Stack allocation is a widely used method for implementing lexically-scoped memory allocation. Because activation records are created and free in a 'LIFO' order, it is possible to push the activation records (or at least a reference to it) onto a stack data structure. This program stack has been near-ubiquitous for languages designed after the early 1960s, and many ISAs have either hardware support for the program stack, or a register convention that amounts to the same effect.

The general model for a register-based stack is to have a set of registers set aside as pointers that define both the total size of the stack, and it's current state. The mechanisms for sizing the .stack section vary, but the basic mechanism for manipulating them are more or less universal.

The stack's dynamic state is defined by a stack pointer, a counter which points to the address on the stack where the most recently added element is found. When a new element is pushed onto the stack, the stack pointer is decremented (for a stack that grows downward - for a variety of reasons, most hardware stacks grow down by default, so the rest of this discussion assume this direction), and the data for the new top of the stack is copied to the location addressed by the stack pointer.

When an element is popped off of the stack, the stack pointer is incremented by the same amount, which has the effect of freeing the memory used for the previous top element.

To manage an activation record on a stack, a second register, the base pointer (also called a frame pointer), which points into the stack data, is used.

When a function is to be called, the caller pushes the function's arguments onto the stack in an order defined by a Calling Convention, which defines the order in which arguments and local variables are stored. Depending on the convention, saving the caller's own state (e.g., the registers it is actively using) may be the caller's responsibility, or that of the called function, or be some combination of the two (e.g., certain registers may be 'sacred' and have to be preserved by the called function, while others can be used freely by the function and it is up to the caller to save them before the call).

The calling convention may also specify that some or all of the arguments be passed in registers rather than on the stack, to reduce memory use and call overhead, which could affect what the caller needs to preserve for its own use.

Once the arguments are pushed, the call to the function is made by pushing the address just past the current one (the one where the call is made from) as the return address for the function, and jumping to the entry point of the called function. Some ISAs have a special instruction for this purpose, which either pushed the value to the stack directly, or stores it into a return address register; for example, the CALL instruction on the Intel x86 instruction set. When a return-address register is used instead, the function would be responsible for saving it's own return address in its activation record explicitly, though this does allow a function that does no function calls of it's own to avoid the overhead of pushing the address.

When control is passed to the function, the function has to first save the current base pointer (the base pointer of the calling function). It then copies the current stack pointer to the base pointer, which can then be used as a base for offsets up into the arguments, or down into local variables. Finally, the stack is decremented by the sum total amount of memory needed for all of the function's local variables.

When a the function exits, the function copies it's base pointer back to the stack pointer, then pushes the caller's base to the base pointer, which has the effect of restoring the stack to the point of the function's entry. It then can return to the caller, often through a special instruction such RET on the x86 (which also removes the return address from the stack).


Dynamically Allocated Memory

Dynamic memory, also called Heap Memory, is memory used for data when neither the size nor the lifespan of the variable is known beforehand. This a region of memory that can be allocated to programs in pieces, at run time, without being automatically reclaimed when the function that allocated it ends (some languages have a special means of reclaiming dynamic memory after it is no longer in use called garbage collection, but that's not relevant here). Most operating systems have a system call to allow the program to request that the system allocate enough memory to hold one or more values of a given type; it returns a pointer to the newly allocated memory space. There usually is a corresponding system call which allows the program to return the allocated memory to the system. In between allocation and release, the memory becomes available for use, so long as there is at least one pointer to the memory block; you can pass the address of that memory around from one pointer to another, treating it as if it were the same as a pointer to any other non-constant memory. However, once the system releases that memory, the pointers to the memory are all invalid, and should not be used - preferably, you should null them after releasing the memory they point to. When the process exits, the operating system (should) release the memory used, even if it wasn't deallocated, but as long as the program is running, the memory will still be allocated until freed.

There are just three rules to follow with dynamic memory: don't lose the starting address of the memory block, or you won't be able to deallocate later; always remember to deallocate it after you are done with it; and don't try to access the memory after it has been deallocated. These rules are trickier to follow than they sound, however, as it is easy to lose track of where the memory is and how you are using it at any given time. Failing to release memory when it is finished is a memory leak, while a pointer that points to a freed block i(or one that is invalid for some other reason) s a wild pointer.

So, if it is so complicated, why bother with dynamic memory? Because you can make arrays of arbitrary size without knowing how big it has to be beforehand. Also, you can use it to allocate structures with pointers in them, which can then point to other dynamically allocated memory, letting you chain the allocated memory into more complex data structures such as lists and trees. It simply wouldn't be practical to make these structures without dynamic allocation.

This brings us to application memory management. Most of the time, programs do not fetch a new block of memory each and every time they need one; the overhead of system calls is too high, and it potentially can slow the whole system down as the OS manages the allocations. So the usual practice is to allocate larger blocks of memory than is actually needed, and have use that as a kind of scratch space for allocating memory from. The allocations and frees are usually wrapped up in a pair of library functions corresponding to the system calls, and the program only calls the OS again if it needs more. The local allocator has to keep track of the local heap memory, but usually this isn't too complicated.

However, this leaves the programmer to keep track of the actual pointers to the memory, which can be very complicated indeed. For this reason, various approaches for automating memory management have been devised, with Garbage collection being the most common (though a related form called 'reference counting' is often used as well, and often confused with it). Basically, in garbage collection, the library functions require the program to get all pointer variables from the allocator, and a part of the program - the garbage collector - checks periodically to see if the blocks of allocated memory still have any pointers pointing at them. When a block has no pointers to it, the allocator frees it for the program, without any action on the part of the programmer. Older garbage collectors had to stop the program in order to do this, but some modern multithreaded ones can do so in parallel to the program's other threads.

While GC is most associated with interpreters, it is applicable to compiled code as well - it was originally developed for the original Lisp 1.5 compiler in 1961. It is possible to use a garbage collector as a library and requiring the programmer to follow the GC discipline, but it is far more common for it to be implemented in the language itself. It has a number of other advantages for both security and simplicity, especially in scripting languages - it allows the language translator to abstract away the allocation and deallocation of variable memory entirely - but there are distinct costs to it as well. There are endless variations on garbage collection meant to reduce the overhead of doing the collection, but a certain amount of overhead is inevitable when garbage collection is used.