User:Pancakes/BitmapHeapImplementationEnhanced

From OSDev Wiki
Jump to navigation Jump to search

Bitmap Heap Implementation Enhanced

I was working on another project and realized that I could use the same code for a heap as I could to manage physical memory. I did not want to clutter that page with lots of code so I placed it here.

The original page is here and contains a lot of information about using the heap.

This heap provides four extra functions.

int k_heapBMAddBlockEx(KHEAPBM *heap, uintptr addr, uint32 size, uint32 bsize, KHEAPBLOCKBM *b, uint8 *bm, uint8 isBMInside);
void *k_heapBMAllocBound(KHEAPBM *heap, uint32 size, uint32 mask);
uintptr k_heapBMGetBMSize(uintptr size, uint32 bsize);
void k_heapBMSet(KHEAPBM *heap, uintptr ptr, uintptr size, uint8 rval);

The k_heapBMAddBlockEx allows you to set the actual block management structure and bitmap in a different memory location. The is useful because normally the header will offset the actual data onto a CPU word sized boundary. While a CPU word sized (for example 4-byte/32-bit) boundary is useful for a normal heap, when working with say 4KB or 16KB boundary you will have problems and additional complications trying to make it work. For example if you wanted to manage memory from address 0 to the 8MB mark and make the bitmap handle chunks in 4KB sections all aligned on a boundary then the header is going to cause that to be impossible.

Two Ways To Initialize That Are Equivalent

To explain let me show you two ways to initialize the bitmap that are equivalent:

KHEAPBM       bmh;

#define SIZE  0x100000
#define BSIZE 16

/* the simple and normal way */
k_heapBMInit(&bmh);
k_heapBMAddBlock(&bmh, (uintptr)malloc(SIZE), SIZE, BSIZE);

/* the advanced way */
uintptr       addr;
uint8         *bm;
KHEAPBLOCKBM  *block;

k_heapBMInit(&bmh);
addr = (uintptr)malloc(SIZE);
b = (KHEAPBLOCKBM*)addr;
bm = (uint8*)(addr + sizeof(KHEAPBLOCKBM));
k_heapBMAddBlockEx(&bmh, addr + sizeof(KHEAPBLOCKBM), SIZE - sizeof(KHEAPBLOCKBM), BSIZE, b, bm, 1);

Both, produce the same result. In actuality when you call k_heapBMAddBlock it does this then it calls k_heapBMAddBlockEx where Ex stands for extension. Now, let me show you an advantage of using the extension function.

Allocating Header And Bitmap Outside

uintptr       addr;
uint8         *bm;
KHEAPBLOCKBM  *block;

k_heapBMInit(&bmh);
addr = (uintptr)malloc(SIZE);
b = (KHEAPBLOCKBM*)malloc(sizeof(KHEAPBLOCKBM));
bm = (uint8*)malloc(k_heapBMGetBMSize(SIZE, BSIZE));
k_heapBMAddBlockEx(&bmh, addr, SIZE, BSIZE, b, bm, 0);

Now, the bitmap and block header are stored outside of the originally allocated data. This is useful because if you want to alloc data aligned on a boundary beyond the CPU word size it is now possible because you have removed the block structure from offsetting your data. But, what about the bitmap? I mean the bitmap would have used whole blocks anyway right? Well, true, but the bitmap might have wasted some precious memory in a machine with a limited amount of memory to begin with because the bitmap will normally occupy whole blocks. So if your block size was 4K and you have 8MB of memory then the bitmap would have used an entire 4K (one block) instead of only 2K which might be kind of wasteful.

Allocate Header Outside With Bitmap At Beginning

My point is now you not only have control over removing the header, but also the ability to save a little space by also allocating the bitmap outside of the heap data and not have it subject to using whole blocks and that can save a little space.

Let me also show you how to leave the bitmap but allocate the header outside:

k_heapBMInit(&bmh);
addr = (uintptr)malloc(SIZE);
b = (KHEAPBLOCKBM*)malloc(sizeof(KHEAPBLOCKBM));
bm = (uint8*)addr;
k_heapBMAddBlock(&bmh, addr, SIZE, BSIZE, b, bm, 1);

You can notice the bm is set to the beginning and at the moment the implementation does not support it being set anywhere else unless you use the k_heapBMSet which is explained below.

Example To Manage Physical Memory Using Two Heaps

Let me show an example of using the bitmap implementation to manage physical memory.

        #define KRAMADDR    0x0               
        #define KCHKIOFF    (0x1000 * 4)       /* 16KB */
        #define KCHKISIZE   0x1000             /* 4KB */
        #define KRAMSIZE    (1024 * 1024 * 8)  /* 8 MB */

        KHEAPBM      hphy, hchk;

	k_heapBMInit(&hphy);
	k_heapBMInit(&hchk);
	/* get a bit of memory to start with for small chunk */
	k_heapBMAddBlock(&hchk, KCHKIOFF, KCHKISIZE, 16);
	/* add block but place header in chunk heap to keep alignment */
        /* also allocate bitmap inside chunk heap */
	k_heapBMAddBlockEx(&hphy, KRAMADDR, KRAMSIZE, 0x1000, (KHEAPBLOCKBM*)k_heapBMAlloc(&hchk, sizeof(KHEAPBLOCKBM)), (uint8*)k_heapBMAlloc(&hchk, k_heapBMGetBMSize(KRAMSIZE, 0x1000)), 0);
        k_heapBMSet(&hphy, KCHKIOFF, KCHKISIZE, 5);

Now, we used k_heapBMSet to remove the region KCHKIOFF of size KCHKISIZE by setting it as used inside the heap basically removing it from being allocated. Now, you can allocate on a 16KB boundary from the beginning of RAM by removing the header and bitmap, and still keeping track of them by allocating an initial small size heap.

You want a 4KB page then use hphy. You need to allocate a structure then use hchk. You want 16KB then allocate from either hphy and hchk (you need support to code to make hchk allocate more memory if it does not have enough or runs out see blow).

You need 16KB allocated on a 16KB boundary then see one of the sections below.

You may notice the last argument of k_heapBMSet is 5. Try not to use the value 1, 0, or 2 as the implementation uses that to designate allocations. If you use any of those you will find your region getting freed or allocated. So use something > 2. You can use anything between 3 and 255.

Feeding Heap With Pages/Chunks/Sections

From this point you could keep your small chunk heap (hchk) feed by handing allocates from your larger size heap (hphy). The hphy stands for physical page heap and hchk for chunk heap.

void *kalloc(uintptr size) {
     void *ptr;
     /* access hphy and hchk... */
     ptr = k_heapBMAlloc(hchk, size);
     if (!ptr) {
         /* try allocation again... */
         if (size > 0x500) {
             k_heapBMAddBlock(hchk, k_heapAlloc(hphy, size * 2), size * 2);
         } else {
             k_heapBMAddBlock(hchk, k_heapAlloc(hphy, 0x1000), 0x1000);
         }
         ptr = k_heapBMAlloc(hchk, size);
     }
     /* either it worked or it did not */
     return ptr;
}

The above is a rough way to feed your small chunk heap with your larger physical page heap. It can be optimized. For example you might be able to allocate less than size * 2 but I just used that because its whole number integer multiplication and simple to understand.

= Allocating On Boundaries

uintptr   ptr;
ptr = k_heapBMAllocBound(hphy, 1024 * 16, 14);

Here we request a pointer to a memory region 16KB in size that is allocated on a 16KB boundary. The 16KB boundary is specified by asking for the bottom 14 bits to be zero. As 16KB is 0x4000 which has the bottom 14 bits set to zero. If it is unable to find a region that satisfies this condition then you get a null/zero pointer.

Code

Here is the updated bitmap implementation providing the new extensions:

/* Leonard Kevin McGuire Jr ([email protected]) (www.kmcg3413.net) */
typedef struct _KHEAPBLOCKBM {
	struct _KHEAPBLOCKBM	*next;
	uint32					size;
	uint32					used;
	uint32					bsize;
	uint32					lfb;
	uintptr					data;
	uint8					*bm;
} KHEAPBLOCKBM;

typedef struct _KHEAPBM {
	KHEAPBLOCKBM			*fblock;
} KHEAPBM;

void k_heapBMInit(KHEAPBM *heap);
int k_heapBMAddBlock(KHEAPBM *heap, uintptr addr, uint32 size, uint32 bsize);
int k_heapBMAddBlockEx(KHEAPBM *heap, uintptr addr, uint32 size, uint32 bsize, KHEAPBLOCKBM *b, uint8 *bm, uint8 isBMInside);
void *k_heapBMAlloc(KHEAPBM *heap, uint32 size);
void k_heapBMFree(KHEAPBM *heap, void *ptr);
uintptr k_heapBMGetBMSize(uintptr size, uint32 bsize);
void *k_heapBMAllocBound(KHEAPBM *heap, uint32 size, uint32 mask);
void k_heapBMSet(KHEAPBM *heap, uintptr ptr, uintptr size, uint8 rval);

void k_heapBMInit(KHEAPBM *heap) {
	heap->fblock = 0;
}

int k_heapBMAddBlock(KHEAPBM *heap, uintptr addr, uint32 size, uint32 bsize) {
	KHEAPBLOCKBM			*b;
	uintptr					bmsz;
	uint8					*bm;

	b = (KHEAPBLOCKBM*)addr;
	bmsz = k_heapBMGetBMSize(size, bsize);
	bm = (uint8*)(addr + sizeof(KHEAPBLOCKBM));
	/* important to set isBMInside... (last argument) */
	return k_heapBMAddBlockEx(heap, addr + sizeof(KHEAPBLOCKBM), size - sizeof(KHEAPBLOCKBM), bsize, b, bm, 1);
}

uintptr k_heapBMGetBMSize(uintptr size, uint32 bsize) {
	return size / bsize;
}

int k_heapBMAddBlockEx(KHEAPBM *heap, uintptr addr, uint32 size, uint32 bsize, KHEAPBLOCKBM *b, uint8 *bm, uint8 isBMInside) {
	uint32				bcnt;
	uint32				x;
	
	b->size = size;
	b->bsize = bsize;
	b->data = addr;
	b->bm = bm;
	
	b->next = heap->fblock;
	heap->fblock = b;

	bcnt = size / bsize;
	
	/* clear bitmap */
	for (x = 0; x < bcnt; ++x) {
		bm[x] = 0;
	}

	bcnt = (bcnt / bsize) * bsize < bcnt ? bcnt / bsize + 1 : bcnt / bsize;
	
	/* if BM is not inside leave this space avalible */
	if (isBMInside) {
		/* reserve room for bitmap */
		for (x = 0; x < bcnt; ++x) {
			bm[x] = 5;
		}
	}
	
	b->lfb = bcnt - 1;
	
	b->used = bcnt;
	
	return 1;
}

static uint8 k_heapBMGetNID(uint8 a, uint8 b) {
	uint8		c;	
	for (c = a + 1; c == b || c == 0; ++c);
	return c;
}

void *k_heapBMAlloc(KHEAPBM *heap, uint32 size) {
	return k_heapBMAllocBound(heap, size, 0);
}

void *k_heapBMAllocBound(KHEAPBM *heap, uint32 size, uint32 bound) {
	KHEAPBLOCKBM		*b;
	uint8				*bm;
	uint32				bcnt;
	uint32				x, y, z;
	uint32				bneed;
	uint8				nid;
	uint32				max;

	bound = ~(~0 << bound);
	/* iterate blocks */
	for (b = heap->fblock; b; b = b->next) {
		/* check if block has enough room */
		if (b->size - (b->used * b->bsize) >= size) {
			bcnt = b->size / b->bsize;		
			bneed = (size / b->bsize) * b->bsize < size ? size / b->bsize + 1 : size / b->bsize;
			bm = (uint8*)b->bm;
			
			for (x = (b->lfb + 1 >= bcnt ? 0 : b->lfb + 1); x != b->lfb; ++x) {
				/* just wrap around */
				if (x >= bcnt) {
					x = 0;
				}		
				
				/*
					this is used to allocate on specified boundaries larger than the block size
				*/
				if ((((x * b->bsize) + b->data) & bound) != 0)
					continue;	
				
				if (bm[x] == 0) {	
					/* count free blocks */
					max = bcnt - x;
					for (y = 0; bm[x + y] == 0 && y < bneed && y < max; ++y);

					/* we have enough, now allocate them */
					if (y == bneed) {
						/* find ID that does not match left or right */
						nid = k_heapBMGetNID(bm[x - 1], bm[x + y]);
						
						/* allocate by setting id */
						for (z = 0; z < y; ++z) {
							bm[x + z] = nid;
						}
						/* optimization */
						b->lfb = (x + bneed) - 2;

						/* count used blocks NOT bytes */
						b->used += y;
						return (void*)((x * b->bsize) + b->data);
					}
					
					/* x will be incremented by one ONCE more in our FOR loop */
					x += (y - 1);
					continue;
				}
			}
		}
	}
	
	return 0;
}

void k_heapBMSet(KHEAPBM *heap, uintptr ptr, uintptr size, uint8 rval) {
	KHEAPBLOCKBM		*b;	
	uintptr				ptroff, endoff;
	uint32				bi, x, ei;
	uint8				*bm;
	uint8				id;
	uint32				max;
	
	for (b = heap->fblock; b; b = b->next) {
		/* check if region effects block */
		if (
			/* head end resides inside block */
			(ptr >= b->data && ptr < b->data + b->size) ||
			/* tail end resides inside block */
			((ptr + size) >= b->data && (ptr + size) < b->data + b->size) ||
			/* spans across but does not start or end in block */
			(ptr < b->data && (ptr + size) > b->data + b->size)
		) {
			/* found block */
			if (ptr >= b->data) {
				ptroff = ptr - b->data;  /* get offset to get block */
				/* block offset in BM */
				bi = ptroff / b->bsize;
			} else {
				/* do not start negative on bitmap */
				bi = 0;
			}
			
			/* access bitmap pointer in local variable */
			bm = b->bm;

			ptr = ptr + size;
			endoff = ptr - b->data;
			
			/* end index inside bitmap */
			ei = (endoff / b->bsize) * b->bsize < endoff ? (endoff / b->bsize) + 1 : endoff / b->bsize;
			++ei;
			
			/* region could span past end of a block so adjust */
			max = b->size / b->bsize;
			max = ei > max ? max : ei;
			
			/* set bitmap buckets */
			for (x = bi; x < max; ++x) {
				bm[x] = rval;
			}
			
			/* update free block count */
			if (rval == 0) {
				b->used -= ei - bi;
			} else {
				b->used += ei - bi;
			}
			
			/* do not return as region could span multiple blocks.. so check the rest */
		}
	}
	
	/* this error needs to be raised or reported somehow */
	return;
}

void k_heapBMFree(KHEAPBM *heap, void *ptr) {
	KHEAPBLOCKBM		*b;	
	uintptr				ptroff;
	uint32				bi, x;
	uint8				*bm;
	uint8				id;
	uint32				max;
	
	for (b = heap->fblock; b; b = b->next) {
		if ((uintptr)ptr > b->data && (uintptr)ptr < b->data + b->size) {
			/* found block */
			ptroff = (uintptr)ptr - b->data;  /* get offset to get block */
			/* block offset in BM */
			bi = ptroff / b->bsize;
			/* .. */
			bm = b->bm;
			/* clear allocation */
			id = bm[bi];

			max = b->size / b->bsize;
			for (x = bi; bm[x] == id && x < max; ++x) {
				bm[x] = 0;
			}
			/* update free block count */
			b->used -= x - bi;
			return;
		}
	}
	
	/* this error needs to be raised or reported somehow */
	return;
}

Example Of Making The Most Of Memory On An ARM System

Just for an example. This is how I might start my memory management on an ARM based platform, but it could be very similar for the X86/X64 or another architecture. I like the ARM because it is a bit simpler. The X86 has many more memory areas that need to be protected.

/* the intial kernel stack and the exception stack */
#define KSTACKSTART 0x2000		/* descending */
#define KSTACKEXC   0x3000		/* descending */
/* somewhere to place the kernel state structure */
#define KSTATEADDR	0x3000
/* the address of the start of usable RAM and it's length in bytes */
#define KRAMADDR	0x4000
#define KRAMSIZE	(1024 * 1024 * 8)
/* the size of a physical memory page */
#define KPHYPAGESIZE		4096
/* the block size of the chunk heap (kmalloc, kfree) */
#define KCHKHEAPBSIZE		16

/* KSTATE is a structure that serves as a global variable area for the kernel */
ks = (KSTATE*)KSTATEADDR;

k_heapBMInit(&ks->hphy);
k_heapBMInit(&ks->hchk);
/* get a bit of memory to start with for small chunk */
k_heapBMAddBlock(&ks->hchk, 4 * 7, KRAMADDR - (4 * 7), 16);

/* state structure */
k_heapBMSet(&ks->hchk, KSTATEADDR, sizeof(KSTATE), 5);
/* stacks (can free KSTACKSTART later) */
k_heapBMSet(&ks->hchk, KSTACKSTART - 0x1000, 0x1000, 6);
k_heapBMSet(&ks->hchk, KSTACKEXC - 0x1000, 0x1000, 7);

/* add block but place header and bitmap in chunk heap to keep alignment */
k_heapBMAddBlockEx(&ks->hphy, KRAMADDR, KRAMSIZE - KRAMADDR, 0x1000, (KHEAPBLOCKBM*)k_heapBMAlloc(&ks->hchk, sizeof(KHEAPBLOCKBM)), (uint8*)k_heapBMAlloc(&ks->hchk, k_heapBMGetBMSize(KRAMSIZE - KRAMADDR, 0x1000)), 0);
	
/* remove kernel too (you might have to remove it from hchk or hphy), in my case hphy */
k_heapBMSet(&ks->hphy, 0x10000, 0x4000, 8);

Something to handle is the amount of memory needed to map the maximum memory installed. You may need to dynamically calculate KRAMADDR so you have enough space to allocate for the KHEAPBLOCKBM structure and the bitmap. For example for 4GB of memory and a 4K page size your bitmap would be 1MB in size. A simple check that displays a message may be sufficient. Such as a warning message that says the system was unable to initialize and why. But, actually increasing KRAMADDR if needed should work fine.

My chunk heap (small sized allocation heap) uses 16-byte blocks and is initial setup between memory address 4*7 and KRAMSTART. The RAM actually starts at 0x0 so it is a bit misleading. I then allocate the memory needed for the physical page heap from my chunk heap and set it to go from KRAMADDR with size KRAMADDR - KRAMSIZE since I used a bit out of the total RAM size.

Then I set each region as used in my small chunk heap where the initial kernel stack resides and the exception stack. I also skipped past the exception table with 4*7. This helps to make the most of my potentially limited memory by allocating 16-byte blocks to live between those areas. Once task switching is up and running and I can free the area occupied by the initial kernel stack.

Also, notice I use a 6 for the initial kernel stack and a 7 for the exception stack. This is because if I used the same number and tried to free the initial kernel stack it would free both. Unless, I used the k_heapBMSet and gave it a 0 which is also another way, but I just preferred to keep them separate so I did not have to reference the KSTACKSTART again when releasing the memory.

Original Implementation

For the original implementation see Bitmap Heap Implementation. I kept these separate to reduce the complexity of people just looking for a basic heap implementation with only the minimal necessary functionality. I would recommend that check out that page just to get an idea of what has changed.