User:Requimrar/FAT32
FAT32 Implementation
Hello. Unlike the mainline FAT article on the Wiki, this is entirely focused on FAT32 and that only. It also should have no naming inconsistencies. Note that my codebase is in C++, and the FAT32 driver is an object, so "this->" is seen frequently. Do not expect to copy and paste code. Also, this article will not have tables or whatever reference material, such as byte offsets for each value, etc. The closest you'll get to that here is a struct {} definition. For the actual spec, use either the fatgen103 document or the Wikipedia article.
The code below is entirely to explain how FAT32 works. It is not meant to be implemented in an actual system. With an actual driver you need more abstraction, and should be dealing with VFS nodes and not cluster numbers, for instance.
There are also fictitious types, such as List<> and Vector<> (the angled brackets are c++ generic templates), it is up to you, the reader, to implement this yourself. It also uses fictitious methods like AllocateBuffer, foreach(type name in list), etc. This will not compile, and don't expect it to.
Helper Functions
First up, some helper functions. Make these file-static or something.
struct DirectoryEntry
{
char name[8];
char ext[3];
uint8_t attrib;
uint8_t userattrib;
char undelete;
uint16_t createtime;
uint16_t createdate;
uint16_t accessdate;
uint16_t clusterhigh;
uint16_t modifiedtime;
uint16_t modifieddate;
uint16_t clusterlow;
uint32_t filesize;
} __attribute__ ((packed));
uint64_t FAT32FSDriver::ClusterToLBA(uint32_t cluster)
{
return this->FirstUsableCluster + cluster * this->SectorsPerCluster - (2 * this->SectorsPerCluster);
}
The first struct should be fairly self-explanatory, it is simply a packed struct representing the values in a directory entry. The second function is used to convert a cluster number (say cluster 4) into an LBA address you can use to read from a disk (like 4412). Do not worry about the values, I'll explain them later.
BPB Setup
I'll just provide code here. It should be self explanatory. This is also where FirstUsableCluster is defined, it is simply the first data cluster in the FAT volume. (ie. after the two FATS)
this->BytesPerSector = *((uint16_t*)((uintptr_t) fat + 11));
this->SectorsPerCluster = *((uint8_t*)((uintptr_t) fat + 13));
this->ReservedSectors = *((uint16_t*)((uintptr_t) fat + 14));
this->NumberOfFATs = *((uint8_t*)((uintptr_t) fat + 16));
this->NumberOfDirectories = *((uint16_t*)((uintptr_t) fat + 17));
if((uint16_t)(*((uint16_t*)(fat + 17))) > 0)
this->TotalSectors = *((uint16_t*)((uintptr_t) fat + 17));
else
this->TotalSectors = *((uint32_t*)((uintptr_t) fat + 32));
this->HiddenSectors = *((uint32_t*)((uintptr_t) fat + 28));
this->FATSectorSize = *((uint32_t*)((uintptr_t) fat + 36));
this->RootDirectoryCluster = *((uint32_t*)((uintptr_t) fat + 44));
this->FSInfoCluster = *((uint16_t*)((uintptr_t) fat + 48));
this->BackupBootCluster = *((uint16_t*)((uintptr_t) fat + 50));
this->FirstUsableCluster = this->partition->GetStartLBA() + this->ReservedSectors
+ (this->NumberOfFATs * this->FATSectorSize);
As I said, it should be fairly self explanatory, since the variable names are quite verbose. One point to note: since we're dealing only with FAT32, we can assume a FAT32 EBPB exists. In your own code, check that you're actually dealing with a FAT32 volume.
Traversing a Path
I shall assume that you know how to split a path like /usr/local/lib/libc.a into its named components, and I assume you know how to iterate through a list to match them up. The following code will not take care of nuances you need to be aware of, such as the same name on different levels, eg. you have /folder/name and /folder/something/name. There are probably more efficient ways to write this.
// The actual function will probably return a value, such as a found/not found, or a VFS node or whatever.
// Again, this is not meant to be copied.
void Traverse(const char* path)
{
// or whatever here, basically split the paths into its components.
List<const char*> pathparts = split(path, '/');
// get the root directory.
List<FSObject>* dirs = ReadDir(this->RootDirectoryCluster);
// iterate through it.
foreach(FSObject fso in dirs)
{
// again, FSObject should be a VFS node of some kind, with FS specific information stored in a pointer within.
// I'll just assume it has a member called 'cluster' at this point.
// TODO
}
}
// The following function takes a cluster number of a directory, then lists its contents.
List<FSObject>* ReadDir(uint32_t cluster)
{
// if the cluster number is zero, assume we're trying to read from the root directory.
if(cluster == 0)
cluster = this->RootDirectoryCluster;
// Get the cluster chain of the directory. This contains the directory entries themselves.
// We will implement GetClusterChain later.
uint64_t numclus = 0;
List<uint32_t>* clusters = this->GetClusterChain(node, &numclus);
// try and read each cluster into a contiguous buffer.
// this is the total number of bytes the directory is (in terms of on-disk dirents)
uint64_t dirsize = numclus * this->SectorsPerCluster * 512;
// allocate a buffer, rounding up the size to a page.
uint64_t buf = AllocateBuffer(numclus * this->SectorsPerCluster * 512);
uint64_t obuf = buf;
assert(clusters);
foreach(uint32_t cluster in clusters)
{
// read "SectorsPerCluster * 512" bytes at the LBA "ClusterToLBA(cluster)" into buf.
ReadFromDisk(ClusterToLBA(cluster), buf, this->SectorsPerCluster * 512);
// increment buf, so we don't trash its contents.
buf += this->SectorsPerCluster * 512;
}
buf = obuf;
// we now have the complete data of the directory.
// ... to be implemented
}
This is the basic framework for reading a directory's contents: get its cluster (which is provided as an argument), get the cluster chain from the FAT table, read all the clusters into a buffer. It should be trivial to, for instance, do a for or while loop to iterate through each DirectoryEntry, incrementing your pointer offset by sizeof(DirectoryEntry)
each time.
You may have noticed the missing GetClusterChain() method, which returns a list of clusters (since FAT is a linked list filesystem), gotten from the FAT. We will look at it now.
Reading a Cluster Chain
List<uint32_t>* GetClusterChain(uint32_t firstcluster, uint64_t* numclus)
{
// setup some stuff.
uint32_t Cluster = firstcluster;
uint32_t cchain = 0;
List<uint32_t>* ret = new List<uint32_t>();
// here we assume your 'ReadFromDisk' only does 512 bytes at a time or something.
auto buf = AllocateBuffer(512);
auto obuf = buf;
do
{
// these formulas are gotten from the 'fatgen103' document.
// 'FatSector' is the LBA of the disk at which the FAT entry you want is located.
// 'FatOffset' is the offset in bytes from the beginning of 'FatSector' to the cluster.
uint32_t FatSector = (uint32_t) this->partition->GetStartLBA() + this->ReservedSectors + ((Cluster * 4) / 512);
uint32_t FatOffset = (Cluster * 4) % 512;
// read 512 bytes into buf.
ReadFromDisk(FatSector, buf, 512);
// cast to an array so we get an easier time.
uint8_t* clusterchain = (uint8_t*) buf;
// using FatOffset, we just index into the array to get the value we want.
cchain = *((uint32_t*)&clusterchain[FatOffset]) & 0x0FFFFFFF;
// the value of 'Cluster' will change by the next iteration.
// Because we're nice people, we need to include the first cluster in the list of clusters we return.
ret->push_back(Cluster);
// since cchain is the next cluster in the list, we just modify the things, shouldn't be too hard to grasp.
Cluster = cchain;
// numclus tells the caller how many clusters are in the chain.
(*numclus)++;
} while((cchain != 0) && !((cchain & 0x0FFFFFFF) >= 0x0FFFFFF8));
// check that there's more entries in the chain, if not break the loop.
// the reason for using a do-while should be obvious.
return ret;
}
Most of the difficulty in understanding FAT is the math, calculating the sectors and clusters and whatnot. The above function is as simple as it gets. It reads one sector of the FAT at a time, then follows the linked list. You can add optimisations, such as reading more than 512 bytes at a time, or checking if you even need to read again (since many FAT entries are consecutive, especially on a nearly-empty volume)
Allocating a Cluster
This begins write support. I have yet to port that functionality from my old kernel into my current one, so this will take a while.
TODO.
Conclusion
I hope that this article gave you a clearer and less confusing picture on how FAT32 works and the concepts behind it. Feel free to find me on irc at #osdev: @zhiayang, or anybody else online.