LEAN

From OSDev Wiki
Jump to navigation Jump to search
Filesystems
Virtual Filesystems

VFS

Disk Filesystems
CD/DVD Filesystems
Network Filesystems
Flash Filesystems

LEAN is a (recursive) backronym that stands for "Lean yet Effective Allocation and Naming" and is an alternative to the FAT filesystem used by the FreeDOS-32 and FYSOS projects.

Information

LEAN uses an extent-based allocation algorithm and is designed to use the little-endian byte ordering. There have been a total of 8 revisions (at the time of writing) and the changelog for each release can be found in the specification (see link below). LEAN provides the following features:

  • Supports Unicode long file names (in contrary to FAT long names, patented by Microsoft).
  • Supports extended attributes (name-value metadata, embedded in files or forks).
  • Requires only a small memory footprint and isn't too intensive on the CPU.
  • Fairly easy to implement and to understand.
  • Suitable for both large and small volumes.
  • Hard and symbolic links.
  • POSIX-compatible.

A more detailed overview of the filesystem's actual capabilities:

  • arbitrary sized allocation unit (usually matching the sector size of the media, but not required)
  • A maximum volume size of 2^63 - 1 sectors
  • A maximum file size of 2^64 - 1 bytes
  • File names of up to 4068 bytes (case sensitive and in UTF-8 format)

Partition Layout

To improve locality in space and to help dynamically change the size of a volume, the disk is logically subdivided into bands, made up of several contiguous blocks. Bands are equally sized, perhaps except the last one, and must be made up of a power of two worth of blocks. A volume contains at least one band. Bands are numbered starting from 0, where block 0 is the first block of band 0.

Each band contains a portion of the bitmap representing the allocation state of the blocks of that band. That is, each bitmap portion represents the state of closely located blocks. The bitmap portion for each band must start at the first block of the band. Band 0 is an exception in that it contains blocks reserved for the boot loader and the superblock, thus a specific field is used to tell where the bitmap of band 0 starts. See the image below.

lean_layout.png

Superblock

The superblock is a structure containing fundamental information about a LEAN volume. The superblock must be stored in one of any block in range from 1 to 32, inclusive. The specification specifies how to find this superblock.

Member Definition
uint32_t checksum The checksum value for the superblock. All 32-bit dwords in this block or included in the calculation.
uint32_t magic This must be equal to 0x4E41454C (the 'LEAN' characters in ASCII), and it must be used to identify a valid LEAN file system. It should be used to find the superblock backup in case the primary copy gets lost or damaged.
uint16_t fsVersion This field identifies the version of the LEAN file system, and it is provided for future development. The high byte identifies the major version number and the low byte the minor version number (for example 0x0110 would mean version 1.16). At present, in the alpha development stage, it must be set to 0x0008 (that is version 0.8) and drivers must not try to access an unknown file system version, backward compatibility making no sense.
uint8_t preallocCount The count minus one of contiguous blocks that the driver should try to preallocate when a new block is to be added to a file. A value of zero means that the driver should append a single block. Values greater than zero are recommended, especially for slowly-growing files such as directories, to reduce fragmentation and improve locality.
uint8_t logBlocksPerBand The base-2 logarithm of the number of blocks per band. The latter is easily computed as 1 << logBlocksPerBand. For 512-byte blocks, it must be at least equal to 12 (4096 bits per band, that is 512 bytes of a block), so that a bitmap portion is always made up of an integral amount of blocks. This field must be equal to 13 for 1024-byte blocks, 14 for 2048-byte blocks, 15 for 4096-byte blocks, and so on. This means, when using 512-byte blocks, the smallest band size is 4096 sectors, that is 2 MiB.
uint32_t state This field identifies the state of the file system.

Bit 0 (the LSB) is the clean unmount bit: it must be set to '0' while the volume is mounted, and it must set '1' while the volume is unmounted. Thus, if a driver finds this bit set to '0' when mounting a volume, it means the volume was not unmounted properly. Bit 1 is the error bit: it must be normally '0'. Drivers must set it to '1' in case of major errors, like when file system corruption is detected. File system checkup programs may use this bit to know if some major problem happened with the volume. Bits 2 to 31 are reserved for future use.

uint8_t uuid[16] A 128-bit universally unique identifier (UUID) that should uniquely identify the volume. A driver should use it to check for media change in a removable media drive. A driver may use it for automatic volume identification. Drivers must create a sequence of 16 bytes as unique as possible, for example using a good random number generator or one of the canonical algorithms for UUID generation.
char volumeLabel[64] A descriptive string, encoded in Unicode UTF-8, null-terminated, to help the user identify the volume. It may be empty.
uint64_t blockCount The total number of blocks that form the volume.
uint64_t freeBlockCount The number of free (or available, or unallocated) blocks in the volume. A driver must keep it consistent to the rest of the file system to measure the free volume space to store files, directories and book keeping. A value of zero must mean disk full. Drivers may choose to not update this field every time there is a change if performance is a constraint or the disk has limited write capabilities (e.g. flash memories). However, this field must be updated when dismounting the volume.
uint64_t primarySuper The address of the block containing the primary copy of the superblock. A driver must use this number to know where to write changes made to the superblock back to disk. This field is provided for robustness, for example against repair utilities operating on a disk containing a LEAN file system image file.
uint64_t backupSuper The address of the block containing the backup copy of the superblock. A driver must use this number to know where to write the backup copy when changes are made to the superblock.

In the event the primary copy of the superblock gets lost or damaged, recovery utilities should search for the backup copy of the superblock looking for valid magic and checksum fields, and may also use the backupSuper field itself for further validation.

uint64_t bitmapStart The address of the block where the bitmap of the first band (that is, band 0) starts. This is, usually but not mandated, the block right after the superblock. For any other band, the bitmap must start in the first block of the band. The size of the bitmap in each band is related to the logBlocksPerBand field.
uint64_t rootInode The inode number (that is, the address of the first block) of the root directory. This is, usually but not mandated, the block right after the bitmap for band 0.
uint64_t badInode The inode number (that is, the address of the first block) of a pseudo-file to track bad blocks. This pseudo-file should be generated and managed by a file system checkup program, and should be made up of any blocks that the program detected as bad. This ensures that a driver can not allocate them. If the volume has no bad blocks, badInode must be zero. See a later section for the format of this pseudo-file.
uint64_t journalInode The inode number of the "file" representing the Journal data. This field may be zero indicating no Journal. An implementation must verify the Journal Data before using it. It is outside this specification on how this Journal is to be formatted/used. See a later section for more information.
uint8_t logBlockSize The base-2 logarithm of the size of a block in bytes when this volume was formatted. A value of 9 is 512, 10 = 1024, 11 = 2048, 12 = 4096, and so on. It is calculated as 1 << logBlockSize. This value must be at least 8.
uint8_t reserved0[7] Padding, reserved for future use. See the definition for Reserved
uint8_t reserved1[344](1) Reserved for future use. These bytes pad the size of the superblock structure to the size of the block used and must be included in checksum calculation. A value of 344 is used for 512-byte blocks. See the definition for Reserved.

(1) 344 for 512-byte blocks, 865 for 1024-byte blocks, 1880 for 2048-byte blocks, 3928 for 4096-byte blocks, etc.

Bitmap

The complete bitmap is a non-contiguous data structure, made up of blocks that, as a whole, form an array of bits, each of them indicating the allocated/unallocated state of each block of a volume. The bitmap is not contiguous because its blocks are subdivided into bands, instead of being stored in a single chunk in a fixed location. Each Band has its own bitmap.

Inode

All fundamental metadata of a file are stored in a structure called inode structure. Such file metadata include file size, attributes and time stamps, as well as the means to access directly- and indirectly-addressable extents of a file.

Member Definition
uint32_t checksum The checksum value for the inode structure. Any optional inline extended attributes or file data following the inode structure are not part of the inode structure itself, thus must not be included in checksum calculation.
uint32_t magic This must be equal to 0x45444F4E (the 'NODE' characters in ASCII).
uint8_t extentCount The number of valid extent specifications stored in the inode structure, that is the count of the first elements in the extentStarts and extentSizes arrays that refer to actual file extents, from 1 (the first extent contains the inode structure itself) to extentsPerInode = 6. Directly-addressable extents must be used from the first to the last before using indirect blocks. If extentCount < extentsPerInode, unused extent specification are undefined. If extentCount < extentsPerInode and any of indirectCount, firstIndirect, or lastIndirect is not zero, this inode is considered corrupt.
uint8_t reserved[3] Padding, reserved for future use. See the definition for Reserved.
uint32_t indirectCount The number of indirect blocks owned by the file.
uint32_t linkCount If this file is not a fork, this field must be equal to the number of hard links (the count of directory entries) referring to this file. If this file is a fork, this field must be equal to the number of non-fork files referring to this fork. Drivers may implement fork sharing and copy-on-write. In any case, for non-deleted files this field must be at least 1.
uint32_t uid The numeric identifier of the user that owns the file. Interpretation of this field is operating system dependent. A mono-user driver must use zero.
uint32_t gid The numeric identifier of the group that owns of the file. Interpretation of this field is operating system dependent. A mono-user driver must use zero.
uint32_t attributes The attributes mask for the file, see enum InodeAttributes. It includes POSIX permissions, behavior flags and the file type specification. Interpretation of permissions is operating system dependent. A mono-user driver must use the owner user permission bits only (ia?USR). Note that the "file type" values are mutually exclusive enumerated values, they are not single bits that can be combined. The iaFmtMask mask can be used to retrieve the file type.
uint64_t fileSize The size in bytes of the file data, thus excluding the size of the inode structure, the optional extra space for inline extended attributes, and indirect blocks.
uint64_t blockCount The number of data blocks (thus excluding indirect blocks, and including the first block with the inode structure) allocated for the file. Must always be at least 1. A driver may preallocate more blocks than required for a specified fileSize to implement strategies to reduce fragmentation.
int64_t accessTime Last access time stamp. This field is computed as the signed number of microseconds since 1970-01-01 00:00:00.000000 UTC, not including leap seconds. Drivers may choose to not update this field if performance is a constraint or the disk has limited write capabilities (e.g. flash memories). A mount-time flag or an open-time flag may be implemented to disable last access time stamps, and the implementation may want this to be the default. Initially it must have the same value as creationTime. If the operating system semantics allows it, drivers may delay update of this time stamp instead of updating it at every access.
int64_t statusChangeTime Status modification time stamp. This field is computed as the signed number of microseconds since 1970-01-01 00:00:00.000000 UTC, not including leap seconds. Initially it must have the same value as creationTime. If the operating system semantics allows it, drivers may delay update of this time stamp instead of updating it at every modification to the inode structure.
int64_t modificationTime Last data modification time stamp. This field is computed as the signed number of microseconds since 1970-01-01 00:00:00.000000 UTC, not including leap seconds. Initially it must have the same value as creationTime. If the operating system semantics allows it, drivers may delay update of this time stamp instead of updating it at every write to the file.
int64_t creationTime File creation time stamp. This field is computed as the signed number of microseconds since 1970-01-01 00:00:00.000000 UTC, not including leap seconds. Depending on the operating system semantics, this field may be greater than the other time stamps if this file is created as a copy of an existing file.
uint64_t firstIndirect The address of the first indirect block of the file. If the file has no indirect blocks the value of this field must be zero. Example: If there is only 1 indirect block currently being used, this will have the same value as lastIndirect.
uint64_t lastIndirect The address of the last indirect block of the file. If the file has no indirect blocks the value of this field must be zero. Example: If there is only 1 indirect block currently being used, this will have the same value as firstIndirect.
uint64_t fork If this file is not a fork, this field is the address of the first block of the optional fork associated with this file, or zero if this file has not a fork. If this file is a fork, this field must be zero.
uint64_t extentStarts[extentsPerInode] The array of starting block addresses of the extent specifications. The count of valid extent specifications, starting from index 0, is determined by the extentCount field. Any unused extent specification is unspecified. Note that the first element of this array is also -by definition- the address of the block containing this inode structure.
uint32_t extentSizes[extentsPerInode] The array of extent sizes of the extent specifications. Each element represents the size in blocks of the corresponding extent whose first block is specified in extentStarts.

Indirect Addressing

If a file needs more than extentsPerInode extents, indirect addressing is used for subsequent extents. An indirect block is a block storing an array of extentsPerIndirect extent specifications instead of file data. Indirect blocks are linked together to form a doubly-linked list in order to support arbitrary file lengths.

Member Definition
uint32_t checksum The checksum value for the indirect block. All bytes in the block do count for checksum calculation.
uint32_t magic This must be equal to 0x58444E49 (the 'INDX' characters in ASCII).
uint64_t blockCount The total number of blocks addressable using this indirect block. A driver should use this information to quickly traverse the list of indirect blocks to find the desired data block. It must be equal to the sum of the used extentSizes (see below).
uint64_t inode The inode number of the file this indirect block belongs to. This is provided for robustness.
uint64_t thisblock The address of the block storing this indirect block. This is provided for robustness.
uint64_t prevIndirect The address of the previous indirect block. If there is no previous indirect block, this field must be zero. A driver may use this value to traverse the list of indirect blocks in reverse order.
uint64_t nextIndirect The address of the next indirect block. If there is no next indirect block, this field must be zero. This is the primary means for a driver to traverse the list of indirect blocks to seek into a file.
uint16_t extentCount The number of valid extent specifications stored in the indirect block, that is the count of the first elements in the extentStarts and extentSizes arrays that refer to actual file extents. If the indirect block is not the last indirect block of a file, all extent specifications must be in use, that is extentCount must be equal to extentsPerIndirect.
uint8_t reserved0[2] Padding, reserved for future use. See the definition for Reserved.
uint32_t reserved1 Padding, reserved for future use. See the definition for Reserved.
uint64_t extentStarts[extentsPerIndirect] The array of starting block addresses of the extent specifications.
uint32_t extentSizes[extentsPerIndirect] The array of extent sizes of the extent specifications. Each element represents the size in blocks of the corresponding extent whose first block is specified in extentStarts.
uint8_t reserved2[n] Zero to eleven bytes. A 512-byte block will have zero bytes here. A 4096-byte block will have 8 bytes here. The value of these bytes are included in the checksum calculation. See the definition for Reserved.

Tools

The creators of LEAN have a reference implementation and binaries of a LEAN file manager (Windows/Debian/Linux) available on their website. See this page for more information.

More Information

The specification, linked below, details these structures and their members in more detail, as well as information on other aspects of the file system.

See Also

Threads

External Links