USTAR

From OSDev Wiki
Jump to: navigation, search
Filesystems
Virtual Filesystems

VFS

Disk Filesystems
CD/DVD Filesystems
Network Filesystems
Flash Filesystems

USTAR filesystem was originally designed for tapes. It is very widely used, tools to create tar archives are available for every OS. It is also a very very simple filesystem, extremely easy to implement.

As USTAR is a POSIX standard (POSIX.1-1988 and POSIX.1-2001), it's well defined and very well documented. You can also find a lot of example code, and the GNU tar utility is Open Source.

Tar uses 512 bytes sectors just like floppies and disks, so you can write it on a floppy or a partition just as-is.

Contents

Format Details

Each file and directory has a 512 bytes sector containing meta data (i-node with filename if you like). If the file is not empty, then that meta data sector is followed by data sectors with file contents rounded up to 512 bytes.

Offset Size Description
0 100 File name
100 8 File mode
108 8 Owner's numeric user ID
116 8 Group's numeric user ID
124 12 File size in bytes (octal base)
136 12 Last modification time in numeric Unix time format (octal)
148 8 Checksum for header record
156 1 Type flag
157 100 Name of linked file
257 6 UStar indicator "ustar" then NUL
263 2 UStar version "00"
265 32 Owner user name
297 32 Owner group name
329 8 Device major number
337 8 Device minor number
345 155 Filename prefix

The only trick is, that file size is not stored in binary, rather in an ASCII octal string. For example 1025 is stored as '000000002001'.

The field Type flag tells what kind of file it's about.

Type flag Meaning
'0' or (ASCII NUL) Normal file
'1' Hard link
'2' Symbolic link
'3' Character device
'4' Block device
'5' Directory
'6' Named pipe (FIFO)

Pretty much that's all you need to know for a basic implementation.

Example Code

We need a helper function to convert ASCII octal number into binary:

int oct2bin(unsigned char *str, int size)
{
    int n = 0;
    unsigned char *c = str;
    while (size-- > 0) {
        n *= 8;
        n += *c - '0';
        c++;
    }
    return n;
}

Then file lookup is as simple as:

/* returns file size and pointer to file data in out */
int tar_lookup(unsigned char *archive, char *filename, char **out)
{
    unsigned char *ptr = archive;
 
    while (!memcmp(ptr + 257, "ustar", 5)) {
        int filesize = oct2bin(ptr + 0x7c, 11);
        if (!memcmp(ptr, filename, strlen(filename) + 1)) {
            *out = ptr + 512;
            return filesize;
        }
        ptr += (((filesize + 511) / 512) + 1) * 512;
    }
    return 0;
}

If you don't load the entire archive into memory, then you'll have to load the sector from the disk in the beginning of the loop (before 'int filesize'), and seeking to the next sector (or skipping more) at the end of the loop (instead of 'ptr +=').

See Also

External Links

Personal tools
Namespaces
Variants
Actions
Navigation
About
Toolbox