A Virtual File System (VFS) is not an on-disk file system, or a network file system. As such, it's neither a data structure (like ReiserFS, NTFS or FAT), nor a network protocol (like NFS). Actually, it's just an abstraction that many operating systems provide to applications, so don't let the name scare you.
Virtual file systems are used to separate the high-level interface to the file system from the low level interfaces that different implementations (FAT, ext3, etc) may require, thus providing transparent access to storage devices from applications. This allows for greater flexibility, specially if one wants to support several file systems.
The VFS sits between the higher level user space operations over the file system, and the file system drivers. Having a VFS interface implies having the idea of mount points. A mount point is a path inside the virtual file system tree that represents an in-use file system. This file system may be on a local device, in memory or stored on a networked device.
Disambiguation of the underlying semantics of a VFS
A VFS, in concrete terms then, provides a uniform access path and subsystem for a group of file systems of the same type. To date, there are three common types of file system: Hierarchical (the most common), tag-based and Database file systems.
The VFS model used in DOS and Microsoft Windows assigns a letter from the alphabet to each accessible file system on the machine. This type of VFS is the most simple to implement but is restricted to 26 mounted file systems and can get more and more complex as features are added.
When a file is requested the VFS checks what drive the file is on and then passes the request on to the relevant driver.
Mount Point List
A more complex model is that of a mount point list. This system maintains a list of mounted file systems and where they are mounted. When a file is requested the list is scanned to determine what file system the file is on. The rest of the path is then passed on to the file system driver to fetch the file. This design is a quite versatile one but suffers from speed problems when large amounts of mount points are used.
A VFS model that can be very efficient is the Node Graph. This model maintains a graph of file system nodes that can represent a file, folder, mount point or other type of file. A node graph can be faster to traverse than a list but suffers from complexity problems and, if a large amounts of nodes are needed, can take up large amounts of memory.
Each node in a node graph has the name, permissions and inode stored within a structure along with pointers to file IO functions like Read, Write, Read Dir and Find Dir.
These models represent the basics for a VFS to be designed on, they have their problems however. Scanning through a list of mount points and then passing on to the file system the remainder of the path is usable for a simple OS but requires large amounts of repeated code as each driver must be able to parse a path reliably. A node graph on the other hand, requires a node for each file and directory on the system to present in memory at the same time, otherwise features like mount points would have to be constantly refreshed.
A compromise between these two systems would be to have a list of mounted file systems and use that to determine what mount point a file lies on and then use nodes that do not necessarily have to permanently reside in memory to store file information and methods.
- The VFS and the initrd in JamesM's Kernel Development Tutorials
- 2001 Linux VFS article part 1 and part 2 on FreeOS
- 1996 Linux VFS article in the Linux Kernel Hackers' Guide
- The Use of Name Spaces in Plan 9 - Describes how Plan 9 takes advantage of both private namespaces and an VFS that matches the 9P distributed file system protocol.