C Sharp

From OSDev Wiki
Jump to navigation Jump to search

Please note that the correct title of this article is C#, however because of technical restrictions it's called "C Sharp".

C# is an object-oriented programming language developed by the .NET Foundation and standardized by the ECMA and ISO. Its syntax is similar to C++, but with significant differences in functionality. Typically, source files written in C# compile into an intermediate bytecode language called CIL (also referred to as IL) which was then just-in-time compiled into machine code by the .NET runtime. As of .NET 7, traditional ahead-of-time compilation is also available. A standard library (also called a system module throughout .NET's tooling) provides well-known base classes, such as System.Object, and is automatically referenced by regular C# projects. For kernels written in C#, though, a custom standard library usually needs to be provided.

Why write a kernel in C#?

The primary benefits of writing a kernel in C# is the type and memory safety the language provides. If a region of code is not explicitly marked as unsafe, it can be safely assumed that it will not perform any invalid memory accesses. This idea can be extended to trivially formally verify a region of IL bytecode as being memory safe, simply by excluding certain instructions that are known to be unsafe. As a verified region of code is certain to be well-behaved, a kernel may run it without any kind of address space switching, nor protection ring changes. Given the fact that two safe (or managed) processes can run in the same address space, there exists potential for an extremely efficient implementation of a microkernel, as an IPC call would be equivalent to a plain function invocation. A similar concept has been explored in the JX Operating System with Java - however, it's important to make the distinction that Java manages dependencies in a fundamentally different way than C#, and is known to be more restrictive.

The disadvantages of writing a C# kernel are the complexity needed for a custom standard library to properly function. In addition to regular functions one may find in the standard libraries of Rust or C, C# also requires the system module to provide a garbage collection implementation, as well as many internal data types that the Roslyn C# compiler (csc) uses for newer syntax features. Depending on the IL-to-native code compiler used, intrinsic features (which are often undocumented) also need to be implemented.

Writing a C# kernel also requires a capable GC. A good choice for a garbage collector living in kernel-mode would be an on-the-fly GC, which only needs to stop one thread at a time. This can happen when the thread is idle (e.g. waiting for an I/O operation to complete), meaning no noticeable pause is made to the thread's execution. The kernel's code should also avoid unnecessary allocations that may stress the GC; for example, instead of allocating a temporary buffer with var example = new int[16], a better solution would be to use a small stack allocation, e.g. Span<int> example = stackalloc int[16]. Another viable option is to use custom pool/allocation functions. As of C# 8.0, you may use the disposable ref structs feature to safely and automatically free temporary buffers that are retrieved with such methods, similar to C++'s RAII. Both stack-allocated memory and pool/custom allocation functionality bypass the GC.

Compiling your kernel to machine code

There are two approaches one could take to converting C# code to machine code:

  1. Use a compiler which directly converts C# (or a superset of it) to machine code. This is the approach taken by Microsoft's research OS, Singularity, which defines the Sing# language for easier interaction with the underlying hardware. The Sing# code is converted to machine code by their compiler, Bartok.
  2. Create a compiler which converts CIL to machine code. This leverages the C# to CIL compilers already present in .NET or the Mono project. This is the approach taken by many open source C# kernels, including SharpOS, Cosmos, the MOSA project and tysos.

Both of the above approaches could be performed ahead-of-time (AOT) or just-in-time (JIT). The AOT approach is simpler to begin with - you have your compiler running on your development system which is used to produce executable files (e.g. ELF) which can then be loaded directly by a standard boot loader. A JIT design would require a not insubstantial amount of code that is loaded before your kernel which then converts your kernel into machine code before running it. The problem here is that the JIT compiler would likely need a lot of services which only your kernel could compile. The best combination is probably to AOT compile your kernel and JIT compiler and have them loaded, and then all other processes can be JIT compiled.

An AOT compiler is built-in to .NET >=7.0. In order to use it, you can:

  • Use ilc - the IL compiler, internally used for Native AOT (NAOT) compilation, provided by .NET >7.0. ILC may be invoked manually by installing the .NET tool with dotnet tool install -g ilc. The compiler accepts managed CLR assemblies, in the form of DLL files. You can get such files by compiling your C# source code with csc, which can be installed the same way as ILC.
  • Use Native AOT deployment provided by the .NET build system. Note that this will require large amounts of tweaking in order to customize the default Linux/Windows build rules. For an example, the ZeroSharp: C# for systems programming demonstration may be of interest. The repository only uses features that are built-in to .NET.

Alternatively, for a more streamlined experience, the bflat toolchain provides a simple way to use ILC, alongside a simple, systems-programming-oriented "zerolib" standard library. It automatically interfaces with ILC. Do note that bflat is not integrated with .NET - it also uses its a separate version of the Roslyn C# compiler.

The runtime

Several C# commands (or their equivalent in CIL) require a functioning runtime (in .NET, called the CLR - the Common Language Runtime). For example, the expected result of the newobj CIL command is to create an object on a garbage managed heap. This would require a malloc function and then some method to perform garbage collection on the heap. With clever coding (mainly using static or stack-defined objects), it is possible to have your kernel not use the newobj command until you have initialized your memory allocator.

You may also need to provide an encapsulation of the run-time type information for a class and some way to implement the System.Object.GetType() and typeof() functions to return a class semantically equivalent to System.Type. This is not necessarily required when using Native AOT (ILC) compilation.

The standard library

The standard library is a basic component of the .NET runtime and provides definitions of all the standard types (e.g. System.Int32 for a 32-bit signed integer, System.String for a string, System.Collections.Generic.List<T> for a list of objects of type T). To write any sort of meaningful code in your kernel, you are probably going to need to provide implementations for most of these.

While using the BCL (Base Class Library - .NET's standard library) is theoretically possible, it is generally not suited for any kind of kernel development - it assumes that the running program is in user-mode, and that most system facilities, such as file I/O, are already configured. Projects such as Cosmos handle this by selectively patching certain methods of the BCL which use code intrinsic to select operating system - called plugging.

Writing a custom core library more suitable for kernel-mode programming is a better solution - the amount of classes that need to be implemented depends on the toolset used. For reference, you can use bflat's minimal runtime library.

See Also

Articles

External Links