User:Shikhin/Tutorial SMP

From OSDev Wiki
Jump to navigation Jump to search
Difficulty level
Difficulty 3.png
Advanced

SMP, or Symmetric Multiprocessing is one method of having multiple processors in one computer system. Several hobbyist OS Developers leave SMP support for the future. However, even though supporting SMP itself is comparatively easy, the "one-big lock" chaos it could create often leads to rewrites.

Thus, starting with SMP support early is recommended, and explaining it is what the tutorial aims at. The tutorial covers basic locking primitives, however, it expects the reader to have a base knowledge on Virtual Memory and Multiprocessing systems.

Moreover, the tutorial, unlike other tutorials, doesn't believe in providing source. In some cases, it may contain appropriate pseudo code and table structures, however full source should NOT be expected.

Multiprocessing

Multiprocessing system, basically, refers to a system where multiple processors execute code simultaneously. Multiprocessing systems can be divided into two sub-categories:

  • UMA or Uniform Memory Access architectures, are those architectures, in which all processors share the same Physical Memory, with uniform time accesses.
  • NUMA or Non-Uniform Memory Access architectures, on the contrary, are those architectures, in which processors access their own local memory faster than non-local memory.

Contrary to popular belief, NUMA and SMP architectures are not mutually exclusive, as is demonstrated by the latest set of Intel processors. In this article, for simplicity sake, we would not be covering optimization for NUMA architectures, and nor for SMT architectures.

Terminology

I'd be using few terms through out the tutorial, which may be a little tough to understand. Here, I'd maintain a glossary for all the terminology I use.

  • SMP, or Symmetric Multiprocessing, is one method of having multliple processors in a computer.
  • SMT, or Symmetric Multithreading, is another type of multiprocessing, where the idle time in a processor is used for another "thread".
  • UMA, or Uniform Memory Access refers to a architecture where all processors share the same Physical Memory with uniform time accesses.
  • NUMA, or Non-Uniform Memory Access refers to a architecture, where processors access their own local memory faster than non-local memory.
  • MPS, or the MultiProcessing Specification is a deprecated specification developed by Intel, and has been superseded by ACPI.
  • ACPI, or Advanced Configuration and Power Interface, is a configuration standard for the PC, developed by Intel, Microsoft and Toshiba. ACPI provides a set of tables, which come useful to us.
  • APIC, or Advanced Programmable Interrupt Controller is the updated Intel standard for the older PIC, which is used in multiprocessor systems and is an integral part of all recent Intel (and compatible) processors. The APIC is used for sophisticated interrupt redirection, and for sending interrupts between processors.
  • When the computer starts, control is handled to a BSP or BootStrap Processor. This is what your code is currently running on.
  • The other processors are classified as APs or Application Processors. These are the processors we would be attempting to start.

Gathering Information

To get a list of all the APs in a system, either the MPS or ACPI tables can be used. The MPS tables are deprecated and have been superseded by ACPI. The MPS tables are proven to be buggy in SMT cases, and thus aren't recommended. However, on systems where ACPI isn't present, the MPS tables can be used as a backup.

These tables provide a list of all the APICs in a computer, which can then be initialized.

MPS

The MPS or MultiProcessor Specification tables are present in all SMP systems. If this table isn't present in the a system, the system can be assumed to be a UniProcessor system.

MP Floating Point Structure

To use these tables, the MP Floating Point Structure must first be found. As the name suggests, it is a Floating Point Structure and must be searched for.

The table is present on a 16-byte boundary, and is marked by the signature "_MP_". To find the table, the following areas must be searched in:

a) In the first kilobyte of Extended BIOS Data Area (EBDA), or

b) Within the last kilobyte of system base memory (e.g., 639K-640K for systems with 640KB of base memory or 511K-512K for systems with 512 KB of base memory) if the EBDA segment is undefined, or

c) In the BIOS ROM address space between 0xF0000 and 0xFFFFF.


The EBDA isn't standardized, and it's address can be found in the BDA. The word at 0x040E usually contains the address of EBDA shifted right by four bits. Thus, the address of the EBDA can be found by doing something like *((uint16_t*)0x040E) << 4. The author recommends performing checks on this address, since this isn't much standardized too. You could check that it falls in the EBDA area, as listed in EBDA.

If the check fails or the MP Floating Point Structure isn't present in the EBDA region, the last kilobyte of System base memory should be searched. If you use a Multiboot compatible bootloader, the mem_lower field at offset 4 lists the amount of base memory. If you are writing your own bootloader, then int 0x12 can be used to get the amount of base memory.

If the MP Floating Point Structure can't be found in this area, then the area between 0xF0000 and 0xFFFFF should be searched. The MP Floating Point Structure follows the following format:

typedef struct
{
    // The Signature, must contain _MP_, and is present on a 16 byte boundary.
    uint8_t  Signature[4];

    // The address of the MP Configuration Table. 
    uint32_t MPConfigurationTable;

    // The length of the floating point structure table, in 16 byte units. This field *should* contain 0x01, meaning 16-bytes.
    uint8_t  Length;

    // The version number of the MP Specification. A value of 1 indicates 1.1, 4 indicates 1.4, and so on.
    uint8_t  Version;

    // The checksum of the Floating Point Structure. 
    uint8_t  Checksum;

    // Few feature bytes.
    uint8_t FeatureBytes[5];
} MPFloatingPoint;

If the first feature byte is non-zero the MP Configuration Table is not present, and a default configuration, defined in Chapter 5 of the MultiProcessor Specification is used. For the sake of simplicity, I would assume that the first feature byte is always zero.

However, if you want your OS to run in all supported configurations, every default configuration, as defined in Chapter 5, must be accounted for.

Recommended Usage

First of all, the area defined above must be searched in. Once the table is found, the checksum of the table must be done.

If the checksum fails, the system must be assumed to be a UniProcessor system.

MP Configuration Table

Recommended Usage

ACPI

Extra Notes

See Also

Threads

External Links