IOAPIC

From OSDev Wiki
Jump to navigation Jump to search


The Intel I/O Advanced Programmable Interrupt Controller is used to distribute external interrupts in a more advanced manner than that of the standard 8259 PIC. With the I/O APIC, interrupts can be distributed to physical or logical (clusters of) processors and can be prioritized. Each I/O APIC typically handles 24 external interrupts.

Detecting I/O APIC

In order to detect the existence of an I/O APIC (or multiple ones), the Intel Multi-Processor or ACPI tables (specifically, the MADT) must be parsed. In the MP tables, configuration tables with the entry identification of 0x02 are for I/O APICs. Parsing will tell how many (if any) I/O APICs exist, what are their APIC ID, base MMIO address and first IRQ (or GSI - Global System Interrupt). For more information on parsing the MP tables, see the External MP Tables Links section below. So you can have, say, 2 I/O APICs, the first handling IRQs 0 - 23 and the second 24 - 47.

Programming the I/O APIC

Each I/O APIC has a set of 2 or 3 (depending on version) 32-bit registers and up to many 64-bit registers (one per IRQ). The 64-bit registers have to be accessed as two 32-bit reads/writes. All registers are memory indexed. It means that you actually have only two 32-bit registers in memory, called IOREGSEL and IOREGWIN. You put the register index in IOREGSEL, and then you can read/write in IOREGWIN. The first three registers contain general information about this I/O APIC, while the remaining registers contain the specific configuration for each IRQ.

IOAPICID

This register has index 0 (you write 0 to IOREGSEL and then read from IOREGWIN). It's a Read-Only register with almost all bits reserved. The only interesting field is in bits 24 - 27: the APIC ID for this device (each peripheral which is interfaced with the APIC Bus needs an APIC ID, not only CPUs). You shall find this ID in ACPI/MP Tables as well.

IOAPICVER

This register (index 1) contains the I/O APIC Version in bits 0 - 7, and the Max Redirection Entry which is "how many IRQs can this I/O APIC handle - 1". It is encoded in bits 16 - 23.

IOAPICARB

This register (index 2) contains in bits 24 - 27 the APIC Arbitration ID. TODO

IOREDTBL

Following there are two 32-bit register for each IRQ. The first IRQ has indexes 0x10 and 0x11, the second 0x12 and 0x13, the third 0x14 and 0x15, and so on. So the Redirection Entry register for IRQ n is 0x10 + n * 2 (+ 1). In the first of the two registers you access to the LOW uint32_t / bits 31:0, and the second for the high uint32_t / 63:32. Each redirection entry is made of the following fields:

Field Bits Description
Vector 0 - 7 The Interrupt vector that will be raised on the specified CPU(s).
Delivery Mode 8 - 10 How the interrupt will be sent to the CPU(s). It can be 000 (Fixed), 001 (Lowest Priority), 010 (SMI), 100 (NMI), 101 (INIT) and 111 (ExtINT). Most of the cases you want Fixed mode, or Lowest Priority if you don't want to suspend a high priority task on some important Processor/Core/Thread.
Destination Mode 11 Specify how the Destination field shall be interpreted. 0: Physical Destination, 1: Logical Destination
Delivery Status 12 If 0, the IRQ is just relaxed and waiting for something to happen (or it has fired and already processed by Local APIC(s)). If 1, it means that the IRQ has been sent to the Local APICs but it's still waiting to be delivered.
Pin Polarity 13 0: Active high, 1: Active low. For ISA IRQs assume Active High unless otherwise specified in Interrupt Source Override descriptors of the MADT or in the MP Tables.
Remote IRR 14 TODO
Trigger Mode 15 0: Edge, 1: Level. For ISA IRQs assume Edge unless otherwise specified in Interrupt Source Override descriptors of the MADT or in the MP Tables.
Mask 16 Just like in the old PIC, you can temporary disable this IRQ by setting this bit, and reenable it by clearing the bit.
Destination 56 - 63 This field is interpreted according to the Destination Format bit. If Physical destination is choosen, then this field is limited to bits 56 - 59 (only 16 CPUs addressable). You put here the APIC ID of the CPU that you want to receive the interrupt. TODO: Logical destination format...

IOREGSEL and IOWIN

The register IOREGSEL is an MMIO register select register that is used to access all the other I/O APIC registers. The IOWIN register is the 'data' register. Once the IOREGSEL register has been set, the IOWIN register can be used to write or read the register in the IOREGSEL. The actual position in memory of the two registers is specified in the ACPI MADT Table and/or in the MP table. The IOREGSEL is at the address specified, and IOREGWIN is at the same address + 0x10.

#define IOAPICID          0x00
#define IOAPICVER         0x01
#define IOAPICARB         0x02
#define IOAPICREDTBL(n)   (0x10 + 2 * n) // lower-32bits (add +1 for upper 32-bits)

void write_ioapic_register(const uintptr_t apic_base, const uint8_t offset, const uint32_t val) 
{
    /* tell IOREGSEL where we want to write to */
    *(volatile uint32_t*)(apic_base) = offset;
    /* write the value to IOWIN */
    *(volatile uint32_t*)(apic_base + 0x10) = val; 
}
 
uint32_t read_ioapic_register(const uintptr_t apic_base, const uint8_t offset)
{
    /* tell IOREGSEL where we want to read from */
    *(volatile uint32_t*)(apic_base) = offset;
    /* return the data from IOWIN */
    return *(volatile uint32_t*)(apic_base + 0x10);
}

/* @class IOAPIC
 *
 * A sample driver code which control an IOAPIC. It handles one IOAPIC and exposes
 * some functions. It is totally representational, .i.e you should add locking support
 * link all IOAPIC classes in a data structure and much more. Here we are just showing
 * what & how your'e gonna handle this in C++.
 *
 * You could also note that IOAPIC registers "may" cross a page boundary. So maybe you
 * may need to map the physical-base to a double-page (means allocate twice the amount
 * of memory from vmm).
 */
class IOAPIC
{
public:
        enum DeliveryMode
        {
                EDGE  = 0,
                LEVEL = 1,
        };

        enum DestinationMode
        {
                PHYSICAL = 0,
                LOGICAL  = 1
        };

        union RedirectionEntry
        {
                struct
                {
                        uint64_t vector       : 8;
                        uint64_t delvMode     : 3;
                        uint64_t destMode     : 1;
                        uint64_t delvStatus   : 1;
                        uint64_t pinPolarity  : 1;
                        uint64_t remoteIRR    : 1;
                        uint64_t triggerMode  : 1;
                        uint64_t mask         : 1;
                        uint64_t reserved     : 39;
                        uint64_t destination  : 8;
                 };
                 struct
                 {
                        uint32_t lowerDword;
                        uint32_t upperDword;
                 };
        };

        unsigned char id(){ return (apicId); }
        unsigned char ver(){ return (apicVer); }
        unsigned char redirectionEntries(){ return (redirEntryCnt); }
        unsigned long globalInterruptBase(){ return (globalIntrBase); }

        IOAPIC::IOAPIC(unsigned long physRegs, unsigned long apicId, unsigned long gsib)
        {
                this->virtAddr = KernelMemory::allocatePage(PAGE_SIZE);

                /* map virtAddr to physical-regs. Note that your paging code may not support 
                   automatically aligning physRegs to page-boundaries. Be sure to check! */
                EnsureMapping(virtAddr, physRegs, PagePresent | PageReadWrite | PageCacheDisable);
                
               virtAddr += physRegs % PAGE_SIZE;

               apicId = (read(IOAPICID) >> 24) & 0xF0;
               apicVer = read(IOAPICVER);// cast to uint8_t (unsigned char) hides upper bits

               //< max. redir entry is given IOAPICVER[16:24]
               redirEntryCnt = (read(IOAPICVER) >> 16) + 1;// cast to uint8_t occuring ok!
               globalIntrBase = gsib;
        }

        /*
         * Bit of assignment here - implement this on your own. Use the lowerDword & upperDword
         * fields of RedirectionEntry using
         *                                 ent.lowerDword = read(entNo);
         *                                 ent.upperDword = read(entNo);
         *                                 return (ent);
         *
         * Be sure to check that entNo < redirectionEntries()
         *
         * @param entNo - entry no. for which redir-entry is required
         * @return entry associated with entry no.
         */
        RedirectionEntry getRedirEntry(unsigned char entNo);

        /*
         * Bit of assignment here - implement this on your own. Use the lowerDword & upperDword
         * fields of RedirectionEntry using
         *                               write(entNo, ent->lowerDword);
         *                               write(entNo, ent->upperDword);
         *
         * Be sure to check that entNo < redirectionEntries()
         *
         * @param entNo - entry no. for which redir-entry is required
         * @param entry - ptr to entry to write
         */
        void writeRedirEntry(unsigned char entNo, RedirectionEntry *entry);

private:
        /*
         * This field contains the physical-base address for the IOAPIC
         * can be found using an IOAPIC-entry in the ACPI 2.0 MADT.
         */
        unsigned long physRegs;

        /*
         * Holds the base address of the registers in virtual memory. This
         * address is non-cacheable (see paging).
         */
        unsigned long virtAddr;

        /*
         * Software has complete control over the apic-id. Also, hardware
         * won't automatically change its apic-id so we could cache it here.
         */
        unsigned char apicId;

        /*
         * Hardware-version of the apic, mainly for display purpose. ToDo: specify
         * more purposes.
         */
        unsigned char apicVer;

        /*
         * Although entries for current IOAPIC is 24, it may change. To retain
         * compatibility make sure you use this.
         */
        unsigned char redirEntryCnt;

        /*
         * The first IRQ which this IOAPIC handles. This is only found in the
         * IOAPIC entry of the ACPI 2.0 MADT. It isn't found in the IOAPIC
         * registers.
         */
        unsigned long globalIntrBase;

        /*
         * Reads the data present in the register at offset regOff.
         *
         * @param regOff - the register's offset which is being read
         * @return the data present in the register associated with that offset
         */
        uint32_t read(unsigned char regOff)
        {
                *(uint32_t volatile*) virtAddr = regOff;
                return *(uint32_t volatile*)(virtAddr + 0x10);
        }

        /*
         * Writes the data into the register associated. 
         *
         * @param regOff - the register's offset which is being written
         * @param data - dword to write to the register
         */
        void write(unsigned char regOff, uint32_t data)
        {
                *(uint32_t volatile*) virtAddr = regOff;
                *(uint32_t volatile*)(virtAddr + 0x10) = data;
        }
};

'apic_base' is the memory base address for a selected IOAPIC, these can be found by enumerating them from the MP or ACPI Tables.


IO APIC Inputs

How other hardware (devices, etc) use IO APIC inputs is completely arbitrary - the motherboard/chipset designer can hard-wire anything they like to any IO APIC input. For the motherboard designer's convenience, most but not all legacy IRQs are often (but not always) connected "1:1" to IO APIC inputs (e.g. IO APIC input #1 may be the same as PIC chip input #1) as this makes firmware a little easier (e.g. no need for "interrupt redirection entries" in ACPI's MADT/APIC table), but this is not a requirement of any standard and not something that useful operating system software can rely on.

To correctly determine what how IO APIC inputs are used (and how they must be configured - as active high or active low, and as edge triggered or level triggered) operating system software must either:

1) Use APIC's MADT/APIC table to determine how legacy IRQs are mapped to IO APIC inputs; then use ACPI's AML (with a suitable interpreter) to determine how PCI devices are connected to IO APIC inputs

2) Use Intel's "MultiProcessor Specification" tables to determine how both legacy IRQs and PCI IRQs are mapped to IO APIC inputs. Note that Intel's "MultiProcessor Specification" is deprecated (an operating system should use ACPI where possible, and fall back to MultiProcessor Specification tables if ACPI doesn't exist or can't be used)

3) Provide (many) motherboard specific drivers, where each driver is able to use motherboard specific information to determine how IO inputs are used

4) Use an excessively "clever" auto-detection scheme (with a high risk of misconfiguration and race conditions). These schemes typically begin with whatever information can be obtained easily (e.g. determining legacy IRQs from ACPI's MADT/APIC table), assuming everything else can be configured as "level triggered active low" (to suit PCI), and then asking device drivers to repeatedly forcing their device to generate an IRQ (while ruling out IO APIC inputs that weren't triggered within a certain period of time after the IRQ was caused) until the specific IO APIC that the device uses can be determined.

External Links

MP Tables

I/O APIC