Loading content...
A keyboard keystroke arrives. A network packet is received. The timer fires. A disk read completes. A power failure is detected. All within the same microsecond.
Which interrupt should the CPU service first?
This is the interrupt priority problem. Not all interrupts are equally urgent. A power failure warning demands immediate attention—the system has milliseconds to save state. A keyboard keystroke can wait a few milliseconds without anyone noticing. The network packet is somewhere in between.
Interrupt priority is the mechanism that ensures the most critical events are handled first, that less important interrupts don't starve out more important ones, and that the interrupt handling system itself doesn't collapse under load.
By the end of this page, you will understand the complete interrupt priority architecture: hardware priority mechanisms in PICs and APICs, the concept of interrupt masks, priority-based preemption, and how operating systems manage priority to ensure responsiveness while preventing starvation. You'll learn practical priority assignment strategies used in production systems.
Without priority, interrupts would be handled in simple arrival order (FIFO). This seems fair, but causes serious problems:
Scenario: Low-Priority Interrupt Blocks Critical Event
Priority solves this by allowing:
| Source | Urgency | Consequence of Delay | Typical Priority |
|---|---|---|---|
| NMI / Power Failure | Critical | Data loss, system damage | Highest (unmaskable) |
| Machine Check | Critical | Hardware fault escalation | Very High |
| Timer | High | Scheduler inaccuracy, timing errors | High |
| Disk I/O Complete | Medium | Process blocked longer | Medium |
| Network Packet | Medium | Increased latency, possible packet loss | Medium |
| Keyboard/Mouse | Low-Medium | User perceived delay | Medium-Low |
| USB Device | Low | Device response delay | Low |
High priority means faster service but can starve lower-priority interrupts. Too many high-priority interrupts can cause livelock (CPU can't run user code). Priority assignment is a balance between responsiveness for critical events and fairness for routine ones.
The Intel 8259A Programmable Interrupt Controller implements a fixed priority scheme by default. Understanding this legacy architecture illuminates fundamental priority concepts.
Default Priority Ordering:
In the default mode, IRQ numbers directly determine priority:
The priority ordering for the cascaded PC/AT configuration:
IRQ0 > IRQ1 > IRQ2(8-15) > IRQ3 > IRQ4 > IRQ5 > IRQ6 > IRQ7
Within the slave (IRQ2):
IRQ8 > IRQ9 > IRQ10 > IRQ11 > IRQ12 > IRQ13 > IRQ14 > IRQ15
| Rank | IRQ | Default Device | Priority Reason |
|---|---|---|---|
| 1 | IRQ0 | System Timer | Critical for timekeeping, scheduling |
| 2 | IRQ1 | Keyboard | User input must be responsive |
| 3 | IRQ8 | Real-Time Clock | Time-critical periodic events |
| 4 | IRQ9 | Available / ACPI | System management events |
| 5 | IRQ10 | Available | PCI devices |
| 6 | IRQ11 | Available | PCI devices |
| 7 | IRQ12 | PS/2 Mouse | User input |
| 8 | IRQ13 | FPU | Math coprocessor errors |
| 9 | IRQ14 | Primary IDE | Disk operations |
| 10 | IRQ15 | Secondary IDE | Secondary disk/CD-ROM |
| 11 | IRQ3 | COM2/4 | Serial communication |
| 12 | IRQ4 | COM1/3 | Serial/modem |
| 13 | IRQ5 | LPT2/Sound | Audio/parallel port |
| 14 | IRQ6 | Floppy | Floppy disk (legacy) |
| 15 | IRQ7 | LPT1 | Printer (lowest priority) |
How PIC Priority Works:
The 8259A uses the In-Service Register (ISR) to implement priority:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051
// 8259A PIC priority modes and configuration // The PIC supports several priority modes via OCW2 (Operation Command Word 2) // Fully Nested Mode (Default)// - IRQ0 highest, IRQ7 lowest// - While servicing IRQn, only IRQ0 through IRQ(n-1) can interrupt#define PIC_OCW2_FULLY_NESTED 0x00 // Specific EOI - Clear specific ISR bit// Allows more control over when lower-priority IRQs can interruptvoid pic_specific_eoi(uint8_t irq) { uint8_t ocw2 = 0x60 | (irq & 0x07); // Specific EOI for IRQn if (irq >= 8) { outb(PIC2_COMMAND, 0x60 | ((irq - 8) & 0x07)); outb(PIC1_COMMAND, 0x60 | 0x02); // EOI for cascade line } else { outb(PIC1_COMMAND, ocw2); }} // Rotating Priority (Automatic Rotation)// - After servicing IRQn, IRQn becomes lowest priority// - Provides more fairness, prevents starvationvoid pic_enable_rotating_priority(void) { // Set rotate on non-specific EOI mode outb(PIC1_COMMAND, 0xA0); // Rotate on non-specific EOI outb(PIC2_COMMAND, 0xA0);} // Special Mask Mode// - Allows selective masking of higher-priority interrupts// - Useful for long-running handlers that should respect certain IRQsvoid pic_enable_special_mask(void) { // Set Special Mask Mode outb(PIC1_COMMAND, 0x68); // OCW3: Set special mask mode outb(PIC2_COMMAND, 0x68);} // Polling Mode// - Disable automatic interrupt signaling// - Software polls PIC for pending interrupts// - Gives OS full control over interrupt orderinguint8_t pic_poll(void) { outb(PIC1_COMMAND, 0x0C); // OCW3: Poll command uint8_t status = inb(PIC1_COMMAND); if (status & 0x80) { return status & 0x07; // Return highest-priority pending IRQ } return 0xFF; // No interrupt pending}The 8259A's fixed priority requires careful IRQ assignment. If a device that generates many interrupts is on a high-priority IRQ, lower-priority devices may starve. The APIC's programmable priority levels address this limitation.
The Advanced Programmable Interrupt Controller (APIC) provides a sophisticated, programmable priority system that addresses the limitations of the 8259A PIC.
APIC Priority Concepts:
The APIC uses a 256-level priority scheme based directly on the interrupt vector number:
Higher vector numbers = Higher priority. Vector 255 is highest; vector 16 is lowest (vectors 0-15 are invalid for APIC).
| Priority Class | Vector Range | Typical Use |
|---|---|---|
| 15 (Highest) | 0xF0-0xFF | Spurious interrupt, critical system |
| 14 | 0xE0-0xEF | High-priority device interrupts |
| 13 | 0xD0-0xDF | Device interrupts |
| 12 | 0xC0-0xCF | Device interrupts |
| 11 | 0xB0-0xBF | Device interrupts |
| 10 | 0xA0-0xAF | Device interrupts |
| 9 | 0x90-0x9F | Device interrupts |
| 8 | 0x80-0x8F | Device interrupts |
| 7 | 0x70-0x7F | Device interrupts |
| 6 | 0x60-0x6F | Device interrupts |
| 5 | 0x50-0x5F | Device interrupts |
| 4 | 0x40-0x4F | Device interrupts |
| 3 | 0x30-0x3F | Low-priority devices, IPI |
| 2 | 0x20-0x2F | CPU exceptions (remapped PIC) |
| 1 | 0x10-0x1F | Invalid (reserved for exceptions) |
| 0 (Lowest) | 0x00-0x0F | Invalid (CPU exceptions) |
Task Priority Register (TPR):
The Local APIC's TPR sets a priority threshold for the current CPU:
This allows fine-grained interrupt throttling without the global effect of disabling all interrupts.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162
// APIC priority management // Local APIC register offsets (memory-mapped)#define LAPIC_TPR 0x080 // Task Priority Register#define LAPIC_PPR 0x0A0 // Processor Priority Register (read-only)#define LAPIC_EOI 0x0B0 // End of Interrupt Register#define LAPIC_ISR_BASE 0x100 // In-Service Register (8 32-bit registers)#define LAPIC_IRR_BASE 0x200 // Interrupt Request Register#define LAPIC_ICR_LOW 0x300 // Interrupt Command Register (low)#define LAPIC_ICR_HIGH 0x310 // Interrupt Command Register (high) // LAPIC base address (memory-mapped, typically 0xFEE00000)#define LAPIC_BASE ((volatile uint32_t*)0xFEE00000) // Read/write LAPIC registersstatic inline uint32_t lapic_read(uint32_t offset) { return LAPIC_BASE[offset / 4];} static inline void lapic_write(uint32_t offset, uint32_t value) { LAPIC_BASE[offset / 4] = value;} // Set Task Priority Register - block interrupts below this priorityvoid lapic_set_priority(uint8_t priority_class) { // TPR format: bits 7:4 = priority class, bits 3:0 = subpriority // Setting only the class effectively blocks all vectors in lower classes lapic_write(LAPIC_TPR, (uint32_t)priority_class << 4);} // Get current Processor Priority (what CPU is using for decisions)uint8_t lapic_get_processor_priority(void) { return (lapic_read(LAPIC_PPR) >> 4) & 0x0F;} // Block all interrupts (maximum TPR)void lapic_disable_interrupts(void) { lapic_set_priority(15);} // Allow all interrupts (minimum TPR)void lapic_enable_interrupts(void) { lapic_set_priority(0);} // Example: Run critical section with elevated priorityvoid run_with_elevated_priority(void (*func)(void)) { uint8_t old_priority = lapic_read(LAPIC_TPR); lapic_set_priority(14); // Block most interrupts func(); // Execute critical section lapic_write(LAPIC_TPR, old_priority); // Restore} // Vector assignment strategy for optimal priority#define VECTOR_TIMER 0xF0 // Highest device priority#define VECTOR_IPI 0xE0 // Inter-processor interrupts#define VECTOR_NIC 0x80 // Network - medium priority#define VECTOR_DISK 0x70 // Disk - medium-low#define VECTOR_USB 0x50 // USB - low#define VECTOR_KEYBOARD 0x40 // Keyboard/mouse - lowest deviceInterrupt masking is the ability to selectively enable or disable interrupts. Masking is essential for protecting critical sections, preventing race conditions, and managing interrupt load.
Levels of Masking:
Interrupt masking operates at multiple levels, each with different granularity and overhead:
| Level | Mechanism | Scope | Overhead | Use Case |
|---|---|---|---|---|
| CPU | CLI/STI (IF flag) | All maskable interrupts, this CPU | Very low (~1 cycle) | Brief critical sections |
| CPU | TPR (APIC) | Per-priority class, this CPU | Low (~10 cycles) | Priority-based blocking |
| Device | Device-specific register | Single device | Medium (I/O) | Device-specific control |
| PIC/APIC | IMR/LVT mask bit | Single IRQ line | Medium (I/O) | Disable a specific interrupt |
| APIC | Destination mask | Per-CPU delivery | Medium | CPU affinity control |
CLI (Clear Interrupt Flag) and STI (Set Interrupt Flag) provide the fastest masking mechanism:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051
// CPU-level interrupt masking // Disable all maskable interruptsstatic inline void cli(void) { asm volatile("cli" ::: "memory");} // Enable all maskable interruptsstatic inline void sti(void) { asm volatile("sti" ::: "memory");} // Save flags and disable interrupts - returns previous statestatic inline unsigned long irq_save(void) { unsigned long flags; asm volatile( "pushfq " "pop %0 " "cli " : "=r"(flags) : : "memory" ); return flags;} // Restore previous interrupt statestatic inline void irq_restore(unsigned long flags) { asm volatile( "push %0 " "popfq " : : "r"(flags) : "memory", "cc" );} // Safe critical section pattern (nestable)void critical_section(void) { unsigned long flags = irq_save(); // May already be disabled // Critical code here... // No interrupts can preempt this irq_restore(flags); // Restore previous state}Keep interrupt-disabled sections as short as possible. Long CLI periods cause interrupt latency, missed timer ticks, and poor system responsiveness. Linux warns if interrupts are disabled for more than a few hundred microseconds.
Priority inversion occurs when a high-priority task is blocked waiting for a resource held by a low-priority task. This effectively inverts the intended priority ordering and can cause serious problems, including system hangs.
Classic Interrupt Priority Inversion Scenario:
This is why interrupt handlers typically avoid locks, use lock-free data structures, or use special spinlocks with interrupt masking.
In 1997, the Mars Pathfinder spacecraft experienced repeated system resets due to priority inversion. A low-priority information gathering task held a mutex needed by a high-priority bus management task. A medium-priority communications task would run, preventing the low-priority task from completing and releasing the mutex. The fix was enabling priority inheritance in the real-time OS.
Solutions to Priority Inversion:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748
// Spinlock with interrupt disabling - prevents priority inversion typedef struct { volatile int locked;} spinlock_t; // Acquire spinlock, saving and disabling interruptsvoid spin_lock_irqsave(spinlock_t *lock, unsigned long *flags) { *flags = irq_save(); // Disable interrupts, save previous state // Spin until we acquire the lock while (__atomic_test_and_set(&lock->locked, __ATOMIC_ACQUIRE)) { // Hint to CPU that we're spinning asm volatile("pause" ::: "memory"); }} // Release spinlock, restoring interrupt statevoid spin_unlock_irqrestore(spinlock_t *lock, unsigned long flags) { __atomic_clear(&lock->locked, __ATOMIC_RELEASE); irq_restore(flags);} // Usage example: protecting shared data accessed by interrupt handlerstatic spinlock_t device_lock;static struct device_state shared_state; // Called from interrupt handlervoid device_interrupt_handler(void) { unsigned long flags; spin_lock_irqsave(&device_lock, &flags); // Access shared_state safely shared_state.receive_count++; spin_unlock_irqrestore(&device_lock, flags);} // Called from process context (can be preempted)void device_read_stats(struct device_stats *stats) { unsigned long flags; spin_lock_irqsave(&device_lock, &flags); // No interrupt can preempt and modify while we read stats->receive_count = shared_state.receive_count; spin_unlock_irqrestore(&device_lock, flags);}Operating systems implement interrupt priority at multiple levels, combining hardware priority with software policy to achieve responsiveness, fairness, and stability.
Linux Interrupt Processing Model:
Linux divides interrupt handling into layers with different priorities:
| Layer | Priority | Context | Can Sleep? | Typical Use |
|---|---|---|---|---|
| NMI | Highest | NMI | No | Watchdog, profiling, fatal errors |
| Hard IRQ | Very High | Interrupt | No | Device interrupt acknowledgment |
| Soft IRQ | High | Softirq | No | Network packet processing, timers |
| Tasklet | High | Softirq | No | Per-device deferred processing |
| Workqueue | Medium | Process | Yes | Extended device processing |
| Kernel Thread | Varies | Process | Yes | Background kernel work |
| User Process | Lowest | Process | Yes | Application code |
Windows Interrupt Request Levels (IRQLs):
Windows defines explicit Interrupt Request Levels (IRQLs) that formalize priority:
Code running at a given IRQL can only be preempted by interrupts at a higher IRQL.
When assigning interrupt vectors in custom systems: place time-critical devices (NIC, storage) in high priority classes (0xA0-0xF0), place interactive devices (keyboard, mouse, GPU) in medium classes (0x50-0x80), and place background devices (sensors, USB) in lower classes (0x30-0x50). Leave 0xF0-0xFF for system-critical interrupts.
Real-Time Systems have strict timing requirements where interrupt latency must be bounded and predictable. This requires special priority management beyond what general-purpose systems provide.
Hard Real-Time Requirements:
| Application | Max Latency | Consequence of Failure | System Type |
|---|---|---|---|
| Anti-lock Brakes (ABS) | < 100 μs | Vehicle collision | Hard real-time |
| Industrial Robot Control | < 1 ms | Damaged products, injury | Hard real-time |
| Audio Processing | < 5 ms | Audible glitches | Soft real-time |
| Video Streaming | < 33 ms | Dropped frames | Soft real-time |
| Desktop UI | < 50 ms | User-perceived lag | Best-effort |
| Background Download | < 1 s | Slower throughput | Best-effort |
Real-Time Linux (PREEMPT_RT):
The PREEMPT_RT patch transforms Linux into a hard real-time system by:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950
// Threaded interrupt handler (Linux PREEMPT_RT-friendly) // Minimal hard IRQ handler - just checks if device interrupted usstatic irqreturn_t device_hard_irq(int irq, void *dev_id) { struct my_device *dev = dev_id; // Quick check: did OUR device interrupt? if (!(device_read_status(dev) & INTERRUPT_PENDING)) { return IRQ_NONE; // Not our interrupt } // Mask device interrupt to prevent immediate re-fire device_mask_interrupt(dev); // Return: schedule threaded handler return IRQ_WAKE_THREAD;} // Threaded handler - runs in schedulable kernel thread contextstatic irqreturn_t device_threaded_irq(int irq, void *dev_id) { struct my_device *dev = dev_id; // Full interrupt processing - can take time // Runs with interrupts enabled, can be preempted by higher-priority threads while (device_has_data(dev)) { struct data_packet *pkt = device_read_packet(dev); process_packet(dev, pkt); } // Re-enable device interrupt device_unmask_interrupt(dev); return IRQ_HANDLED;} // Register threaded interrupt handlerint setup_device_interrupt(struct my_device *dev) { return request_threaded_irq( dev->irq, // IRQ number device_hard_irq, // Fast hardirq handler device_threaded_irq, // Threaded handler IRQF_SHARED, // Flags "my_device", // Name dev // Device pointer );} // Benefit: The threaded handler can be assigned a real-time priority// to ensure timely processing without blocking other interruptsPREEMPT_RT Linux achieves worst-case latencies under 100 microseconds on commodity hardware—suitable for many real-time applications. For sub-microsecond requirements, dedicated RTOS like VxWorks, QNX, or bare-metal firmware is needed.
We've explored interrupt priority—the mechanisms that ensure critical events receive timely attention while maintaining system stability. From hardware priority in PICs to software priority policies in operating systems, priority management is essential for responsive, reliable systems.
Module Complete:
You have now completed the Interrupts and Exceptions module. You've learned:
These concepts form the foundation of how operating systems interact with hardware and respond to exceptional conditions—essential knowledge for kernel development, driver programming, and systems debugging.
Congratulations! You've completed the Interrupts and Exceptions module. You now have a deep understanding of one of the most fundamental mechanisms in computer systems—the interrupt architecture that enables responsive, efficient computing. This knowledge is essential for any engineer working on operating systems, embedded systems, or low-level software development.