Interrupts And Exceptions - Learning Module

Loading content...

0/240

Interrupt Priority

The Triage Problem: Which Interrupt First?

A keyboard keystroke arrives. A network packet is received. The timer fires. A disk read completes. A power failure is detected. All within the same microsecond.

Which interrupt should the CPU service first?

This is the interrupt priority problem. Not all interrupts are equally urgent. A power failure warning demands immediate attention—the system has milliseconds to save state. A keyboard keystroke can wait a few milliseconds without anyone noticing. The network packet is somewhere in between.

Interrupt priority is the mechanism that ensures the most critical events are handled first, that less important interrupts don't starve out more important ones, and that the interrupt handling system itself doesn't collapse under load.

What You Will Learn

By the end of this page, you will understand the complete interrupt priority architecture: hardware priority mechanisms in PICs and APICs, the concept of interrupt masks, priority-based preemption, and how operating systems manage priority to ensure responsiveness while preventing starvation. You'll learn practical priority assignment strategies used in production systems.

The Need for Interrupt Priority

Without priority, interrupts would be handled in simple arrival order (FIFO). This seems fair, but causes serious problems:

Scenario: Low-Priority Interrupt Blocks Critical Event

Network device generates interrupt (moderate priority)
Handler begins processing packets—takes several milliseconds
Power failure detected (critical priority)—but interrupts are disabled during handler!
By the time network handler finishes, power is lost
System crashes without saving state

Priority solves this by allowing:

Critical interrupts to preempt lower-priority handlers
The system to prioritize time-sensitive events
Graceful handling of interrupt storms (ignore low-priority floods)

Interrupt Urgency Examples
Source	Urgency	Consequence of Delay	Typical Priority
NMI / Power Failure	Critical	Data loss, system damage	Highest (unmaskable)
Machine Check	Critical	Hardware fault escalation	Very High
Timer	High	Scheduler inaccuracy, timing errors	High
Disk I/O Complete	Medium	Process blocked longer	Medium
Network Packet	Medium	Increased latency, possible packet loss	Medium
Keyboard/Mouse	Low-Medium	User perceived delay	Medium-Low
USB Device	Low	Device response delay	Low

Priority is a Tradeoff

High priority means faster service but can starve lower-priority interrupts. Too many high-priority interrupts can cause livelock (CPU can't run user code). Priority assignment is a balance between responsiveness for critical events and fairness for routine ones.

8259 PIC Priority Mechanism

The Intel 8259A Programmable Interrupt Controller implements a fixed priority scheme by default. Understanding this legacy architecture illuminates fundamental priority concepts.

Default Priority Ordering:

In the default mode, IRQ numbers directly determine priority:

IRQ0 = Highest priority
IRQ7 = Lowest priority (on master)
IRQ8 = Higher than IRQ3-7 (slave connects through IRQ2)

The priority ordering for the cascaded PC/AT configuration:

IRQ0 > IRQ1 > IRQ2(8-15) > IRQ3 > IRQ4 > IRQ5 > IRQ6 > IRQ7

Within the slave (IRQ2):

IRQ8 > IRQ9 > IRQ10 > IRQ11 > IRQ12 > IRQ13 > IRQ14 > IRQ15

PC/AT IRQ Priority (Highest to Lowest)
Rank	IRQ	Default Device	Priority Reason
1	IRQ0	System Timer	Critical for timekeeping, scheduling
2	IRQ1	Keyboard	User input must be responsive
3	IRQ8	Real-Time Clock	Time-critical periodic events
4	IRQ9	Available / ACPI	System management events
5	IRQ10	Available	PCI devices
6	IRQ11	Available	PCI devices
7	IRQ12	PS/2 Mouse	User input
8	IRQ13	FPU	Math coprocessor errors
9	IRQ14	Primary IDE	Disk operations
10	IRQ15	Secondary IDE	Secondary disk/CD-ROM
11	IRQ3	COM2/4	Serial communication
12	IRQ4	COM1/3	Serial/modem
13	IRQ5	LPT2/Sound	Audio/parallel port
14	IRQ6	Floppy	Floppy disk (legacy)
15	IRQ7	LPT1	Printer (lowest priority)

How PIC Priority Works:

The 8259A uses the In-Service Register (ISR) to implement priority:

When an interrupt is acknowledged, its bit is set in the ISR
While an ISR bit is set, lower-priority IRQs are blocked
Higher-priority IRQs can still interrupt
When handler sends EOI, ISR bit is cleared
Blocked lower-priority IRQs can now be serviced

pic_priority.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
// 8259A PIC priority modes and configuration
 
// The PIC supports several priority modes via OCW2 (Operation Command Word 2)
 
// Fully Nested Mode (Default)
// - IRQ0 highest, IRQ7 lowest
// - While servicing IRQn, only IRQ0 through IRQ(n-1) can interrupt
#define PIC_OCW2_FULLY_NESTED     0x00
 
// Specific EOI - Clear specific ISR bit
// Allows more control over when lower-priority IRQs can interrupt
void pic_specific_eoi(uint8_t irq) {
    uint8_t ocw2 = 0x60 | (irq & 0x07);  // Specific EOI for IRQn
    if (irq >= 8) {
        outb(PIC2_COMMAND, 0x60 | ((irq - 8) & 0x07));
        outb(PIC1_COMMAND, 0x60 | 0x02);  // EOI for cascade line
    } else {
        outb(PIC1_COMMAND, ocw2);
    }
}
 
// Rotating Priority (Automatic Rotation)
// - After servicing IRQn, IRQn becomes lowest priority
// - Provides more fairness, prevents starvation
void pic_enable_rotating_priority(void) {
    // Set rotate on non-specific EOI mode
    outb(PIC1_COMMAND, 0xA0);  // Rotate on non-specific EOI
    outb(PIC2_COMMAND, 0xA0);
}
 
// Special Mask Mode
// - Allows selective masking of higher-priority interrupts
// - Useful for long-running handlers that should respect certain IRQs
void pic_enable_special_mask(void) {
    // Set Special Mask Mode
    outb(PIC1_COMMAND, 0x68);  // OCW3: Set special mask mode
    outb(PIC2_COMMAND, 0x68);
}
 
// Polling Mode
// - Disable automatic interrupt signaling
// - Software polls PIC for pending interrupts
// - Gives OS full control over interrupt ordering
uint8_t pic_poll(void) {
    outb(PIC1_COMMAND, 0x0C);  // OCW3: Poll command
    uint8_t status = inb(PIC1_COMMAND);
    if (status & 0x80) {
        return status & 0x07;  // Return highest-priority pending IRQ
    }
    return 0xFF;  // No interrupt pending
}

PIC Priority Limitations

The 8259A's fixed priority requires careful IRQ assignment. If a device that generates many interrupts is on a high-priority IRQ, lower-priority devices may starve. The APIC's programmable priority levels address this limitation.

APIC Priority Architecture

The Advanced Programmable Interrupt Controller (APIC) provides a sophisticated, programmable priority system that addresses the limitations of the 8259A PIC.

APIC Priority Concepts:

The APIC uses a 256-level priority scheme based directly on the interrupt vector number:

Priority Class: Bits 7:4 of the vector (16 classes)
Sub-priority: Bits 3:0 of the vector (16 levels per class)

Higher vector numbers = Higher priority. Vector 255 is highest; vector 16 is lowest (vectors 0-15 are invalid for APIC).

APIC Priority Classes
Priority Class	Vector Range	Typical Use
15 (Highest)	0xF0-0xFF	Spurious interrupt, critical system
14	0xE0-0xEF	High-priority device interrupts
13	0xD0-0xDF	Device interrupts
12	0xC0-0xCF	Device interrupts
11	0xB0-0xBF	Device interrupts
10	0xA0-0xAF	Device interrupts
9	0x90-0x9F	Device interrupts
8	0x80-0x8F	Device interrupts
7	0x70-0x7F	Device interrupts
6	0x60-0x6F	Device interrupts
5	0x50-0x5F	Device interrupts
4	0x40-0x4F	Device interrupts
3	0x30-0x3F	Low-priority devices, IPI
2	0x20-0x2F	CPU exceptions (remapped PIC)
1	0x10-0x1F	Invalid (reserved for exceptions)
0 (Lowest)	0x00-0x0F	Invalid (CPU exceptions)

Task Priority Register (TPR):

The Local APIC's TPR sets a priority threshold for the current CPU:

If TPR = N, interrupts with priority class ≤ N are blocked
Only interrupts with priority class > TPR are delivered
Setting TPR = 0 allows all interrupts
Setting TPR = 15 blocks all maskable interrupts (equivalent to CLI)

This allows fine-grained interrupt throttling without the global effect of disabling all interrupts.

apic_priority.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
// APIC priority management
 
// Local APIC register offsets (memory-mapped)
#define LAPIC_TPR       0x080  // Task Priority Register
#define LAPIC_PPR       0x0A0  // Processor Priority Register (read-only)
#define LAPIC_EOI       0x0B0  // End of Interrupt Register
#define LAPIC_ISR_BASE  0x100  // In-Service Register (8 32-bit registers)
#define LAPIC_IRR_BASE  0x200  // Interrupt Request Register
#define LAPIC_ICR_LOW   0x300  // Interrupt Command Register (low)
#define LAPIC_ICR_HIGH  0x310  // Interrupt Command Register (high)
 
// LAPIC base address (memory-mapped, typically 0xFEE00000)
#define LAPIC_BASE  ((volatile uint32_t*)0xFEE00000)
 
// Read/write LAPIC registers
static inline uint32_t lapic_read(uint32_t offset) {
    return LAPIC_BASE[offset / 4];
}
 
static inline void lapic_write(uint32_t offset, uint32_t value) {
    LAPIC_BASE[offset / 4] = value;
}
 
// Set Task Priority Register - block interrupts below this priority
void lapic_set_priority(uint8_t priority_class) {
    // TPR format: bits 7:4 = priority class, bits 3:0 = subpriority
    // Setting only the class effectively blocks all vectors in lower classes
    lapic_write(LAPIC_TPR, (uint32_t)priority_class << 4);
}
 
// Get current Processor Priority (what CPU is using for decisions)
uint8_t lapic_get_processor_priority(void) {
    return (lapic_read(LAPIC_PPR) >> 4) & 0x0F;
}
 
// Block all interrupts (maximum TPR)
void lapic_disable_interrupts(void) {
    lapic_set_priority(15);
}
 
// Allow all interrupts (minimum TPR)
void lapic_enable_interrupts(void) {
    lapic_set_priority(0);
}
 
// Example: Run critical section with elevated priority
void run_with_elevated_priority(void (*func)(void)) {
    uint8_t old_priority = lapic_read(LAPIC_TPR);
    lapic_set_priority(14);  // Block most interrupts
    
    func();  // Execute critical section
    
    lapic_write(LAPIC_TPR, old_priority);  // Restore
}
 
// Vector assignment strategy for optimal priority
#define VECTOR_TIMER     0xF0  // Highest device priority
#define VECTOR_IPI       0xE0  // Inter-processor interrupts
#define VECTOR_NIC       0x80  // Network - medium priority
#define VECTOR_DISK      0x70  // Disk - medium-low
#define VECTOR_USB       0x50  // USB - low
#define VECTOR_KEYBOARD  0x40  // Keyboard/mouse - lowest device

Interrupt Masking: Selective Interrupt Control

Interrupt masking is the ability to selectively enable or disable interrupts. Masking is essential for protecting critical sections, preventing race conditions, and managing interrupt load.

Levels of Masking:

Interrupt masking operates at multiple levels, each with different granularity and overhead:

Interrupt Masking Levels
Level	Mechanism	Scope	Overhead	Use Case
CPU	CLI/STI (IF flag)	All maskable interrupts, this CPU	Very low (~1 cycle)	Brief critical sections
CPU	TPR (APIC)	Per-priority class, this CPU	Low (~10 cycles)	Priority-based blocking
Device	Device-specific register	Single device	Medium (I/O)	Device-specific control
PIC/APIC	IMR/LVT mask bit	Single IRQ line	Medium (I/O)	Disable a specific interrupt
APIC	Destination mask	Per-CPU delivery	Medium	CPU affinity control

CLI (Clear Interrupt Flag) and STI (Set Interrupt Flag) provide the fastest masking mechanism:

cpu_masking.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
// CPU-level interrupt masking
 
// Disable all maskable interrupts
static inline void cli(void) {
    asm volatile("cli" ::: "memory");
}
 
// Enable all maskable interrupts
static inline void sti(void) {
    asm volatile("sti" ::: "memory");
}
 
// Save flags and disable interrupts - returns previous state
static inline unsigned long irq_save(void) {
    unsigned long flags;
    asm volatile(
        "pushfq          
"
        "pop %0          
"
        "cli             
"
        : "=r"(flags)
        :
        : "memory"
    );
    return flags;
}
 
// Restore previous interrupt state
static inline void irq_restore(unsigned long flags) {
    asm volatile(
        "push %0         
"
        "popfq           
"
        :
        : "r"(flags)
        : "memory", "cc"
    );
}
 
// Safe critical section pattern (nestable)
void critical_section(void) {
    unsigned long flags = irq_save();  // May already be disabled
    
    // Critical code here...
    // No interrupts can preempt this
    
    irq_restore(flags);  // Restore previous state
}

CLI Duration

Keep interrupt-disabled sections as short as possible. Long CLI periods cause interrupt latency, missed timer ticks, and poor system responsiveness. Linux warns if interrupts are disabled for more than a few hundred microseconds.

Priority Inversion: When Priority Goes Wrong

Priority inversion occurs when a high-priority task is blocked waiting for a resource held by a low-priority task. This effectively inverts the intended priority ordering and can cause serious problems, including system hangs.

Classic Interrupt Priority Inversion Scenario:

Low-priority interrupt handler A begins, acquires lock L
High-priority interrupt B fires, preempts A
Handler B needs lock L—it's held by A
B spins waiting for L
A cannot run (B has higher priority and is running)
System is deadlocked

This is why interrupt handlers typically avoid locks, use lock-free data structures, or use special spinlocks with interrupt masking.

The Mars Pathfinder Incident

In 1997, the Mars Pathfinder spacecraft experienced repeated system resets due to priority inversion. A low-priority information gathering task held a mutex needed by a high-priority bus management task. A medium-priority communications task would run, preventing the low-priority task from completing and releasing the mutex. The fix was enabling priority inheritance in the real-time OS.

Solutions to Priority Inversion:

Priority Inversion Prevention

•Priority Inheritance: When high-priority task blocks on low-priority task's lock, temporarily boost low-priority task's priority to that of the waiting task
•Priority Ceiling: Assign each lock a ceiling priority; task holding lock runs at ceiling priority, preventing intermediate-priority tasks from preempting
•Lock-Free Data Structures: Use atomic operations and wait-free algorithms that don't require locks
•Disable Preemption While Holding Lock: Prevent higher-priority interrupts during critical section (simple but limits concurrency)
•Per-CPU Data: Avoid sharing data between CPUs/interrupt handlers when possible, eliminating need for locks

spinlock_irq.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
// Spinlock with interrupt disabling - prevents priority inversion
 
typedef struct {
    volatile int locked;
} spinlock_t;
 
// Acquire spinlock, saving and disabling interrupts
void spin_lock_irqsave(spinlock_t *lock, unsigned long *flags) {
    *flags = irq_save();  // Disable interrupts, save previous state
    
    // Spin until we acquire the lock
    while (__atomic_test_and_set(&lock->locked, __ATOMIC_ACQUIRE)) {
        // Hint to CPU that we're spinning
        asm volatile("pause" ::: "memory");
    }
}
 
// Release spinlock, restoring interrupt state
void spin_unlock_irqrestore(spinlock_t *lock, unsigned long flags) {
    __atomic_clear(&lock->locked, __ATOMIC_RELEASE);
    irq_restore(flags);
}
 
// Usage example: protecting shared data accessed by interrupt handler
static spinlock_t device_lock;
static struct device_state shared_state;
 
// Called from interrupt handler
void device_interrupt_handler(void) {
    unsigned long flags;
    spin_lock_irqsave(&device_lock, &flags);
    
    // Access shared_state safely
    shared_state.receive_count++;
    
    spin_unlock_irqrestore(&device_lock, flags);
}
 
// Called from process context (can be preempted)
void device_read_stats(struct device_stats *stats) {
    unsigned long flags;
    spin_lock_irqsave(&device_lock, &flags);
    
    // No interrupt can preempt and modify while we read
    stats->receive_count = shared_state.receive_count;
    
    spin_unlock_irqrestore(&device_lock, flags);
}

Operating System Priority Policies

Operating systems implement interrupt priority at multiple levels, combining hardware priority with software policy to achieve responsiveness, fairness, and stability.

Linux Interrupt Processing Model:

Linux divides interrupt handling into layers with different priorities:

Converting Mermaid diagram...

Linux Interrupt Priority Layers
Layer	Priority	Context	Can Sleep?	Typical Use
NMI	Highest	NMI	No	Watchdog, profiling, fatal errors
Hard IRQ	Very High	Interrupt	No	Device interrupt acknowledgment
Soft IRQ	High	Softirq	No	Network packet processing, timers
Tasklet	High	Softirq	No	Per-device deferred processing
Workqueue	Medium	Process	Yes	Extended device processing
Kernel Thread	Varies	Process	Yes	Background kernel work
User Process	Lowest	Process	Yes	Application code

Windows Interrupt Request Levels (IRQLs):

Windows defines explicit Interrupt Request Levels (IRQLs) that formalize priority:

PASSIVE_LEVEL (0): Normal thread execution
APC_LEVEL (1): Asynchronous Procedure Calls
DISPATCH_LEVEL (2): Scheduler, DPC
DEVICE_LEVEL (3-26): Device interrupts
PROFILE_LEVEL (27): Profiling timer
CLOCK_LEVEL (28): Clock interrupt
IPI_LEVEL (29): Inter-processor interrupt
POWER_LEVEL (30): Power failure
HIGH_LEVEL (31): Machine check, NMI

Code running at a given IRQL can only be preempted by interrupts at a higher IRQL.

Choosing Vector Numbers

When assigning interrupt vectors in custom systems: place time-critical devices (NIC, storage) in high priority classes (0xA0-0xF0), place interactive devices (keyboard, mouse, GPU) in medium classes (0x50-0x80), and place background devices (sensors, USB) in lower classes (0x30-0x50). Leave 0xF0-0xFF for system-critical interrupts.

Real-Time Interrupt Priority

Real-Time Systems have strict timing requirements where interrupt latency must be bounded and predictable. This requires special priority management beyond what general-purpose systems provide.

Hard Real-Time Requirements:

Maximum interrupt latency: guaranteed upper bound (microseconds)
Jitter: minimal variation in response time
Priority: strictly enforced, no inversion
Determinism: same inputs always produce same timing

Interrupt Latency Requirements by Domain
Application	Max Latency	Consequence of Failure	System Type
Anti-lock Brakes (ABS)	< 100 μs	Vehicle collision	Hard real-time
Industrial Robot Control	< 1 ms	Damaged products, injury	Hard real-time
Audio Processing	< 5 ms	Audible glitches	Soft real-time
Video Streaming	< 33 ms	Dropped frames	Soft real-time
Desktop UI	< 50 ms	User-perceived lag	Best-effort
Background Download	< 1 s	Slower throughput	Best-effort

Real-Time Linux (PREEMPT_RT):

The PREEMPT_RT patch transforms Linux into a hard real-time system by:

Converting spinlocks to sleeping locks: Most spinlocks become rt_mutex with priority inheritance
Threaded interrupt handlers: Hardware ISRs are minimal; processing runs in schedulable kernel threads
Fully preemptible kernel: Nearly all kernel code can be preempted
High-resolution timers: Microsecond-accurate timekeeping
Priority inheritance everywhere: Prevents priority inversion in all lock types

threaded_irq.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
// Threaded interrupt handler (Linux PREEMPT_RT-friendly)
 
// Minimal hard IRQ handler - just checks if device interrupted us
static irqreturn_t device_hard_irq(int irq, void *dev_id) {
    struct my_device *dev = dev_id;
    
    // Quick check: did OUR device interrupt?
    if (!(device_read_status(dev) & INTERRUPT_PENDING)) {
        return IRQ_NONE;  // Not our interrupt
    }
    
    // Mask device interrupt to prevent immediate re-fire
    device_mask_interrupt(dev);
    
    // Return: schedule threaded handler
    return IRQ_WAKE_THREAD;
}
 
// Threaded handler - runs in schedulable kernel thread context
static irqreturn_t device_threaded_irq(int irq, void *dev_id) {
    struct my_device *dev = dev_id;
    
    // Full interrupt processing - can take time
    // Runs with interrupts enabled, can be preempted by higher-priority threads
    
    while (device_has_data(dev)) {
        struct data_packet *pkt = device_read_packet(dev);
        process_packet(dev, pkt);
    }
    
    // Re-enable device interrupt
    device_unmask_interrupt(dev);
    
    return IRQ_HANDLED;
}
 
// Register threaded interrupt handler
int setup_device_interrupt(struct my_device *dev) {
    return request_threaded_irq(
        dev->irq,              // IRQ number
        device_hard_irq,       // Fast hardirq handler
        device_threaded_irq,   // Threaded handler
        IRQF_SHARED,           // Flags
        "my_device",           // Name
        dev                    // Device pointer
    );
}
 
// Benefit: The threaded handler can be assigned a real-time priority
// to ensure timely processing without blocking other interrupts

Real-Time in Practice

PREEMPT_RT Linux achieves worst-case latencies under 100 microseconds on commodity hardware—suitable for many real-time applications. For sub-microsecond requirements, dedicated RTOS like VxWorks, QNX, or bare-metal firmware is needed.

Summary: Interrupt Priority

We've explored interrupt priority—the mechanisms that ensure critical events receive timely attention while maintaining system stability. From hardware priority in PICs to software priority policies in operating systems, priority management is essential for responsive, reliable systems.

Key Takeaways

•Priority ensures critical events are handled first — Not all interrupts are equally urgent; priority ordering prevents critical events from being delayed by routine ones.
•8259A PIC uses fixed IRQ-based priority — Lower IRQ numbers have higher priority; this fixed scheme requires careful device assignment.
•APIC provides 256-level programmable priority — Vector number determines priority class; TPR allows dynamic priority thresholding.
•Interrupt masking operates at multiple levels — From global CLI to per-device masks, each offers different granularity and overhead tradeoffs.
•Priority inversion must be prevented — High-priority handlers blocked by low-priority lock holders cause system hangs; spinlock_irq and priority inheritance are solutions.
•OSes layer interrupt priority in software — Linux separates hard IRQ, soft IRQ, tasklets, and workqueues; Windows uses explicit IRQLs.
•Real-time systems require bounded latency — PREEMPT_RT, threaded interrupts, and priority inheritance enable hard real-time response in Linux.

Module Complete:

You have now completed the Interrupts and Exceptions module. You've learned:

Hardware interrupts: External signals from devices
Software interrupts: Traps, faults, and aborts from executing code
Interrupt handling: CPU state saving, privilege transitions, and return
Vector tables: IVT and IDT structure and configuration
Interrupt priority: Hardware and software priority management

These concepts form the foundation of how operating systems interact with hardware and respond to exceptional conditions—essential knowledge for kernel development, driver programming, and systems debugging.

Module Complete

Congratulations! You've completed the Interrupts and Exceptions module. You now have a deep understanding of one of the most fundamental mechanisms in computer systems—the interrupt architecture that enables responsive, efficient computing. This knowledge is essential for any engineer working on operating systems, embedded systems, or low-level software development.