Disabling Interrupts - Learning Module

Loading content...

0/227

Privileged Operation

The Protection Imperative

We've established that disabling interrupts is a powerful mechanism for achieving mutual exclusion on uniprocessor systems. But with great power comes great responsibility—and in operating systems, great restriction.

You, as a regular user-space program, cannot disable interrupts.

If you try to execute the cli instruction from a normal application, the CPU will immediately raise a general protection fault (#GP). Your program will be terminated. The operating system protects this capability jealously, and for excellent reasons.

This restriction isn't an arbitrary design choice—it's a fundamental security boundary. If any program could disable interrupts at will, the entire operating system model would collapse. A malicious or buggy program could monopolize the CPU indefinitely, freeze the system, prevent critical I/O operations, and render all other programs—including the operating system itself—powerless.

This page explores the privilege model that restricts interrupt control, why this restriction exists, and how the operating system manages the boundary between privileged and unprivileged code.

Security Boundary

The ability to disable interrupts is restricted to ring 0 (kernel mode) on x86 processors or equivalent privileged modes on other architectures. User-space code running at ring 3 cannot execute interrupt control instructions. This is enforced by hardware, not software—a user program cannot bypass this restriction, period.

CPU Privilege Levels: The Hardware Foundation

Modern CPUs implement a privilege level or protection ring system that controls what code can do based on its privilege level. This is a hardware-enforced mechanism—no amount of clever programming can circumvent it.

x86/x64 Protection Rings:

Intel and AMD processors implement four privilege levels, called rings 0 through 3:

Ring 0 (Kernel Mode): Highest privilege. Can execute any instruction, access any memory, modify any hardware state. The operating system kernel runs here.
Ring 1 & 2: Intermediate levels, historically intended for device drivers and services. Rarely used in modern operating systems (most run drivers in ring 0).
Ring 3 (User Mode): Lowest privilege. Restricted instruction set, restricted memory access, no hardware manipulation. All user applications run here.

Converting Mermaid diagram...

How Privilege Is Tracked:

The CPU tracks the current privilege level (CPL) in the Code Segment register (CS) on x86 architectures. Specifically, the two least significant bits of CS indicate the current ring:

CS.RPL = 0 → Ring 0 (kernel mode)
CS.RPL = 3 → Ring 3 (user mode)

Every instruction the CPU executes is checked against the current privilege level. If an instruction requires higher privilege than the CPL, the CPU raises a fault.

Privilege-Sensitive x86 Instructions
Instruction	Operation	Required Privilege	Fault if Unprivileged
CLI	Clear Interrupt Flag (disable)	Ring 0 (with IOPL=0)	General Protection (#GP)
STI	Set Interrupt Flag (enable)	Ring 0 (with IOPL=0)	General Protection (#GP)
IN/OUT	Port I/O	Ring 0 (or IOPL allows)	General Protection (#GP)
LGDT/LIDT	Load GDT/IDT register	Ring 0 only	General Protection (#GP)
MOV CR*, DRx	Modify control/debug regs	Ring 0 only	General Protection (#GP)
HLT	Halt CPU	Ring 0 only	General Protection (#GP)
WRMSR/RDMSR	Access Model-Specific Regs	Ring 0 only	General Protection (#GP)
INVLPG	Invalidate TLB entry	Ring 0 only	General Protection (#GP)

The IOPL Mechanism

x86 has an additional mechanism called IOPL (I/O Privilege Level) in the EFLAGS register. If IOPL >= CPL, even ring 3 code can execute CLI/STI. However, only ring 0 code can change IOPL. Modern operating systems always set IOPL=0, ensuring only kernel code can control interrupts. This mechanism exists for legacy compatibility, not as a general permission system.

Privilege Models in Other Architectures

While x86's protection rings are well-known, other processor architectures implement similar privilege models with their own terminology and specific mechanisms.

ARM Architecture:

ARM processors use Exception Levels (EL) rather than rings:

EL0: User mode (applications)
EL1: Kernel/OS mode (operating system)
EL2: Hypervisor mode (virtualization)
EL3: Secure monitor mode (TrustZone)

The CPSR (Current Program Status Register) includes the I (IRQ disable) and F (FIQ disable) bits. The instructions CPSID/CPSIE (Change Processor State, Interrupt Disable/Enable) are only executable at EL1 or higher.

arm_privilege.s
ARM Assembly
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
@ ARM Exception Levels and Interrupt Control
@ ==========================================
 
@ At EL0 (User Mode) - This will FAULT!
@ ----------------------------------------
cpsid i        @ Attempt to disable IRQ
               @ RESULT: Undefined Instruction exception
               @         Trapped to EL1 handler
 
@ At EL1 (Kernel Mode) - This succeeds
@ ----------------------------------------
cpsid i        @ Disable IRQ interrupts
               @ CPSR.I bit is set to 1
               @ IRQ interrupts are now masked
 
cpsie i        @ Enable IRQ interrupts
               @ CPSR.I bit is cleared to 0
               @ IRQ interrupts are now unmasked
 
@ Using MRS/MSR (works at EL1+)
@ ----------------------------------------
mrs x0, daif        @ Read interrupt mask flags
orr x0, x0, #0x80   @ Set bit 7 (IRQ mask)
msr daif, x0        @ Write back, disabling IRQ
 
@ ARMv8 DAIF register bits:
@   D (bit 9): Debug exceptions mask
@   A (bit 8): SError mask
@   I (bit 7): IRQ mask
@   F (bit 6): FIQ mask
 
@ Exception Level Transition:
@ ----------------------------
@ EL0 -> EL1: SVC instruction (system call)
@ EL1 -> EL0: ERET instruction (return from exception)
@ 
@ Interrupt control is ONLY available at EL1, EL2, EL3

RISC-V Architecture:

RISC-V uses a simpler privilege model with machine mode, supervisor mode, and user mode. Interrupt control is managed through CSR (Control and Status Register) access:

riscv_privilege.s

RISC-V Assembly

# RISC-V Privilege Modes and Interrupt Control
# ==============================================
 
# RISC-V Privilege Levels:
#   M-mode (Machine mode):     Highest privilege (firmware, bootloader)
#   S-mode (Supervisor mode):  OS kernel
#   U-mode (User mode):        Applications
 
# CSR Access from U-mode - FAILS
# --------------------------------
csrci mstatus, 0x8     # Attempt to clear MIE bit
                        # RESULT: Illegal instruction exception
                        #         Trapped to S-mode or M-mode
 
# CSR Access from S-mode or M-mode - SUCCEEDS
# --------------------------------------------
# Disable interrupts:
csrci mstatus, 0x8     # Clear MIE (Machine Interrupt Enable)
# Or:
csrci sstatus, 0x2     # Clear SIE (Supervisor Interrupt Enable)
 
# Enable interrupts:
csrsi mstatus, 0x8     # Set MIE bit
# Or:
csrsi sstatus, 0x2     # Set SIE bit
 
# Save and restore pattern:
csrrc a0, mstatus, 0x8  # Read mstatus into a0, clear MIE
# ... critical section ...
csrs mstatus, a0        # Restore original MIE state

Privilege Models Across Architectures
Architecture	Privilege Levels	Kernel Mode	User Mode	Interrupt Control Restriction
x86/x64	4 rings (0-3)	Ring 0	Ring 3	CLI/STI require Ring 0 (or IOPL)
ARM (v7 and earlier)	User/System/FIQ/IRQ/SVC/...	SVC, System modes	User mode	CPSID/CPSIE require privileged mode
ARM (v8/v9)	EL0, EL1, EL2, EL3	EL1	EL0	DAIF access requires EL1+
RISC-V	M, S, U modes	S-mode (or M-mode)	U-mode	CSR access requires S/M-mode
PowerPC	User/Supervisor	Supervisor (MSR.PR=0)	User (MSR.PR=1)	MSR.EE modification is privileged
MIPS	User/Supervisor/Kernel	Kernel mode	User mode	Status.IE requires kernel mode

Universal Principle

Despite different terminology and specific mechanisms, ALL modern processor architectures restrict interrupt control to privileged modes. This is a universal security principle: user code must not be able to control CPU interrupt behavior. The specific instructions and registers vary, but the protection is always present.

Why This Restriction Exists: Security Analysis

The restriction of interrupt control to kernel mode isn't arbitrary—it addresses fundamental security and stability requirements. Let's analyze what would happen if user-space programs could disable interrupts.

Threat Model: Malicious Interrupt Disabling

Imagine a world where any program could execute cli (disable interrupts). Consider the attacks that become possible:

Attacks Enabled by User-Space Interrupt Control

•Denial of Service (CPU Monopolization): A program disables interrupts and enters an infinite loop. The timer interrupt cannot fire, so the scheduler never runs. All other processes starve. The system appears frozen forever.
•Real-Time Deadline Violation: A malicious program disables interrupts at critical moments, causing real-time systems to miss hard deadlines. In industrial control or medical systems, this could cause physical harm.
•I/O Starvation Attack: By disabling interrupts during network or disk operations, a program could cause I/O timeouts, buffer overflows, or data loss. Network connections would drop. Disk operations might corrupt data.
•Watchdog Timer Bypass: Hardware watchdog timers reset the system if the software appears hung. If a program can disable the interrupt that resets the watchdog, it could prevent system recovery from other faults.
•Race Condition Exploitation: A program could disable interrupts at precise moments to win race conditions with kernel code, potentially gaining elevated privileges or accessing protected data.
•Debugging and Audit Bypass: Security monitoring often relies on periodic checks (via timer interrupts). Disabling interrupts could allow malicious activity to proceed undetected.

The Fundamental Problem: Preemption Is Essential

Modern operating systems rely on preemption to provide:

Fairness: All processes get CPU time, preventing starvation
Responsiveness: High-priority tasks can interrupt long-running computations
System Control: The OS maintains ultimate authority over resource allocation
Security Boundaries: Untrusted code cannot monopolize resources

If user code could prevent preemption, the operating system would lose control of the machine. The OS would become merely a suggestion—a program could simply refuse to give up the CPU.

attack_scenario.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
/* HYPOTHETICAL: If user code could disable interrupts */
/* This is why the restriction exists! */
 
#include <stdio.h>
 
/* Inline assembly to disable interrupts (x86) */
static inline void cli(void) {
    asm volatile("cli");  /* This would FAIL with #GP in reality */
}
 
static inline void sti(void) {
    asm volatile("sti");
}
 
/* ATTACK 1: System freeze */
void dos_attack(void) {
    cli();              /* Disable timer interrupts */
    while (1) {
        /* Do nothing - system is now frozen */
        /* Scheduler never runs */
        /* No process can execute */
        /* Even Ctrl+C doesn't work (that's an interrupt!) */
    }
    /* sti() never reached */
}
 
/* ATTACK 2: Priority inversion exploitation */
void race_condition_attack(int *privileged_flag) {
    /* 
     * Normal case: OS might preempt us after our check
     * With CLI: We guarantee the window stays open
     */
    cli();
    if (*privileged_flag) {
        /* This check happens */
    }
    /* Without CLI, scheduler might run here, changing flag */
    /* With CLI, NO preemption - flag stays the same */
    if (*privileged_flag) {
        /* We STILL have access - race won! */
        do_privileged_operation();
    }
    sti();
}
 
/* ATTACK 3: Timing side-channel attack */
void timing_attack(void) {
    cli();
    /* 
     * With interrupts disabled:
     * - No timer variance from interrupts
     * - Perfect timing measurements possible
     * - Side-channel attacks become trivially precise
     */
    precise_timing_measurement();
    sti();
}
 
/* 
 * All these attacks are PREVENTED by the privilege restriction.
 * User code attempting cli/sti will trigger #GP fault.
 * The OS catches the fault and terminates the process.
 */

Trust Boundary Violation

Interrupt control touches the most fundamental trust boundary in computing: the separation between the operating system (trusted) and user applications (untrusted). If this boundary is breached, no other security mechanism can compensate. Privilege checks exist precisely to maintain this boundary at the hardware level.

The User-Kernel Transition

Since user-space code cannot disable interrupts directly, any synchronization that requires interrupt disabling must involve the kernel. Let's examine how applications request privileged operations.

System Calls: The Controlled Gateway

The only legitimate way for user code to access privileged functionality is through system calls. A system call is a controlled transition from user mode to kernel mode, where:

The user program sets up parameters and invokes a special instruction (syscall on x64, svc on ARM)
The CPU transitions to ring 0 / EL1 and jumps to a kernel-defined entry point
The kernel validates the request, performs the operation if authorized
The kernel returns results and transitions back to user mode

Converting Mermaid diagram...

What User Code Can Do:

Given the restriction, what options do user-space programs have for synchronization?

Futexes (Fast User-space mutexes): Spin in user-space for uncontended cases, call into kernel only when blocking is needed
Pthread Mutexes: Library-level abstractions that use futexes and kernel support
Atomic Operations: Many atomic instructions ARE available in user mode (like lock cmpxchg)
Signal Blocking: Can request the kernel to block signals (like interrupts for the process)

Notice that none of these involve directly disabling CPU interrupts—they either use atomic guarantees or request kernel assistance.

user_synchronization.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
/* User-space synchronization without interrupt control */
#include <pthread.h>
#include <stdatomic.h>
#include <signal.h>
 
/* METHOD 1: Pthread mutex (uses futex internally) */
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
 
void critical_section_pthread(void) {
    pthread_mutex_lock(&mutex);
    /* Protected by mutex - kernel handles contention */
    shared_data_operation();
    pthread_mutex_unlock(&mutex);
}
 
/* METHOD 2: Atomic operations (no kernel needed for simple ops) */
atomic_int counter = 0;
 
void increment_atomic(void) {
    /* 
     * atomic_fetch_add compiles to: lock add [counter], 1
     * The LOCK prefix is user-mode accessible!
     * No interrupt disabling needed - atomicity via bus lock
     */
    atomic_fetch_add(&counter, 1);
}
 
/* Compare-and-swap is also atomic */
int try_lock(atomic_int *lock) {
    int expected = 0;
    /* 
     * Compiles to: lock cmpxchg [lock], 1
     * Again, user-mode accessible
     */
    return atomic_compare_exchange_strong(lock, &expected, 1);
}
 
/* METHOD 3: Signal masking (process-level, not CPU-level) */
void signal_safe_section(void) {
    sigset_t new_mask, old_mask;
    
    sigfillset(&new_mask);
    /* Ask kernel to block signals to THIS PROCESS */
    sigprocmask(SIG_BLOCK, &new_mask, &old_mask);
    
    /* 
     * Signals (like SIGINT from Ctrl+C) won't interrupt us
     * But timer interrupts still fire!
     * Other processes still run!
     * This is PROCESS-level, not CPU-level
     */
    signal_sensitive_operation();
    
    sigprocmask(SIG_SETMASK, &old_mask, NULL);
}
 
/*
 * KEY INSIGHT:
 * User code achieves synchronization through:
 * - Atomic instructions (hardware support, no kernel)
 * - Kernel-mediated locks (futex, mutexes)
 * - Process-level signal control (not interrupt control)
 * 
 * User code NEVER directly controls CPU interrupts.
 */

Atomic Operations vs Interrupt Disabling

Modern user-space synchronization relies heavily on atomic operations (cmpxchg, lock prefix) rather than interrupt control. These atomic operations ARE available in user mode because they don't prevent system operation—they just ensure memory access atomicity. This is a carefully designed security/functionality balance.

Virtualization and Privilege

Modern systems often run in virtualized environments, which adds another layer of complexity to privilege management. The interaction between interrupt control and virtualization deserves special attention.

The Virtualization Challenge:

In a virtualized environment, a guest operating system believes it's running at ring 0. It executes cli expecting to disable interrupts. But the hypervisor cannot allow this to actually disable physical interrupts—that would allow the guest to freeze the entire machine, including other VMs.

VMX (Intel VT-x) and SVM (AMD-V) Solutions:

Hardware virtualization extensions introduce additional privilege levels:

VMX root mode: The hypervisor runs here (even higher than ring 0)
VMX non-root mode: Guest OS runs here, thinking it's ring 0

When guest code executes cli, the hardware can be configured to either:

Trap (VM exit): Transfer control to the hypervisor, which emulates the effect
Shadow: Maintain a virtual interrupt flag separate from the physical one

Interrupt Control in Virtualized Environments
Scenario	Guest Executes CLI	Actual Effect	Physical Interrupts
Bare metal	cli in ring 0	IF = 0, interrupts disabled	Disabled on this CPU
VM (trap on CLI)	cli in guest kernel	VM exit to hypervisor	Still enabled! HV handles it
VM (shadow RFLAGS)	cli in guest kernel	Virtual IF = 0	Physical IF unchanged
VM (VIF)	cli with VIF enabled	VIF flag = 0 in RFLAGS	Physical IF unchanged
Container	cli in container	#GP fault (still ring 3!)	N/A - not even attempted

Virtual Interrupt Flag (VIF):

Intel processors support a Virtual Interrupt Flag (VIF) mechanism. When enabled:

The guest's cli and sti instructions modify VIF, not IF
Physical interrupts continue to be delivered to the hypervisor
The hypervisor can inject virtual interrupts based on VIF state
Guest "believes" interrupts are disabled, but the machine keeps running

virtualization_interrupt.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
/* Hypervisor handling of privileged interrupt control */
 
struct vm_state {
    uint64_t guest_rflags;    /* Guest's view of RFLAGS */
    uint64_t guest_rip;       /* Instruction pointer */
    bool virtual_if;          /* Virtual interrupt flag */
    /* ... other state ... */
};
 
/* 
 * VM exit handler for CLI instruction
 * Called when guest executes CLI and it's configured to trap
 */
void handle_cli_vmexit(struct vm_state *vm) {
    /*
     * Guest wants to disable interrupts.
     * We don't actually disable physical interrupts!
     * We just update the guest's virtual state.
     */
    
    /* Clear the virtual IF in guest's saved RFLAGS */
    vm->guest_rflags &= ~RFLAGS_IF;
    vm->virtual_if = false;
    
    /* Log for debugging/auditing if needed */
    vmm_log("Guest %d disabled (virtual) interrupts at RIP=0x%lx", 
            vm->id, vm->guest_rip);
    
    /* Advance guest RIP past the CLI instruction */
    vm->guest_rip += 1;  /* CLI is 1 byte */
    
    /* Resume guest execution */
    vmresume(vm);
    
    /*
     * The physical CPU still has IF=1
     * Physical timer interrupts still fire
     * Hypervisor still gets control
     * Other VMs still run
     * Just this guest "thinks" interrupts are off
     */
}
 
/* Virtual interrupt injection */
void inject_virtual_interrupt(struct vm_state *vm, int vector) {
    if (vm->virtual_if) {
        /* Guest has interrupts "enabled" - inject immediately */
        vm->pending_interrupt = vector;
        vm->interrupt_pending = true;
        vmresume_with_interrupt(vm);
    } else {
        /* Guest has interrupts "disabled" - queue for later */
        queue_virtual_interrupt(vm, vector);
        /* Will be delivered when guest executes STI */
    }
}
 
/*
 * KEY INSIGHT:
 * Virtualization demonstrates why privilege enforcement is crucial.
 * Even kernel-mode interrupt control must be contained to prevent
 * one VM from affecting others.
 * The hypervisor maintains ultimate control by virtualizing the
 * privileged operations themselves.
 */

Nested Privilege

Virtualization creates a hierarchy of privilege: user < guest kernel < hypervisor < hardware. At each level, the lower level's interrupt control is virtualized by the higher level. This nesting allows complex cloud environments where multiple OSes safely share physical hardware, each believing it has full control.

The Kernel's Responsibility: Wise Use of Privilege

Having established that interrupt control is restricted to kernel mode, let's examine how kernel developers must responsibly use this power. Even within the kernel, interrupt disabling carries significant implications.

The Latency Problem:

While interrupts are disabled:

Timer ticks are missed (timekeeping accuracy degrades)
Network packets queue up (potential drops under load)
Disk I/O completions are delayed (throughput reduction)
User input is not processed (system feels unresponsive)

Every microsecond spent with interrupts disabled is a microsecond of potential missed events. The kernel must minimize this time.

Best Practices for Kernel Interrupt Disabling

•Keep it short: Critical sections with interrupts disabled should be microseconds, not milliseconds. Long critical sections cause latency spikes.
•Never block with interrupts disabled: You cannot call schedule(), wait for I/O, or acquire a sleeping lock. Doing so will deadlock the system.
•Don't do I/O: I/O operations might trigger interrupts that you've disabled, leading to hangs or missed events.
•Prefer spinlocks with _irqsave: Use spin_lock_irqsave() which combines spinlock with interrupt saving, ensuring proper nesting.
•Consider alternatives: RCU, per-CPU data, and lockless algorithms often eliminate the need to disable interrupts.
•Audit and measure: Use tracing tools (ftrace, perf) to identify long interrupt-off sections. These are often performance bugs.
•Document why: When you do disable interrupts, document the specific reason—reviewing code later should reveal the synchronization requirement.

kernel_irq_discipline.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
/* Examples of good and bad interrupt disabling in kernel code */
 
#include <linux/spinlock.h>
#include <linux/irqflags.h>
 
/* BAD: Long critical section */
void bad_long_critical_section(void) {
    unsigned long flags;
    local_irq_save(flags);
    
    /* BAD: Doing extensive work with interrupts off */
    for (int i = 0; i < 1000000; i++) {
        process_item(i);  /* Each call takes time! */
    }
    
    local_irq_restore(flags);
    /* This could block interrupts for seconds! */
}
 
/* GOOD: Minimal critical section */
void good_minimal_critical_section(void) {
    unsigned long flags;
    
    /* Do heavy work with interrupts enabled */
    prepare_data();
    
    local_irq_save(flags);
    /* Only the critical update is protected */
    atomic_counter++;  /* Tiny critical section */
    local_irq_restore(flags);
    
    /* Continue non-critical work */
    cleanup_and_log();
}
 
/* BAD: Blocking with interrupts disabled */
void bad_blocking(void) {
    unsigned long flags;
    local_irq_save(flags);
    
    /* DEADLY: This will never complete! */
    wait_for_io_completion();  /* Needs interrupts to complete! */
    
    local_irq_restore(flags);  /* Never reached */
}
 
/* GOOD: Using appropriate lock type */
void good_lock_choice(void) {
    /* If you might block, use a mutex (not spinlock) */
    mutex_lock(&my_mutex);
    
    /* OK to do potentially blocking work here */
    result = disk_read_sync(buffer);
    
    mutex_unlock(&my_mutex);
    /* Interrupts were never fully disabled, 
       just the mutex provides mutual exclusion */
}
 
/* GOOD: Per-CPU data eliminates need for interrupt disable */
DEFINE_PER_CPU(struct stats, cpu_statistics);
 
void update_stats_percpu(void) {
    /*
     * No need to disable interrupts if only this CPU
     * accesses its own per-CPU data.
     * 
     * Still need to prevent preemption to ensure we're
     * on the same CPU throughout:
     */
    preempt_disable();
    this_cpu_inc(cpu_statistics.events);
    preempt_enable();
    
    /* If interrupt handlers also access, then: */
    local_irq_save(flags);  /* Only for IRQ handler sync */
    this_cpu_inc(cpu_statistics.irq_accessed_field);
    local_irq_restore(flags);
}

Real-Time Implications

In real-time Linux (PREEMPT_RT), even kernel spinlocks become sleeping locks to minimize interrupt-disabled time. Only 'raw' spinlocks actually disable interrupts. This extreme design shows how seriously real-time systems take interrupt latency. If you're in a latency-sensitive environment, every local_irq_disable() needs justification.

Summary: Privileged Operation

We've thoroughly examined why interrupt disabling is restricted to privileged code. Let's consolidate the key insights:

Key Takeaways

•Hardware enforces privilege levels — CPUs implement protection rings (x86) or exception levels (ARM) that restrict what instructions user code can execute.
•CLI/STI require kernel mode — Attempting to execute interrupt control instructions in user mode causes a General Protection Fault (#GP).
•Security drives the restriction — If user code could disable interrupts, it could monopolize the CPU, bypass scheduling, and undermine system control.
•System calls provide controlled access — User code requests privileged operations through system calls, giving the kernel the opportunity to validate and mediate.
•Virtualization adds complexity — Hypervisors virtualize interrupt control, creating shadow interrupt flags so guests can't affect physical hardware.
•Kernel must use privilege wisely — Even in kernel mode, interrupt disabling should be minimal—keep critical sections short, never block, measure latency.

What's Next:

Having covered the mechanism, the single-processor limitation, and the privilege requirement, we'll now examine the broader limitations of the interrupt disable approach—why it's not the universal solution to synchronization even in the contexts where it's available.

Page Complete

You now understand why interrupt disabling is a privileged operation, how hardware enforces this restriction, and why the security implications demand such strict control. You've also seen how kernel code must responsibly wield this power. Next, we'll explore additional limitations beyond privilege and processor count.

Privileged Operation

The Protection Imperative

You, as a regular user-space program, cannot disable interrupts.

This page explores the privilege model that restricts interrupt control, why this restriction exists, and how the operating system manages the boundary between privileged and unprivileged code.

Security Boundary

CPU Privilege Levels: The Hardware Foundation

x86/x64 Protection Rings:

Intel and AMD processors implement four privilege levels, called rings 0 through 3:

Ring 0 (Kernel Mode): Highest privilege. Can execute any instruction, access any memory, modify any hardware state. The operating system kernel runs here.
Ring 1 & 2: Intermediate levels, historically intended for device drivers and services. Rarely used in modern operating systems (most run drivers in ring 0).
Ring 3 (User Mode): Lowest privilege. Restricted instruction set, restricted memory access, no hardware manipulation. All user applications run here.

Converting Mermaid diagram...

How Privilege Is Tracked:

The CPU tracks the current privilege level (CPL) in the Code Segment register (CS) on x86 architectures. Specifically, the two least significant bits of CS indicate the current ring:

CS.RPL = 0 → Ring 0 (kernel mode)
CS.RPL = 3 → Ring 3 (user mode)

Every instruction the CPU executes is checked against the current privilege level. If an instruction requires higher privilege than the CPL, the CPU raises a fault.

Privilege-Sensitive x86 Instructions
Instruction	Operation	Required Privilege	Fault if Unprivileged
CLI	Clear Interrupt Flag (disable)	Ring 0 (with IOPL=0)	General Protection (#GP)
STI	Set Interrupt Flag (enable)	Ring 0 (with IOPL=0)	General Protection (#GP)
IN/OUT	Port I/O	Ring 0 (or IOPL allows)	General Protection (#GP)
LGDT/LIDT	Load GDT/IDT register	Ring 0 only	General Protection (#GP)
MOV CR*, DRx	Modify control/debug regs	Ring 0 only	General Protection (#GP)
HLT	Halt CPU	Ring 0 only	General Protection (#GP)
WRMSR/RDMSR	Access Model-Specific Regs	Ring 0 only	General Protection (#GP)
INVLPG	Invalidate TLB entry	Ring 0 only	General Protection (#GP)

The IOPL Mechanism

Privilege Models in Other Architectures

While x86's protection rings are well-known, other processor architectures implement similar privilege models with their own terminology and specific mechanisms.

ARM Architecture:

ARM processors use Exception Levels (EL) rather than rings:

EL0: User mode (applications)
EL1: Kernel/OS mode (operating system)
EL2: Hypervisor mode (virtualization)
EL3: Secure monitor mode (TrustZone)

arm_privilege.s
ARM Assembly
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
@ ARM Exception Levels and Interrupt Control
@ ==========================================
 
@ At EL0 (User Mode) - This will FAULT!
@ ----------------------------------------
cpsid i        @ Attempt to disable IRQ
               @ RESULT: Undefined Instruction exception
               @         Trapped to EL1 handler
 
@ At EL1 (Kernel Mode) - This succeeds
@ ----------------------------------------
cpsid i        @ Disable IRQ interrupts
               @ CPSR.I bit is set to 1
               @ IRQ interrupts are now masked
 
cpsie i        @ Enable IRQ interrupts
               @ CPSR.I bit is cleared to 0
               @ IRQ interrupts are now unmasked
 
@ Using MRS/MSR (works at EL1+)
@ ----------------------------------------
mrs x0, daif        @ Read interrupt mask flags
orr x0, x0, #0x80   @ Set bit 7 (IRQ mask)
msr daif, x0        @ Write back, disabling IRQ
 
@ ARMv8 DAIF register bits:
@   D (bit 9): Debug exceptions mask
@   A (bit 8): SError mask
@   I (bit 7): IRQ mask
@   F (bit 6): FIQ mask
 
@ Exception Level Transition:
@ ----------------------------
@ EL0 -> EL1: SVC instruction (system call)
@ EL1 -> EL0: ERET instruction (return from exception)
@ 
@ Interrupt control is ONLY available at EL1, EL2, EL3

RISC-V Architecture:

RISC-V uses a simpler privilege model with machine mode, supervisor mode, and user mode. Interrupt control is managed through CSR (Control and Status Register) access:

riscv_privilege.s

RISC-V Assembly

# RISC-V Privilege Modes and Interrupt Control
# ==============================================
 
# RISC-V Privilege Levels:
#   M-mode (Machine mode):     Highest privilege (firmware, bootloader)
#   S-mode (Supervisor mode):  OS kernel
#   U-mode (User mode):        Applications
 
# CSR Access from U-mode - FAILS
# --------------------------------
csrci mstatus, 0x8     # Attempt to clear MIE bit
                        # RESULT: Illegal instruction exception
                        #         Trapped to S-mode or M-mode
 
# CSR Access from S-mode or M-mode - SUCCEEDS
# --------------------------------------------
# Disable interrupts:
csrci mstatus, 0x8     # Clear MIE (Machine Interrupt Enable)
# Or:
csrci sstatus, 0x2     # Clear SIE (Supervisor Interrupt Enable)
 
# Enable interrupts:
csrsi mstatus, 0x8     # Set MIE bit
# Or:
csrsi sstatus, 0x2     # Set SIE bit
 
# Save and restore pattern:
csrrc a0, mstatus, 0x8  # Read mstatus into a0, clear MIE
# ... critical section ...
csrs mstatus, a0        # Restore original MIE state

Privilege Models Across Architectures
Architecture	Privilege Levels	Kernel Mode	User Mode	Interrupt Control Restriction
x86/x64	4 rings (0-3)	Ring 0	Ring 3	CLI/STI require Ring 0 (or IOPL)
ARM (v7 and earlier)	User/System/FIQ/IRQ/SVC/...	SVC, System modes	User mode	CPSID/CPSIE require privileged mode
ARM (v8/v9)	EL0, EL1, EL2, EL3	EL1	EL0	DAIF access requires EL1+
RISC-V	M, S, U modes	S-mode (or M-mode)	U-mode	CSR access requires S/M-mode
PowerPC	User/Supervisor	Supervisor (MSR.PR=0)	User (MSR.PR=1)	MSR.EE modification is privileged
MIPS	User/Supervisor/Kernel	Kernel mode	User mode	Status.IE requires kernel mode

Universal Principle

Why This Restriction Exists: Security Analysis

Threat Model: Malicious Interrupt Disabling

Imagine a world where any program could execute cli (disable interrupts). Consider the attacks that become possible:

Attacks Enabled by User-Space Interrupt Control

•Denial of Service (CPU Monopolization): A program disables interrupts and enters an infinite loop. The timer interrupt cannot fire, so the scheduler never runs. All other processes starve. The system appears frozen forever.
•Real-Time Deadline Violation: A malicious program disables interrupts at critical moments, causing real-time systems to miss hard deadlines. In industrial control or medical systems, this could cause physical harm.
•I/O Starvation Attack: By disabling interrupts during network or disk operations, a program could cause I/O timeouts, buffer overflows, or data loss. Network connections would drop. Disk operations might corrupt data.
•Watchdog Timer Bypass: Hardware watchdog timers reset the system if the software appears hung. If a program can disable the interrupt that resets the watchdog, it could prevent system recovery from other faults.
•Race Condition Exploitation: A program could disable interrupts at precise moments to win race conditions with kernel code, potentially gaining elevated privileges or accessing protected data.
•Debugging and Audit Bypass: Security monitoring often relies on periodic checks (via timer interrupts). Disabling interrupts could allow malicious activity to proceed undetected.

The Fundamental Problem: Preemption Is Essential

Modern operating systems rely on preemption to provide:

Fairness: All processes get CPU time, preventing starvation
Responsiveness: High-priority tasks can interrupt long-running computations
System Control: The OS maintains ultimate authority over resource allocation
Security Boundaries: Untrusted code cannot monopolize resources

If user code could prevent preemption, the operating system would lose control of the machine. The OS would become merely a suggestion—a program could simply refuse to give up the CPU.

attack_scenario.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
/* HYPOTHETICAL: If user code could disable interrupts */
/* This is why the restriction exists! */
 
#include <stdio.h>
 
/* Inline assembly to disable interrupts (x86) */
static inline void cli(void) {
    asm volatile("cli");  /* This would FAIL with #GP in reality */
}
 
static inline void sti(void) {
    asm volatile("sti");
}
 
/* ATTACK 1: System freeze */
void dos_attack(void) {
    cli();              /* Disable timer interrupts */
    while (1) {
        /* Do nothing - system is now frozen */
        /* Scheduler never runs */
        /* No process can execute */
        /* Even Ctrl+C doesn't work (that's an interrupt!) */
    }
    /* sti() never reached */
}
 
/* ATTACK 2: Priority inversion exploitation */
void race_condition_attack(int *privileged_flag) {
    /* 
     * Normal case: OS might preempt us after our check
     * With CLI: We guarantee the window stays open
     */
    cli();
    if (*privileged_flag) {
        /* This check happens */
    }
    /* Without CLI, scheduler might run here, changing flag */
    /* With CLI, NO preemption - flag stays the same */
    if (*privileged_flag) {
        /* We STILL have access - race won! */
        do_privileged_operation();
    }
    sti();
}
 
/* ATTACK 3: Timing side-channel attack */
void timing_attack(void) {
    cli();
    /* 
     * With interrupts disabled:
     * - No timer variance from interrupts
     * - Perfect timing measurements possible
     * - Side-channel attacks become trivially precise
     */
    precise_timing_measurement();
    sti();
}
 
/* 
 * All these attacks are PREVENTED by the privilege restriction.
 * User code attempting cli/sti will trigger #GP fault.
 * The OS catches the fault and terminates the process.
 */

Trust Boundary Violation

The User-Kernel Transition

Since user-space code cannot disable interrupts directly, any synchronization that requires interrupt disabling must involve the kernel. Let's examine how applications request privileged operations.

System Calls: The Controlled Gateway

The only legitimate way for user code to access privileged functionality is through system calls. A system call is a controlled transition from user mode to kernel mode, where:

The user program sets up parameters and invokes a special instruction (syscall on x64, svc on ARM)
The CPU transitions to ring 0 / EL1 and jumps to a kernel-defined entry point
The kernel validates the request, performs the operation if authorized
The kernel returns results and transitions back to user mode

Converting Mermaid diagram...

What User Code Can Do:

Given the restriction, what options do user-space programs have for synchronization?

Futexes (Fast User-space mutexes): Spin in user-space for uncontended cases, call into kernel only when blocking is needed
Pthread Mutexes: Library-level abstractions that use futexes and kernel support
Atomic Operations: Many atomic instructions ARE available in user mode (like lock cmpxchg)
Signal Blocking: Can request the kernel to block signals (like interrupts for the process)

Notice that none of these involve directly disabling CPU interrupts—they either use atomic guarantees or request kernel assistance.

user_synchronization.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
/* User-space synchronization without interrupt control */
#include <pthread.h>
#include <stdatomic.h>
#include <signal.h>
 
/* METHOD 1: Pthread mutex (uses futex internally) */
pthread_mutex_t mutex = PTHREAD_MUTEX_INITIALIZER;
 
void critical_section_pthread(void) {
    pthread_mutex_lock(&mutex);
    /* Protected by mutex - kernel handles contention */
    shared_data_operation();
    pthread_mutex_unlock(&mutex);
}
 
/* METHOD 2: Atomic operations (no kernel needed for simple ops) */
atomic_int counter = 0;
 
void increment_atomic(void) {
    /* 
     * atomic_fetch_add compiles to: lock add [counter], 1
     * The LOCK prefix is user-mode accessible!
     * No interrupt disabling needed - atomicity via bus lock
     */
    atomic_fetch_add(&counter, 1);
}
 
/* Compare-and-swap is also atomic */
int try_lock(atomic_int *lock) {
    int expected = 0;
    /* 
     * Compiles to: lock cmpxchg [lock], 1
     * Again, user-mode accessible
     */
    return atomic_compare_exchange_strong(lock, &expected, 1);
}
 
/* METHOD 3: Signal masking (process-level, not CPU-level) */
void signal_safe_section(void) {
    sigset_t new_mask, old_mask;
    
    sigfillset(&new_mask);
    /* Ask kernel to block signals to THIS PROCESS */
    sigprocmask(SIG_BLOCK, &new_mask, &old_mask);
    
    /* 
     * Signals (like SIGINT from Ctrl+C) won't interrupt us
     * But timer interrupts still fire!
     * Other processes still run!
     * This is PROCESS-level, not CPU-level
     */
    signal_sensitive_operation();
    
    sigprocmask(SIG_SETMASK, &old_mask, NULL);
}
 
/*
 * KEY INSIGHT:
 * User code achieves synchronization through:
 * - Atomic instructions (hardware support, no kernel)
 * - Kernel-mediated locks (futex, mutexes)
 * - Process-level signal control (not interrupt control)
 * 
 * User code NEVER directly controls CPU interrupts.
 */

Atomic Operations vs Interrupt Disabling

Virtualization and Privilege

The Virtualization Challenge:

VMX (Intel VT-x) and SVM (AMD-V) Solutions:

Hardware virtualization extensions introduce additional privilege levels:

VMX root mode: The hypervisor runs here (even higher than ring 0)
VMX non-root mode: Guest OS runs here, thinking it's ring 0

When guest code executes cli, the hardware can be configured to either:

Trap (VM exit): Transfer control to the hypervisor, which emulates the effect
Shadow: Maintain a virtual interrupt flag separate from the physical one

Interrupt Control in Virtualized Environments
Scenario	Guest Executes CLI	Actual Effect	Physical Interrupts
Bare metal	cli in ring 0	IF = 0, interrupts disabled	Disabled on this CPU
VM (trap on CLI)	cli in guest kernel	VM exit to hypervisor	Still enabled! HV handles it
VM (shadow RFLAGS)	cli in guest kernel	Virtual IF = 0	Physical IF unchanged
VM (VIF)	cli with VIF enabled	VIF flag = 0 in RFLAGS	Physical IF unchanged
Container	cli in container	#GP fault (still ring 3!)	N/A - not even attempted

Virtual Interrupt Flag (VIF):

Intel processors support a Virtual Interrupt Flag (VIF) mechanism. When enabled:

The guest's cli and sti instructions modify VIF, not IF
Physical interrupts continue to be delivered to the hypervisor
The hypervisor can inject virtual interrupts based on VIF state
Guest "believes" interrupts are disabled, but the machine keeps running

virtualization_interrupt.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
/* Hypervisor handling of privileged interrupt control */
 
struct vm_state {
    uint64_t guest_rflags;    /* Guest's view of RFLAGS */
    uint64_t guest_rip;       /* Instruction pointer */
    bool virtual_if;          /* Virtual interrupt flag */
    /* ... other state ... */
};
 
/* 
 * VM exit handler for CLI instruction
 * Called when guest executes CLI and it's configured to trap
 */
void handle_cli_vmexit(struct vm_state *vm) {
    /*
     * Guest wants to disable interrupts.
     * We don't actually disable physical interrupts!
     * We just update the guest's virtual state.
     */
    
    /* Clear the virtual IF in guest's saved RFLAGS */
    vm->guest_rflags &= ~RFLAGS_IF;
    vm->virtual_if = false;
    
    /* Log for debugging/auditing if needed */
    vmm_log("Guest %d disabled (virtual) interrupts at RIP=0x%lx", 
            vm->id, vm->guest_rip);
    
    /* Advance guest RIP past the CLI instruction */
    vm->guest_rip += 1;  /* CLI is 1 byte */
    
    /* Resume guest execution */
    vmresume(vm);
    
    /*
     * The physical CPU still has IF=1
     * Physical timer interrupts still fire
     * Hypervisor still gets control
     * Other VMs still run
     * Just this guest "thinks" interrupts are off
     */
}
 
/* Virtual interrupt injection */
void inject_virtual_interrupt(struct vm_state *vm, int vector) {
    if (vm->virtual_if) {
        /* Guest has interrupts "enabled" - inject immediately */
        vm->pending_interrupt = vector;
        vm->interrupt_pending = true;
        vmresume_with_interrupt(vm);
    } else {
        /* Guest has interrupts "disabled" - queue for later */
        queue_virtual_interrupt(vm, vector);
        /* Will be delivered when guest executes STI */
    }
}
 
/*
 * KEY INSIGHT:
 * Virtualization demonstrates why privilege enforcement is crucial.
 * Even kernel-mode interrupt control must be contained to prevent
 * one VM from affecting others.
 * The hypervisor maintains ultimate control by virtualizing the
 * privileged operations themselves.
 */

Nested Privilege

The Kernel's Responsibility: Wise Use of Privilege

The Latency Problem:

While interrupts are disabled:

Timer ticks are missed (timekeeping accuracy degrades)
Network packets queue up (potential drops under load)
Disk I/O completions are delayed (throughput reduction)
User input is not processed (system feels unresponsive)

Every microsecond spent with interrupts disabled is a microsecond of potential missed events. The kernel must minimize this time.

Best Practices for Kernel Interrupt Disabling

•Keep it short: Critical sections with interrupts disabled should be microseconds, not milliseconds. Long critical sections cause latency spikes.
•Never block with interrupts disabled: You cannot call schedule(), wait for I/O, or acquire a sleeping lock. Doing so will deadlock the system.
•Don't do I/O: I/O operations might trigger interrupts that you've disabled, leading to hangs or missed events.
•Prefer spinlocks with _irqsave: Use spin_lock_irqsave() which combines spinlock with interrupt saving, ensuring proper nesting.
•Consider alternatives: RCU, per-CPU data, and lockless algorithms often eliminate the need to disable interrupts.
•Audit and measure: Use tracing tools (ftrace, perf) to identify long interrupt-off sections. These are often performance bugs.
•Document why: When you do disable interrupts, document the specific reason—reviewing code later should reveal the synchronization requirement.

kernel_irq_discipline.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
/* Examples of good and bad interrupt disabling in kernel code */
 
#include <linux/spinlock.h>
#include <linux/irqflags.h>
 
/* BAD: Long critical section */
void bad_long_critical_section(void) {
    unsigned long flags;
    local_irq_save(flags);
    
    /* BAD: Doing extensive work with interrupts off */
    for (int i = 0; i < 1000000; i++) {
        process_item(i);  /* Each call takes time! */
    }
    
    local_irq_restore(flags);
    /* This could block interrupts for seconds! */
}
 
/* GOOD: Minimal critical section */
void good_minimal_critical_section(void) {
    unsigned long flags;
    
    /* Do heavy work with interrupts enabled */
    prepare_data();
    
    local_irq_save(flags);
    /* Only the critical update is protected */
    atomic_counter++;  /* Tiny critical section */
    local_irq_restore(flags);
    
    /* Continue non-critical work */
    cleanup_and_log();
}
 
/* BAD: Blocking with interrupts disabled */
void bad_blocking(void) {
    unsigned long flags;
    local_irq_save(flags);
    
    /* DEADLY: This will never complete! */
    wait_for_io_completion();  /* Needs interrupts to complete! */
    
    local_irq_restore(flags);  /* Never reached */
}
 
/* GOOD: Using appropriate lock type */
void good_lock_choice(void) {
    /* If you might block, use a mutex (not spinlock) */
    mutex_lock(&my_mutex);
    
    /* OK to do potentially blocking work here */
    result = disk_read_sync(buffer);
    
    mutex_unlock(&my_mutex);
    /* Interrupts were never fully disabled, 
       just the mutex provides mutual exclusion */
}
 
/* GOOD: Per-CPU data eliminates need for interrupt disable */
DEFINE_PER_CPU(struct stats, cpu_statistics);
 
void update_stats_percpu(void) {
    /*
     * No need to disable interrupts if only this CPU
     * accesses its own per-CPU data.
     * 
     * Still need to prevent preemption to ensure we're
     * on the same CPU throughout:
     */
    preempt_disable();
    this_cpu_inc(cpu_statistics.events);
    preempt_enable();
    
    /* If interrupt handlers also access, then: */
    local_irq_save(flags);  /* Only for IRQ handler sync */
    this_cpu_inc(cpu_statistics.irq_accessed_field);
    local_irq_restore(flags);
}

Real-Time Implications

Summary: Privileged Operation

We've thoroughly examined why interrupt disabling is restricted to privileged code. Let's consolidate the key insights:

Key Takeaways

•Hardware enforces privilege levels — CPUs implement protection rings (x86) or exception levels (ARM) that restrict what instructions user code can execute.
•CLI/STI require kernel mode — Attempting to execute interrupt control instructions in user mode causes a General Protection Fault (#GP).
•Security drives the restriction — If user code could disable interrupts, it could monopolize the CPU, bypass scheduling, and undermine system control.
•System calls provide controlled access — User code requests privileged operations through system calls, giving the kernel the opportunity to validate and mediate.
•Virtualization adds complexity — Hypervisors virtualize interrupt control, creating shadow interrupt flags so guests can't affect physical hardware.
•Kernel must use privilege wisely — Even in kernel mode, interrupt disabling should be minimal—keep critical sections short, never block, measure latency.

What's Next:

Page Complete