Loading content...
After thoroughly examining the interrupt disable approach—its mechanism, limitations, and constraints—we arrive at the crucial question: When should you actually use it?
Despite the long list of limitations we've discussed, interrupt disabling remains a valuable and sometimes irreplaceable tool in the kernel developer's arsenal. The key is understanding the specific scenarios where its strengths outweigh its weaknesses.
This page synthesizes everything we've learned into practical guidance. We'll examine the canonical use cases where interrupt disabling is appropriate, provide a decision-making framework for synchronization choices, and illustrate with real-world examples from production kernels.
Remember: there are no universally "best" synchronization primitives. Each tool serves specific purposes, and mastery lies in choosing the right tool for each situation.
When deciding whether to disable interrupts, ask: 'What am I actually protecting against?' If you need to prevent preemption on THIS CPU specifically—perhaps to access per-CPU data or prevent an interrupt handler from interrupting you—interrupt disabling is often appropriate. If you need mutual exclusion across CPUs, you need additional mechanisms (spinlocks, atomics, etc.).
Let's enumerate the scenarios where interrupt disabling is not just appropriate but often the correct—or only—choice.
Use Case 1: Per-CPU Data Protection
The clearest case for interrupt disabling is when accessing data that is strictly per-CPU. Each CPU maintains its own copy of this data, so there's no cross-CPU race condition to address. The only threat is competition between:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556
/* USE CASE 1: Per-CPU data protection */ #include <linux/percpu.h>#include <linux/irqflags.h> /* Each CPU has its own runqueue structure */DEFINE_PER_CPU(struct rq, runqueues); /* Per-CPU statistics - interrupt handlers also update these */DEFINE_PER_CPU(struct kernel_stat, kstat); void update_runqueue(void) { unsigned long flags; struct rq *rq; /* * Scenario: * - We're updating current CPU's runqueue * - An interrupt might fire, and the IRQ handler * might also need to update the runqueue * - No OTHER CPU ever touches OUR runqueue * * Solution: Disable interrupts (no spinlock needed!) */ local_irq_save(flags); rq = this_cpu_ptr(&runqueues); rq->nr_switches++; rq->curr = next_task; /* Safe: no interrupt can access rq while we're here */ local_irq_restore(flags);} void scheduler_tick(void) { unsigned long flags; struct kernel_stat *ks; /* * Even simpler: atomically increment per-CPU counter * local_irq_save/restore ensures we don't race with * interrupt handlers on THIS CPU that might also * update statistics. */ local_irq_save(flags); ks = this_cpu_ptr(&kstat); ks->cpu_time[USER]++; /* Or SYSTEM, IDLE, etc. */ local_irq_restore(flags);} /* * KEY INSIGHT: * - No spinlock is needed because no other CPU accesses our data * - Interrupt disabling alone provides mutual exclusion * - This is EFFICIENT: no cache line bouncing, no bus locking */Use Case 2: Serialization with Interrupt Handlers
When thread context code and interrupt handlers both access the same data, interrupt disabling ensures the thread can't be interrupted during its critical section. This is often combined with spinlocks for SMP safety.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677
/* USE CASE 2: Serialization with interrupt handlers */ struct device_data { spinlock_t lock; struct list_head pending_requests; unsigned int requests_in_flight;}; /* Thread context: submits work to device */void submit_device_request(struct device_data *dev, struct request *req) { unsigned long flags; /* * Why spin_lock_irqsave? * * 1. We need to prevent other CPUs from accessing * dev->pending_requests (spinlock provides this) * * 2. We need to prevent the device's IRQ handler * (which runs on THIS CPU) from accessing the * list while we're modifying it * * spin_lock_irqsave = disable IRQ + acquire spinlock */ spin_lock_irqsave(&dev->lock, flags); list_add_tail(&req->list, &dev->pending_requests); dev->requests_in_flight++; trigger_device_dma(dev, req); spin_unlock_irqrestore(&dev->lock, flags);} /* Interrupt handler: processes completed work */irqreturn_t device_irq_handler(int irq, void *data) { struct device_data *dev = data; struct request *req; /* * Handler already has its IRQ disabled (by hardware). * We still need spinlock for other CPUs. * * spin_lock (not irqsave) is sufficient here: * - Our IRQ is already masked by the interrupt controller * - We're not worried about this specific IRQ re-entering * - Other interrupts might fire, but they won't access our data */ spin_lock(&dev->lock); req = list_first_entry(&dev->pending_requests, struct request, list); list_del(&req->list); dev->requests_in_flight--; spin_unlock(&dev->lock); complete_request(req); return IRQ_HANDLED;} /* * WHY NOT just spinlock in thread context? * * Consider: * Thread: spin_lock(&dev->lock); * [IRQ fires, same CPU] * Handler: spin_lock(&dev->lock); // DEADLOCK! * * Handler can't acquire lock because Thread holds it. * Thread can't release lock because it's been interrupted. * DEADLOCK. * * With irqsave: IRQ can't fire while Thread holds lock. * Problem solved. */Use Case 3: Hardware Register Manipulation
Some hardware requires precise timing or sequencing of register accesses. Allowing interrupts during these sequences could violate hardware requirements or cause undefined behavior.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263
/* USE CASE 3: Hardware register manipulation */ /* Example: Unlocking flash memory for writing */void flash_write_enable(struct flash_device *flash) { unsigned long flags; /* * Flash chips often require precise unlock sequences. * If this sequence is interrupted, the chip may: * - Remain locked (write fails) * - Enter undefined state * - Require reset/power cycle * * We MUST complete the sequence atomically. */ local_irq_save(flags); /* Typical flash unlock sequence */ writeb(0xAA, flash->base + 0x5555); writeb(0x55, flash->base + 0x2AAA); writeb(0xA0, flash->base + 0x5555); /* Write enable */ /* Write the actual data */ writeb(data, flash->base + target_addr); /* Wait for completion (polling, not interrupts) */ while (readb(flash->base) & BUSY_BIT) cpu_relax(); local_irq_restore(flags);} /* Example: Programming PCI configuration space */void pci_config_write(struct pci_dev *dev, int where, u32 val) { unsigned long flags; /* * PCI configuration access uses port I/O to the root complex. * The CONFIG_ADDRESS and CONFIG_DATA registers must be * accessed as an atomic pair. * * If interrupted between address and data writes: * - Another driver might change CONFIG_ADDRESS * - We'd write to the wrong device/register */ raw_spin_lock_irqsave(&pci_lock, flags); /* Set address: bus, device, function, register */ outl(0x80000000 | (dev->bus << 16) | (dev->devfn << 8) | where, 0xCF8); /* CONFIG_ADDRESS */ /* Write data */ outl(val, 0xCFC); /* CONFIG_DATA */ raw_spin_unlock_irqrestore(&pci_lock, flags);} /* * NOTE: These use raw_spin_lock_irqsave in PREEMPT_RT kernels * because hardware access truly needs non-preemptible execution. * Regular spin_lock would become a sleeping lock in RT kernels, * which is wrong for hardware operations. */Given the range of synchronization options available, how do you decide when interrupt disabling is appropriate? Here's a structured decision framework:
Step 1: Identify What You're Protecting Against
Step 2: Consider Duration
Even when interrupt disabling is appropriate, duration matters:
| Duration | Appropriateness | Actions |
|---|---|---|
| < 1μs | Always appropriate | No concerns |
| 1-10μs | Generally appropriate | Use for short, bounded operations |
| 10-100μs | Use with caution | Verify no real-time requirements violated |
| 100μs-1ms | Avoid if possible | Restructure to enable/disable between chunks |
1ms | Almost never appropriate | Design flaw; refactor required |
Step 3: Check Constraints
Before using interrupt disabling, verify none of these constraints apply:
Can the critical section block? If there's ANY chance of blocking (waiting for I/O, acquiring a sleeping lock, allocating memory that might wait), you CANNOT use interrupt disabling.
Are there real-time requirements? If your system has hard real-time deadlines, every interrupt-disabled section contributes to worst-case latency.
What's the call depth? If your critical section calls other functions, do you know their IRQ requirements and durations?
Is this on the fast path? Frequently executed code paths accumulate latency. Rare error-handling paths are more tolerant.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485
/* Decision framework examples */ /* SCENARIO 1: Updating per-CPU counter from IRQ handler *//* Analysis: * - Data: per-CPU counter → no cross-CPU races * - Threat: only same-CPU interrupt handlers * - Duration: single increment → ~nanoseconds * - Blocking: no * * DECISION: local_irq_save/restore ALONE is perfect */void update_percpu_counter(void) { unsigned long flags; local_irq_save(flags); __this_cpu_inc(my_counter); /* Single operation */ local_irq_restore(flags);} /* SCENARIO 2: Modifying global device list *//* Analysis: * - Data: global list → cross-CPU races possible * - Threat: other CPUs + IRQ handlers * - Duration: list manipulation → ~microseconds * - Blocking: no * * DECISION: spin_lock_irqsave (spinlock + IRQ disable) */void add_device_to_global_list(struct device *dev) { unsigned long flags; spin_lock_irqsave(&device_list_lock, flags); list_add(&dev->list, &device_list); spin_unlock_irqrestore(&device_list_lock, flags);} /* SCENARIO 3: Loading module from filesystem *//* Analysis: * - Data: module catalog → cross-process races * - Threat: other threads (not really IRQ concern) * - Duration: potentially long (filesystem I/O!) * - Blocking: YES - disk I/O blocks * * DECISION: mutex (sleeping lock, NOT spinlock/irq_save) */int load_module(const char *name) { int ret; mutex_lock(&module_mutex); /* Can sleep! */ ret = find_and_load_module(name); /* May block on I/O */ if (ret == 0) add_to_loaded_modules(name); mutex_unlock(&module_mutex); return ret;} /* SCENARIO 4: Extremely time-sensitive calibration *//* Analysis: * - Data: hardware calibration registers * - Threat: interrupts interfering with timing * - Duration: ~50-100μs for calibration loop * - Blocking: no * - Frequency: once at boot * * DECISION: local_irq_save acceptable despite duration * (one-time cost, timing accuracy critical) */void calibrate_tsc(void) { unsigned long flags; u64 start, end, cycles; local_irq_save(flags); /* Long, but once at boot */ start = rdtsc(); udelay(100); /* Busy-wait, not sleep! */ end = rdtsc(); local_irq_restore(flags); cycles = end - start; tsc_calibration = cycles / 100;}When you've determined that interrupt disabling is appropriate for your situation, follow these best practices to minimize risks and maximize correctness.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127
/* Best practices demonstration */ #include <linux/spinlock.h>#include <linux/irqflags.h>#include <linux/lockdep.h> /* * BEST PRACTICE: Document IRQ context requirements */ /** * process_urgent_event - handles time-critical events * @event: event to process * * Context: Can be called from any context, including IRQ. * Function will disable interrupts internally. * Critical section duration: < 5μs. * * Return: 0 on success, negative error code on failure. */int process_urgent_event(struct event *event) { unsigned long flags; int result; /* Do validation with IRQs enabled (might be slow) */ if (!validate_event(event)) return -EINVAL; /* Minimal critical section */ local_irq_save(flags); result = quick_event_handler(event); /* < 5μs */ local_irq_restore(flags); /* Logging and cleanup with IRQs enabled */ log_event_result(event, result); return result;} /* * BEST PRACTICE: Use spin_lock_irqsave for combined protection */static DEFINE_SPINLOCK(data_lock); void update_shared_data(struct data *d) { unsigned long flags; /* * GOOD: Single call handles both IRQ save and lock * Easier to audit than separate operations */ spin_lock_irqsave(&data_lock, flags); d->field1++; d->field2 = compute_field2(); spin_unlock_irqrestore(&data_lock, flags); /* * BAD: Separate operations, harder to verify correct nesting * local_irq_save(flags); * spin_lock(&data_lock); * ... * spin_unlock(&data_lock); * local_irq_restore(flags); */} /* * BEST PRACTICE: Use lockdep annotations when needed */void function_requires_irqs_off(void) { /* Assert that interrupts are already disabled */ lockdep_assert_irqs_disabled(); /* Proceed with confidence */ do_quick_work();} void function_requires_irqs_on(void) { /* Assert that interrupts are enabled */ lockdep_assert_irqs_enabled(); /* Safe to do potentially lengthy work */ do_lengthy_work();} /* * BEST PRACTICE: Restructure to minimize IRQ-disable time */ /* BAD: Long critical section */void process_list_bad(struct list_head *items) { unsigned long flags; struct item *item; local_irq_save(flags); list_for_each_entry(item, items, list) { process_item(item); /* Could be slow! */ } local_irq_restore(flags);} /* GOOD: Short critical sections, IRQs enabled between */void process_list_good(struct list_head *items) { unsigned long flags; struct item *item, *safe; struct item *batch[16]; int count = 0; /* Phase 1: Extract items (short critical section) */ local_irq_save(flags); list_for_each_entry_safe(item, safe, items, list) { list_del(&item->list); batch[count++] = item; if (count >= 16) break; } local_irq_restore(flags); /* Phase 2: Process (IRQs enabled!) */ for (int i = 0; i < count; i++) { process_item(batch[i]); /* Can be slow, IRQs enabled */ } /* Repeat until list is empty */}Let's examine how interrupt disabling is used in practice within the Linux kernel—one of the most scrutinized and optimized codebases in existence.
Example 1: Scheduler Context Switch
The Linux scheduler disables interrupts during the actual context switch to ensure atomicity:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647
/* Simplified from kernel/sched/core.c */ /* * context_switch - switch to the new MM and the new thread's register state. */static __always_inline struct rq *context_switch(struct rq *rq, struct task_struct *prev, struct task_struct *next){ /* * The scheduler is THE canonical example of IRQ-disabled sections. * We cannot allow an interrupt during the actual CPU state switch. * * Why not? * - The CPU registers are in an inconsistent state * - Stack pointers are being changed * - The 'current' task changes mid-operation * - An interrupt now would corrupt everything */ /* Already holding rq->lock with IRQs disabled */ prepare_lock_switch(rq, next); /* Switch to next's memory mapping */ if (!prev->mm) { /* Kernel thread - borrow next's mm */ next->active_mm = prev->active_mm; prev->active_mm = NULL; } else { /* User task - switch to its address space */ switch_mm(prev->active_mm, next->mm, next); } /* * THE CRITICAL MOMENT: switch_to() does the actual * register swap. This is assembly-level, architecture- * specific code that saves prev's registers and * loads next's registers. * * Duration: ~500ns - 2μs typically */ switch_to(prev, next, prev); /* When we return here, we're running as 'next' */ return finish_task_switch(prev);}Example 2: Timer Interrupt Handling
The timer tick handler demonstrates the IRQ-disabled → bottom-half pattern:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051
/* Simplified from kernel/time/timer.c */ /* * This is called from the hardware timer interrupt. * The interrupt handler runs with IRQs disabled. */void update_process_times(int user_tick){ struct task_struct *p = current; /* * We're in hardirq context: interrupts are disabled. * Keep this FAST. Just update counters. */ /* Account CPU time: ~20-50 CPU cycles */ account_process_tick(p, user_tick); /* Update jiffies and trigger timer subsystem */ run_local_timers(); /* Check if current task should be preempted */ scheduler_tick(); /* * Total duration: ~1-5μs * This is acceptable because timer tick is crucial. * * HEAVY work (running expired timers, sending signals) * is deferred to softirq context with IRQs enabled. */} /* * run_timers - runs after IRQs are re-enabled * Called from softirq context (IRQs ENABLED) */static void run_timer_softirq(struct softirq_action *h){ struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_STD]); /* Now we can do the expensive work with IRQs enabled */ spin_lock_irq(&base->lock); while (time_before_eq(base->clk, jiffies)) { /* Expire each timer: might trigger callbacks */ expire_timers(base, heads); } spin_unlock_irq(&base->lock);}Example 3: Network Device Driver
The NAPI subsystem shows the modern approach to minimizing IRQ-disabled time in high-throughput paths:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273
/* Simplified from typical network driver using NAPI */ /* * NAPI (New API) is designed to minimize interrupt overhead * while maintaining high throughput. * * Philosophy: * 1. Hardware interrupt: VERY short, disable further IRQs * 2. Software poll: Process many packets with IRQs ENABLED * 3. Repeat as needed */ /* Hardware interrupt handler */static irqreturn_t my_nic_interrupt(int irq, void *dev_id){ struct my_nic_dev *dev = dev_id; /* * We're in hardirq context: this IRQ is disabled on this CPU. * Get out FAST. */ /* Acknowledge interrupt to hardware (~50 cycles) */ iowrite32(IRQ_ACK, dev->regs + IRQ_STATUS); /* Disable further RX interrupts from this device */ iowrite32(0, dev->regs + RX_IRQ_ENABLE); /* Schedule NAPI poll to run in softirq context */ napi_schedule(&dev->napi); return IRQ_HANDLED; /* * Total duration: < 1μs * Even at millions of packets/sec, this is sustainable */} /* NAPI poll function - runs in softirq context */static int my_nic_poll(struct napi_struct *napi, int budget){ struct my_nic_dev *dev = container_of(napi, struct my_nic_dev, napi); int processed = 0; /* * NOW we're in softirq context. * INTERRUPTS ARE ENABLED! * Timer interrupts fire. Other devices' IRQs work. * This is where we do the heavy lifting. */ /* Process up to 'budget' packets (budget ~64-128) */ while (processed < budget && rx_ring_has_packets(dev)) { struct sk_buff *skb = receive_packet(dev); netif_receive_skb(skb); /* May trigger protocol stack */ processed++; } /* If we processed everything, re-enable interrupts */ if (processed < budget) { napi_complete_done(napi, processed); iowrite32(1, dev->regs + RX_IRQ_ENABLE); } return processed; /* * Duration: 10s-100s of μs (processing many packets) * But IRQs are ENABLED the whole time! * Other system functions continue normally. */}Notice the consistent pattern in all these examples: the hardirq handler (IRQ disabled) is extremely short—just acknowledge the event and schedule further work. The actual processing happens in softirq or tasklet context with interrupts enabled. This 'short hard, long soft' pattern is the standard for high-performance kernel code.
Before choosing interrupt disabling, consider whether alternative mechanisms might be more appropriate for your use case:
| Mechanism | When to Use | Advantages | Disadvantages |
|---|---|---|---|
| RCU (Read-Copy-Update) | Read-heavy workloads with rare updates | Readers never block, excellent scaling | Complex update semantics, memory overhead |
| Per-CPU variables | Data that's truly per-CPU | No cross-CPU locking overhead | Can't share data directly between CPUs |
| Atomic operations | Simple counters, flags, pointers | No scheduler latency impact | Limited to single-word operations |
| Mutex | Complex critical sections that may block | Can sleep, fairer | Higher overhead than spinlock |
| RW locks | Read-heavy, shared data | Multiple readers simultaneously | Writer starvation possible |
| Lock-free data structures | High contention, performance critical | No blocking ever | Complex to implement correctly |
| Disable preemption only | Per-CPU data, no IRQ handler access | IRQs still work, lower latency | Doesn't protect against IRQs |
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798
/* Alternatives that avoid interrupt disabling */ #include <linux/rculist.h>#include <linux/percpu.h>#include <linux/atomic.h> /* * ALTERNATIVE 1: RCU for read-mostly data * No IRQ disabling needed for readers! */struct config_data __rcu *global_config; struct config_data *read_config(void) { struct config_data *cfg; rcu_read_lock(); /* Just a preempt_disable on most configs */ cfg = rcu_dereference(global_config); /* Read cfg fields */ use_config(cfg); rcu_read_unlock(); return cfg; /* NO interrupt disabling! Readers proceed at full speed */} void update_config(struct config_data *new_cfg) { struct config_data *old_cfg; /* Writer uses mutex, but readers aren't affected */ mutex_lock(&config_mutex); old_cfg = rcu_dereference_protected(global_config, lockdep_is_held(&config_mutex)); rcu_assign_pointer(global_config, new_cfg); mutex_unlock(&config_mutex); synchronize_rcu(); /* Wait for all readers */ kfree(old_cfg);} /* * ALTERNATIVE 2: Atomic operations for simple data * No locking at all! */atomic_t connection_count = ATOMIC_INIT(0);atomic_t bytes_transferred = ATOMIC_INIT(0); void track_connection(int bytes) { atomic_inc(&connection_count); atomic_add(bytes, &bytes_transferred); /* No critical section, no IRQ concerns */} long get_stats(void) { int conns = atomic_read(&connection_count); int bytes = atomic_read(&bytes_transferred); /* Reads are atomic, no locking */ return conns * bytes;} /* * ALTERNATIVE 3: preempt_disable for per-CPU data * When IRQ handlers don't access the data */DEFINE_PER_CPU(struct cache_stats, local_cache); void update_cache_hit(void) { /* * If no interrupt handler accesses local_cache, * we only need to prevent migrating to another CPU, * not prevent interrupts entirely. */ preempt_disable(); __this_cpu_inc(local_cache.hits); preempt_enable(); /* * Lower overhead than IRQ disable * Interrupts can still fire and be serviced * Only prevents preemption (CPU migration) */} /* * DECISION TREE: * * Is data per-CPU? * ├─ Yes: Do IRQ handlers access it? * │ ├─ Yes: local_irq_save + __this_cpu_* * │ └─ No: preempt_disable + __this_cpu_* * └─ No: Is it read-mostly? * ├─ Yes: Consider RCU * └─ No: Is it a simple counter/flag? * ├─ Yes: Use atomic_t * └─ No: Use spinlock (+ irqsave if IRQ-accessed) */More sophisticated mechanisms like RCU have higher implementation complexity but often provide better performance characteristics. The choice depends on your specific workload pattern, contention level, and performance requirements. For simple cases, straightforward spinlock+irqsave is often the right choice; for hot paths in production systems, investing in RCU or lock-free approaches pays off.
We've completed our comprehensive exploration of the interrupt disable approach. Let's consolidate the decision-making guidance:
Module Complete: Disabling Interrupts
Across these five pages, we've covered:
You now have a complete understanding of this foundational synchronization technique, including its power, its limitations, and its proper application in operating system development.
Congratulations! You've mastered the interrupt disable approach to mutual exclusion. You understand when to use it (per-CPU data, IRQ handler synchronization, combined with spinlocks), when to avoid it (long operations, blocking, real-time paths), and the best practices for implementation. This knowledge prepares you for more advanced synchronization topics like locks, semaphores, and lock-free algorithms.