Operating SystemsCPU Execution Modes

CPU Execution Modes

LevelBeginner

Duration60 mins

TopicCPU Execution Modes

2 / 5

Kernel Mode (Supervisor Mode)

The Seat of Power

If User Mode is a carefully constructed prison, Kernel Mode is the warden's office—with master keys to every door, cameras in every room, and a killswitch for the entire facility.

Kernel Mode (also called Supervisor Mode, Privileged Mode, or Ring 0) is the CPU execution state where the operating system kernel runs. In this mode, code has unrestricted access to:

Every byte of physical memory
Every hardware device and peripheral
Every processor feature and control register
Every running process's state and data

This is absolute power. And as the saying goes, absolute power requires absolute responsibility. A single bug in Kernel Mode code doesn't crash one application—it crashes the entire system. A security vulnerability in Kernel Mode code doesn't compromise one user's data—it potentially exposes everything on the machine to attackers.

What You Will Learn

By the end of this page, you will understand: (1) What makes Kernel Mode 'privileged' and how this is architecturally enforced, (2) The specific operations that become possible in Kernel Mode, (3) Why the OS kernel must run in this mode, (4) The responsibilities and dangers of Kernel Mode execution, and (5) How modern systems minimize the amount of code running with full privileges.

Defining Kernel Mode

Kernel Mode is the processor execution state characterized by maximum privileges. Code executing in Kernel Mode can perform any operation the CPU is capable of, without hardware-enforced restrictions.

The formal definition:

Kernel Mode is the most privileged processor execution state, wherein executing code has unfettered access to all hardware resources, memory addresses, and processor features, limited only by the physical capabilities of the hardware itself.

Why 'Kernel' Mode?

The name reflects what runs here: the kernel—the core of the operating system that manages all system resources. The kernel is not a single program but a collection of:

Scheduler — Decides which process runs next
Memory Manager — Allocates and protects memory
File System — Organizes persistent storage
Device Drivers — Communicates with hardware
Interrupt Handlers — Responds to hardware signals
System Call Interface — Serves application requests

All of these require hardware access that would be dangerous to grant to arbitrary applications.

Kernel Mode Nomenclature Across Platforms
Platform/Architecture	Name for Kernel Mode	Technical Designation
x86/x64	Kernel Mode / Ring 0	CPL = 0
ARM (Cortex-A)	Privileged Mode / EL1	Exception Level 1
ARM (older)	Supervisor Mode (SVC)	Mode bits in CPSR
RISC-V	Supervisor Mode (S-mode)	MODE = 1 in sstatus
MIPS	Kernel Mode	KSU field = 00 in Status
PowerPC	Supervisor State	MSR[PR] = 0

Beyond Kernel Mode: Hypervisor Mode

Modern processors often have an even more privileged mode than traditional Kernel Mode: Hypervisor Mode (Ring -1, EL2 on ARM). This allows a hypervisor to control multiple operating systems, each of which believes it's running in Kernel Mode. The hypervisor can intercept and virtualize the 'kernel's' privileged operations. This creates a hierarchy: Hardware → Hypervisor → Kernel → User applications.

The Powers of Kernel Mode

Kernel Mode unlocks every capability the processor provides. Let's examine the specific powers that become available when the processor's privilege level is elevated to Kernel Mode.

1. Unrestricted Memory Access

Kernel Mode code can access any physical memory address, regardless of page table permissions. This includes:

All user-space memory of every process
Kernel data structures
Memory-mapped device registers
Even addresses that aren't RAM (like video memory, PCI config space)

2. Control Register Access

The CPU's control registers configure fundamental processor behavior:

x86_control_registers.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// x86/x64 Control Registers (accessible only in Ring 0)
 
CR0 - Protection Enable, Paging Enable, Write Protect
      ├── PE (bit 0): Enable protected mode
      ├── PG (bit 31): Enable paging
      └── WP (bit 16): Write-protect in kernel mode
 
CR2 - Page Fault Linear Address
      └── Contains faulting address after page fault
 
CR3 - Page Directory Base Register (PDBR)
      └── Physical address of top-level page table
      └── Changing this switches address spaces!
 
CR4 - Extended Control Features
      ├── PAE: Physical Address Extension
      ├── PSE: Page Size Extension (4MB pages)
      ├── SMEP: Supervisor Mode Execution Prevention
      ├── SMAP: Supervisor Mode Access Prevention
      └── PCIDE: Process-Context Identifiers Enable
 
CR8 (x64 only) - Task Priority Level
      └── Controls interrupt filtering

3. Interrupt and Exception Control

Kernel Mode code can:

Disable/enable CPU interrupts (CLI/STI instructions)
Configure the Interrupt Descriptor Table (LIDT instruction)
Set up interrupt handlers
Acknowledge and dismiss hardware interrupts

4. Direct Hardware I/O

Two methods of hardware communication become available:

Port I/O: IN/OUT instructions to communicate with devices via I/O ports
Memory-Mapped I/O: Direct read/write to device registers mapped into physical address space

5. Privileged Instructions

Dozens of CPU instructions that are forbidden in User Mode become available:

Examples of Privileged Instructions (x86)

•CLI / STI — Clear/Set Interrupt Flag (disable/enable interrupts)
•IN / OUT — Port-based I/O with hardware devices
•HLT — Halt the CPU until next interrupt (enter low-power state)
•LGDT / LIDT — Load Global/Interrupt Descriptor Table registers
•MOV to CRn — Modify control registers
•WRMSR / RDMSR — Write/Read Model-Specific Registers
•INVLPG — Invalidate TLB entry for an address
•IRET — Return from interrupt (also changes privilege level)
•LMSW — Load Machine Status Word (legacy CR0 load)
•CLTS — Clear Task-Switched flag in CR0

With Great Power...

Every power available in Kernel Mode is also a potential weapon. Write to the wrong control register? System crash. Disable interrupts and enter a loop? System freeze. Corrupt page tables? Random processes die or, worse, subtly corrupt data. This is why the kernel codebase is so carefully reviewed and tested—bugs here are catastrophic.

Why the OS Needs Kernel Mode

The operating system's fundamental responsibilities require powers that cannot be safely granted to user applications. Let's examine each core kernel function and understand why it demands privileged execution.

1. Process Management: The Scheduler

The scheduler decides which process runs next. This requires:

Reading/writing the saved state (registers) of every process
Manipulating the CPU's execution context
Setting timer interrupts to enforce time slices
Accessing the ready queue (kernel data structure)

2. Memory Management: The Virtual Memory System

Virtual memory requires:

Creating and modifying page tables (controls all memory access)
Accessing physical memory addresses directly
Handling page faults (loading pages from disk)
Maintaining the Translation Lookaside Buffer (TLB)

Kernel Functions and Their Required Privileges
Kernel Function	Privileged Operations Required	Risk If User-Accessible
Scheduling	Timer interrupt control, context switching	Process could make itself unpreemptable
Memory Management	Page table modification, TLB control	Process could access any memory
File System	Disk I/O, inode manipulation	Process could corrupt filesystem
Network Stack	NIC hardware access, packet routing	Process could sniff all network traffic
Device Drivers	Direct hardware communication	Process could damage hardware
Security/Access Control	Permission checking, credential management	Process could bypass all security

3. Device Drivers: Hardware Communication

Devices don't speak C or Python—they communicate through:

Hardware registers at specific memory addresses
I/O ports (legacy PC devices)
Interrupts signaling events
DMA (Direct Memory Access) transfers

All of these require Kernel Mode access. A user-space process with direct hardware access could:

Read any memory via DMA
Crash the system by misconfiguring devices
Bypass all file system permissions by reading disks directly

4. Interrupt Handling

When hardware needs attention (keyboard press, network packet, timer tick), it triggers an interrupt. The CPU:

Saves current execution state
Automatically switches to Kernel Mode
Jumps to a predetermined handler address

Interrupt handlers run in Kernel Mode because they must:

Access device registers to acknowledge the interrupt
Update kernel data structures (e.g., packet queues)
Potentially wake up sleeping processes

The Minimal Privilege Principle

Good OS design runs as little code in Kernel Mode as possible. The more code with full privileges, the larger the attack surface and the higher the probability of devastating bugs. This principle led to microkernel architectures, where even device drivers run in User Mode, communicating with a minimal kernel via IPC.

Architectural Implementation of Kernel Mode

Just as User Mode is enforced by hardware, Kernel Mode status is tracked and enforced at the silicon level. Let's examine how different architectures implement this.

x86/x64 Implementation (Protection Rings)

The x86 architecture defines four protection rings (0-3), though most OSes use only two:

Ring 0 (CPL=0): Kernel Mode — Full privileges
Ring 3 (CPL=3): User Mode — Restricted privileges

The Current Privilege Level (CPL) is stored in bits 0-1 of the Code Segment (CS) register. Every memory access and privileged instruction checks this value.

What happens with each instruction:

privilege_check_algorithm.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// For every instruction, the CPU performs:
 
function checkInstructionPrivilege(instruction) {
    CPL = CS.PrivilegeLevel;  // Current Privilege Level
    
    // Check 1: Privileged instruction?
    if (instruction.isPrivileged && CPL != 0) {
        raise GeneralProtectionFault("#GP");
    }
    
    // Check 2: Memory access privilege?
    if (instruction.accessesMemory) {
        targetPage = translateAddress(instruction.address);
        pageDPL = targetPage.supervisorBit ? 0 : 3;
        
        if (CPL > pageDPL) {
            // User trying to access supervisor page
            raise PageFault();
        }
        
        // Additional checks for write access, execution
        if (instruction.isWrite && !targetPage.writeable) {
            raise PageFault();
        }
    }
}

ARM Implementation (Exception Levels)

ARM takes a cleaner, more modern approach with Exception Levels (EL0-EL3):

Level	Purpose	Example Code
EL0	User applications	Web browsers, games, servers
EL1	Operating system kernel	Linux, Windows, macOS kernel
EL2	Hypervisor	KVM, Hyper-V, VMware ESXi
EL3	Secure monitor	ARM TrustZone secure firmware

Transitions between levels are explicit and controlled. Moving to a higher level (more privileged) requires an exception (hardware interrupt or system call). Moving to a lower level is done via return-from-exception instructions.

RISC-V Implementation (Machine/Supervisor/User Modes)

RISC-V defines three modes with clean separation:

Machine Mode (M): The most privileged, used for firmware
Supervisor Mode (S): For the OS kernel
User Mode (U): For applications

Mode is encoded in the mstatus (machine status) register's MPP (Machine Previous Privilege) field and similar fields for S-mode.

Ring -1: Hardware Virtualization

Intel VT-x and AMD-V added a 'ring below ring 0' for hypervisors. This allows a hypervisor to run in a more privileged mode than the guest OS kernels it hosts. The guest OS thinks it's in Ring 0, but actual privileged operations are intercepted by the hypervisor in this lower 'root mode.' This is sometimes called 'Ring -1' informally.

The Kernel Stack and Context

When the CPU transitions to Kernel Mode, it doesn't just flip a bit—it switches to an entirely different execution context. Understanding this context is crucial for comprehending how the OS maintains control.

The Kernel Stack

Every process has (at least) two stacks:

User Stack — Located in user-space memory; used during normal application execution
Kernel Stack — Located in kernel memory; used when the kernel executes on behalf of this process

When a system call or interrupt occurs:

CPU saves the user-mode stack pointer
CPU loads the kernel stack pointer (stored in a special register or table)
CPU pushes the user's return address, flags, and other context onto the kernel stack
Execution continues using the kernel stack

Converting Mermaid diagram...

Why Separate Stacks?

Security: User code cannot corrupt the kernel's stack or analyze its contents
Isolation: Kernel can run complex nested operations without user stack overflow
Reliability: Stack pointer validity can be assessed before trusting user-provided addresses
Size Control: Kernel stacks are typically small (8-16KB) because kernel shouldn't recurse deeply

The Task State Segment (TSS) on x86

On x86, the TSS (Task State Segment) stores the kernel stack pointer. When transitioning to Ring 0:

CPU reads RSP0 (Ring 0 stack pointer) from the TSS
Saves SS:RSP (user stack) to the new kernel stack
Sets RSP to the kernel stack pointer

This happens atomically in hardware, before any kernel code executes—ensuring secure transition.

Kernel-Mode Context:

Beyond the stack switch, entering Kernel Mode establishes:

Access to all kernel data structures
Ability to use privileged instructions
Different page table permissions (supervisor pages become accessible)
Potentially different segment limits (legacy x86)

Kernel Stack Overflow: A Critical Vulnerability

Kernel stacks are intentionally small. If kernel code recurses too deeply or allocates too much on the stack, it overflows into other kernel data—a critical security vulnerability. Modern kernels use guard pages and static analysis to detect potential stack overflows. The Linux kernel typically uses 8KB or 16KB kernel stacks per thread.

Dangers and Responsibilities

Kernel Mode is not just powerful—it's dangerous. The same unrestricted access that enables system management also enables catastrophic failures. Understanding these risks explains why kernel development is treated with such care.

Failure Modes in Kernel Mode:

Common Kernel-Mode Failures

•Null Pointer Dereference — Accessing address 0 crashes immediately (if NULL guard is mapped) or corrupts random memory (if not). Unlike user space, there's no signal handler to catch it.
•Use-After-Free — Freeing memory then accessing it may corrupt other kernel data structures, causing seemingly unrelated failures later.
•Buffer Overflow — Writing beyond allocated bounds overwrites adjacent kernel data, potentially control structures.
•Deadlock — Taking locks out of order freezes the kernel. No watchdog can kill a deadlocked kernel thread.
•Interrupt Handling Bugs — A crash in an interrupt handler can leave hardware in inconsistent state.
•Race Conditions — Concurrent access to shared data without proper locking causes data corruption.

Bug in User Mode

•Process crashes with segfault
•Other processes unaffected
•OS continues running
•Restart the app and continue
•Data loss limited to one process

Bug in Kernel Mode

•System panic / BSOD
•All processes die instantly
•Machine requires reboot
•Open files may be corrupted
•Potential data loss system-wide

Security Vulnerabilities in Kernel Mode:

Kernel vulnerabilities are the most severe class of software exploits:

Vulnerability Type	Impact
Privilege Escalation	Attacker gains root/SYSTEM access
Arbitrary Read	Attacker can read any memory (passwords, keys)
Arbitrary Write	Attacker can modify any memory (disable security)
Code Execution	Attacker can run code with kernel privileges
Denial of Service	Attacker can crash entire system

The Meltdown and Spectre vulnerabilities demonstrated that even correct kernel code can leak data through hardware-level side channels, leading to the development of Kernel Page Table Isolation (KPTI) to separate user and kernel page tables more completely.

Defense in Depth

Modern kernels employ multiple defense layers: SMEP/SMAP prevent kernel from executing/accessing user pages, KASLR randomizes kernel address layout, W^X ensures memory is never both writable and executable, stack canaries detect buffer overflows, and CFI (Control Flow Integrity) prevents return-oriented programming attacks.

Minimizing Kernel Mode Code

Given the dangers of Kernel Mode, a key design principle in modern operating systems is to minimize the code that runs with full privileges. Several architectural approaches achieve this:

1. Microkernels

The microkernel architecture moves traditionally kernel-mode components to user space:

File systems → User-space servers
Device drivers → User-space processes
Network stack → User-space implementation

The kernel provides only:

Process/thread management
Memory management
Inter-process communication (IPC)

Examples: Mach, QNX, seL4, MINIX 3

Monolithic vs. Microkernel Design
Aspect	Monolithic Kernel	Microkernel
Kernel-mode code	Large (millions of LoC)	Small (thousands of LoC)
Driver location	Kernel space	User space
Bug impact	Crashes system	Crashes one service
Performance	Faster (no IPC overhead)	Slower (IPC for everything)
Security	Large attack surface	Smaller attack surface
Examples	Linux, Windows, BSD	QNX, seL4, MINIX 3

2. User-Mode Drivers

Even in monolithic kernels, some drivers can run in user space:

USB drivers (via libusb)
Network drivers (DPDK)
GPU drivers (Mesa/Vulkan components)
Filesystem drivers (FUSE)

These use kernel interfaces that grant controlled access to hardware while isolating driver bugs.

3. Sandboxed Kernel Components

Some OS designs sandbox even kernel components:

eBPF (Linux): Safe bytecode running in kernel, verified before execution
Windows Supervisor Mode: Driver isolation in Windows 10/11
seL4: Formally verified microkernel with mathematical proofs of security

4. Hardware Assistance

Modern hardware helps reduce kernel exposure:

IOMMU: Prevents DMA attacks on kernel memory
Intel SGX / ARM TrustZone: Isolated execution enclaves
Hardware virtualization: Allows hypervisor to contain kernel bugs

The Linux Compromise

Linux uses a 'modular monolithic' approach. The kernel is monolithic (drivers run in kernel space), but it's modular (drivers can be loaded/unloaded dynamically). This provides good performance while allowing some flexibility. However, a buggy module still has full kernel privileges and can crash the entire system.

Summary: Kernel Mode

Kernel Mode is the privileged execution environment where the operating system kernel runs with unrestricted access to all system resources. Let's consolidate our understanding:

Key Takeaways

•Kernel Mode provides unrestricted access — All memory, all hardware, all processor features are accessible.
•The OS kernel requires these privileges — Scheduling, memory management, device I/O, and interrupt handling all need privileged operations.
•Hardware enforces the privilege level — The CPL (x86), Exception Level (ARM), or Mode bits (RISC-V) are checked by the CPU itself.
•Separate kernel stacks maintain isolation — Each process has a dedicated kernel stack for privileged execution.
•Bugs in Kernel Mode are catastrophic — A crash in kernel code crashes the entire system, not just one process.
•Security vulnerabilities are critical — Kernel exploits can give attackers complete system control.
•Minimizing kernel code is a design goal — Microkernels, user-mode drivers, and sandboxing reduce the privileged attack surface.

Looking ahead:

We've now seen both User Mode and Kernel Mode. But how does the processor actually track which mode it's in? The next page examines the Mode Bit—the specific hardware mechanism that encodes the current privilege level and controls access to privileged operations.

Page Complete

You now understand Kernel Mode: the privileged execution state where the OS runs with complete hardware access. This power enables system management but creates risks—bugs crash systems and vulnerabilities compromise security. Understanding both modes is essential to grasping how operating systems balance capability with safety.

2 / 5

Loading learning content...

Operating SystemsCPU Execution Modes

CPU Execution Modes

LevelBeginner

Duration60 mins

TopicCPU Execution Modes

2 / 5

Kernel Mode (Supervisor Mode)

The Seat of Power

If User Mode is a carefully constructed prison, Kernel Mode is the warden's office—with master keys to every door, cameras in every room, and a killswitch for the entire facility.

Every byte of physical memory
Every hardware device and peripheral
Every processor feature and control register
Every running process's state and data

What You Will Learn

Defining Kernel Mode

The formal definition:

Kernel Mode is the most privileged processor execution state, wherein executing code has unfettered access to all hardware resources, memory addresses, and processor features, limited only by the physical capabilities of the hardware itself.

Why 'Kernel' Mode?

The name reflects what runs here: the kernel—the core of the operating system that manages all system resources. The kernel is not a single program but a collection of:

Scheduler — Decides which process runs next
Memory Manager — Allocates and protects memory
File System — Organizes persistent storage
Device Drivers — Communicates with hardware
Interrupt Handlers — Responds to hardware signals
System Call Interface — Serves application requests

All of these require hardware access that would be dangerous to grant to arbitrary applications.

Kernel Mode Nomenclature Across Platforms
Platform/Architecture	Name for Kernel Mode	Technical Designation
x86/x64	Kernel Mode / Ring 0	CPL = 0
ARM (Cortex-A)	Privileged Mode / EL1	Exception Level 1
ARM (older)	Supervisor Mode (SVC)	Mode bits in CPSR
RISC-V	Supervisor Mode (S-mode)	MODE = 1 in sstatus
MIPS	Kernel Mode	KSU field = 00 in Status
PowerPC	Supervisor State	MSR[PR] = 0

Beyond Kernel Mode: Hypervisor Mode

The Powers of Kernel Mode

Kernel Mode unlocks every capability the processor provides. Let's examine the specific powers that become available when the processor's privilege level is elevated to Kernel Mode.

1. Unrestricted Memory Access

Kernel Mode code can access any physical memory address, regardless of page table permissions. This includes:

All user-space memory of every process
Kernel data structures
Memory-mapped device registers
Even addresses that aren't RAM (like video memory, PCI config space)

2. Control Register Access

The CPU's control registers configure fundamental processor behavior:

x86_control_registers.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// x86/x64 Control Registers (accessible only in Ring 0)
 
CR0 - Protection Enable, Paging Enable, Write Protect
      ├── PE (bit 0): Enable protected mode
      ├── PG (bit 31): Enable paging
      └── WP (bit 16): Write-protect in kernel mode
 
CR2 - Page Fault Linear Address
      └── Contains faulting address after page fault
 
CR3 - Page Directory Base Register (PDBR)
      └── Physical address of top-level page table
      └── Changing this switches address spaces!
 
CR4 - Extended Control Features
      ├── PAE: Physical Address Extension
      ├── PSE: Page Size Extension (4MB pages)
      ├── SMEP: Supervisor Mode Execution Prevention
      ├── SMAP: Supervisor Mode Access Prevention
      └── PCIDE: Process-Context Identifiers Enable
 
CR8 (x64 only) - Task Priority Level
      └── Controls interrupt filtering

3. Interrupt and Exception Control

Kernel Mode code can:

Disable/enable CPU interrupts (CLI/STI instructions)
Configure the Interrupt Descriptor Table (LIDT instruction)
Set up interrupt handlers
Acknowledge and dismiss hardware interrupts

4. Direct Hardware I/O

Two methods of hardware communication become available:

Port I/O: IN/OUT instructions to communicate with devices via I/O ports
Memory-Mapped I/O: Direct read/write to device registers mapped into physical address space

5. Privileged Instructions

Dozens of CPU instructions that are forbidden in User Mode become available:

Examples of Privileged Instructions (x86)

•CLI / STI — Clear/Set Interrupt Flag (disable/enable interrupts)
•IN / OUT — Port-based I/O with hardware devices
•HLT — Halt the CPU until next interrupt (enter low-power state)
•LGDT / LIDT — Load Global/Interrupt Descriptor Table registers
•MOV to CRn — Modify control registers
•WRMSR / RDMSR — Write/Read Model-Specific Registers
•INVLPG — Invalidate TLB entry for an address
•IRET — Return from interrupt (also changes privilege level)
•LMSW — Load Machine Status Word (legacy CR0 load)
•CLTS — Clear Task-Switched flag in CR0

With Great Power...

Why the OS Needs Kernel Mode

1. Process Management: The Scheduler

The scheduler decides which process runs next. This requires:

Reading/writing the saved state (registers) of every process
Manipulating the CPU's execution context
Setting timer interrupts to enforce time slices
Accessing the ready queue (kernel data structure)

2. Memory Management: The Virtual Memory System

Virtual memory requires:

Creating and modifying page tables (controls all memory access)
Accessing physical memory addresses directly
Handling page faults (loading pages from disk)
Maintaining the Translation Lookaside Buffer (TLB)

Kernel Functions and Their Required Privileges
Kernel Function	Privileged Operations Required	Risk If User-Accessible
Scheduling	Timer interrupt control, context switching	Process could make itself unpreemptable
Memory Management	Page table modification, TLB control	Process could access any memory
File System	Disk I/O, inode manipulation	Process could corrupt filesystem
Network Stack	NIC hardware access, packet routing	Process could sniff all network traffic
Device Drivers	Direct hardware communication	Process could damage hardware
Security/Access Control	Permission checking, credential management	Process could bypass all security

3. Device Drivers: Hardware Communication

Devices don't speak C or Python—they communicate through:

Hardware registers at specific memory addresses
I/O ports (legacy PC devices)
Interrupts signaling events
DMA (Direct Memory Access) transfers

All of these require Kernel Mode access. A user-space process with direct hardware access could:

Read any memory via DMA
Crash the system by misconfiguring devices
Bypass all file system permissions by reading disks directly

4. Interrupt Handling

When hardware needs attention (keyboard press, network packet, timer tick), it triggers an interrupt. The CPU:

Saves current execution state
Automatically switches to Kernel Mode
Jumps to a predetermined handler address

Interrupt handlers run in Kernel Mode because they must:

Access device registers to acknowledge the interrupt
Update kernel data structures (e.g., packet queues)
Potentially wake up sleeping processes

The Minimal Privilege Principle

Architectural Implementation of Kernel Mode

Just as User Mode is enforced by hardware, Kernel Mode status is tracked and enforced at the silicon level. Let's examine how different architectures implement this.

x86/x64 Implementation (Protection Rings)

The x86 architecture defines four protection rings (0-3), though most OSes use only two:

Ring 0 (CPL=0): Kernel Mode — Full privileges
Ring 3 (CPL=3): User Mode — Restricted privileges

The Current Privilege Level (CPL) is stored in bits 0-1 of the Code Segment (CS) register. Every memory access and privileged instruction checks this value.

What happens with each instruction:

privilege_check_algorithm.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// For every instruction, the CPU performs:
 
function checkInstructionPrivilege(instruction) {
    CPL = CS.PrivilegeLevel;  // Current Privilege Level
    
    // Check 1: Privileged instruction?
    if (instruction.isPrivileged && CPL != 0) {
        raise GeneralProtectionFault("#GP");
    }
    
    // Check 2: Memory access privilege?
    if (instruction.accessesMemory) {
        targetPage = translateAddress(instruction.address);
        pageDPL = targetPage.supervisorBit ? 0 : 3;
        
        if (CPL > pageDPL) {
            // User trying to access supervisor page
            raise PageFault();
        }
        
        // Additional checks for write access, execution
        if (instruction.isWrite && !targetPage.writeable) {
            raise PageFault();
        }
    }
}

ARM Implementation (Exception Levels)

ARM takes a cleaner, more modern approach with Exception Levels (EL0-EL3):

Level	Purpose	Example Code
EL0	User applications	Web browsers, games, servers
EL1	Operating system kernel	Linux, Windows, macOS kernel
EL2	Hypervisor	KVM, Hyper-V, VMware ESXi
EL3	Secure monitor	ARM TrustZone secure firmware

RISC-V Implementation (Machine/Supervisor/User Modes)

RISC-V defines three modes with clean separation:

Machine Mode (M): The most privileged, used for firmware
Supervisor Mode (S): For the OS kernel
User Mode (U): For applications

Mode is encoded in the mstatus (machine status) register's MPP (Machine Previous Privilege) field and similar fields for S-mode.

Ring -1: Hardware Virtualization

The Kernel Stack and Context

The Kernel Stack

Every process has (at least) two stacks:

User Stack — Located in user-space memory; used during normal application execution
Kernel Stack — Located in kernel memory; used when the kernel executes on behalf of this process

When a system call or interrupt occurs:

CPU saves the user-mode stack pointer
CPU loads the kernel stack pointer (stored in a special register or table)
CPU pushes the user's return address, flags, and other context onto the kernel stack
Execution continues using the kernel stack

Converting Mermaid diagram...

Why Separate Stacks?

Security: User code cannot corrupt the kernel's stack or analyze its contents
Isolation: Kernel can run complex nested operations without user stack overflow
Reliability: Stack pointer validity can be assessed before trusting user-provided addresses
Size Control: Kernel stacks are typically small (8-16KB) because kernel shouldn't recurse deeply

The Task State Segment (TSS) on x86

On x86, the TSS (Task State Segment) stores the kernel stack pointer. When transitioning to Ring 0:

CPU reads RSP0 (Ring 0 stack pointer) from the TSS
Saves SS:RSP (user stack) to the new kernel stack
Sets RSP to the kernel stack pointer

This happens atomically in hardware, before any kernel code executes—ensuring secure transition.

Kernel-Mode Context:

Beyond the stack switch, entering Kernel Mode establishes:

Access to all kernel data structures
Ability to use privileged instructions
Different page table permissions (supervisor pages become accessible)
Potentially different segment limits (legacy x86)

Kernel Stack Overflow: A Critical Vulnerability

Dangers and Responsibilities

Failure Modes in Kernel Mode:

Common Kernel-Mode Failures

•Null Pointer Dereference — Accessing address 0 crashes immediately (if NULL guard is mapped) or corrupts random memory (if not). Unlike user space, there's no signal handler to catch it.
•Use-After-Free — Freeing memory then accessing it may corrupt other kernel data structures, causing seemingly unrelated failures later.
•Buffer Overflow — Writing beyond allocated bounds overwrites adjacent kernel data, potentially control structures.
•Deadlock — Taking locks out of order freezes the kernel. No watchdog can kill a deadlocked kernel thread.
•Interrupt Handling Bugs — A crash in an interrupt handler can leave hardware in inconsistent state.
•Race Conditions — Concurrent access to shared data without proper locking causes data corruption.

Bug in User Mode

•Process crashes with segfault
•Other processes unaffected
•OS continues running
•Restart the app and continue
•Data loss limited to one process

Bug in Kernel Mode

•System panic / BSOD
•All processes die instantly
•Machine requires reboot
•Open files may be corrupted
•Potential data loss system-wide

Security Vulnerabilities in Kernel Mode:

Kernel vulnerabilities are the most severe class of software exploits:

Vulnerability Type	Impact
Privilege Escalation	Attacker gains root/SYSTEM access
Arbitrary Read	Attacker can read any memory (passwords, keys)
Arbitrary Write	Attacker can modify any memory (disable security)
Code Execution	Attacker can run code with kernel privileges
Denial of Service	Attacker can crash entire system

Defense in Depth

Minimizing Kernel Mode Code

Given the dangers of Kernel Mode, a key design principle in modern operating systems is to minimize the code that runs with full privileges. Several architectural approaches achieve this:

1. Microkernels

The microkernel architecture moves traditionally kernel-mode components to user space:

File systems → User-space servers
Device drivers → User-space processes
Network stack → User-space implementation

The kernel provides only:

Process/thread management
Memory management
Inter-process communication (IPC)

Examples: Mach, QNX, seL4, MINIX 3

Monolithic vs. Microkernel Design
Aspect	Monolithic Kernel	Microkernel
Kernel-mode code	Large (millions of LoC)	Small (thousands of LoC)
Driver location	Kernel space	User space
Bug impact	Crashes system	Crashes one service
Performance	Faster (no IPC overhead)	Slower (IPC for everything)
Security	Large attack surface	Smaller attack surface
Examples	Linux, Windows, BSD	QNX, seL4, MINIX 3

2. User-Mode Drivers

Even in monolithic kernels, some drivers can run in user space:

USB drivers (via libusb)
Network drivers (DPDK)
GPU drivers (Mesa/Vulkan components)
Filesystem drivers (FUSE)

These use kernel interfaces that grant controlled access to hardware while isolating driver bugs.

3. Sandboxed Kernel Components

Some OS designs sandbox even kernel components:

eBPF (Linux): Safe bytecode running in kernel, verified before execution
Windows Supervisor Mode: Driver isolation in Windows 10/11
seL4: Formally verified microkernel with mathematical proofs of security

4. Hardware Assistance

Modern hardware helps reduce kernel exposure:

IOMMU: Prevents DMA attacks on kernel memory
Intel SGX / ARM TrustZone: Isolated execution enclaves
Hardware virtualization: Allows hypervisor to contain kernel bugs

The Linux Compromise

Summary: Kernel Mode

Kernel Mode is the privileged execution environment where the operating system kernel runs with unrestricted access to all system resources. Let's consolidate our understanding:

Key Takeaways

•Kernel Mode provides unrestricted access — All memory, all hardware, all processor features are accessible.
•The OS kernel requires these privileges — Scheduling, memory management, device I/O, and interrupt handling all need privileged operations.
•Hardware enforces the privilege level — The CPL (x86), Exception Level (ARM), or Mode bits (RISC-V) are checked by the CPU itself.
•Separate kernel stacks maintain isolation — Each process has a dedicated kernel stack for privileged execution.
•Bugs in Kernel Mode are catastrophic — A crash in kernel code crashes the entire system, not just one process.
•Security vulnerabilities are critical — Kernel exploits can give attackers complete system control.
•Minimizing kernel code is a design goal — Microkernels, user-mode drivers, and sandboxing reduce the privileged attack surface.

Looking ahead:

Page Complete

2 / 5