Loading content...
Imagine a medieval castle with concentric walls. The outermost wall protects the city below. The next wall guards the castle courtyard. The innermost wall protects the keep, where the king resides. Each wall represents a level of trust—enemies must breach multiple barriers to reach the most sensitive areas.
Protection rings apply this same principle to computing. The CPU implements concentric privilege levels, with the most critical code (the kernel) at the innermost ring and the least trusted code (user applications) at the outermost. Each ring can access its own resources and those of outer rings, but cannot directly access inner rings without going through controlled gates.
This architecture, pioneered by the Multics operating system in the 1960s and implemented in the Intel 80286 and all subsequent x86 processors, remains the foundation of hardware protection in modern systems.
By the end of this page, you will understand the protection ring architecture, how hardware enforces ring boundaries, the role of each ring in typical operating systems, the evolution from multi-ring to two-ring systems, and how virtualization has resurrected unused rings.
Protection rings are a hierarchical mechanism for organizing protection domains. Each ring is assigned a privilege level, with lower numbers indicating higher privilege. Code at ring N can access all resources available to rings N through (max ring), but cannot directly access resources in rings 0 through N-1.
Formal Definition:
A protection ring system consists of:
x86 Protection Rings:
Intel x86 processors implement 4 protection rings (0-3), encoded in 2 bits of the code segment selector:
Ring 0: Kernel mode (highest privilege)
Ring 1: Device drivers (unused in most OS)
Ring 2: System services (unused in most OS)
Ring 3: User mode (lowest privilege)
Despite having 4 rings available, most operating systems use only Ring 0 (kernel) and Ring 3 (user). Ring 1 and 2 are unused because: (1) Unix was designed for architectures with only 2 modes; (2) Some CPU features only distinguish Ring 0 from non-Ring-0; (3) Portability concerns—other architectures may not have 4 rings.
The CPU enforces ring boundaries through multiple hardware mechanisms. Understanding these mechanisms reveals what protection rings actually guarantee.
Current Privilege Level (CPL):
The CPL is stored in bits 0-1 of the CS (Code Segment) register. It represents the ring in which the currently executing code operates.
CS Register: [Segment Selector (bits 3-15)] [TI (bit 2)] [RPL (bits 0-1)]
When code is executing: CPL = RPL of the current CS
Ring 0: CPL = 0b00 (binary 00)
Ring 1: CPL = 0b01 (binary 01)
Ring 2: CPL = 0b10 (binary 10)
Ring 3: CPL = 0b11 (binary 11)
Descriptor Privilege Level (DPL):
Every segment descriptor (in the GDT/LDT) and gate descriptor has a DPL field specifying the minimum privilege required to access it:
struct segment_descriptor {
u16 limit_low;
u16 base_low;
u8 base_mid;
u8 type:4;
u8 s:1; // 1=code/data, 0=system
u8 dpl:2; // Descriptor Privilege Level (0-3)
u8 p:1; // Present
u8 limit_high:4;
u8 avl:1;
u8 l:1; // 64-bit mode
u8 d:1;
u8 g:1;
u8 base_high;
};
Access Control Rules:
When code attempts to access a segment or call through a gate, the CPU performs privilege checks:
For Data Segments:
Access allowed if: CPL ≤ DPL and RPL ≤ DPL
Code in Ring 0 can access Ring 0, 1, 2, 3 data. Code in Ring 3 can only access Ring 3 data.
For Code Segments (via CALL/JMP):
For Call Gates (privilege transition):
Access allowed if: CPL ≤ DPL_gate
After transition: CPL = DPL_code_segment
A Ring 3 process can only use gates with DPL=3. Upon traversing the gate, CPL changes to the target segment's DPL.
| From → To | Mechanism | Condition | CPL After |
|---|---|---|---|
| Ring 3 → Ring 0 | Interrupt/Trap Gate | Gate DPL ≥ 3 | 0 |
| Ring 3 → Ring 0 | SYSCALL instruction | MSRs configured by kernel | 0 |
| Ring 0 → Ring 3 | IRET instruction | Target CS has RPL = 3 | 3 |
| Ring 0 → Ring 3 | SYSRET instruction | Implicit return to user | 3 |
| Ring N → Ring N | Normal CALL/JMP | CPL = DPL_target | N (unchanged) |
| Ring 3 → Ring 3 | All normal operations | Always allowed within ring | 3 (unchanged) |
Beyond segment access, protection rings determine which CPU instructions and operations are permitted. Ring 0 has exclusive access to many critical operations.
Privileged Instructions (Ring 0 Only):
These instructions can only execute when CPL = 0. Attempting to execute them at Ring 3 triggers a General Protection Fault (#GP):
| Category | Instructions | Purpose |
|---|---|---|
| Control Registers | MOV CR0/CR2/CR3/CR4 | Control CPU modes, paging, features |
| Debug Registers | MOV DR0-DR7 | Hardware breakpoints, debug control |
| Model-Specific Registers | RDMSR, WRMSR | CPU configuration, syscall setup |
| I/O Permissions | CLI, STI | Enable/disable interrupts |
| I/O Port Access | IN, OUT, INS, OUTS | (if IOPL < CPL) |
| Descriptor Tables | LGDT, LIDT, LLDT, LTR | Load segment descriptor tables |
| Cache Control | INVD, WBINVD | Invalidate CPU caches |
| TLB Control | INVLPG, INVPCID | Invalidate page table entries |
| Halt | HLT | Halt CPU until interrupt |
I/O Privilege Level (IOPL):
The IOPL field in RFLAGS (bits 12-13) provides additional control over I/O operations:
RFLAGS: [...] [IOPL (12-13)] [...]
If CPL ≤ IOPL:
IN, OUT, INS, OUTS are allowed
CLI, STI are allowed (interrupt flag control)
If CPL > IOPL:
I/O ops check the I/O Permission Bitmap in TSS
CLI, STI cause #GP exception
The I/O Permission Bitmap:
The TSS contains a bitmap where each bit controls access to one I/O port (65536 ports = 8192 bytes). Even in Ring 3, if the corresponding bit is 0, access is allowed:
struct tss_struct {
// ...
u16 io_map_base; // Offset to I/O permission bitmap
// The bitmap follows the TSS
// Bit N = 0: Port N accessible from Ring 3
// Bit N = 1: Port N requires Ring 0
};
Ring 0 code can modify any CPU state, access any memory, disable all interrupts, and completely subvert the operating system. This is why kernel vulnerabilities are so severe—a bug in Ring 0 code cannot be contained by the ring system.
Although x86 provides four rings, modern operating systems (Linux, Windows, macOS, BSD) use only Ring 0 and Ring 3. This simplification has both historical and practical reasons.
Why Not Use All Four Rings?
Unix Legacy: Unix was designed for the PDP-11, which had only two modes (kernel/user). When ported to x86, developers kept the two-mode model.
Portability: Other architectures (ARM, MIPS, SPARC) traditionally had only two privilege levels. Using only Ring 0/3 maximizes portability.
X86 Implementation Details: Many x86 features only distinguish "Ring 0" from "not Ring 0":
Diminishing Returns: Intermediate rings add complexity without proportional security benefits. If driver code in Ring 1 is compromised, it can often escalate to Ring 0 anyway.
Alternative Approaches:
Some operating systems have attempted to isolate drivers from the kernel:
| Approach | Example | How It Works |
|---|---|---|
| Microkernel | MINIX, seL4 | Drivers run in Ring 3; kernel is minimal |
| User-mode drivers | Windows UMDF | Certain drivers run as user processes |
| Paravirtualization | Xen Ring 1 drivers | Drivers in Ring 1, kernel in Ring 0 |
| Hardware isolation | SR-IOV | Hardware provides per-device isolation |
| IOMMU | Intel VT-d | Restricts device DMA to specific memory |
These approaches provide driver isolation at the cost of complexity and performance.
Virtualization introduces interesting complications to the ring model. When running a guest operating system, both the hypervisor and the guest kernel want to be in Ring 0—but there can be only one true Ring 0.
The Original Problem:
Without hardware virtualization support:
Software Virtualization Approaches:
Ring Compression (Guest kernel in Ring 1):
Hypervisor: Ring 0 (true Ring 0)
Guest Kernel: Ring 1 (thinks it's Ring 0)
Guest User: Ring 3 (true Ring 3)
Problem: Ring 1 lacks some Ring 0 features. Some privileged instructions must be emulated.
Ring Aliasing (Guest kernel in Ring 3):
Hypervisor: Ring 0 (true Ring 0)
Guest Kernel: Ring 3 (thinks it's Ring 0, heavily emulated)
Guest User: Ring 3 (same as guest kernel!)
Problem: Guest kernel cannot be isolated from guest user space.
Hardware Virtualization Solution:
Intel VT-x and AMD-V introduce a new layer below Ring 0:
┌─────────────────────────────────────┐
│ Guest Mode (VMX non-root) │
│ ┌──────────────────────────────┐ │
│ │ Ring 0: Guest Kernel │ │
│ │ Ring 3: Guest User Space │ │
│ └──────────────────────────────┘ │
├─────────────────────────────────────┤
│ Host Mode (VMX root) │
│ Ring 0: Hypervisor │
└─────────────────────────────────────┘
VMX Root Mode: The hypervisor runs here. It has true control. VMX Non-Root Mode: Guest OS runs here. Guest Ring 0 is real Ring 0 from the guest's perspective, but the hypervisor can intercept and emulate any operation.
People sometimes call the hypervisor mode "Ring -1," but this is informal. The hypervisor runs in Ring 0 of VMX root mode. The guest kernel runs in Ring 0 of VMX non-root mode. They're both Ring 0, just in different CPU modes with different privileges.
| Mode | Ring | What Runs Here | Power Level |
|---|---|---|---|
| VMX root | Ring 0 | Hypervisor | Maximum (true supervisor) |
| VMX root | Ring 3 | Hypervisor user tools | Limited (under hypervisor) |
| VMX non-root | Ring 0 | Guest kernel | Controlled (can be intercepted) |
| VMX non-root | Ring 3 | Guest applications | Minimal (under guest kernel) |
ARM processors use a different privilege model called Exception Levels (EL). Unlike x86's four rings that are rarely used, ARM's exception levels are designed with virtualization and secure execution built-in.
ARM Exception Levels:
| Level | Name | Typical Use | x86 Equivalent |
|---|---|---|---|
| EL0 | User | Applications | Ring 3 |
| EL1 | Supervisor | OS Kernel | Ring 0 |
| EL2 | Hypervisor | Virtualization | VMX root Ring 0 |
| EL3 | Secure Monitor | TrustZone | (no direct equivalent) |
Key Differences from x86:
TrustZone: The Secure World:
ARM TrustZone creates two parallel "worlds"—Normal and Secure—that run side by side:
┌────────────────────────────────────────────────────┐
│ EL3: Secure Monitor │
│ (Manages transitions between worlds) │
├─────────────────────────┬──────────────────────────┤
│ Normal World │ Secure World │
│ │ │
│ EL2: Hypervisor │ EL2: (optional) │
│ EL1: Linux Kernel │ EL1: Secure Kernel │
│ EL0: Applications │ EL0: Trusted Apps │
└─────────────────────────┴──────────────────────────┘
Secure World code has access to memory, keys, and hardware that Normal World cannot see. This enables:
ARM EL transitions are asymmetric: you can only increase EL through exceptions (interrupts, syscalls, faults), and can only decrease EL through explicit return instructions (ERET). This prevents unprivileged code from directly jumping to higher ELs.
Protection rings interact with the paging system to provide memory protection. Page table entries contain bits that enforce ring-based access control on every memory access.
The User/Supervisor (U/S) Bit:
Each page table entry contains a U/S bit:
Page Table Entry (64-bit):
┌────────────────────────────────────────────────────┐
│ Address (bits 12-51) │ Flags │ U/S │ R/W │ P │ │
└────────────────────────────────────────────────────┘
↑
User/Supervisor bit
If U/S = 0 and CPL = 3:
→ Page Fault (#PF) on any access
If U/S = 1:
→ Accessible from Ring 0, 1, 2, or 3
SMEP, SMAP, and PKU:
Modern processors add extra protections beyond the basic U/S bit:
Supervisor Mode Execution Prevention (SMEP): Prevents Ring 0 from executing code in user pages. If an attacker controls user memory and tricks the kernel into jumping there, SMEP blocks execution.
CR4.SMEP = 1:
If CPL = 0 and page has U/S = 1:
Instruction fetch → #PF
Supervisor Mode Access Prevention (SMAP): Prevents Ring 0 from reading/writing user pages unless explicitly enabled. Protects against confused deputy attacks.
CR4.SMAP = 1:
If CPL = 0 and page has U/S = 1 and AC flag = 0:
Data access → #PF
Kernel must use STAC/CLAC instructions:
STAC: Set AC flag (allow user access)
CLAC: Clear AC flag (restore protection)
Protection Keys for Userspace (PKU): Allows user-space to define 16 protection domains for their own memory, with per-domain read/write restrictions. Enforced via the PKRU register.
| Feature | Introduced | Protection Provided | Controlled By |
|---|---|---|---|
| U/S bit | i386 (1985) | User vs. supervisor pages | Page table entry |
| NX bit | Pentium 4/AMD K8 | Non-executable pages | Page table entry |
| SMEP | Ivy Bridge (2012) | No user code exec in Ring 0 | CR4 register |
| SMAP | Haswell (2013) | No user data access in Ring 0 | CR4 + RFLAGS.AC |
| PKU | Skylake (2015) | 16 user-space protection keys | PKRU register |
| CET | Tiger Lake (2020) | Shadow stack, IBT | CR4 + MSRs |
Not all systems use protection rings the same way. Embedded and real-time environments often have different requirements that influence ring usage.
Embedded Systems:
Many embedded systems disable protection rings entirely or run everything in Ring 0:
┌─────────────────────────────────┐
│ Single Ring (Ring 0) │
│ Application + RTOS + Drivers │
│ All privileged │
└─────────────────────────────────┘
Why?
Real-Time Systems:
Hard real-time systems may avoid ring transitions to guarantee timing:
ARM Cortex-M: Two-Level Protection
ARM Cortex-M processors (common in microcontrollers) use a simpler two-level model:
| Mode | Name | Purpose |
|---|---|---|
| Thread Mode | Unprivileged | Normal application code |
| Handler Mode | Privileged | Exception/interrupt handlers |
The Memory Protection Unit (MPU) on Cortex-M provides region-based protection without a full MMU:
// Configure MPU region 0
MPU->RNR = 0; // Select region 0
MPU->RBAR = 0x20000000; // Base address (SRAM)
MPU->RASR =
(0x13 << 1) | // Size = 1MB
(0x3 << 24) | // Full access
(1 << 0); // Enable region
This provides isolation without the overhead of full paging and ring transitions.
Running without protection rings means any bug can crash the system and any vulnerability grants full control. For connected IoT devices, this is increasingly problematic. Newer embedded architectures (ARMv8-M TrustZone-M, RISC-V PMP) add protection features specifically for embedded use cases.
We've explored the hierarchical protection model that underpins modern processor security. Let's consolidate the key insights:
What's Next:
Protection rings provide the mechanism for privilege separation, but they don't tell us how much privilege code should have. The next page explores the Principle of Least Privilege—the fundamental security guideline that code should have only the minimum rights necessary for its task. This principle guides the design of secure systems and the proper use of protection domains.
You now understand how protection rings provide hierarchical privilege separation enforced by hardware. This knowledge is essential for understanding kernel architecture, system call implementation, and privilege escalation attacks.