Loading content...
If the segment limit answers "How much memory?" then protection bits answer the equally critical question: "What can you do with it?"
Protection bits in segment descriptors encode the rules governing memory access—whether code can be executed, data can be read, or modifications are allowed. These rules are enforced by the CPU on every single memory operation, creating an unbreakable contract between the operating system's security policy and the hardware's execution.
When a process attempts an operation that violates its segment's protection settings, the CPU doesn't just fail silently or produce garbage results. Instead, it raises a hardware exception, alerting the operating system to the violation before any damage can be done. This is the foundation of privilege separation, kernel protection, and secure computing.
Protection bits implement the principle of least privilege at the hardware level: code only gets the permissions it needs, nothing more. A read-only data segment cannot be modified. A non-executable data segment cannot be jumped into. These constraints prevent entire classes of security vulnerabilities.
By the end of this page, you will master protection bits including read/write/execute permissions, descriptor privilege levels (DPL), system vs. user segments, conforming code segments, privilege checking algorithms, and violation handling. You'll understand how hardware enforces security policies.
Protection bits are encoded in the segment descriptor's access byte (byte 5 of the 8-byte descriptor) and partially in the type field. Let's dissect the complete structure.
Segment Descriptor Access Byte Layout:
Bit 7: P (Present) - Is segment in memory?
Bit 6-5: DPL (Descriptor Privilege Level) - Required privilege (0-3)
Bit 4: S (System) - 0=System descriptor, 1=Code/Data descriptor
Bit 3-0: Type - Specific attributes depending on S bit
┌───────────────────────────────────────────────────────────────┐
│ Byte 5: Access Byte │
├─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┬──────────────┤
│ P │ DPL │ DPL │ S │ Type│ Type│ Type│ Type│ Bit Position │
│ (7) │ (6) │ (5) │ (4) │ (3) │ (2) │ (1) │ (0) │ │
└─────┴─────┴─────┴─────┴─────┴─────┴─────┴─────┴──────────────┘
Type Field Interpretation (S=1, Code/Data Segments):
| Bit 3 | Bit 2 | Bit 1 | Bit 0 | Segment Type | Meaning |
|---|---|---|---|---|---|
| 0 | 0 | 0 | A | Data, Read-Only | Read-Only Data |
| 0 | 0 | 1 | A | Data, Read/Write | Writable Data |
| 0 | 1 | 0 | A | Data, Read-Only, Expand-Down | Stack, Read-Only |
| 0 | 1 | 1 | A | Data, Read/Write, Expand-Down | Stack, Writable |
| 1 | 0 | 0 | A | Code, Execute-Only | Non-Readable Code |
| 1 | 0 | 1 | A | Code, Execute/Read | Readable Code |
| 1 | 1 | 0 | A | Code, Execute-Only, Conforming | Conforming Non-Readable |
| 1 | 1 | 1 | A | Code, Execute/Read, Conforming | Conforming Readable |
Understanding the Type Bits:
Critical Point: Code Segments Are Never Writable
Notice there is no "writable code" type. Code segments can only be executed (and optionally read). To modify code, you must access it through a writable data segment with the same base—a deliberate design to prevent self-modifying code accidents and certain exploits.
A mnemonic: For data segments, bit 1 = Write. For code segments, bit 1 = Read. The W/R bit controls optional capability. Execute is implicit for code (bit 3=1). Read is implicit for data (bit 3=0). This asymmetry reflects fundamental segment roles.
Read permission determines whether data can be loaded from a segment into CPU registers. The interpretation differs between code and data segments.
Data Segments: Implicitly Readable
All data segments are readable. The type field's bit 1 controls writeability, not readability. You cannot have a write-only data segment—every data segment permits reads.
Data Segment Types:
Type 0 (0000): Read-Only Data
Type 2 (0010): Read/Write Data
Type 4 (0100): Read-Only Data, Expand-Down (Stack)
Type 6 (0110): Read/Write Data, Expand-Down (Stack)
Code Segments: Optionally Readable
Code segments have a read bit (bit 1) that controls whether code can be read as data:
Why Execute-Only Code?
Execute-only code prevents:
However, execute-only code has challenges: the CPU needs to read instructions to decode them (this is allowed for instruction fetch), but MOV or other data read instructions cannot access execute-only segments.
123456789101112131415161718192021222324252627282930
// Hardware logic for read permission checking bool check_read_permission(segment_descriptor_t* desc, access_type_t access) { // For instruction fetch, always allowed if code segment and present if (access == INSTRUCTION_FETCH) { return desc->type.code; // Must be code segment } // For data read access if (desc->type.code) { // Code segment: check if readable bit is set return desc->type.readable; } else { // Data segment: always readable return true; }} // Example: Attempt to read from different segment typesvoid demonstrate_read_permissions() { // This MOV reads data from DS segment // DS must be either: // - A data segment (any type, all are readable), or // - A code segment with readable bit set __asm__ volatile ( "mov eax, [ds:0x1000]" // Read from data segment - OK "mov ebx, [cs:0x1000]" // Read from code segment - OK only if readable );}In modern systems with paging, the NX (No-Execute) bit in page tables provides execute protection at page granularity. Combined with read-only pages, this achieves W^X (Write XOR Execute): memory is either writable or executable, never both. This replaces traditional segment-based execute protection.
Write permission is the most security-critical permission. Allowing writes opens the door to data modification, corruption, and potentially exploits. The x86 architecture is deliberately restrictive about what can be written.
Data Segments: Controlled by W Bit
For data segments, bit 1 of the type field controls write capability:
Code Segments: Never Writable
Code segments cannot be written, period. There is no "writable code" type. This is an intentional security feature:
Write Protection Scenarios:
| Segment Type | Write Attempt | Result |
|---|---|---|
| Data, Read-Only | MOV [mem], reg | #GP Fault |
| Data, Read/Write | MOV [mem], reg | Success |
| Data, Expand-Down, RO | PUSH value | #GP Fault (problematic for stack) |
| Data, Expand-Down, RW | PUSH value | Success |
| Code, Any Type | MOV [cs:offset], reg | #GP Fault (always) |
| System Segment | Any write | #GP Fault (system segments not for data) |
Why Separate Read-Only Data Segments?
Read-only data segments serve several purposes:
Modifying "Constant" Data:
To modify data in a read-only segment (e.g., runtime patching), you must:
This is sometimes used for JIT compilation, dynamic linking, or debugging.
1234567891011121314151617181920212223242526272829303132
// Demonstrating write protection via aliased segments // Global read-only dataconst int important_const = 42; // In .rodata section // Attempt to modify through proper channelsvoid modify_const_dangerous(int new_value) { // Method 1: Change page permissions (modern approach) void* addr = (void*)&important_const; mprotect(page_align(addr), PAGE_SIZE, PROT_READ | PROT_WRITE); *(int*)addr = new_value; // Now succeeds mprotect(page_align(addr), PAGE_SIZE, PROT_READ); // Restore // Method 2: Use writable data selector (legacy x86 segmentation)#ifdef USE_SEGMENTS uint16_t rw_selector = get_writable_alias_selector(); __asm__ volatile ( "mov ax, %1\n\t" "mov ds, ax\n\t" "mov dword ptr [%0], %2\n\t" // Restore original DS :: "r"(&important_const), "r"(rw_selector), "r"(new_value) : "ax", "memory" );#endif} // This will fault without the above tricksvoid will_crash() { int* p = (int*)&important_const; *p = 100; // SIGSEGV - writing to read-only segment/page}Self-modifying code requires writing to memory that will later be executed. With segment protection, this typically means creating a data segment aliased over the code segment, modifying through the data segment, then executing through the code segment. Modern CPUs also require proper cache management (instruction cache flush) after code modification.
Execute permission is implicit in the segment type: code segments are executable, data segments are not. This fundamental distinction is the first line of defense against code injection attacks.
The Code/Data Distinction:
Type Bit 3 = 0: Data Segment (NOT executable)
Type Bit 3 = 1: Code Segment (Executable)
Execution Enforcement:
The CPU only fetches instructions from the segment referenced by CS (Code Segment register). This segment must have type bit 3 = 1. Any attempt to execute code violating this rule fails:
Why Data Segments Cannot Be Executed:
The Modern NX/XD Bit:
Segmentation's code/data distinction operates at segment granularity—an entire segment is either code or data. Modern systems use the NX (No-Execute) bit in page tables for page-level (4KB) granularity:
Paging-level NX complements segment-level protection, allowing fine-grained control:
Final Execute Permission =
(CS segment is code type) AND
(Page NX bit = 0) AND
(Other checks pass: privilege, etc.)
The W^X Principle:
Best practice is W^X (Write XOR Execute): memory should be either writable or executable, never both simultaneously. This is achieved by:
Execute permissions are checked at multiple levels: 1) Segment type (code vs. data), 2) Privilege level (CPL vs. DPL), 3) Page permissions (NX bit). All must allow execution for an instruction fetch to succeed. This defense-in-depth makes exploits significantly harder.
The Descriptor Privilege Level (DPL) is a 2-bit field (bits 5-6 of the access byte) that specifies the privilege level required to access the segment. This is the foundation of the protection ring model.
Privilege Levels:
DPL = 0: Kernel (most privileged)
DPL = 1: System services (rarely used)
DPL = 2: System services (rarely used)
DPL = 3: User applications (least privileged)
The Privilege Checking Rule (Simplified):
For accessing a data segment:
max(CPL, RPL) ≤ DPL
Where:
What This Means:
| CPL | RPL | max(CPL,RPL) | DPL=0 | DPL=1 | DPL=2 | DPL=3 |
|---|---|---|---|---|---|---|
| 0 | 0 | 0 | ✓ | ✓ | ✓ | ✓ |
| 0 | 3 | 3 | ✗ | ✗ | ✗ | ✓ |
| 3 | 0 | 3 | ✗ | ✗ | ✗ | ✓ |
| 3 | 3 | 3 | ✗ | ✗ | ✗ | ✓ |
| 1 | 2 | 2 | ✗ | ✗ | ✓ | ✓ |
Why Use RPL?
RPL prevents a subtle attack: a user program passes a selector to a kernel system call, hoping the kernel (CPL=0) will access kernel-private data on behalf of the user.
With RPL=3 in the selector, even though the kernel has CPL=0, the max(0,3)=3, so DPL=0 segments remain inaccessible. The kernel uses the user's privilege level for the access check.
Practical Example:
// User calls read(fd, buffer, count)
// Kernel needs to verify 'buffer' is accessible
// Without RPL protection (BAD):
if (CPL allows access to buffer)
copy_to_user(buffer, data); // Could access kernel memory!
// With RPL protection (GOOD):
if (max(CPL, buffer_selector.RPL) allows access)
copy_to_user(buffer, data); // RPL=3 blocks kernel segment access
Kernel Data Protection:
Kernel segments have DPL=0. User code (CPL=3) attempting to access them:
max(CPL=3, RPL) ≤ DPL=0?
max(3, anything) ≤ 0?
3 ≤ 0? → FALSE → #GP Fault
This is why user code cannot read kernel memory (segment-level protection), and why kernel exploits are so valuable to attackers.
Don't confuse DPL (segment attribute), CPL (current code privilege), and RPL (selector field). CPL is dynamic (changes during execution), RPL is per-access (stored in each selector used), and DPL is static (stored in the segment descriptor). The interplay between all three determines access rights.
Bit 4 of the access byte (the S bit) distinguishes between system segments and code/data segments.
S Bit Interpretation:
System Descriptors (S=0):
System segments are not for holding user code or data. They describe system data structures used by the CPU:
Why Separate System Segments?
| Type | Binary | Description |
|---|---|---|
| 0x1 | 0001 | 16-bit TSS (Available) |
| 0x2 | 0010 | LDT |
| 0x3 | 0011 | 16-bit TSS (Busy) |
| 0x4 | 0100 | 16-bit Call Gate |
| 0x5 | 0101 | Task Gate |
| 0x6 | 0110 | 16-bit Interrupt Gate |
| 0x7 | 0111 | 16-bit Trap Gate |
| 0x9 | 1001 | 32-bit TSS (Available) |
| 0xB | 1011 | 32-bit TSS (Busy) |
| 0xC | 1100 | 32-bit Call Gate |
| 0xE | 1110 | 32-bit Interrupt Gate |
| 0xF | 1111 | 32-bit Trap Gate |
Security Implications:
Attempting to load a system descriptor as a code or data segment causes a #GP. This prevents:
Creating User vs. System Segments:
// User code segment (S=1, Type=1010 = Code, Execute/Read)
access_byte = 0b10011010; // P=1, DPL=0, S=1, Type=1010
// 32-bit TSS descriptor (S=0, Type=1001)
access_byte = 0b10001001; // P=1, DPL=0, S=0, Type=1001
Call gates provide controlled entry points into privileged code. A user-mode program can 'call' through a call gate to execute kernel code—but only at the specific entry point defined by the gate, not arbitrary kernel addresses. This is how system calls were traditionally implemented before SYSENTER/SYSCALL instructions.
The conforming bit (bit 2 of type field, when bit 3 = 1 for code segments) creates a special type of code segment with unique privilege behavior.
Normal (Non-Conforming) Code:
For non-conforming code segments:
Conforming Code:
For conforming code segments:
Use Case: Shared Utility Code
Imagine a math library that should be usable by both kernel and user code. With conforming segments:
Math library segment: DPL=0, Conforming=1
Kernel code (CPL=0) calls math_sin():
- CPL (0) ≥ DPL (0) → Allowed
- After call: CPL remains 0
- Can access kernel data
User code (CPL=3) calls math_sin():
- CPL (3) ≥ DPL (0) → Allowed (conforming!)
- After call: CPL remains 3
- Cannot access kernel data (still Ring 3)
- Code runs with caller's privilege
Security Analysis:
Conforming segments are not a security hole because:
Why Not Make All Code Conforming?
Because most kernel code needs to access kernel data (DPL=0). If kernel code ran at caller's CPL=3, it couldn't access its own data structures. Conforming is only appropriate for stateless, data-independent utility code.
x86-64 Note:
In 64-bit long mode, conforming is less relevant because:
Both conforming segments and call gates allow cross-ring calls. The key difference: call gates CHANGE CPL (privilege escalation possible), while conforming segments PRESERVE CPL (no escalation). Use call gates for syscall entry points; use conforming for shared utility code.
Let's consolidate how the CPU performs protection checks on every memory access. This multi-stage verification happens in hardware, adding essentially zero overhead.
Complete Access Validation Algorithm:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455
function validate_memory_access(selector, offset, access_type): // STAGE 1: Selector Validation if selector.index >= descriptor_table_limit: raise #GP(selector) // Selector out of range descriptor = descriptor_table[selector.index] // STAGE 2: Presence Check if not descriptor.present: raise #NP(selector) // Segment not present // STAGE 3: Type Compatibility if access_type == EXECUTE: if not descriptor.is_code(): raise #GP(selector) // Cannot execute non-code if descriptor.is_data(): raise #GP(selector) // Trying to execute data if access_type == WRITE: if descriptor.is_code(): raise #GP(selector) // Cannot write to code if not descriptor.is_writable(): raise #GP(0) // Data is read-only if access_type == READ: if descriptor.is_code() and not descriptor.is_readable(): raise #GP(selector) // Execute-only code // STAGE 4: Privilege Check effective_priv = max(CPL, selector.RPL) if descriptor.is_code() and not descriptor.is_conforming(): // Non-conforming code: exact match required if CPL != descriptor.DPL: raise #GP(selector) else if descriptor.is_code() and descriptor.is_conforming(): // Conforming code: CPL >= DPL allowed if CPL < descriptor.DPL: raise #GP(selector) else: // Data segment: effective_priv <= DPL if effective_priv > descriptor.DPL: raise #GP(selector) // STAGE 5: Limit Check access_size = get_operand_size(instruction) if not within_limit(offset, access_size, descriptor): if descriptor.is_stack(): raise #SS(0) // Stack fault else: raise #GP(0) // General protection // ALL CHECKS PASSED - compute physical address linear_address = descriptor.base + offset return linear_addressPerformance of Protection Checks:
All these checks happen in a single clock cycle (or are pipelined across a few cycles) because:
When Checks Are Performed:
Protection faults cause pipeline flushes and exception handling. This is expensive (hundreds of cycles), but faults should be rare in correct programs. The fast path (all checks pass) is the common case and essentially free. The slow path (fault handling) is costly but necessary for security.
When a protection check fails, the CPU generates a fault, saves context, and transfers control to the OS exception handler. The handler receives information about what went wrong and must decide how to respond.
Exception Types for Protection Violations:
| Exception | Vector | Error Code | Causes |
|---|---|---|---|
| #GP | 13 | Selector or 0 | Most protection violations, limit exceeded, privilege violation |
| #SS | 12 | Selector or 0 | Stack limit exceeded, invalid stack segment |
| #NP | 11 | Selector | Segment not present in memory |
| #TS | 10 | Selector | Invalid TSS during task switch |
| #SF | 8 | 0 | Double fault (exception during exception) |
Error Code Contents:
For selector-based faults (#GP with selector, #NP, #TS), the error code contains:
┌────────────────────────────────────────────────────────────────┐
│ Bits 15-3: Selector Index (segment number) │
│ Bit 2: TI (Table Indicator: 0=GDT, 1=LDT) │
│ Bit 1: IDT (1 if fault during IDT access) │
│ Bit 0: EXT (1 if external event caused fault, e.g., interrupt)│
└────────────────────────────────────────────────────────────────┘
Example Fault Handler:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748
// General Protection Fault handler void gp_fault_handler(interrupt_frame_t* frame, uint32_t error_code) { // Decode error code bool is_external = error_code & 0x01; bool is_idt = error_code & 0x02; bool is_ldt = error_code & 0x04; uint16_t selector_index = (error_code >> 3) & 0x1FFF; // Get faulting instruction void* fault_eip = (void*)frame->eip; uint8_t* instruction = (uint8_t*)fault_eip; // Analyze the fault fault_info_t info = { .type = FAULT_GENERAL_PROTECTION, .eip = fault_eip, .selector = selector_index, .error_code = error_code, }; if (error_code == 0) { // Limit check failure or non-selector fault info.reason = analyze_limit_or_permission_fault(frame); } else { // Selector-based fault info.reason = analyze_selector_fault(selector_index, is_ldt); } // Log fault details klog(LOG_WARN, "GP Fault: PID=%d EIP=0x%08x Reason=%s", current_pid(), fault_eip, fault_reason_str(info.reason)); // User-mode fault: deliver signal if (frame->cs & 0x03) { // CPL = 3 siginfo_t si = { .si_signo = SIGSEGV, .si_code = SEGV_ACCERR, .si_addr = fault_eip, }; deliver_signal(current_process(), SIGSEGV, &si); return; // If handler installed, may return here } // Kernel-mode fault: panic kernel_panic("GP fault in kernel: EIP=0x%08x EC=0x%04x", fault_eip, error_code);}Protection faults in user mode typically result in SIGSEGV delivery (and usually process termination). Protection faults in kernel mode indicate kernel bugs and usually cause a kernel panic. There's no safe way to recover from a kernel protection violation—the kernel's invariants are compromised.
We've comprehensively explored protection bits—the mechanism that enforces access control on memory operations. Let's consolidate the key concepts:
What's Next:
With base, limit, and protection bits covered, we'll complete our segment table exploration with segment table location—understanding where segment tables reside in memory, how the CPU finds them via GDTR and LDTR registers, and how the OS manages multiple descriptor tables for different processes.
You now have comprehensive knowledge of segment protection bits—permissions, privilege levels, system/user distinction, conforming segments, and hardware enforcement. This understanding is fundamental for OS development, security research, and understanding how modern systems protect memory.