Loading content...
Segment selectors are the visible interface between software and the x86 segmentation architecture. While segment descriptors contain the detailed information about memory segments (base, limit, access rights), it's the selector—a compact 16-bit value—that software actually manipulates and that segment registers hold.
Every memory reference in protected mode involves a selector. When you load a value into CS, DS, SS, ES, FS, or GS, you're loading a selector. When the processor fetches an instruction or accesses data, it uses the selector in the corresponding segment register to find the appropriate descriptor. Understanding selectors is essential for comprehending how protected mode memory addressing actually works.
By the end of this page, you will understand the complete selector format and bit layout, how the index, TI, and RPL fields interact, the rules for loading selectors into different segment registers, how the Requested Privilege Level (RPL) provides caller protection, selector validation and the exceptions that can occur, and practical examples of selector usage in kernel and user code.
A segment selector is a 16-bit value that encodes three distinct pieces of information:
Let's examine each field in detail:
12345678910111213141516171819202122232425
Segment Selector (16 bits):┌────────────────────────────────────────────────────┐│ 15 14 13 12 11 10 9 8 7 6 5 4 3 │ 2 │ 1 0 ││ Index (13 bits) │ TI │ RPL ││ Descriptor Table Index │ │ │└────────────────────────────────────────────────────┘ Field Breakdown:┌──────────┬───────────┬────────────────────────────────────────────┐│ Bits │ Field │ Description │├──────────┼───────────┼────────────────────────────────────────────┤│ 15-3 │ Index │ 13-bit index into descriptor table ││ │ │ Range: 0 to 8191 (2^13 - 1) ││ │ │ Multiplied by 8 to get byte offset │├──────────┼───────────┼────────────────────────────────────────────┤│ 2 │ TI │ Table Indicator ││ │ │ 0 = Global Descriptor Table (GDT) ││ │ │ 1 = Local Descriptor Table (LDT) │├──────────┼───────────┼────────────────────────────────────────────┤│ 1-0 │ RPL │ Requested Privilege Level ││ │ │ 0 = Ring 0 (highest privilege) ││ │ │ 1 = Ring 1 ││ │ │ 2 = Ring 2 ││ │ │ 3 = Ring 3 (lowest privilege) │└──────────┴───────────┴────────────────────────────────────────────┘Computing the Byte Offset:
To convert a selector index to pointer a byte offset within the descriptor table:
Byte Offset = Index × 8
= (Selector >> 3) × 8
= (Selector & 0xFFF8)
Since each descriptor is 8 bytes, the index field is essentially pre-shifted by 3 bits. The lower 3 bits (TI and RPL) are "free" bits that don't affect the byte offset calculation.
Example Selector Calculations:
| Selector (Hex) | Binary | Index | TI | RPL | Table | Byte Offset |
|---|---|---|---|---|---|---|
| 0x0008 | 0000 0000 0000 1000 | 1 | 0 | 0 | GDT | 8 |
| 0x0010 | 0000 0000 0001 0000 | 2 | 0 | 0 | GDT | 16 |
| 0x001B | 0000 0000 0001 1011 | 3 | 0 | 3 | GDT | 24 |
| 0x0023 | 0000 0000 0010 0011 | 4 | 0 | 3 | GDT | 32 |
| 0x002B | 0000 0000 0010 1011 | 5 | 0 | 3 | GDT | 40 |
| 0x000F | 0000 0000 0000 1111 | 1 | 1 | 3 | LDT | 8 |
| 0x0017 | 0000 0000 0001 0111 | 2 | 1 | 3 | LDT | 16 |
| 0x0000 | 0000 0000 0000 0000 | 0 | 0 | 0 | GDT | 0 (null) |
Common selector patterns in typical OS configurations: 0x08 = Kernel code (GDT index 1, Ring 0). 0x10 = Kernel data (GDT index 2, Ring 0). 0x1B = User code (GDT index 3, Ring 3, includes RPL=3). 0x23 = User data (GDT index 4, Ring 3, includes RPL=3). Note how user-mode selectors have their RPL bits set to 3 to match the user's privilege level.
The Requested Privilege Level (RPL) is one of the more subtle aspects of x86 protection. It provides a mechanism for more privileged code to voluntarily limit its access rights when operating on behalf of less privileged code.
The Problem RPL Solves:
Imagine a Ring 0 kernel routine that accepts a pointer (in the form of a segment selector and offset) from a Ring 3 user program. If the kernel simply uses its Ring 0 privilege to access that location, it could inadvertently access kernel memory that the user program shouldn't be able to reach:
This is called the confused deputy problem—the kernel (the deputy) can be confused into misusing its privilege on behalf of an unprivileged caller.
The RPL Solution:
RPL provides a way to limit access rights. When the processor checks access to a segment, it uses the effective privilege level:
Effective Privilege Level = MAX(CPL, RPL)
Where CPL is the Current Privilege Level and RPL is the Requested Privilege Level from the selector.
Access is granted if: Effective Privilege Level ≤ DPL
By setting RPL = 3 in a selector passed by user code, even if the kernel (CPL=0) uses that selector, the effective privilege level becomes MAX(0, 3) = 3. The kernel cannot accidentally access Ring 0-only segments through user-provided pointers.
The ARPL Instruction:
The processor provides the ARPL (Adjust RPL) instruction specifically for this purpose:
ARPL r/m16, r16 ; Adjust RPL field of destination to be >= source RPL
The kernel can use ARPL to ensure that any selector provided by user code has its RPL set to at least the user's privilege level. If the user passed a selector with RPL=0, ARPL would raise it to RPL=3 (the user's CPL).
1234567891011121314151617181920212223242526
; Kernel code validating a user-provided selector; User passes selector in BX, user's CS selector in AX (with correct RPL) validate_user_selector: ; BX = user-provided selector (might have RPL=0 - suspicious!) ; AX = caller's CS selector (has correct RPL in bits 0-1) arpl bx, ax ; Adjust BX's RPL to be >= AX's RPL ; If BX had RPL=0 and AX has RPL=3, ; BX now has RPL=3 jz .rpl_was_ok ; ZF set if no adjustment needed ; RPL was raised - user was trying something suspicious ; Log security event or take action call log_privilege_escalation_attempt .rpl_was_ok: ; Now safe to use BX - RPL is appropriate for caller mov ds, bx ; Access through DS is checked at MAX(CPL, RPL) privilege ret ; Example: User passes 0x10 (kernel data, RPL=0), CPL was 3; After ARPL: BX becomes 0x13 (RPL changed to 3); Now any access via this selector is checked at Ring 3 levelThe ARPL instruction is not available in 64-bit long mode. In 64-bit mode, segmentation is largely disabled, and the flat memory model with paging handles protection. The same opcode (0x63) is reassigned to MOVSXD (move with sign-extend from 32 to 64 bits) in 64-bit mode. Legacy 32-bit code running in compatibility mode on a 64-bit OS still needs ARPL-like validation, which must be handled differently.
Loading a selector into a segment register is a complex operation that involves multiple validation checks. The processor behavior differs based on which segment register is being loaded and what type of segment is being referenced.
Segment Register Categories:
The six segment registers have different rules:
Validation Checks During Load:
| Register | Null Selector | Valid Descriptor Types | Special Rules |
|---|---|---|---|
| CS | Not allowed | Code segments only | Only via JMP FAR, CALL FAR, RET FAR, IRET, interrupt |
| SS | Causes #GP | Writable data segments | CPL must equal DPL; CPL must equal RPL |
| DS, ES | Allowed | Data or readable code | Standard privilege check |
| FS, GS | Allowed | Data or readable code | Standard privilege check; often used for TLS |
1234567891011121314151617181920212223242526272829303132
; Various segment register loading operations ; --- Loading DS, ES, FS, GS (direct MOV) ---mov ax, 0x10 ; Kernel data selectormov ds, ax ; Load DS with selector ; Processor validates selector, loads descriptor ; Loading null selector (allowed for DS, ES, FS, GS)xor ax, ax ; AX = 0 (null selector)mov fs, ax ; FS = null (OK); Any access via FS will now fault! ; --- Loading SS (strict rules) ---mov ax, 0x10 ; Data segment selector (Ring 0)mov ss, ax ; Load SS; Note: Next instruction after MOV SS is protected from interruptsmov esp, stack_top ; Set stack pointer (safe from interrupt) ; --- Loading CS (only via control transfer) ---; Cannot do: mov cs, ax -- illegal! ; Far jump loads new CSjmp 0x08:new_code_location ; CS = 0x08, EIP = new_code_location ; Far call loads new CS (old value pushed to stack)call 0x08:far_procedure ; CS = 0x08, return address pushed ; Far return pops CS from stackretf ; CS and EIP popped from stack ; Interrupt loads CS from IDTint 0x80 ; CS loaded from IDT[0x80] gate descriptorWhen a selector is loaded into a segment register, the processor also loads the descriptor information into a hidden (non-visible) part of the register called the descriptor cache. This cache includes the base address, limit, and access rights. Subsequent memory accesses use this cached information, avoiding repeated descriptor table lookups. The cache is only updated when the segment register is reloaded.
The Code Segment (CS) register is special. It determines the privilege level of the currently executing code (CPL is stored in the low 2 bits of CS), and it can only be modified through control transfer instructions—never by direct MOV.
Why CS is Protected:
Directly modifying CS would allow arbitrary privilege escalation. Imagine user code (Ring 3) doing:
mov ax, 0x08 ; Kernel code selector, Ring 0
mov cs, ax ; If this worked, instant Ring 0!
By restricting CS modification to control transfers that the processor validates, the architecture ensures privilege changes go through proper checks.
Control Transfers That Modify CS:
| Instruction | Purpose | Privilege Check | Notes |
|---|---|---|---|
| JMP FAR | Jump to far address | Non-conforming: CPL = DPL | Direct inter-segment jump |
| CALL FAR | Call far procedure | Through gates or direct | Pushes return address |
| RET FAR | Return from far call | RPL of return CS ≥ CPL | Pops CS:EIP from stack |
| IRET | Return from interrupt | Complex privilege rules | Restores full state |
| INT n / INT3 | Software interrupt | Via IDT gate | Trap/interrupt to handler |
| Hardware Interrupt | Device interrupt | Via IDT gate | Asynchronous entry |
| Exception | CPU-detected error | Via IDT gate | Fault/trap/abort |
| SYSENTER | Fast system call | Sets CS from MSR | Direct to Ring 0 |
| SYSCALL | Fast system call (64-bit) | Sets CS from MSR | Direct to Ring 0 |
Conforming vs. Non-Conforming Code Segments:
Code segments have a "conforming" bit (bit 2 of the type field) that affects privilege checking:
Non-Conforming Code Segments (C=0):
Conforming Code Segments (C=1):
The conforming mechanism allows shared code (like math libraries) to be accessible from multiple privilege levels without privilege changes, but the code runs at the caller's privilege level.
To legitimately transfer from Ring 3 to Ring 0 code (like a system call in the classical model), the transfer must go through a call gate. Call gates are special descriptors that specify an entry point in Ring 0 code and perform the privilege transition. Modern systems often use SYSENTER/SYSCALL instead, but call gates remain architecturally important.
When selector loading validation fails, the processor generates an exception. Understanding which exception corresponds to which failure is essential for debugging and operating system development.
Exceptions Related to Segment Operations:
| Exception | Vector | Name | Cause |
|---|---|---|---|
| #GP | 13 (0x0D) | General Protection Fault | Most selector/segment violations |
| #NP | 11 (0x0B) | Segment Not Present | Descriptor's Present bit is 0 |
| #SS | 12 (0x0C) | Stack-Segment Fault | Stack segment violations (SS loading issues) |
| #TS | 10 (0x0A) | Invalid TSS | Problem with Task State Segment |
Detailed #GP Causes for Selector Operations:
The General Protection Fault (#GP) is the most common exception during selector operations. Specific causes include:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051
/* Exception scenarios during selector operations */ /* Example 1: #GP from null selector in SS */void trigger_gp_null_ss(void) { __asm__ volatile( "xor %%ax, %%ax\n" /* AX = 0 (null selector) */ "mov %%ax, %%ss\n" /* #GP! Cannot load null into SS */ : : : "ax" );} /* Example 2: #GP from index out of bounds *//* If GDT has 6 entries (limit = 0x2F = 47) */void trigger_gp_oob(void) { __asm__ volatile( "mov $0x38, %%ax\n" /* Selector 0x38 = index 7 */ "mov %%ax, %%ds\n" /* #GP! Index 7 > max index 5 */ : : : "ax" );} /* Example 3: #GP from privilege violation *//* User mode (Ring 3) trying to load kernel selector */void user_mode_violation(void) { /* This would be in Ring 3 code */ __asm__ volatile( "mov $0x10, %%ax\n" /* Kernel data selector (DPL=0) */ "mov %%ax, %%ds\n" /* #GP! CPL=3 > DPL=0 */ : : : "ax" );} /* Example 4: #NP from not-present segment */void trigger_np(void) { /* Assumes GDT entry 10 has P=0 */ __asm__ volatile( "mov $0x50, %%ax\n" /* Selector for entry 10 */ "mov %%ax, %%ds\n" /* #NP! Segment not present */ : : : "ax" );} /* Example 5: #GP from offset beyond limit */void trigger_gp_limit(void) { /* Assumes DS points to segment with limit < 0x10000 */ __asm__ volatile( "mov $0xFFFFFFFF, %%eax\n" /* Large offset */ "mov %%ds:(%%eax), %%bl\n" /* #GP! Offset > limit */ : : : "eax", "bl" );}When #GP occurs due to a segment selector issue, the processor pushes an error code containing the offending selector. The error code format is: bits 15-3 = selector index, bit 2 = TI (table indicator), bit 1 = IDT flag (if exception from IDT), bit 0 = EXT flag (if external event). This error code helps identify which selector caused the fault.
Understanding selector patterns used in real operating systems helps solidify the concepts. Here are common patterns and their purposes.
Standard Flat Model GDT Layout:
Most modern OSes use a flat memory model with these standard selectors:
| Selector | Index | Name | DPL | Purpose |
|---|---|---|---|---|
| 0x00 | 0 | NULL | Required null descriptor | |
| 0x08 | 1 | KERNEL32_CS | 0 | 32-bit kernel code (compatibility) |
| 0x10 | 2 | KERNEL_CS | 0 | 64-bit kernel code |
| 0x18 | 3 | KERNEL_DS | 0 | Kernel data |
| 0x20 | 4 | USER32_CS | 3 | 32-bit user code (compatibility) |
| 0x28 | 3 = 0x2B | 5 | USER_DS | 3 | User data |
| 0x30 | 3 = 0x33 | 6 | USER_CS | 3 | 64-bit user code |
| 0x40 | 8 | TSS | 0 | Task State Segment (16 bytes) |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172
/* Common selector manipulation patterns */ /* Selector constants (example for typical Linux-like setup) */#define KERNEL_CS 0x10#define KERNEL_DS 0x18#define USER_CS 0x33 /* 0x30 | RPL 3 */#define USER_DS 0x2B /* 0x28 | RPL 3 */ /* Pattern 1: IRET to user mode */void return_to_user(uint64_t rip, uint64_t rsp, uint64_t rflags) { __asm__ volatile( /* Push in order: SS, RSP, RFLAGS, CS, RIP */ "push %0\n" /* SS = USER_DS */ "push %1\n" /* RSP = user stack pointer */ "push %2\n" /* RFLAGS */ "push %3\n" /* CS = USER_CS */ "push %4\n" /* RIP = user entry point */ "iretq\n" /* Return to user mode */ : : "r"((uint64_t)USER_DS), "r"(rsp), "r"(rflags), "r"((uint64_t)USER_CS), "r"(rip) );} /* Pattern 2: Reading current selectors */void read_selectors(uint16_t *cs, uint16_t *ds, uint16_t *ss) { __asm__ volatile( "mov %%cs, %0\n" "mov %%ds, %1\n" "mov %%ss, %2\n" : "=r"(*cs), "=r"(*ds), "=r"(*ss) );} /* Pattern 3: Switching data segments (rare in modern code) */void switch_to_kernel_segments(void) { __asm__ volatile( "mov %0, %%ax\n" "mov %%ax, %%ds\n" "mov %%ax, %%es\n" : : "i"(KERNEL_DS) : "ax" );} /* Pattern 4: Using FS/GS for per-CPU or TLS data */void setup_gs_base(void *base) { /* In 64-bit mode, use WRGSBASE or MSR writes */ #ifdef __x86_64__ /* If FSGSBASE is supported */ __asm__ volatile("wrgsbase %0" : : "r"(base)); #else /* Use MSR (IA32_GS_BASE = 0xC0000101) */ wrmsr(0xC0000101, (uint64_t)base); #endif} /* Pattern 5: Extracting selector components */static inline int selector_index(uint16_t sel) { return sel >> 3;} static inline int selector_is_ldt(uint16_t sel) { return (sel & 0x04) != 0;} static inline int selector_rpl(uint16_t sel) { return sel & 0x03;} static inline int selector_is_null(uint16_t sel) { return (sel & 0xFFFC) == 0; /* Index and TI both zero */}Selector Usage in System Calls:
When a user program makes a system call (SYSCALL instruction on x86-64):
On SYSRET (return to user mode):
The entire transition is mediated by selectors, but the MSR provides them rather than explicit loading.
In 64-bit long mode, selectors remain present but their role is significantly reduced. While the selector format is unchanged, much of the segmentation functionality is disabled.
What Changes in Long Mode:
The Critical Remaining Role: CS.L Bit:
In long mode, the most important selector function is determining whether code runs in 64-bit mode or 32-bit compatibility mode:
This is how Windows and Linux run 32-bit applications on 64-bit systems—the OS provides a code segment with L=0 for legacy applications.
FS and GS in 64-bit Mode:
FS and GS are the remaining segments with active base addresses. They're used for:
%fs:0x28 or similar is often the stack canary location.The bases for FS and GS can be set via:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758
/* 64-bit mode selector operations */ #include <stdint.h> /* Check if running in 64-bit mode based on CS */int is_64bit_mode(void) { uint16_t cs; __asm__ volatile("mov %%cs, %0" : "=r"(cs)); /* In actual 64-bit code, we're obviously 64-bit, but this shows how to read CS */ return (cs == 0x10) || (cs == 0x33); /* Kernel64 or User64 CS */} /* Read FS base using RDFSBASE (requires FSGSBASE support) */void *read_fs_base(void) { void *base; __asm__ volatile("rdfsbase %0" : "=r"(base)); return base;} /* Write GS base using WRGSBASE */void write_gs_base(void *base) { __asm__ volatile("wrgsbase %0" : : "r"(base));} /* Alternative: Read FS base via MSR (works everywhere) */void *read_fs_base_msr(void) { uint32_t lo, hi; __asm__ volatile( "mov $0xC0000100, %%ecx\n" /* IA32_FS_BASE */ "rdmsr\n" : "=a"(lo), "=d"(hi) : : "ecx" ); return (void *)((uint64_t)hi << 32 | lo);} /* Access TLS data via FS segment */uint64_t get_tls_value(int offset) { uint64_t value; __asm__ volatile( "mov %%fs:(%1), %0\n" : "=r"(value) : "r"((uint64_t)offset) ); return value;} /* Per-CPU data access pattern (Linux GS-based) */struct percpu_data *get_current_percpu(void) { struct percpu_data *ptr; __asm__ volatile( "mov %%gs:0, %0\n" /* percpu area pointer at GS:0 */ : "=r"(ptr) ); return ptr;}In 64-bit mode, the SWAPGS instruction exchanges the GS base value with the value in MSR IA32_KERNEL_GS_BASE. This is used during syscall/sysret transitions: user mode has its own GS base (for TLS), kernel mode has its own (for per-CPU data). SWAPGS atomically switches between them. Forgetting SWAPGS during a syscall entry is a common source of bugs.
We've explored segment selectors—the compact 16-bit values that form the bridge between software segment register operations and the hardware descriptor table architecture.
What's Next:
With selectors, GDT, and LDT covered, we'll examine the evolution to x86-64—how the move to 64-bit computing fundamentally changed segmentation, what remains, what's deprecated, and how modern operating systems leverage the simplified memory model while maintaining backward compatibility.
You now understand segment selectors in detail—their format, purpose, privilege implications, loading rules, and role in both 32-bit protected mode and 64-bit long mode. This knowledge is essential for low-level systems programming, security analysis, and understanding how x86 hardware enforces memory protection.