Operating SystemsSegment Table

Segment Table

LevelIntermediate

Duration75 mins

TopicSegment Table

2 / 5

Segment Base

The Foundation of Memory Placement

In the architecture of segmented memory, no field is more fundamental than the segment base address. This deceptively simple value—a physical memory address—is the anchor point for every memory access within a segment. It is the mechanism by which a program's logical view of memory is mapped onto the physical reality of RAM chips and memory controllers.

The base address answers a critical question: Where in physical memory does this segment begin?

Every byte the CPU fetches from a segment, every instruction executed, every variable read or written, has its physical location calculated by adding an offset to this base. Understanding the segment base is understanding the fundamental bridge between logical and physical address spaces in segmented systems.

Moreover, the base address enables one of the most powerful features of operating systems: relocation. A program can be compiled once, loaded anywhere in memory, and run correctly—all because the OS simply adjusts the segment base addresses to reflect the actual load location.

What You Will Learn

By the end of this page, you will deeply understand how segment base addresses work, how they enable dynamic relocation and memory sharing, the hardware mechanisms for base loading, memory layout implications, and how bases interact with limits for complete address space definition.

The Mathematical Foundation

At its core, the segment base participates in a simple but profound equation that is the heart of segmented address translation:

Physical Address = Segment Base + Offset

This equation is executed by hardware—specifically the Memory Management Unit (MMU)—for every memory access in a segmented system. Let's examine each component:

The Offset:

The offset is the address specified by the program. When a programmer writes array[i] or a compiler generates MOV EAX, [EBX+4], the actual numerical address used is an offset within a segment. The program has no direct knowledge of where that segment is physically located.

The Base:

The base is the starting physical address of the segment, maintained in the segment table entry and cached in the CPU's segment registers. The base represents the mapping from the program's logical segment to physical memory.

The Translation:

Every time the CPU needs to access memory, it:

Identifies which segment the access belongs to (code, data, stack, etc.)
Retrieves the base address from the appropriate segment register or descriptor
Adds the offset to obtain the physical address
Sends this physical address to the memory system

address_translation.pseudo

Pseudocode

// Segmented Address Translation Algorithm
 
function translate_address(segment_selector, offset):
    // Step 1: Look up segment descriptor
    descriptor = segment_table[segment_selector.index]
    
    // Step 2: Check if segment is present
    if not descriptor.present:
        raise SegmentNotPresentFault(segment_selector)
    
    // Step 3: Check privilege level
    effective_privilege = max(CPL, segment_selector.RPL)
    if effective_privilege > descriptor.DPL:
        raise GeneralProtectionFault("Privilege violation")
    
    // Step 4: Check bounds
    effective_limit = descriptor.limit
    if descriptor.granularity == PAGE:
        effective_limit = (descriptor.limit + 1) * 4096 - 1
    
    if descriptor.expand_down:
        if offset <= effective_limit or offset > MAX_OFFSET:
            raise GeneralProtectionFault("Bounds violation")
    else:  // expand up
        if offset > effective_limit:
            raise GeneralProtectionFault("Bounds violation")
    
    // Step 5: Calculate physical address
    physical_address = descriptor.base + offset
    
    // Step 6: Update accessed bit if needed
    if not descriptor.accessed:
        descriptor.accessed = true
    
    return physical_address

The Power of Indirection:

This simple addition creates a powerful level of indirection. The program knows nothing about physical memory locations—it only deals with offsets. The OS can place the segment anywhere in physical memory simply by adjusting the base. This separation of concerns is a cornerstone of modern operating system design:

Programs deal with logical addresses (offsets within segments)
The OS decides physical placement by setting base addresses
Hardware performs the translation transparently

No recompilation, no relinking, no modification of the program is required when its physical location changes.

Base Address Size and Memory Capacity

The number of bits allocated to the base address field directly determines the maximum amount of physical memory that segments can address. This relationship has driven much of the evolution of CPU architectures.

Historical Progression:

Base Address Evolution Across CPU Generations
CPU/Mode	Base Address Bits	Max Physical Memory	Historical Context
8086 (Real Mode)	20 (16-bit seg × 16)	1 MB	Original IBM PC, 1981
80286 (Protected)	24 bits	16 MB	First protected mode, 1982
80386 (Protected)	32 bits	4 GB	Full 32-bit support, 1985
x86-64 (Long Mode)	Flat model, 64 bits	Theoretical: 16 EB	Modern systems, paging dominates

Real Mode Segmentation (8086):

In the original 8086 processor, segments worked differently. The segment register held a 16-bit value that was multiplied by 16 (shifted left 4 bits) and added to a 16-bit offset:

Physical = (Segment Register × 16) + Offset
         = (Segment Register << 4) + Offset

For example, segment 0x1234 with offset 0x5678:

Physical = 0x1234 × 16 + 0x5678
         = 0x12340 + 0x5678
         = 0x179B8

This gave 20 bits of address space (1 MB), which was astounding for personal computers in 1981 but quickly became constraining.

Protected Mode Evolution (80286+):

The 80286 introduced protected mode with true segment descriptors. The base address became a field in the descriptor rather than a scaled segment register value. The 80286 used a 24-bit base (16 MB), and the 80386 extended this to 32 bits (4 GB).

Modern 64-bit Systems:

In x86-64 long mode, segmentation is essentially disabled for user-mode code—the base of code and data segments is forced to 0, creating a flat memory model. The FS and GS segment registers are exceptions, retaining their bases for thread-local storage and kernel data structures.

Physical address space in 64-bit systems is determined by paging, not segmentation. Modern CPUs support 48-52 bits of physical address (256 TB to 4 PB), far beyond what 32-bit segment bases could address.

Segment Base Limitations

The base address field size created hard limits on addressable memory. This is why the transition from 16-bit to 32-bit computing was so significant—it wasn't just about register width, but about the ability to use more than 16MB (286) or 4GB (386) of RAM. This same pressure drove the transition to 64-bit computing.

Dynamic Relocation via Base Modification

One of the most powerful capabilities enabled by segment base addresses is dynamic relocation—the ability to move a process's memory to a different physical location while the process is running, without requiring any changes to the code or its logical addresses.

The Relocation Problem:

In early computing, programs were compiled for specific memory addresses. If a program expected to run at address 0x1000, it had to be loaded exactly there. With multiple programs, this created impossible conflicts.

The Segment Base Solution:

With segmented addressing, programs are compiled using offsets from the start of their segments, not absolute addresses. The OS can load the segment anywhere in physical memory and simply set the base address accordingly.

Relocation Scenario:

Before Relocation

•Process A code at physical 0x10000
•Segment base = 0x10000
•Instruction at offset 0x500
•Physical address = 0x10500
•Process runs correctly

After Relocation

•Process A code moved to 0x80000
•Segment base = 0x80000
•Same instruction at offset 0x500
•Physical address = 0x80500
•Process runs correctly—unchanged!

Relocation Process:

Pause the Process: The OS suspends execution of the process being relocated.
Copy Memory Contents: The segment's contents are copied from the old location to the new location. For a 64KB segment, this is a straightforward memory copy.
Update Segment Base: The segment table entry's base field is updated to the new physical address.
Flush Caches: Any cached copies of the segment descriptor (in segment register caches) must be invalidated or updated.
Resume Execution: The process continues running. Its code hasn't changed—all offsets remain valid. Only the base has moved.

Why Relocation Matters:

•Memory Compaction — Over time, as processes start and exit, physical memory becomes fragmented with scattered free holes. Relocation allows the OS to compact memory, sliding segments together to create larger contiguous free regions.
•Defragmentation — Similar to disk defragmentation, memory defragmentation via relocation consolidates free space and improves allocation efficiency.
•Load Balancing — In NUMA systems, the OS might relocate segments closer to the CPUs that access them most frequently, improving performance.
•Hot-Swapping — Memory hardware can be replaced without rebooting by relocating all segments away from the affected region.

Relocation Overhead

While powerful, segment relocation has costs. Copying memory takes time proportional to segment size—relocating a 100MB segment requires copying 100MB of data. During this time, the process is paused. For large segments, this can cause noticeable latency. This overhead is one reason paging (with its smaller, fixed-size units) often dominates over pure segmentation for memory management.

Memory Sharing Through Base Aliasing

An elegant use of segment bases is memory sharing—allowing multiple processes to share the same physical memory by pointing their segment bases to the same location. This is fundamental for shared libraries, inter-process communication, and system efficiency.

The Aliasing Concept:

When two segment table entries have the same base address, they "alias" the same physical memory. Writes through one segment are immediately visible through the other.

Shared Library Example:

Process A's Code Segment:  Base = 0x40000000, Limit = 0x100000
Process B's Code Segment:  Base = 0x40000000, Limit = 0x100000
                           ↓ Same physical memory ↓
                        Shared Library Code

Both processes execute the same physical copy of the library code. If the library is 1 MB, only 1 MB is used, not 2 MB. With 100 processes sharing the same library, the savings are enormous.

Benefits of Segment Sharing:

Segment Sharing Benefits
Benefit	Description	Impact
Memory Efficiency	Single physical copy serves multiple processes	Dramatic RAM savings for common libraries
Cache Efficiency	Shared code stays in CPU cache	Better instruction cache hit rates
Faster Process Creation	fork() inherits parent's segment bases	No immediate memory copy needed
IPC Performance	Shared data segments for communication	Zero-copy data sharing
Consistency	Updates to shared code affect all users	Single point of patching

Sharing with Different Permissions:

Processes can share the same physical memory but with different access rights. This is achieved by having different segment descriptors (with different protection bits) pointing to the same base:

Process A: Base=0x50000, Limit=0x10000, Read/Write
Process B: Base=0x50000, Limit=0x10000, Read-Only

Process A can modify the shared region; Process B can only read it. This is useful for scenarios like a server process writing data that multiple client processes read.

Shared Memory Implementation:

shared_segment_creation.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
// OS kernel code to create a shared segment
 
// Structure to track shared memory regions
typedef struct {
    void* physical_base;
    size_t size;
    int ref_count;
    int permissions;
} shared_region_t;
 
// Create a shared segment mapping for a process
int create_shared_segment(process_t* proc, shared_region_t* region, 
                          int local_selector, int permissions) {
    
    // Allocate a segment descriptor slot
    segment_descriptor_t* desc = allocate_descriptor(proc, local_selector);
    if (desc == NULL) return -ENOMEM;
    
    // Set up the descriptor to point to shared physical memory
    desc->base = (uint32_t)region->physical_base;
    desc->limit = region->size - 1;
    desc->present = 1;
    desc->dpl = 3;  // User accessible
    desc->type = DATA_READWRITE;
    
    // Apply requested permissions (must be subset of region permissions)
    if ((permissions & ~region->permissions) != 0) {
        return -EPERM;  // Requested more than allowed
    }
    
    if (!(permissions & PERM_WRITE)) {
        desc->type = DATA_READONLY;
    }
    
    // Increment reference count
    region->ref_count++;
    
    return 0;  // Success
}

Security Considerations

When sharing segments, the OS must carefully manage permissions. A process should not gain write access to memory that should be read-only. The DPL and protection bits in segment descriptors enforce this at the hardware level, preventing privilege escalation through shared memory.

Base Loading and Segment Register Caching

For address translation to be efficient, the CPU cannot fetch segment descriptors from memory on every instruction. Instead, segment information is cached in special registers. Understanding this caching is crucial for system programming.

Segment Register Architecture:

In x86 architecture, each segment register (CS, DS, SS, ES, FS, GS) has two parts:

Visible Part (Selector): The 16-bit segment selector, accessible to software
Hidden Part (Descriptor Cache): A copy of the full segment descriptor, invisible to software

┌─────────────────────────────────────────────────────────────────┐
│                    Segment Register (e.g., DS)                 │
├─────────────────┬───────────────────────────────────────────────┤
│     Selector    │           Hidden Descriptor Cache            │
│    (16 bits)    │  Base(32b) | Limit(20b) | Attributes(16b)   │
│   Visible to    │         Invisible to Software                │
│   Software      │        Loaded automatically when             │
│                 │        selector is loaded into register      │
└─────────────────┴───────────────────────────────────────────────┘

Loading a Segment Register:

•Software loads selector: MOV DS, AX (where AX contains a segment selector)
•Hardware reads descriptor: CPU fetches the segment descriptor from GDT/LDT at the address indicated by the selector's index
•Hardware caches descriptor: The full descriptor (base, limit, attributes) is copied into the hidden portion of the segment register
•Subsequent accesses use cache: All memory accesses through DS now use the cached base, not re-fetching from the descriptor table

Performance Implications:

Because the descriptor is cached, address translation is extremely fast—just an integer addition. The memory read from the descriptor table only happens when the segment register is explicitly loaded with a new selector.

This is why programs typically load segment registers once (during initialization) and then use them extensively. Frequent segment register loads would be slow due to the required memory access and privilege checks.

Descriptor Table Updates:

A critical question arises: what happens if the OS modifies a segment descriptor in the GDT/LDT while a program has that segment loaded?

Answer: The cached copy is not automatically updated. The segment register continues using the old cached values until the segment is reloaded.

Implications:

The OS can safely modify descriptors of segments not currently loaded
To update a loaded segment, the OS must force the program to reload its segment register (typically via a context switch)
Some CPUs provide instructions to explicitly invalidate segment caches

segment_register_reload.asm

x86 Assembly

; Force reload of DS segment register
; This refreshes the hidden cache from the descriptor table
 
reload_data_segment:
    mov ax, ds          ; Save current selector
    mov ds, ax          ; Reload it - forces descriptor cache refresh
    ret
 
; Common pattern after modifying GDT entries
refresh_all_data_segments:
    push ds
    push es
    push fs
    push gs
    
    ; Loading with same values forces cache refresh
    pop gs
    pop fs
    pop es
    pop ds
    ret
 
; Context switch sequence that naturally reloads segments
context_switch:
    ; Save current context...
    
    ; Load new segment selectors (from new process's context)
    mov ds, [new_context.ds]    ; Caches new descriptor
    mov es, [new_context.es]
    mov fs, [new_context.fs]
    mov gs, [new_context.gs]
    mov ss, [new_context.ss]
    
    ; Load code segment via far jump
    jmp [new_context.cs]:[new_context.eip]

Modern OS Usage

Modern OSes typically set all segment bases to 0 (flat model) and don't change them during normal operation. The FS and GS segments are exceptions—they're used for thread-local storage (TLS) on Linux and Windows, with the base set to the current thread's TLS block. This requires per-CPU or per-thread descriptor table entries.

Address Space Layout Using Bases

The segment base addresses collectively define a process's view of physical memory. How the OS chooses these bases determines the memory layout—where code, data, and stack reside, how much they can grow, and how processes are isolated from each other.

Traditional Segmented Layout:

In a purely segmented system, each segment occupies a contiguous physical region. A typical process layout might be:

Physical Memory Layout:

0x00000000 ┌──────────────────────────┐
           │      OS Kernel Code      │  (Protected, DPL=0)
           │      OS Kernel Data      │
0x00100000 ├──────────────────────────┤
           │   Process A Code         │  Base=0x00100000
0x00150000 ├──────────────────────────┤
           │   Process A Data         │  Base=0x00150000
0x001A0000 ├──────────────────────────┤
           │   Process A Stack        │  Base=0x001A0000 (Expand-Down)
0x001B0000 ├──────────────────────────┤
           │   Process B Code         │  Base=0x001B0000
0x00200000 ├──────────────────────────┤
           │   Process B Data         │  Base=0x00200000
0x00280000 ├──────────────────────────┤
           │        Free Space        │
           ├──────────────────────────┤
           │         ...              │

Base Selection Considerations:

OS Decisions for Base Selection

•Contiguous Space Availability — The OS must find a contiguous free region large enough for the segment. External fragmentation complicates this.
•Alignment Requirements — Some systems require bases to be aligned to certain boundaries (page boundaries, cache line boundaries) for performance.
•NUMA Locality — On NUMA systems, placing segments in memory local to the accessing CPU improves performance.
•Security Randomization — Address Space Layout Randomization (ASLR) chooses random bases to make exploits harder.
•Growth Room — Segments that grow (heap, stack) need space to expand. The OS should leave room for future growth.

Overlapping Segments:

Interestingly, segment bases and limits don't prevent segments from overlapping in physical memory. Two segments could have bases such that their physical ranges intersect:

Segment A: Base=0x1000, Limit=0x3000  → 0x1000-0x4000
Segment B: Base=0x2000, Limit=0x2000  → 0x2000-0x4000
                        Overlap: 0x2000-0x4000

This can be intentional (for aliasing/sharing) or a bug. The hardware doesn't prevent it—protection is about what operations are allowed, not about ensuring segments don't overlap.

Flat Model as a Special Case:

In a flat memory model (used by modern OSes), all segment bases are set to 0 and limits to maximum:

Code Segment:  Base=0, Limit=0xFFFFFFFF (4GB)
Data Segment:  Base=0, Limit=0xFFFFFFFF (4GB)
Stack Segment: Base=0, Limit=0xFFFFFFFF (4GB)

This effectively disables segmentation—all segments cover the entire address space. Protection and isolation are then handled by paging instead. This is simpler to manage but loses some benefits of true segmentation.

Segmentation + Paging

Many systems combined segmentation and paging. Segmentation handled logical organization (code, data, stack separation), while paging handled physical memory management (allocating RAM, swapping). The segment base pointed to a linear address, which was then translated via paging to a physical address. This two-stage translation provided the benefits of both mechanisms.

Base and Limit Working Together

While the base and limit are separate fields, they work as a unit to define the segment's physical extent. Understanding their interaction is crucial for proper memory management.

Defining the Segment Bounds:

For an expand-up segment (normal code and data):

Lowest valid address: Base
Highest valid address: Base + Limit
Physical range: [Base, Base + Limit]

For an expand-down segment (stacks):

Lowest valid address: Base + Limit + 1
Highest valid address: Base + MaxOffset (depends on segment size)
This allows stacks to grow downward into lower addresses

Segment Growth Scenarios:

Growing Segments
Segment Type	Growth Direction	How to Grow	Constraint
Heap (Data)	Upward	Increase limit	Must not overlap next segment
Stack	Downward	Decrease expand-down limit	Must not overlap previous segment
Code	Typically fixed	Relink with larger code	Static after loading

Growing the Heap:

When a program calls sbrk() or malloc() needs more space, the OS can grow the data segment:

Check if there's free physical memory after the current segment end
If yes, simply increase the limit in the segment descriptor
If no, relocate the segment to a larger free area, then increase limit
Update the segment descriptor

Growing the Stack:

Stack growth is more complex due to the expand-down nature:

As the program pushes data, the stack pointer decreases
When the pointer approaches the current limit, a fault may occur
The OS allocates more physical memory below the stack
The expand-down limit is decreased to allow access to the new region
Stack growth continues

Example: Segment Collision Prevention

segment_growth.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
// Check if a segment can grow without collision
 
bool can_grow_segment(process_t* proc, segment_id_t seg_id, size_t additional_bytes) {
    segment_descriptor_t* seg = get_segment(proc, seg_id);
    
    // Calculate new end address
    uint32_t current_end = seg->base + seg->limit;
    uint32_t new_end = current_end + additional_bytes;
    
    // Check against all other segments of this process
    for (int i = 0; i < MAX_SEGMENTS; i++) {
        if (i == seg_id) continue;
        
        segment_descriptor_t* other = get_segment(proc, i);
        if (!other->present) continue;
        
        uint32_t other_start = other->base;
        uint32_t other_end = other->base + other->limit;
        
        // Check for overlap with grown segment
        if (new_end > other_start && seg->base < other_end) {
            return false;  // Would collide
        }
    }
    
    // Check against system memory regions
    if (new_end > get_max_user_address()) {
        return false;  // Would exceed user space
    }
    
    return true;  // Safe to grow
}
 
int grow_segment(process_t* proc, segment_id_t seg_id, size_t additional_bytes) {
    if (!can_grow_segment(proc, seg_id, additional_bytes)) {
        // Need to relocate first
        if (!relocate_segment(proc, seg_id, additional_bytes)) {
            return -ENOMEM;  // Cannot grow
        }
    }
    
    segment_descriptor_t* seg = get_segment(proc, seg_id);
    
    // Allocate physical memory for new region
    if (!allocate_physical(seg->base + seg->limit + 1, additional_bytes)) {
        return -ENOMEM;
    }
    
    // Increase the limit
    seg->limit += additional_bytes;
    
    // Force cache refresh
    flush_segment_cache(proc, seg_id);
    
    return 0;  // Success
}

The Fragmentation Problem

Because segments are contiguous and variable-sized, external fragmentation is inevitable. Over time, free memory becomes scattered in small chunks, none large enough for a new segment. This requires compaction (expensive) or clever allocation strategies. Paging avoids this by using fixed-size units, which is a key reason it replaced pure segmentation.

Base Address in x86 Segment Descriptors

The x86 architecture provides a concrete example of how base addresses are encoded in segment descriptors. The format is notoriously complex due to historical compatibility requirements.

x86 Segment Descriptor Format (8 bytes):

Byte 7 ┌───────────────────────────────────────────┐
       │  Base[31:24]  │ G │D/B│ L │AVL│ Lim[19:16]│
Byte 6 │───────────────────────────────────────────│
Byte 5 │   P │ DPL │ S │     Type      │           │
Byte 4 │──────────────────────────────────Base[23:16]│
Byte 3 │                                           │
Byte 2 │              Base[15:0]                   │
Byte 1 │                                           │
Byte 0 │              Limit[15:0]                  │
       └───────────────────────────────────────────┘

Base Address Fields:

Base[15:0]: Bytes 2-3 (16 bits)
Base[23:16]: Byte 4 (8 bits)
Base[31:24]: Byte 7 (8 bits)

To extract the 32-bit base address:

base = (descriptor[7] << 24) | (descriptor[4] << 16) | (descriptor[2] | (descriptor[3] << 8))

Why Split Across Non-Contiguous Bytes?

This strange layout is for backward compatibility with the 80286. The 286 used 6-byte descriptors with a 24-bit base. When the 386 added 32-bit addressing, the extra 8 bits of base had to go somewhere new—they were placed in what was reserved/available space in the 286 format, resulting in the split.

x86_descriptor_utils.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
// x86 segment descriptor manipulation
 
typedef struct {
    uint16_t limit_low;      // Bytes 0-1: Limit[15:0]
    uint16_t base_low;       // Bytes 2-3: Base[15:0]
    uint8_t  base_mid;       // Byte 4: Base[23:16]
    uint8_t  access;         // Byte 5: Access byte (P, DPL, S, Type)
    uint8_t  granularity;    // Byte 6: Limit[19:16], flags
    uint8_t  base_high;      // Byte 7: Base[31:24]
} __attribute__((packed)) gdt_entry_t;
 
// Extract 32-bit base address from descriptor
static inline uint32_t get_descriptor_base(const gdt_entry_t* desc) {
    return (uint32_t)desc->base_low |
           ((uint32_t)desc->base_mid << 16) |
           ((uint32_t)desc->base_high << 24);
}
 
// Set base address in descriptor
static inline void set_descriptor_base(gdt_entry_t* desc, uint32_t base) {
    desc->base_low  = (uint16_t)(base & 0xFFFF);
    desc->base_mid  = (uint8_t)((base >> 16) & 0xFF);
    desc->base_high = (uint8_t)((base >> 24) & 0xFF);
}
 
// Extract limit (20-bit, not considering granularity)
static inline uint32_t get_descriptor_limit(const gdt_entry_t* desc) {
    return (uint32_t)desc->limit_low |
           (((uint32_t)desc->granularity & 0x0F) << 16);
}
 
// Create a complete segment descriptor
void create_gdt_entry(gdt_entry_t* desc, uint32_t base, uint32_t limit,
                      uint8_t access, uint8_t flags) {
    // Set base (split across 3 fields)
    set_descriptor_base(desc, base);
    
    // Set limit (split across 2 fields)
    if (limit > 0xFFFFF) {
        // Use page granularity if limit > 1MB
        limit >>= 12;
        flags |= 0x80;  // Set G bit
    }
    desc->limit_low = limit & 0xFFFF;
    desc->granularity = (limit >> 16) & 0x0F;
    desc->granularity |= (flags & 0xF0);
    
    // Set access byte
    desc->access = access;
}

64-bit Mode Changes

In x86-64 long mode, the descriptor format changes for system segments (TSS, LDT) which expand to 16 bytes to accommodate 64-bit base addresses. However, code and data segment descriptors remain 8 bytes, and the base field is largely ignored—the CPU forces base=0 for user-mode segments in 64-bit mode, implementing a flat memory model.

Thread-Local Storage via FS/GS Base

Even in modern 64-bit systems that use a flat memory model, segment bases find an important use: Thread-Local Storage (TLS). The FS and GS segment registers retain functional bases, allowing per-thread and per-CPU data access.

Why TLS Needs Segment Bases:

In a multithreaded program, some data needs to be per-thread:

Thread ID
Thread-specific errno value
Thread-local variables (__thread or thread_local)
Stack canaries for security

Without segment bases, accessing this data would require:

Getting the current thread ID somehow
Looking up the thread's TLS pointer in a table
Adding the variable's offset to the pointer

With segment bases, the process is much simpler:

Access TLS at a fixed offset from FS or GS base
The base is already set to point to this thread's TLS block
Single instruction: MOV EAX, FS:[0x10]

TLS Location With FS/GS:

Thread 1:  GS Base = 0x7F1234560000 → Thread 1's TLS Block
Thread 2:  GS Base = 0x7F1234570000 → Thread 2's TLS Block
Thread 3:  GS Base = 0x7F1234580000 → Thread 3's TLS Block

All threads use identical code:
    MOV RAX, GS:[0x28]    ; Read thread-local variable at offset 0x28
    Each thread gets its own value based on its GS base

FS/GS Usage in Modern Operating Systems
OS	FS Register	GS Register
Linux (user)	Thread pointer (glibc TLS)	Not typically used
Linux (kernel)	Per-CPU data	Per-CPU data
Windows (user)	Thread Environment Block (TEB)	Not user accessible
Windows (kernel)	Not used	Processor Control Region (KPCR)
macOS	Thread-local storage	Kernel-reserved

Setting the FS/GS Base:

Modern x86-64 CPUs provide special MSRs (Model Specific Registers) to set the FS and GS bases without going through the GDT:

FS_BASE MSR (0xC0000100): Sets FS segment base
GS_BASE MSR (0xC0000101): Sets GS segment base
KERNEL_GS_BASE MSR (0xC0000102): Shadow GS base for kernel/user transitions

The WRFSBASE and WRGSBASE instructions (if enabled) allow user-mode code to set these bases directly, improving thread creation performance.

Security Implications:

The GS-relative access pattern is used for stack protector (canary) checks:

; Function prologue with stack protector
function_start:
    sub rsp, 0x28
    mov rax, gs:[0x28]        ; Load canary from TLS
    mov [rsp+0x20], rax       ; Store on stack
    ; ... function body ...
    
; Function epilogue
    mov rax, [rsp+0x20]       ; Retrieve canary from stack
    xor rax, gs:[0x28]        ; Compare with TLS value
    jnz __stack_chk_fail      ; If different, buffer overflow detected
    ret

Performance Advantage

Using segment bases for TLS access is extremely efficient—it's just a memory load with an implied base. No table lookups, no function calls, no global variable access to find the current thread. This matters because TLS is accessed frequently: errno checks, stack canary verification on every function return, and thread-local variables throughout the code.

Summary: Mastering the Segment Base

We've explored the segment base address from every angle—mathematical foundations, hardware implementation, OS usage, and modern applications. Let's consolidate the key insights:

Key Takeaways

•The Base Formula — Physical Address = Base + Offset is the fundamental equation of segmented memory.
•Dynamic Relocation — Changing the base moves a segment in physical memory without code modification.
•Memory Sharing — Multiple segments with the same base share physical memory, enabling efficient library sharing.
•Base Size Determines Addressability — 24-bit base = 16MB max, 32-bit = 4GB, etc.
•Descriptor Caching — Segment registers cache descriptors for performance; explicit operations needed to refresh.
•Base + Limit = Segment Extent — Together they define exactly which physical addresses belong to the segment.
•x86 Complexity — Base is split across non-contiguous bytes for historical 286 compatibility.
•Modern TLS Usage — FS/GS bases remain active for thread-local storage even in flat memory models.

What's Next:

Having mastered the base address, we'll now examine the segment limit in comparable depth. The limit field defines segment boundaries and enables hardware bounds checking—a critical protection mechanism that catches buffer overflows and other memory errors at the point of access.

Page Complete

You now have a complete understanding of segment base addresses—their role in address translation, how they enable relocation and sharing, their implementation in hardware, and their continued relevance for thread-local storage. This knowledge is essential for systems programming and OS internals.

2 / 5

Loading learning content...

Operating SystemsSegment Table

Segment Table

LevelIntermediate

Duration75 mins

TopicSegment Table

2 / 5

Segment Base

The Foundation of Memory Placement

The base address answers a critical question: Where in physical memory does this segment begin?

What You Will Learn

The Mathematical Foundation

At its core, the segment base participates in a simple but profound equation that is the heart of segmented address translation:

Physical Address = Segment Base + Offset

This equation is executed by hardware—specifically the Memory Management Unit (MMU)—for every memory access in a segmented system. Let's examine each component:

The Offset:

The Base:

The Translation:

Every time the CPU needs to access memory, it:

Identifies which segment the access belongs to (code, data, stack, etc.)
Retrieves the base address from the appropriate segment register or descriptor
Adds the offset to obtain the physical address
Sends this physical address to the memory system

address_translation.pseudo

Pseudocode

// Segmented Address Translation Algorithm
 
function translate_address(segment_selector, offset):
    // Step 1: Look up segment descriptor
    descriptor = segment_table[segment_selector.index]
    
    // Step 2: Check if segment is present
    if not descriptor.present:
        raise SegmentNotPresentFault(segment_selector)
    
    // Step 3: Check privilege level
    effective_privilege = max(CPL, segment_selector.RPL)
    if effective_privilege > descriptor.DPL:
        raise GeneralProtectionFault("Privilege violation")
    
    // Step 4: Check bounds
    effective_limit = descriptor.limit
    if descriptor.granularity == PAGE:
        effective_limit = (descriptor.limit + 1) * 4096 - 1
    
    if descriptor.expand_down:
        if offset <= effective_limit or offset > MAX_OFFSET:
            raise GeneralProtectionFault("Bounds violation")
    else:  // expand up
        if offset > effective_limit:
            raise GeneralProtectionFault("Bounds violation")
    
    // Step 5: Calculate physical address
    physical_address = descriptor.base + offset
    
    // Step 6: Update accessed bit if needed
    if not descriptor.accessed:
        descriptor.accessed = true
    
    return physical_address

The Power of Indirection:

Programs deal with logical addresses (offsets within segments)
The OS decides physical placement by setting base addresses
Hardware performs the translation transparently

No recompilation, no relinking, no modification of the program is required when its physical location changes.

Base Address Size and Memory Capacity

Historical Progression:

Base Address Evolution Across CPU Generations
CPU/Mode	Base Address Bits	Max Physical Memory	Historical Context
8086 (Real Mode)	20 (16-bit seg × 16)	1 MB	Original IBM PC, 1981
80286 (Protected)	24 bits	16 MB	First protected mode, 1982
80386 (Protected)	32 bits	4 GB	Full 32-bit support, 1985
x86-64 (Long Mode)	Flat model, 64 bits	Theoretical: 16 EB	Modern systems, paging dominates

Real Mode Segmentation (8086):

In the original 8086 processor, segments worked differently. The segment register held a 16-bit value that was multiplied by 16 (shifted left 4 bits) and added to a 16-bit offset:

Physical = (Segment Register × 16) + Offset
         = (Segment Register << 4) + Offset

For example, segment 0x1234 with offset 0x5678:

Physical = 0x1234 × 16 + 0x5678
         = 0x12340 + 0x5678
         = 0x179B8

This gave 20 bits of address space (1 MB), which was astounding for personal computers in 1981 but quickly became constraining.

Protected Mode Evolution (80286+):

Modern 64-bit Systems:

Segment Base Limitations

Dynamic Relocation via Base Modification

The Relocation Problem:

The Segment Base Solution:

Relocation Scenario:

Before Relocation

•Process A code at physical 0x10000
•Segment base = 0x10000
•Instruction at offset 0x500
•Physical address = 0x10500
•Process runs correctly

After Relocation

•Process A code moved to 0x80000
•Segment base = 0x80000
•Same instruction at offset 0x500
•Physical address = 0x80500
•Process runs correctly—unchanged!

Relocation Process:

Pause the Process: The OS suspends execution of the process being relocated.
Copy Memory Contents: The segment's contents are copied from the old location to the new location. For a 64KB segment, this is a straightforward memory copy.
Update Segment Base: The segment table entry's base field is updated to the new physical address.
Flush Caches: Any cached copies of the segment descriptor (in segment register caches) must be invalidated or updated.
Resume Execution: The process continues running. Its code hasn't changed—all offsets remain valid. Only the base has moved.

Why Relocation Matters:

•Memory Compaction — Over time, as processes start and exit, physical memory becomes fragmented with scattered free holes. Relocation allows the OS to compact memory, sliding segments together to create larger contiguous free regions.
•Defragmentation — Similar to disk defragmentation, memory defragmentation via relocation consolidates free space and improves allocation efficiency.
•Load Balancing — In NUMA systems, the OS might relocate segments closer to the CPUs that access them most frequently, improving performance.
•Hot-Swapping — Memory hardware can be replaced without rebooting by relocating all segments away from the affected region.

Relocation Overhead

Memory Sharing Through Base Aliasing

The Aliasing Concept:

When two segment table entries have the same base address, they "alias" the same physical memory. Writes through one segment are immediately visible through the other.

Shared Library Example:

Process A's Code Segment:  Base = 0x40000000, Limit = 0x100000
Process B's Code Segment:  Base = 0x40000000, Limit = 0x100000
                           ↓ Same physical memory ↓
                        Shared Library Code

Both processes execute the same physical copy of the library code. If the library is 1 MB, only 1 MB is used, not 2 MB. With 100 processes sharing the same library, the savings are enormous.

Benefits of Segment Sharing:

Segment Sharing Benefits
Benefit	Description	Impact
Memory Efficiency	Single physical copy serves multiple processes	Dramatic RAM savings for common libraries
Cache Efficiency	Shared code stays in CPU cache	Better instruction cache hit rates
Faster Process Creation	fork() inherits parent's segment bases	No immediate memory copy needed
IPC Performance	Shared data segments for communication	Zero-copy data sharing
Consistency	Updates to shared code affect all users	Single point of patching

Sharing with Different Permissions:

Processes can share the same physical memory but with different access rights. This is achieved by having different segment descriptors (with different protection bits) pointing to the same base:

Process A: Base=0x50000, Limit=0x10000, Read/Write
Process B: Base=0x50000, Limit=0x10000, Read-Only

Process A can modify the shared region; Process B can only read it. This is useful for scenarios like a server process writing data that multiple client processes read.

Shared Memory Implementation:

shared_segment_creation.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
// OS kernel code to create a shared segment
 
// Structure to track shared memory regions
typedef struct {
    void* physical_base;
    size_t size;
    int ref_count;
    int permissions;
} shared_region_t;
 
// Create a shared segment mapping for a process
int create_shared_segment(process_t* proc, shared_region_t* region, 
                          int local_selector, int permissions) {
    
    // Allocate a segment descriptor slot
    segment_descriptor_t* desc = allocate_descriptor(proc, local_selector);
    if (desc == NULL) return -ENOMEM;
    
    // Set up the descriptor to point to shared physical memory
    desc->base = (uint32_t)region->physical_base;
    desc->limit = region->size - 1;
    desc->present = 1;
    desc->dpl = 3;  // User accessible
    desc->type = DATA_READWRITE;
    
    // Apply requested permissions (must be subset of region permissions)
    if ((permissions & ~region->permissions) != 0) {
        return -EPERM;  // Requested more than allowed
    }
    
    if (!(permissions & PERM_WRITE)) {
        desc->type = DATA_READONLY;
    }
    
    // Increment reference count
    region->ref_count++;
    
    return 0;  // Success
}

Security Considerations

Base Loading and Segment Register Caching

Segment Register Architecture:

In x86 architecture, each segment register (CS, DS, SS, ES, FS, GS) has two parts:

Visible Part (Selector): The 16-bit segment selector, accessible to software
Hidden Part (Descriptor Cache): A copy of the full segment descriptor, invisible to software

┌─────────────────────────────────────────────────────────────────┐
│                    Segment Register (e.g., DS)                 │
├─────────────────┬───────────────────────────────────────────────┤
│     Selector    │           Hidden Descriptor Cache            │
│    (16 bits)    │  Base(32b) | Limit(20b) | Attributes(16b)   │
│   Visible to    │         Invisible to Software                │
│   Software      │        Loaded automatically when             │
│                 │        selector is loaded into register      │
└─────────────────┴───────────────────────────────────────────────┘

Loading a Segment Register:

•Software loads selector: MOV DS, AX (where AX contains a segment selector)
•Hardware reads descriptor: CPU fetches the segment descriptor from GDT/LDT at the address indicated by the selector's index
•Hardware caches descriptor: The full descriptor (base, limit, attributes) is copied into the hidden portion of the segment register
•Subsequent accesses use cache: All memory accesses through DS now use the cached base, not re-fetching from the descriptor table

Performance Implications:

Descriptor Table Updates:

A critical question arises: what happens if the OS modifies a segment descriptor in the GDT/LDT while a program has that segment loaded?

Answer: The cached copy is not automatically updated. The segment register continues using the old cached values until the segment is reloaded.

Implications:

The OS can safely modify descriptors of segments not currently loaded
To update a loaded segment, the OS must force the program to reload its segment register (typically via a context switch)
Some CPUs provide instructions to explicitly invalidate segment caches

segment_register_reload.asm

x86 Assembly

; Force reload of DS segment register
; This refreshes the hidden cache from the descriptor table
 
reload_data_segment:
    mov ax, ds          ; Save current selector
    mov ds, ax          ; Reload it - forces descriptor cache refresh
    ret
 
; Common pattern after modifying GDT entries
refresh_all_data_segments:
    push ds
    push es
    push fs
    push gs
    
    ; Loading with same values forces cache refresh
    pop gs
    pop fs
    pop es
    pop ds
    ret
 
; Context switch sequence that naturally reloads segments
context_switch:
    ; Save current context...
    
    ; Load new segment selectors (from new process's context)
    mov ds, [new_context.ds]    ; Caches new descriptor
    mov es, [new_context.es]
    mov fs, [new_context.fs]
    mov gs, [new_context.gs]
    mov ss, [new_context.ss]
    
    ; Load code segment via far jump
    jmp [new_context.cs]:[new_context.eip]

Modern OS Usage

Address Space Layout Using Bases

Traditional Segmented Layout:

In a purely segmented system, each segment occupies a contiguous physical region. A typical process layout might be:

Physical Memory Layout:

0x00000000 ┌──────────────────────────┐
           │      OS Kernel Code      │  (Protected, DPL=0)
           │      OS Kernel Data      │
0x00100000 ├──────────────────────────┤
           │   Process A Code         │  Base=0x00100000
0x00150000 ├──────────────────────────┤
           │   Process A Data         │  Base=0x00150000
0x001A0000 ├──────────────────────────┤
           │   Process A Stack        │  Base=0x001A0000 (Expand-Down)
0x001B0000 ├──────────────────────────┤
           │   Process B Code         │  Base=0x001B0000
0x00200000 ├──────────────────────────┤
           │   Process B Data         │  Base=0x00200000
0x00280000 ├──────────────────────────┤
           │        Free Space        │
           ├──────────────────────────┤
           │         ...              │

Base Selection Considerations:

OS Decisions for Base Selection

•Contiguous Space Availability — The OS must find a contiguous free region large enough for the segment. External fragmentation complicates this.
•Alignment Requirements — Some systems require bases to be aligned to certain boundaries (page boundaries, cache line boundaries) for performance.
•NUMA Locality — On NUMA systems, placing segments in memory local to the accessing CPU improves performance.
•Security Randomization — Address Space Layout Randomization (ASLR) chooses random bases to make exploits harder.
•Growth Room — Segments that grow (heap, stack) need space to expand. The OS should leave room for future growth.

Overlapping Segments:

Interestingly, segment bases and limits don't prevent segments from overlapping in physical memory. Two segments could have bases such that their physical ranges intersect:

Segment A: Base=0x1000, Limit=0x3000  → 0x1000-0x4000
Segment B: Base=0x2000, Limit=0x2000  → 0x2000-0x4000
                        Overlap: 0x2000-0x4000

This can be intentional (for aliasing/sharing) or a bug. The hardware doesn't prevent it—protection is about what operations are allowed, not about ensuring segments don't overlap.

Flat Model as a Special Case:

In a flat memory model (used by modern OSes), all segment bases are set to 0 and limits to maximum:

Code Segment:  Base=0, Limit=0xFFFFFFFF (4GB)
Data Segment:  Base=0, Limit=0xFFFFFFFF (4GB)
Stack Segment: Base=0, Limit=0xFFFFFFFF (4GB)

Segmentation + Paging

Base and Limit Working Together

While the base and limit are separate fields, they work as a unit to define the segment's physical extent. Understanding their interaction is crucial for proper memory management.

Defining the Segment Bounds:

For an expand-up segment (normal code and data):

Lowest valid address: Base
Highest valid address: Base + Limit
Physical range: [Base, Base + Limit]

For an expand-down segment (stacks):

Lowest valid address: Base + Limit + 1
Highest valid address: Base + MaxOffset (depends on segment size)
This allows stacks to grow downward into lower addresses

Segment Growth Scenarios:

Growing Segments
Segment Type	Growth Direction	How to Grow	Constraint
Heap (Data)	Upward	Increase limit	Must not overlap next segment
Stack	Downward	Decrease expand-down limit	Must not overlap previous segment
Code	Typically fixed	Relink with larger code	Static after loading

Growing the Heap:

When a program calls sbrk() or malloc() needs more space, the OS can grow the data segment:

Check if there's free physical memory after the current segment end
If yes, simply increase the limit in the segment descriptor
If no, relocate the segment to a larger free area, then increase limit
Update the segment descriptor

Growing the Stack:

Stack growth is more complex due to the expand-down nature:

As the program pushes data, the stack pointer decreases
When the pointer approaches the current limit, a fault may occur
The OS allocates more physical memory below the stack
The expand-down limit is decreased to allow access to the new region
Stack growth continues

Example: Segment Collision Prevention

segment_growth.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
// Check if a segment can grow without collision
 
bool can_grow_segment(process_t* proc, segment_id_t seg_id, size_t additional_bytes) {
    segment_descriptor_t* seg = get_segment(proc, seg_id);
    
    // Calculate new end address
    uint32_t current_end = seg->base + seg->limit;
    uint32_t new_end = current_end + additional_bytes;
    
    // Check against all other segments of this process
    for (int i = 0; i < MAX_SEGMENTS; i++) {
        if (i == seg_id) continue;
        
        segment_descriptor_t* other = get_segment(proc, i);
        if (!other->present) continue;
        
        uint32_t other_start = other->base;
        uint32_t other_end = other->base + other->limit;
        
        // Check for overlap with grown segment
        if (new_end > other_start && seg->base < other_end) {
            return false;  // Would collide
        }
    }
    
    // Check against system memory regions
    if (new_end > get_max_user_address()) {
        return false;  // Would exceed user space
    }
    
    return true;  // Safe to grow
}
 
int grow_segment(process_t* proc, segment_id_t seg_id, size_t additional_bytes) {
    if (!can_grow_segment(proc, seg_id, additional_bytes)) {
        // Need to relocate first
        if (!relocate_segment(proc, seg_id, additional_bytes)) {
            return -ENOMEM;  // Cannot grow
        }
    }
    
    segment_descriptor_t* seg = get_segment(proc, seg_id);
    
    // Allocate physical memory for new region
    if (!allocate_physical(seg->base + seg->limit + 1, additional_bytes)) {
        return -ENOMEM;
    }
    
    // Increase the limit
    seg->limit += additional_bytes;
    
    // Force cache refresh
    flush_segment_cache(proc, seg_id);
    
    return 0;  // Success
}

The Fragmentation Problem

Base Address in x86 Segment Descriptors

The x86 architecture provides a concrete example of how base addresses are encoded in segment descriptors. The format is notoriously complex due to historical compatibility requirements.

x86 Segment Descriptor Format (8 bytes):

Byte 7 ┌───────────────────────────────────────────┐
       │  Base[31:24]  │ G │D/B│ L │AVL│ Lim[19:16]│
Byte 6 │───────────────────────────────────────────│
Byte 5 │   P │ DPL │ S │     Type      │           │
Byte 4 │──────────────────────────────────Base[23:16]│
Byte 3 │                                           │
Byte 2 │              Base[15:0]                   │
Byte 1 │                                           │
Byte 0 │              Limit[15:0]                  │
       └───────────────────────────────────────────┘

Base Address Fields:

Base[15:0]: Bytes 2-3 (16 bits)
Base[23:16]: Byte 4 (8 bits)
Base[31:24]: Byte 7 (8 bits)

To extract the 32-bit base address:

base = (descriptor[7] << 24) | (descriptor[4] << 16) | (descriptor[2] | (descriptor[3] << 8))

Why Split Across Non-Contiguous Bytes?

x86_descriptor_utils.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
// x86 segment descriptor manipulation
 
typedef struct {
    uint16_t limit_low;      // Bytes 0-1: Limit[15:0]
    uint16_t base_low;       // Bytes 2-3: Base[15:0]
    uint8_t  base_mid;       // Byte 4: Base[23:16]
    uint8_t  access;         // Byte 5: Access byte (P, DPL, S, Type)
    uint8_t  granularity;    // Byte 6: Limit[19:16], flags
    uint8_t  base_high;      // Byte 7: Base[31:24]
} __attribute__((packed)) gdt_entry_t;
 
// Extract 32-bit base address from descriptor
static inline uint32_t get_descriptor_base(const gdt_entry_t* desc) {
    return (uint32_t)desc->base_low |
           ((uint32_t)desc->base_mid << 16) |
           ((uint32_t)desc->base_high << 24);
}
 
// Set base address in descriptor
static inline void set_descriptor_base(gdt_entry_t* desc, uint32_t base) {
    desc->base_low  = (uint16_t)(base & 0xFFFF);
    desc->base_mid  = (uint8_t)((base >> 16) & 0xFF);
    desc->base_high = (uint8_t)((base >> 24) & 0xFF);
}
 
// Extract limit (20-bit, not considering granularity)
static inline uint32_t get_descriptor_limit(const gdt_entry_t* desc) {
    return (uint32_t)desc->limit_low |
           (((uint32_t)desc->granularity & 0x0F) << 16);
}
 
// Create a complete segment descriptor
void create_gdt_entry(gdt_entry_t* desc, uint32_t base, uint32_t limit,
                      uint8_t access, uint8_t flags) {
    // Set base (split across 3 fields)
    set_descriptor_base(desc, base);
    
    // Set limit (split across 2 fields)
    if (limit > 0xFFFFF) {
        // Use page granularity if limit > 1MB
        limit >>= 12;
        flags |= 0x80;  // Set G bit
    }
    desc->limit_low = limit & 0xFFFF;
    desc->granularity = (limit >> 16) & 0x0F;
    desc->granularity |= (flags & 0xF0);
    
    // Set access byte
    desc->access = access;
}

64-bit Mode Changes

Thread-Local Storage via FS/GS Base

Why TLS Needs Segment Bases:

In a multithreaded program, some data needs to be per-thread:

Thread ID
Thread-specific errno value
Thread-local variables (__thread or thread_local)
Stack canaries for security

Without segment bases, accessing this data would require:

Getting the current thread ID somehow
Looking up the thread's TLS pointer in a table
Adding the variable's offset to the pointer

With segment bases, the process is much simpler:

Access TLS at a fixed offset from FS or GS base
The base is already set to point to this thread's TLS block
Single instruction: MOV EAX, FS:[0x10]

TLS Location With FS/GS:

Thread 1:  GS Base = 0x7F1234560000 → Thread 1's TLS Block
Thread 2:  GS Base = 0x7F1234570000 → Thread 2's TLS Block
Thread 3:  GS Base = 0x7F1234580000 → Thread 3's TLS Block

All threads use identical code:
    MOV RAX, GS:[0x28]    ; Read thread-local variable at offset 0x28
    Each thread gets its own value based on its GS base

FS/GS Usage in Modern Operating Systems
OS	FS Register	GS Register
Linux (user)	Thread pointer (glibc TLS)	Not typically used
Linux (kernel)	Per-CPU data	Per-CPU data
Windows (user)	Thread Environment Block (TEB)	Not user accessible
Windows (kernel)	Not used	Processor Control Region (KPCR)
macOS	Thread-local storage	Kernel-reserved

Setting the FS/GS Base:

Modern x86-64 CPUs provide special MSRs (Model Specific Registers) to set the FS and GS bases without going through the GDT:

FS_BASE MSR (0xC0000100): Sets FS segment base
GS_BASE MSR (0xC0000101): Sets GS segment base
KERNEL_GS_BASE MSR (0xC0000102): Shadow GS base for kernel/user transitions

The WRFSBASE and WRGSBASE instructions (if enabled) allow user-mode code to set these bases directly, improving thread creation performance.

Security Implications:

The GS-relative access pattern is used for stack protector (canary) checks:

; Function prologue with stack protector
function_start:
    sub rsp, 0x28
    mov rax, gs:[0x28]        ; Load canary from TLS
    mov [rsp+0x20], rax       ; Store on stack
    ; ... function body ...
    
; Function epilogue
    mov rax, [rsp+0x20]       ; Retrieve canary from stack
    xor rax, gs:[0x28]        ; Compare with TLS value
    jnz __stack_chk_fail      ; If different, buffer overflow detected
    ret

Performance Advantage

Summary: Mastering the Segment Base

We've explored the segment base address from every angle—mathematical foundations, hardware implementation, OS usage, and modern applications. Let's consolidate the key insights:

Key Takeaways

•The Base Formula — Physical Address = Base + Offset is the fundamental equation of segmented memory.
•Dynamic Relocation — Changing the base moves a segment in physical memory without code modification.
•Memory Sharing — Multiple segments with the same base share physical memory, enabling efficient library sharing.
•Base Size Determines Addressability — 24-bit base = 16MB max, 32-bit = 4GB, etc.
•Descriptor Caching — Segment registers cache descriptors for performance; explicit operations needed to refresh.
•Base + Limit = Segment Extent — Together they define exactly which physical addresses belong to the segment.
•x86 Complexity — Base is split across non-contiguous bytes for historical 286 compatibility.
•Modern TLS Usage — FS/GS bases remain active for thread-local storage even in flat memory models.

What's Next:

Page Complete

2 / 5