Loading learning content...
In the architecture of segmented memory, no field is more fundamental than the segment base address. This deceptively simple value—a physical memory address—is the anchor point for every memory access within a segment. It is the mechanism by which a program's logical view of memory is mapped onto the physical reality of RAM chips and memory controllers.
The base address answers a critical question: Where in physical memory does this segment begin?
Every byte the CPU fetches from a segment, every instruction executed, every variable read or written, has its physical location calculated by adding an offset to this base. Understanding the segment base is understanding the fundamental bridge between logical and physical address spaces in segmented systems.
Moreover, the base address enables one of the most powerful features of operating systems: relocation. A program can be compiled once, loaded anywhere in memory, and run correctly—all because the OS simply adjusts the segment base addresses to reflect the actual load location.
By the end of this page, you will deeply understand how segment base addresses work, how they enable dynamic relocation and memory sharing, the hardware mechanisms for base loading, memory layout implications, and how bases interact with limits for complete address space definition.
At its core, the segment base participates in a simple but profound equation that is the heart of segmented address translation:
Physical Address = Segment Base + Offset
This equation is executed by hardware—specifically the Memory Management Unit (MMU)—for every memory access in a segmented system. Let's examine each component:
The Offset:
The offset is the address specified by the program. When a programmer writes array[i] or a compiler generates MOV EAX, [EBX+4], the actual numerical address used is an offset within a segment. The program has no direct knowledge of where that segment is physically located.
The Base:
The base is the starting physical address of the segment, maintained in the segment table entry and cached in the CPU's segment registers. The base represents the mapping from the program's logical segment to physical memory.
The Translation:
Every time the CPU needs to access memory, it:
1234567891011121314151617181920212223242526272829303132333435
// Segmented Address Translation Algorithm function translate_address(segment_selector, offset): // Step 1: Look up segment descriptor descriptor = segment_table[segment_selector.index] // Step 2: Check if segment is present if not descriptor.present: raise SegmentNotPresentFault(segment_selector) // Step 3: Check privilege level effective_privilege = max(CPL, segment_selector.RPL) if effective_privilege > descriptor.DPL: raise GeneralProtectionFault("Privilege violation") // Step 4: Check bounds effective_limit = descriptor.limit if descriptor.granularity == PAGE: effective_limit = (descriptor.limit + 1) * 4096 - 1 if descriptor.expand_down: if offset <= effective_limit or offset > MAX_OFFSET: raise GeneralProtectionFault("Bounds violation") else: // expand up if offset > effective_limit: raise GeneralProtectionFault("Bounds violation") // Step 5: Calculate physical address physical_address = descriptor.base + offset // Step 6: Update accessed bit if needed if not descriptor.accessed: descriptor.accessed = true return physical_addressThe Power of Indirection:
This simple addition creates a powerful level of indirection. The program knows nothing about physical memory locations—it only deals with offsets. The OS can place the segment anywhere in physical memory simply by adjusting the base. This separation of concerns is a cornerstone of modern operating system design:
No recompilation, no relinking, no modification of the program is required when its physical location changes.
The number of bits allocated to the base address field directly determines the maximum amount of physical memory that segments can address. This relationship has driven much of the evolution of CPU architectures.
Historical Progression:
| CPU/Mode | Base Address Bits | Max Physical Memory | Historical Context |
|---|---|---|---|
| 8086 (Real Mode) | 20 (16-bit seg × 16) | 1 MB | Original IBM PC, 1981 |
| 80286 (Protected) | 24 bits | 16 MB | First protected mode, 1982 |
| 80386 (Protected) | 32 bits | 4 GB | Full 32-bit support, 1985 |
| x86-64 (Long Mode) | Flat model, 64 bits | Theoretical: 16 EB | Modern systems, paging dominates |
Real Mode Segmentation (8086):
In the original 8086 processor, segments worked differently. The segment register held a 16-bit value that was multiplied by 16 (shifted left 4 bits) and added to a 16-bit offset:
Physical = (Segment Register × 16) + Offset
= (Segment Register << 4) + Offset
For example, segment 0x1234 with offset 0x5678:
Physical = 0x1234 × 16 + 0x5678
= 0x12340 + 0x5678
= 0x179B8
This gave 20 bits of address space (1 MB), which was astounding for personal computers in 1981 but quickly became constraining.
Protected Mode Evolution (80286+):
The 80286 introduced protected mode with true segment descriptors. The base address became a field in the descriptor rather than a scaled segment register value. The 80286 used a 24-bit base (16 MB), and the 80386 extended this to 32 bits (4 GB).
Modern 64-bit Systems:
In x86-64 long mode, segmentation is essentially disabled for user-mode code—the base of code and data segments is forced to 0, creating a flat memory model. The FS and GS segment registers are exceptions, retaining their bases for thread-local storage and kernel data structures.
Physical address space in 64-bit systems is determined by paging, not segmentation. Modern CPUs support 48-52 bits of physical address (256 TB to 4 PB), far beyond what 32-bit segment bases could address.
The base address field size created hard limits on addressable memory. This is why the transition from 16-bit to 32-bit computing was so significant—it wasn't just about register width, but about the ability to use more than 16MB (286) or 4GB (386) of RAM. This same pressure drove the transition to 64-bit computing.
One of the most powerful capabilities enabled by segment base addresses is dynamic relocation—the ability to move a process's memory to a different physical location while the process is running, without requiring any changes to the code or its logical addresses.
The Relocation Problem:
In early computing, programs were compiled for specific memory addresses. If a program expected to run at address 0x1000, it had to be loaded exactly there. With multiple programs, this created impossible conflicts.
The Segment Base Solution:
With segmented addressing, programs are compiled using offsets from the start of their segments, not absolute addresses. The OS can load the segment anywhere in physical memory and simply set the base address accordingly.
Relocation Scenario:
Relocation Process:
Pause the Process: The OS suspends execution of the process being relocated.
Copy Memory Contents: The segment's contents are copied from the old location to the new location. For a 64KB segment, this is a straightforward memory copy.
Update Segment Base: The segment table entry's base field is updated to the new physical address.
Flush Caches: Any cached copies of the segment descriptor (in segment register caches) must be invalidated or updated.
Resume Execution: The process continues running. Its code hasn't changed—all offsets remain valid. Only the base has moved.
Why Relocation Matters:
While powerful, segment relocation has costs. Copying memory takes time proportional to segment size—relocating a 100MB segment requires copying 100MB of data. During this time, the process is paused. For large segments, this can cause noticeable latency. This overhead is one reason paging (with its smaller, fixed-size units) often dominates over pure segmentation for memory management.
An elegant use of segment bases is memory sharing—allowing multiple processes to share the same physical memory by pointing their segment bases to the same location. This is fundamental for shared libraries, inter-process communication, and system efficiency.
The Aliasing Concept:
When two segment table entries have the same base address, they "alias" the same physical memory. Writes through one segment are immediately visible through the other.
Shared Library Example:
Process A's Code Segment: Base = 0x40000000, Limit = 0x100000
Process B's Code Segment: Base = 0x40000000, Limit = 0x100000
↓ Same physical memory ↓
Shared Library Code
Both processes execute the same physical copy of the library code. If the library is 1 MB, only 1 MB is used, not 2 MB. With 100 processes sharing the same library, the savings are enormous.
Benefits of Segment Sharing:
| Benefit | Description | Impact |
|---|---|---|
| Memory Efficiency | Single physical copy serves multiple processes | Dramatic RAM savings for common libraries |
| Cache Efficiency | Shared code stays in CPU cache | Better instruction cache hit rates |
| Faster Process Creation | fork() inherits parent's segment bases | No immediate memory copy needed |
| IPC Performance | Shared data segments for communication | Zero-copy data sharing |
| Consistency | Updates to shared code affect all users | Single point of patching |
Sharing with Different Permissions:
Processes can share the same physical memory but with different access rights. This is achieved by having different segment descriptors (with different protection bits) pointing to the same base:
Process A: Base=0x50000, Limit=0x10000, Read/Write
Process B: Base=0x50000, Limit=0x10000, Read-Only
Process A can modify the shared region; Process B can only read it. This is useful for scenarios like a server process writing data that multiple client processes read.
Shared Memory Implementation:
123456789101112131415161718192021222324252627282930313233343536373839
// OS kernel code to create a shared segment // Structure to track shared memory regionstypedef struct { void* physical_base; size_t size; int ref_count; int permissions;} shared_region_t; // Create a shared segment mapping for a processint create_shared_segment(process_t* proc, shared_region_t* region, int local_selector, int permissions) { // Allocate a segment descriptor slot segment_descriptor_t* desc = allocate_descriptor(proc, local_selector); if (desc == NULL) return -ENOMEM; // Set up the descriptor to point to shared physical memory desc->base = (uint32_t)region->physical_base; desc->limit = region->size - 1; desc->present = 1; desc->dpl = 3; // User accessible desc->type = DATA_READWRITE; // Apply requested permissions (must be subset of region permissions) if ((permissions & ~region->permissions) != 0) { return -EPERM; // Requested more than allowed } if (!(permissions & PERM_WRITE)) { desc->type = DATA_READONLY; } // Increment reference count region->ref_count++; return 0; // Success}When sharing segments, the OS must carefully manage permissions. A process should not gain write access to memory that should be read-only. The DPL and protection bits in segment descriptors enforce this at the hardware level, preventing privilege escalation through shared memory.
For address translation to be efficient, the CPU cannot fetch segment descriptors from memory on every instruction. Instead, segment information is cached in special registers. Understanding this caching is crucial for system programming.
Segment Register Architecture:
In x86 architecture, each segment register (CS, DS, SS, ES, FS, GS) has two parts:
┌─────────────────────────────────────────────────────────────────┐
│ Segment Register (e.g., DS) │
├─────────────────┬───────────────────────────────────────────────┤
│ Selector │ Hidden Descriptor Cache │
│ (16 bits) │ Base(32b) | Limit(20b) | Attributes(16b) │
│ Visible to │ Invisible to Software │
│ Software │ Loaded automatically when │
│ │ selector is loaded into register │
└─────────────────┴───────────────────────────────────────────────┘
Loading a Segment Register:
MOV DS, AX (where AX contains a segment selector)Performance Implications:
Because the descriptor is cached, address translation is extremely fast—just an integer addition. The memory read from the descriptor table only happens when the segment register is explicitly loaded with a new selector.
This is why programs typically load segment registers once (during initialization) and then use them extensively. Frequent segment register loads would be slow due to the required memory access and privilege checks.
Descriptor Table Updates:
A critical question arises: what happens if the OS modifies a segment descriptor in the GDT/LDT while a program has that segment loaded?
Answer: The cached copy is not automatically updated. The segment register continues using the old cached values until the segment is reloaded.
Implications:
1234567891011121314151617181920212223242526272829303132333435
; Force reload of DS segment register; This refreshes the hidden cache from the descriptor table reload_data_segment: mov ax, ds ; Save current selector mov ds, ax ; Reload it - forces descriptor cache refresh ret ; Common pattern after modifying GDT entriesrefresh_all_data_segments: push ds push es push fs push gs ; Loading with same values forces cache refresh pop gs pop fs pop es pop ds ret ; Context switch sequence that naturally reloads segmentscontext_switch: ; Save current context... ; Load new segment selectors (from new process's context) mov ds, [new_context.ds] ; Caches new descriptor mov es, [new_context.es] mov fs, [new_context.fs] mov gs, [new_context.gs] mov ss, [new_context.ss] ; Load code segment via far jump jmp [new_context.cs]:[new_context.eip]Modern OSes typically set all segment bases to 0 (flat model) and don't change them during normal operation. The FS and GS segments are exceptions—they're used for thread-local storage (TLS) on Linux and Windows, with the base set to the current thread's TLS block. This requires per-CPU or per-thread descriptor table entries.
The segment base addresses collectively define a process's view of physical memory. How the OS chooses these bases determines the memory layout—where code, data, and stack reside, how much they can grow, and how processes are isolated from each other.
Traditional Segmented Layout:
In a purely segmented system, each segment occupies a contiguous physical region. A typical process layout might be:
Physical Memory Layout:
0x00000000 ┌──────────────────────────┐
│ OS Kernel Code │ (Protected, DPL=0)
│ OS Kernel Data │
0x00100000 ├──────────────────────────┤
│ Process A Code │ Base=0x00100000
0x00150000 ├──────────────────────────┤
│ Process A Data │ Base=0x00150000
0x001A0000 ├──────────────────────────┤
│ Process A Stack │ Base=0x001A0000 (Expand-Down)
0x001B0000 ├──────────────────────────┤
│ Process B Code │ Base=0x001B0000
0x00200000 ├──────────────────────────┤
│ Process B Data │ Base=0x00200000
0x00280000 ├──────────────────────────┤
│ Free Space │
├──────────────────────────┤
│ ... │
Base Selection Considerations:
Overlapping Segments:
Interestingly, segment bases and limits don't prevent segments from overlapping in physical memory. Two segments could have bases such that their physical ranges intersect:
Segment A: Base=0x1000, Limit=0x3000 → 0x1000-0x4000
Segment B: Base=0x2000, Limit=0x2000 → 0x2000-0x4000
Overlap: 0x2000-0x4000
This can be intentional (for aliasing/sharing) or a bug. The hardware doesn't prevent it—protection is about what operations are allowed, not about ensuring segments don't overlap.
Flat Model as a Special Case:
In a flat memory model (used by modern OSes), all segment bases are set to 0 and limits to maximum:
Code Segment: Base=0, Limit=0xFFFFFFFF (4GB)
Data Segment: Base=0, Limit=0xFFFFFFFF (4GB)
Stack Segment: Base=0, Limit=0xFFFFFFFF (4GB)
This effectively disables segmentation—all segments cover the entire address space. Protection and isolation are then handled by paging instead. This is simpler to manage but loses some benefits of true segmentation.
Many systems combined segmentation and paging. Segmentation handled logical organization (code, data, stack separation), while paging handled physical memory management (allocating RAM, swapping). The segment base pointed to a linear address, which was then translated via paging to a physical address. This two-stage translation provided the benefits of both mechanisms.
While the base and limit are separate fields, they work as a unit to define the segment's physical extent. Understanding their interaction is crucial for proper memory management.
Defining the Segment Bounds:
For an expand-up segment (normal code and data):
For an expand-down segment (stacks):
Segment Growth Scenarios:
| Segment Type | Growth Direction | How to Grow | Constraint |
|---|---|---|---|
| Heap (Data) | Upward | Increase limit | Must not overlap next segment |
| Stack | Downward | Decrease expand-down limit | Must not overlap previous segment |
| Code | Typically fixed | Relink with larger code | Static after loading |
Growing the Heap:
When a program calls sbrk() or malloc() needs more space, the OS can grow the data segment:
Growing the Stack:
Stack growth is more complex due to the expand-down nature:
Example: Segment Collision Prevention
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556
// Check if a segment can grow without collision bool can_grow_segment(process_t* proc, segment_id_t seg_id, size_t additional_bytes) { segment_descriptor_t* seg = get_segment(proc, seg_id); // Calculate new end address uint32_t current_end = seg->base + seg->limit; uint32_t new_end = current_end + additional_bytes; // Check against all other segments of this process for (int i = 0; i < MAX_SEGMENTS; i++) { if (i == seg_id) continue; segment_descriptor_t* other = get_segment(proc, i); if (!other->present) continue; uint32_t other_start = other->base; uint32_t other_end = other->base + other->limit; // Check for overlap with grown segment if (new_end > other_start && seg->base < other_end) { return false; // Would collide } } // Check against system memory regions if (new_end > get_max_user_address()) { return false; // Would exceed user space } return true; // Safe to grow} int grow_segment(process_t* proc, segment_id_t seg_id, size_t additional_bytes) { if (!can_grow_segment(proc, seg_id, additional_bytes)) { // Need to relocate first if (!relocate_segment(proc, seg_id, additional_bytes)) { return -ENOMEM; // Cannot grow } } segment_descriptor_t* seg = get_segment(proc, seg_id); // Allocate physical memory for new region if (!allocate_physical(seg->base + seg->limit + 1, additional_bytes)) { return -ENOMEM; } // Increase the limit seg->limit += additional_bytes; // Force cache refresh flush_segment_cache(proc, seg_id); return 0; // Success}Because segments are contiguous and variable-sized, external fragmentation is inevitable. Over time, free memory becomes scattered in small chunks, none large enough for a new segment. This requires compaction (expensive) or clever allocation strategies. Paging avoids this by using fixed-size units, which is a key reason it replaced pure segmentation.
The x86 architecture provides a concrete example of how base addresses are encoded in segment descriptors. The format is notoriously complex due to historical compatibility requirements.
x86 Segment Descriptor Format (8 bytes):
Byte 7 ┌───────────────────────────────────────────┐
│ Base[31:24] │ G │D/B│ L │AVL│ Lim[19:16]│
Byte 6 │───────────────────────────────────────────│
Byte 5 │ P │ DPL │ S │ Type │ │
Byte 4 │──────────────────────────────────Base[23:16]│
Byte 3 │ │
Byte 2 │ Base[15:0] │
Byte 1 │ │
Byte 0 │ Limit[15:0] │
└───────────────────────────────────────────┘
Base Address Fields:
To extract the 32-bit base address:
base = (descriptor[7] << 24) | (descriptor[4] << 16) | (descriptor[2] | (descriptor[3] << 8))
Why Split Across Non-Contiguous Bytes?
This strange layout is for backward compatibility with the 80286. The 286 used 6-byte descriptors with a 24-bit base. When the 386 added 32-bit addressing, the extra 8 bits of base had to go somewhere new—they were placed in what was reserved/available space in the 286 format, resulting in the split.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950
// x86 segment descriptor manipulation typedef struct { uint16_t limit_low; // Bytes 0-1: Limit[15:0] uint16_t base_low; // Bytes 2-3: Base[15:0] uint8_t base_mid; // Byte 4: Base[23:16] uint8_t access; // Byte 5: Access byte (P, DPL, S, Type) uint8_t granularity; // Byte 6: Limit[19:16], flags uint8_t base_high; // Byte 7: Base[31:24]} __attribute__((packed)) gdt_entry_t; // Extract 32-bit base address from descriptorstatic inline uint32_t get_descriptor_base(const gdt_entry_t* desc) { return (uint32_t)desc->base_low | ((uint32_t)desc->base_mid << 16) | ((uint32_t)desc->base_high << 24);} // Set base address in descriptorstatic inline void set_descriptor_base(gdt_entry_t* desc, uint32_t base) { desc->base_low = (uint16_t)(base & 0xFFFF); desc->base_mid = (uint8_t)((base >> 16) & 0xFF); desc->base_high = (uint8_t)((base >> 24) & 0xFF);} // Extract limit (20-bit, not considering granularity)static inline uint32_t get_descriptor_limit(const gdt_entry_t* desc) { return (uint32_t)desc->limit_low | (((uint32_t)desc->granularity & 0x0F) << 16);} // Create a complete segment descriptorvoid create_gdt_entry(gdt_entry_t* desc, uint32_t base, uint32_t limit, uint8_t access, uint8_t flags) { // Set base (split across 3 fields) set_descriptor_base(desc, base); // Set limit (split across 2 fields) if (limit > 0xFFFFF) { // Use page granularity if limit > 1MB limit >>= 12; flags |= 0x80; // Set G bit } desc->limit_low = limit & 0xFFFF; desc->granularity = (limit >> 16) & 0x0F; desc->granularity |= (flags & 0xF0); // Set access byte desc->access = access;}In x86-64 long mode, the descriptor format changes for system segments (TSS, LDT) which expand to 16 bytes to accommodate 64-bit base addresses. However, code and data segment descriptors remain 8 bytes, and the base field is largely ignored—the CPU forces base=0 for user-mode segments in 64-bit mode, implementing a flat memory model.
Even in modern 64-bit systems that use a flat memory model, segment bases find an important use: Thread-Local Storage (TLS). The FS and GS segment registers retain functional bases, allowing per-thread and per-CPU data access.
Why TLS Needs Segment Bases:
In a multithreaded program, some data needs to be per-thread:
__thread or thread_local)Without segment bases, accessing this data would require:
With segment bases, the process is much simpler:
MOV EAX, FS:[0x10]TLS Location With FS/GS:
Thread 1: GS Base = 0x7F1234560000 → Thread 1's TLS Block
Thread 2: GS Base = 0x7F1234570000 → Thread 2's TLS Block
Thread 3: GS Base = 0x7F1234580000 → Thread 3's TLS Block
All threads use identical code:
MOV RAX, GS:[0x28] ; Read thread-local variable at offset 0x28
Each thread gets its own value based on its GS base
| OS | FS Register | GS Register |
|---|---|---|
| Linux (user) | Thread pointer (glibc TLS) | Not typically used |
| Linux (kernel) | Per-CPU data | Per-CPU data |
| Windows (user) | Thread Environment Block (TEB) | Not user accessible |
| Windows (kernel) | Not used | Processor Control Region (KPCR) |
| macOS | Thread-local storage | Kernel-reserved |
Setting the FS/GS Base:
Modern x86-64 CPUs provide special MSRs (Model Specific Registers) to set the FS and GS bases without going through the GDT:
The WRFSBASE and WRGSBASE instructions (if enabled) allow user-mode code to set these bases directly, improving thread creation performance.
Security Implications:
The GS-relative access pattern is used for stack protector (canary) checks:
; Function prologue with stack protector
function_start:
sub rsp, 0x28
mov rax, gs:[0x28] ; Load canary from TLS
mov [rsp+0x20], rax ; Store on stack
; ... function body ...
; Function epilogue
mov rax, [rsp+0x20] ; Retrieve canary from stack
xor rax, gs:[0x28] ; Compare with TLS value
jnz __stack_chk_fail ; If different, buffer overflow detected
ret
Using segment bases for TLS access is extremely efficient—it's just a memory load with an implied base. No table lookups, no function calls, no global variable access to find the current thread. This matters because TLS is accessed frequently: errno checks, stack canary verification on every function return, and thread-local variables throughout the code.
We've explored the segment base address from every angle—mathematical foundations, hardware implementation, OS usage, and modern applications. Let's consolidate the key insights:
What's Next:
Having mastered the base address, we'll now examine the segment limit in comparable depth. The limit field defines segment boundaries and enables hardware bounds checking—a critical protection mechanism that catches buffer overflows and other memory errors at the point of access.
You now have a complete understanding of segment base addresses—their role in address translation, how they enable relocation and sharing, their implementation in hardware, and their continued relevance for thread-local storage. This knowledge is essential for systems programming and OS internals.