Loading learning content...
When you write a program, you don't think of memory as a flat, undifferentiated array of bytes. You think in terms of structure: your code goes in one place, your global variables in another, your stack grows downward, your heap grows upward, and your dynamically loaded libraries occupy their own regions. This is the programmer's natural view of memory—organized, structured, and meaningful.
But paging, despite its elegance in solving fragmentation, imposes a fundamentally different view. To a paging system, memory is a uniform collection of fixed-size pages—there's no distinction between a page holding code and one holding data. The logical structure of your program is invisible to the hardware.
Segmentation bridges this gap. It organizes memory into logical segments—variable-sized blocks that correspond to the meaningful units of a program. Each segment represents a complete, logically distinct unit: the main program's code, a library module, a symbol table, a stack, or a heap. This alignment between memory organization and program structure is what makes segmentation compelling.
By the end of this page, you will understand: what logical segments are and why they exist, the fundamental distinction between segments and pages, how segments reflect program structure, the historical motivation for segmentation, addressing within segmented memory, and the relationship between segmentation and other memory management techniques.
A logical segment is a contiguous block of memory that represents a complete, meaningful unit of a program. Unlike pages, which are arbitrary fixed-size chunks created for hardware convenience, segments correspond to the logical divisions that programmers naturally create when writing software.
Consider a typical C program. When compiled and loaded, it consists of several distinct logical units:
Each of these is a logical segment—a coherent unit with its own purpose, access patterns, and lifetime. The key insight of segmentation is that memory management should respect these logical boundaries.
The Formal Definition:
Formally, a segment is defined by a tuple (segment_number, base_address, limit). The segment number uniquely identifies the segment within the process's address space. The base address indicates where the segment begins in physical memory. The limit specifies the segment's size, ensuring access remains within bounds.
An address in a segmented system is a two-component address: (segment_number, offset). To access memory, the hardware:
This two-dimensional addressing scheme is fundamental to how segmentation represents the programmer's view of memory.
The term 'segment' comes from the idea of dividing (segmenting) a program into its natural parts. Just as a worm's body has segments that are complete functional units, a program has segments that are complete logical units. This biological metaphor captures the essential idea: each segment is complete and meaningful on its own.
Segmentation emerged in the 1960s as computer scientists grappled with fundamental questions about memory organization. The systems of that era faced challenges that made segmentation a natural solution.
The Problem with Flat Address Spaces
Early computers used flat, one-dimensional address spaces. A program was loaded starting at some base address, and all addresses were relative to that base. This simplicity came with significant problems:
| System | Year | Segmentation Innovation | Impact |
|---|---|---|---|
| Burroughs B5000 | 1961 | Tagged architecture with code/data segments | Pioneered structured memory |
| Multics | 1965 | Segments + pages, rich segment attributes | Defined modern segmentation concepts |
| Intel 8086 | 1978 | Four 64KB segments (CS, DS, SS, ES) | Brought segmentation to microprocessors |
| Intel 80286 | 1982 | Protected mode with segment descriptors | Added hardware protection to segments |
| Intel 80386 | 1985 | Segments + paging combined | Full modern implementation |
The Multics Vision
The most influential early segmented system was Multics (Multiplexed Information and Computing Service), developed at MIT starting in 1964. Multics introduced a revolutionary concept: treat all of memory as a collection of named segments that persist independently of processes.
In Multics:
This vision was so ambitious that Multics was considered overengineered by some, leading Ken Thompson and Dennis Ritchie to create Unix as a simpler alternative. Yet the concepts pioneered by Multics—segments with attributes, protection domains, and the unification of files and memory—remain influential today.
When Intel designed the 8086 processor in 1978, they needed to address more than 64KB of memory with 16-bit registers. Their solution: use segment registers that provide a base address, shifted left by 4 bits, added to an offset. This allowed 1MB addressing (20 bits) with 16-bit components. While pragmatic, this design choice forced generations of programmers to wrestle with 'near' and 'far' pointers—a consequence of hardware segmentation meeting real-world constraints.
Understanding the distinction between segments and pages is crucial for grasping why both exist and how they can complement each other. These two approaches to memory organization represent fundamentally different philosophies.
The Philosophical Divide
The segment-page dichotomy reflects a deeper tension in systems design: logical organization vs. physical efficiency.
Segmentation says: "Memory should be organized the way programmers think. Give each logical unit its own space, let it grow as needed, protect it according to its purpose."
Paging says: "Memory should be organized for efficient use. Divide everything into uniform chunks that can be shuffled, swapped, and managed without fragmentation."
Neither view is wrong—they're solving different problems. The insight of modern systems is that they can be combined: use segmentation at the logical level (for protection, sharing, and programmer convenience) and paging at the physical level (for memory management efficiency).
Where Each Excels:
| Aspect | Segmentation Better | Paging Better |
|---|---|---|
| Matching program structure | ✓ | |
| Eliminating external fragmentation | ✓ | |
| Enabling sharing by semantic unit | ✓ | |
| Simplifying memory management | ✓ | |
| Fine-grained protection | ✓ | |
| Supporting virtual memory | ✓ | |
| Handling variable-size data | ✓ |
Because segments have variable sizes and live in physical memory, allocation and deallocation create external fragmentation—scattered free regions that can't satisfy large requests even when total free memory is sufficient. This problem is why pure segmentation is rarely used alone; combining with paging eliminates external fragmentation while preserving segmentation's logical benefits.
In a segmented memory system, every address consists of two components: a segment selector (or segment number) and an offset within that segment. This two-dimensional addressing is fundamental to how segmentation works.
123456789101112131415161718192021
// Conceptual representation of a segmented address// In a segmented system, addresses have two components: // Logical Address Structurestruct logical_address { uint16_t segment; // Segment selector (which segment) uint32_t offset; // Offset within the segment}; // Example: Address "segment 3, offset 0x1A4"// This means: byte 0x1A4 within the 3rd segment // The hardware performs translation:// 1. Look up segment 3 in the segment table// 2. Get the base address of segment 3 (e.g., 0x4000)// 3. Check that offset (0x1A4) < segment limit// 4. Physical address = base + offset = 0x4000 + 0x1A4 = 0x41A4 // In Intel x86 notation, this might be written as:// segment:offset -> CS:0x1A4 (for code segment)// segment:offset -> DS:0x400 (for data segment)Advantages of Two-Dimensional Addressing
Natural Separation: Different segments occupy different address spaces. A reference to "segment 2, offset 100" and "segment 7, offset 100" access completely different memory locations, even though the offsets are identical.
Independent Relocation: Each segment can be moved in physical memory independently. Only the segment table entry needs updating; all offsets within the segment remain valid.
Natural Bounds Checking: Each segment has an associated limit. Any access beyond this limit generates a hardware trap, catching buffer overflows and pointer errors at the source.
Meaningful Addresses: Addresses carry semantic information. "Code segment, offset X" means instruction at position X. "Stack segment, offset Y" means stack location Y. This aids debugging and security.
The Translation Process
When a program issues a memory reference, the following occurs:
In many segmented architectures (like Intel x86), common instructions use implicit segment registers. Code fetches automatically use the Code Segment (CS), stack operations use the Stack Segment (SS), and most data references use the Data Segment (DS). Programmers can override these defaults for specific accesses, but the implicit mapping reduces the burden of managing segments in everyday code.
The fundamental insight of segmentation is that programs are not amorphous blobs of data—they have structure. Segmentation makes this structure visible to the hardware, enabling memory management that respects the program's logical organization.
A Typical Program's Segment Structure:
When a program is compiled and linked, it naturally divides into segments that reflect different purposes and access patterns:
| Segment | Contents | Access | Lifespan | Growth |
|---|---|---|---|---|
| Text/Code | Machine instructions | Execute + Read | Static (process lifetime) | Never changes |
| Data (initialized) | Global/static vars with initial values | Read + Write | Static (process lifetime) | Never changes |
| BSS | Uninitialized global/static vars | Read + Write | Static (process lifetime) | Never changes |
| Heap | Dynamic allocations (malloc) | Read + Write | Dynamic | Grows upward on demand |
| Stack | Local vars, call frames, return addrs | Read + Write | Dynamic | Grows downward on call, shrinks on return |
| Shared libs | Dynamically linked library code/data | Varies by section | Process lifetime (ref-counted) | Never changes after load |
Why This Structure Matters for Memory Management:
Different Protection Requirements
Different Sharing Potential
Different Growth Patterns
Different Lifetime Requirements
123456789101112131415161718192021222324252627282930313233343536373839404142434445
// Example: How a C program maps to segments// Consider this simple program: int initialized_global = 42; // DATA segment (has initial value)int uninitialized_global; // BSS segment (zero-initialized)const char* message = "Hello"; // DATA segment (pointer + string in RODATA) void helper_function() { // TEXT segment int local_var; // STACK segment (created at runtime) // ...} int main() { // TEXT segment int* heap_array; // STACK segment (the pointer itself) heap_array = malloc(100 * sizeof(int)); // HEAP segment allocation helper_function(); free(heap_array); // Returns memory to HEAP return 0;} // Memory layout (approximate):// // High addresses ┌─────────────────┐// │ STACK │ ← grows downward// │ (local_var, │// │ heap_array) │// ├─────────────────┤// │ ↓ │// │ (free space) │// │ ↑ │// ├─────────────────┤// │ HEAP │ ← grows upward// │ (malloc'd data)│// ├─────────────────┤// │ BSS │ ← uninitialized_global// ├─────────────────┤// │ DATA │ ← initialized_global// ├─────────────────┤// │ RODATA │ ← "Hello" string// ├─────────────────┤// │ TEXT │ ← main, helper_function// Low addresses └─────────────────┘On Unix-like systems, executable files use the ELF (Executable and Linkable Format) format, which explicitly defines program segments. When you run 'readelf -l program', you see the program headers describing each segment: its type, virtual address, physical address, file size, memory size, and flags. These ELF segments directly correspond to the logical segments loaded into memory.
One of segmentation's most elegant features is how naturally it enables memory sharing. Because segments correspond to logical program units, sharing segments means sharing meaningful components—not arbitrary pages that happen to overlap.
The Sharing Scenario:
Consider 50 users all running the same text editor. Without sharing:
With segment sharing:
How Sharing Works:
Segment Table Entries Point to Same Physical Memory
Reference Counting
Copy-on-Write for Data Segments
What Can Be Shared:
| Segment Type | Sharable? | Notes |
|---|---|---|
| Code (Text) | Always | Multiple readers, no writers |
| Read-only Data | Always | Constant strings, lookup tables |
| Initialized Data | With COW | Each process gets private copy on write |
| BSS | With COW | Copy only the portion written |
| Heap | No | Process-private by nature |
| Stack | Never | Fundamental to process identity |
| Shared Libraries | Code: Yes, Data: COW | This is their purpose |
The C library (libc) is used by nearly every program on a Unix system. With segmentation, one copy of libc's code resides in memory, shared by hundreds of processes. The memory savings are enormous. This is why shared libraries are called 'shared'—they share segments across process boundaries, not just share access to the same file on disk.
Protection in a segmented system is remarkably natural because access control aligns with program structure. Each segment can have its own access permissions, and those permissions make semantic sense.
Segment-Level Protection Attributes:
Protection Scenarios:
Scenario 1: Preventing Code Injection
// Attacker tries to execute data as code
// Data segment has permissions: Read, Write, NO Execute
// Hardware blocks execution attempt → Protection Fault
Scenario 2: Preventing Code Modification
// Bug or attack tries to overwrite code
// Code segment has permissions: Read, Execute, NO Write
// Hardware blocks write attempt → Protection Fault
Scenario 3: Protecting Kernel Segments
// User process tries to access kernel data
// Kernel segment DPL = 0, User process CPL = 3
// DPL < CPL → General Protection Fault
Scenario 4: Stack Smashing Protection
// Buffer overflow on stack injects code
// Stack segment: Read, Write, NO Execute
// Even if injected, code cannot execute
These protections happen in hardware at every memory access, with no software overhead. The segment descriptor is cached in the segment register, so protection checks add zero cycles to most memory operations.
For years, Intel x86 processors lacked a no-execute bit for pages (x86 segmentation had E/X permission, but widely-used flat memory models bypassed it). The AMD64 architecture finally added the NX (No-eXecute) bit, and Intel followed with XD (eXecute Disable). This seemingly small addition dramatically improved security by allowing operating systems to mark data regions as non-executable at the page level, complementing segment-level protections.
You might wonder: if paging solved fragmentation so elegantly, why do modern systems still use segments? The answer is nuanced. Pure segmentation has largely given way to paging, but segmentation concepts persist in important ways.
Current State of Segmentation:
| System/Architecture | Segmentation Role | Details |
|---|---|---|
| x86-64 (Long Mode) | Vestigial | Base fixed at 0, limit disabled; FS/GS used for TLS |
| Linux (x86-64) | Minimal | Uses FS for thread-local storage, GS for per-CPU data |
| Windows (x86-64) | Minimal | Similar TLS use; segments effectively unused for MM |
| ARM (64-bit) | None | Pure paging with no segmentation hardware |
| RISC-V | None | Clean paging design, no segmentation |
| WebAssembly | Conceptual | Linear memory with bounds checking echoes segmentation |
Why Segmentation Retreated:
Where Segmentation Concepts Survive:
Even though hardware segmentation has faded, operating systems maintain segment-like abstractions internally. Linux's vm_area_struct describes contiguous regions with consistent permissions—essentially software segments. The logical concepts of segmentation remain valuable; only the hardware implementation has shifted to paging.
This page has provided a comprehensive exploration of logical segments as the foundational concept of memory segmentation. Let's consolidate the key insights:
What's Next:
With the theoretical foundation established, the next page examines the specific segment types found in typical programs: code segments, data segments, and stack segments. We'll explore how these segments differ in their contents, access patterns, and management requirements—understanding that forms the basis for practical segmentation implementation.
You now understand what logical segments are, why they exist, and how they provide a programmer's view of memory organization. This foundational knowledge prepares you both for understanding specific segment types and for appreciating how segmentation combines with paging in modern systems.