Loading content...
One of segmentation's most defining characteristics is that segments can be any size. A segment holding a single global variable might be 4 bytes. A segment containing a massive data structure could be hundreds of megabytes. There's no fixed quantum, no power-of-two requirement, no hardware-imposed granularity.
This flexibility is both segmentation's greatest strength and its most significant challenge.
The strength: Segments match the actual size of logical program components. No wasted space from rounding to page boundaries. No artificial splitting of coherent units. The memory organization perfectly reflects the program's structure.
The challenge: Variable-size allocation inevitably leads to external fragmentation. As segments of different sizes are allocated and freed, the memory becomes a patchwork of free and used regions. Eventually, you may have plenty of free memory in total, but no single contiguous region large enough for a new segment.
This page explores variable segment sizes in depth: why they matter, how they're managed, and what consequences they have for system design.
By the end of this page, you will understand: why segments have variable sizes while pages are fixed, the advantages of variable sizing for different segment types, the mathematics of segment size distribution, how variable sizes lead to external fragmentation, strategies for managing variable-size allocation, the trade-offs between segments and pages regarding size flexibility, and modern approaches to the fragmentation problem.
The variable size of segments isn't an arbitrary design choice—it emerges directly from segmentation's goal of matching memory organization to program structure. Programs don't come in uniform chunks, so why should their memory?
Program Components Have Diverse Sizes:
Consider a typical application's logical components:
| Component | Typical Size Range | Why This Size? |
|---|---|---|
| Small utility function | 50-200 bytes | Few instructions, simple logic |
| String constants table | 1-10 KB | Collection of error messages, labels |
| Main program module | 100 KB - 1 MB | Core application logic |
| Static data tables | 1 KB - 100 MB | Lookup tables, databases |
| Stack | 64 KB - 8 MB | Depends on recursion depth, local variables |
| Heap | 1 MB - many GB | Dynamic allocation, grows as needed |
| Shared library code | 100 KB - 10 MB | libc, GUI libraries, etc. |
| Memory-mapped file | bytes - GB | Depends entirely on file size |
The Mismatch with Fixed-Size Allocation:
If we were forced to use fixed-size units (like 4KB pages), these diverse components would suffer:
Small segments waste space: A 50-byte function occupies a full 4KB page. That's 98.8% waste—internal fragmentation.
Large segments get fragmented: A 10MB data table becomes 2,560 separate pages. The logical unity is lost; the pages might be scattered throughout physical memory.
Semantic boundaries are invisible: The system can't distinguish the page holding code from the page holding data. Protection must be applied per-page, not per-unit.
Variable sizing aligns with reality: A 50-byte segment takes 50 bytes. A 10MB segment takes 10MB. No artificial padding, no arbitrary splitting. The memory layout matches the program's structure.
Pages are sized for hardware efficiency—power of two, matching cache lines and TLB entries. Segments are sized for logical meaning—matching the actual size of program components. Neither is wrong; they serve different purposes. Modern systems often combine both: segments for logical organization, pages for physical memory management within segments.
Unlike pages (whose size is fixed by hardware), segment sizes are determined by the program's structure. Different segments get their sizes from different sources:
Compile-Time Determined Sizes:
Many segment sizes are fixed when the program is compiled and linked:
Runtime Determined Sizes:
Other segments have sizes that aren't known until the program runs:
1234567891011121314151617181920212223242526272829303132
// Example: Where segment sizes come from // CODE SEGMENT SIZE = sum of all compiled functions// Typically 10s of KB to MBs depending on program complexity // DATA SEGMENT SIZEint initialized_array[1000] = {1, 2, 3, /* ... */ }; // 4000 bytesdouble coefficients[100] = {1.0, 2.0, /* ... */ }; // 800 byteschar greeting[] = "Hello, World!"; // 14 bytes// .data size = 4814 bytes (plus padding/alignment) // BSS SEGMENT SIZEint uninitialized_buffer[10000]; // 40000 bytes (not in executable)static double workspace[5000]; // 40000 bytes (not in executable)// .bss size = 80000 bytes (stored as just the number in executable) // STACK SEGMENT SIZE// Initial: 8 MB (typical default)// Grows as needed, up to ulimit -s maximum void recursive_function(int depth) { char local_buffer[4096]; // 4KB per call on stack if (depth > 0) { recursive_function(depth - 1); // Stack grows with each call }}// 1000 deep recursion = ~4MB stack usage // HEAP SEGMENT SIZE// Starts at 0, grows with allocationsvoid* data = malloc(1024 * 1024); // Request 1MB// Heap grows to at least 1MB (plus allocator overhead)When you compile a program, the resulting executable file (ELF on Linux, PE on Windows) contains headers specifying each segment's size. For .text and .data, both file size and memory size are recorded. For .bss, only the memory size is stored—there's no point writing zeros to the file. Use 'readelf -l program' to see these sizes.
Managing variable-size segments isn't just an engineering challenge—it's a mathematical one. Understanding the mathematical properties of segment allocation helps predict system behavior and design better algorithms.
The Fragmentation Problem Formally:
Consider a memory of size M that has been subjected to a sequence of allocations and deallocations. At any moment:
Even when F is large, if it's scattered across many holes, we may fail to allocate a segment of size S < F because no single hole is large enough.
The 50-Percent Rule:
A remarkable result from queueing theory (Knuth, 1973):
Under steady-state allocation/deallocation with First Fit, if N blocks are allocated, approximately N/2 holes exist.
This means that roughly one-third of memory becomes unusable holes. If you have 1000 allocated segments, expect ~500 holes interspersed among them. This wasted space is external fragmentation—unavoidable with variable-size allocation.
| State | Allocated Segments | Holes | Total Free | Largest Hole |
|---|---|---|---|---|
| Initial | 0 | 1 | 1000 KB | 1000 KB |
| After 10 allocations | 10 | ~5 | 700 KB | ~200 KB |
| After 50 alloc/dealloc cycles | 25 | ~12 | 500 KB | ~50 KB |
| After 200 cycles | 50 | ~25 | 400 KB | ~20 KB |
| Heavily fragmented | 100 | ~50 | 300 KB | ~5 KB |
Size Distribution Matters:
The distribution of segment sizes dramatically affects fragmentation:
Uniform sizes: If all segments are the same size, fragmentation is minimal. Each freed spot perfectly fits a new segment.
Wide size variance: If segments range from 10 bytes to 10 MB, fragmentation is severe. Large segments leave small unusable holes when freed; small segments can't fill large holes efficiently.
Power-law distribution: Real workloads often follow power-law distributions—many small segments, few large ones. This causes particular fragmentation patterns that allocation algorithms must handle.
Best vs. Worst Case:
Best case for fragmentation:
[ A ][ B ][ C ][ D ][ E ] // Contiguous, no holes
Worst case for fragmentation:
[ A ][ ][ B ][ ][ C ][ ][ D ][ ][ E ] // Alternating, maximum holes
Typical long-running system:
[ A ][ ][ B ][ ][C][ D ][ ][ E ] // Mixed, some holes
No allocation algorithm can completely prevent external fragmentation with variable-size segments. The 50-percent rule is a statistical reality. Systems must either: (1) tolerate fragmentation and its consequences, (2) periodically compact memory (expensive), or (3) use fixed-size allocation like paging. Pure segmentation always fights this battle.
When a new segment needs to be allocated, the system must find a suitable free region. With variable-size segments and potentially many holes, this decision requires a placement algorithm.
Classic Placement Algorithms:
| Algorithm | Description | Time Complexity | Fragmentation Tendency |
|---|---|---|---|
| First Fit | Use first hole large enough | O(n) worst case | Good—tends to leave large holes at end |
| Next Fit | Like First Fit, but start from last allocation | O(n) worst case | Moderate—spreads allocation evenly |
| Best Fit | Use smallest hole that fits | O(n) or O(log n) with tree | Poor—leaves many tiny unusable holes |
| Worst Fit | Use largest hole available | O(n) or O(1) with heap | Poor—fragments large holes quickly |
First Fit in Detail:
First Fit is often the best general-purpose choice. It searches from the beginning of memory and uses the first hole large enough to hold the segment.
Memory: [Used][ Free 100KB ][Used][ Free 50KB ][Used]
Allocate 30KB segment:
- Check first hole: 100KB ≥ 30KB ✓
- Allocate 30KB at start of hole
- Result: [Used][New][Free 70KB][Used][Free 50KB][Used]
Advantages:
Disadvantages:
Best Fit in Detail:
Best Fit searches all holes and uses the one closest in size to the request.
Memory: [Used][Free 100KB][Used][Free 50KB][Used][Free 35KB][Used]
Allocate 30KB segment:
- Check all holes: 100KB, 50KB, 35KB
- Best fit: 35KB (closest to 30KB)
- Result: [Used][Free 100KB][Used][Free 50KB][Used][New][Free 5KB][Used]
The 5KB leftover is likely too small to be useful—creating a tiny, unusable fragment. This is Best Fit's fatal flaw.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455
// Conceptual implementation of segment allocation algorithms typedef struct hole { size_t address; size_t size; struct hole* next;} Hole; Hole* free_list; // Linked list of holes // First Fit: Use first hole that's large enoughvoid* first_fit(size_t request_size) { Hole* h = free_list; while (h != NULL) { if (h->size >= request_size) { void* address = (void*)h->address; // Reduce hole size or remove if fully used h->address += request_size; h->size -= request_size; if (h->size == 0) { // Remove empty hole from list remove_hole(h); } return address; } h = h->next; } return NULL; // No suitable hole found} // Best Fit: Use smallest hole that's large enoughvoid* best_fit(size_t request_size) { Hole* best = NULL; Hole* h = free_list; while (h != NULL) { if (h->size >= request_size) { if (best == NULL || h->size < best->size) { best = h; // Found a better fit } } h = h->next; } if (best != NULL) { void* address = (void*)best->address; best->address += request_size; best->size -= request_size; if (best->size == 0) { remove_hole(best); } return address; } return NULL; // No suitable hole found}Simulation studies consistently show First Fit outperforming Best Fit despite intuition suggesting otherwise. First Fit's speed advantage (stopping early) and tendency to preserve large holes typically result in better overall memory utilization than Best Fit's creation of tiny fragments.
Variable-size segments bring another challenge: some segments need to change size during execution. The stack grows and shrinks with function calls. The heap grows with allocations. How do we handle segments that don't stay put?
The Growth Challenge:
When a segment needs to grow, there are three possibilities:
The adjacent space is free: Simply extend the segment into the neighboring hole. Best case—cheap and efficient.
The adjacent space is occupied: The segment cannot grow in place. Options:
No large enough hole exists anywhere: Memory exhaustion. Must free something, swap something out, or fail.
Strategies for Growable Segments:
1. Reserve Extra Space: Allocate more than initially needed, anticipating growth. The stack often starts with 8MB even if initially using only KB. The downside: wasted space if the segment doesn't grow.
2. Direction Convention: Stack grows down, heap grows up. By placing them at opposite ends of the address space, they can grow toward each other without conflict.
High ┌───────────────┐
│ Stack │ grows ↓
│ ↓ │
├───────────────┤
│ │
│ (free) │
│ │
├───────────────┤
│ ↑ │
│ Heap │ grows ↑
Low └───────────────┘
3. Virtual Address Space Buffer: Reserve a large virtual address range but don't allocate physical memory until needed. The segment can grow into the reserved range, and the OS maps physical pages on demand.
4. Segment Relocation: If a segment must move, update its base address in the segment table. All existing references remain valid because they're (segment, offset) pairs—the offset doesn't change.
Shrinkage:
Shrinking is easier than growing:
Modern systems largely avoid segment growth problems by using virtual memory. Each process has a huge virtual address space (128TB on x86-64). 'Growth' just means mapping more physical pages into already-reserved virtual addresses. There's no need to find contiguous physical memory or relocate anything.
When external fragmentation becomes severe—memory is 40% free but no hole exceeds 10KB—the most direct solution is compaction: moving all segments to one end of memory, consolidating all free space into one contiguous region.
The Compaction Process:
Before Compaction:
[A][ ][B][ ][ C ][ ][ D ][ ][ E ][ ]
After Compaction:
[A][B][ C ][ D ][ E ][ ]
All segments contiguous Large free block
The Cost of Compaction:
Compaction is expensive. Consider a system with 16GB of RAM, half containing segments:
When to Compact:
Given the cost, compaction should be infrequent:
Alternatives to Full Compaction:
Partial Compaction: Move only some segments to free up enough space for a pending allocation. Less disruptive than full compaction.
Incremental Compaction: Move one segment at a time during idle periods. Gradually improves fragmentation without long freezes.
Eager Coalescing: When a segment is freed, immediately merge its space with adjacent holes. Prevents fragmentation buildup.
Paging eliminates the need for compaction because pages are fixed-size. Any free frame can hold any page—there's no 'wrong size' problem. When memory is fragmented into 1000 free frames scattered everywhere, that's fine; we can allocate 1000 pages. This is segmentation's primary disadvantage: variable sizes require compaction or suffer fragmentation.
The tension between variable-size segments (good for logical organization, bad for fragmentation) and fixed-size pages (bad for logical organization, good for fragmentation) led to a hybrid solution: segmented paging.
The Insight:
Why choose? Use segments for the programmer's view and pages for physical memory management:
This gives:
How It Works:
A logical address has three components: (segment, page, offset)
Example:
Logical Address: (Segment 2, Page 5, Offset 0x100)
Segment Table:
[0] → Page Table at 0x5000 (for Code segment)
[1] → Page Table at 0x6000 (for Data segment)
[2] → Page Table at 0x7000 (for Stack segment) ← Use this
Stack Segment's Page Table (at 0x7000):
[0] → Frame 12
[1] → Frame 89
[2] → Frame 45
[3] → Frame 3
[4] → Frame 77
[5] → Frame 47 ← Use this
...
Physical Address: Frame 47 base + 0x100
= (47 × 4096) + 256
= 192512 + 256 = 192768
Intel x86 Protected Mode:
The Intel 80386 and its successors use exactly this scheme:
The segment provides protection attributes and a base address; paging handles physical memory allocation.
Segmented paging combines segment benefits (logical grouping, meaningful protection, variable logical sizes) with paging benefits (no external fragmentation, efficient physical memory use, easy swapping). The cost is more complex address translation—but hardware handles this efficiently with TLBs and cached translations.
Pure hardware segmentation with variable sizes has largely given way to paging, but the concept of variable-size logical regions persists in modern operating systems.
Software Segments: Virtual Memory Areas (VMAs):
Linux maintains a list of Virtual Memory Areas (VMAs) for each process. Each VMA describes a contiguous range of virtual addresses with consistent properties—essentially a software segment:
1234567891011121314151617181920212223242526
// Linux kernel's vm_area_struct (simplified)struct vm_area_struct { unsigned long vm_start; // Start virtual address unsigned long vm_end; // End virtual address (exclusive) // Protection and flags pgprot_t vm_page_prot; // Page protection (RWX) unsigned long vm_flags; // VM_READ, VM_WRITE, VM_EXEC, etc. // Backing store struct file* vm_file; // Mapped file (if any) unsigned long vm_pgoff; // Offset in file // Linkage struct vm_area_struct* vm_next; // Next VMA in process struct rb_node vm_rb; // For fast lookup}; // Example: A process's VMAs (from /proc/pid/maps)// Start End Perm Description// 00400000-00452000 r-xp Code segment (.text)// 00651000-00652000 r--p Read-only data// 00652000-00653000 rw-p Initialized data (.data)// 00653000-00674000 rw-p Heap// 7fff6000-7fff8000 rw-p Stack// 7ffff7a00000-7ffff7bcd000 r-xp libc.so codeKey Insight:
VMAs are variable-size regions—exactly like segments—but they're managed at the page level. The VMA defines the logical region; pages within it are allocated to frames as needed. This gives:
Huge Pages: Variable Physical Granularity:
Modern systems support huge pages (2MB or 1GB on x86-64) alongside standard 4KB pages. This introduces limited variability at the physical level:
| Page Size | TLB Coverage per Entry | Use Case |
|---|---|---|
| 4 KB | 4 KB | General purpose, fine-grained |
| 2 MB | 2 MB | Large data structures, databases |
| 1 GB | 1 GB | Very large mappings, VMs |
Huge pages reduce TLB misses for large, contiguous allocations—a nod to the insight that sometimes, bigger is better.
NUMA and Variable Allocation:
NUMA (Non-Uniform Memory Access) systems have another dimension of variability: location. Memory from different NUMA nodes has different access latencies. Allocation policies must consider not just size but placement across nodes.
On Linux, 'cat /proc/self/maps' shows your shell's VMAs. You'll see the code segment, libraries, stack, heap, and more—each with its start address, end address, permissions, and backing (file or anonymous). This is the modern manifestation of segmentation: logical regions with meaningful attributes, backed by pages.
This page has explored the defining characteristic of segments—their variable size—and its profound implications for memory management. Let's consolidate the key insights:
What's Next:
We've explored the technical aspects of segments. The next page steps back to examine the programmer's view—how segmentation presents memory to the programmer, how it simplifies certain programming tasks, and how the logical structure of segments aligns with the way developers think about their programs.
You now understand why segments have variable sizes, the mathematical consequences of variable-size allocation, strategies for managing variable-size memory, and how modern systems blend variable logical regions with fixed-size physical pages. This knowledge is essential for understanding both historical segmentation and modern memory management.