Segmentation Concepts - Learning Module

Loading content...

0/227

Variable Segment Sizes

The Freedom and Burden of Flexibility

One of segmentation's most defining characteristics is that segments can be any size. A segment holding a single global variable might be 4 bytes. A segment containing a massive data structure could be hundreds of megabytes. There's no fixed quantum, no power-of-two requirement, no hardware-imposed granularity.

This flexibility is both segmentation's greatest strength and its most significant challenge.

The strength: Segments match the actual size of logical program components. No wasted space from rounding to page boundaries. No artificial splitting of coherent units. The memory organization perfectly reflects the program's structure.

The challenge: Variable-size allocation inevitably leads to external fragmentation. As segments of different sizes are allocated and freed, the memory becomes a patchwork of free and used regions. Eventually, you may have plenty of free memory in total, but no single contiguous region large enough for a new segment.

This page explores variable segment sizes in depth: why they matter, how they're managed, and what consequences they have for system design.

What You Will Learn

By the end of this page, you will understand: why segments have variable sizes while pages are fixed, the advantages of variable sizing for different segment types, the mathematics of segment size distribution, how variable sizes lead to external fragmentation, strategies for managing variable-size allocation, the trade-offs between segments and pages regarding size flexibility, and modern approaches to the fragmentation problem.

Why Variable Sizes?

The variable size of segments isn't an arbitrary design choice—it emerges directly from segmentation's goal of matching memory organization to program structure. Programs don't come in uniform chunks, so why should their memory?

Program Components Have Diverse Sizes:

Consider a typical application's logical components:

Real-World Segment Size Examples
Component	Typical Size Range	Why This Size?
Small utility function	50-200 bytes	Few instructions, simple logic
String constants table	1-10 KB	Collection of error messages, labels
Main program module	100 KB - 1 MB	Core application logic
Static data tables	1 KB - 100 MB	Lookup tables, databases
Stack	64 KB - 8 MB	Depends on recursion depth, local variables
Heap	1 MB - many GB	Dynamic allocation, grows as needed
Shared library code	100 KB - 10 MB	libc, GUI libraries, etc.
Memory-mapped file	bytes - GB	Depends entirely on file size

The Mismatch with Fixed-Size Allocation:

If we were forced to use fixed-size units (like 4KB pages), these diverse components would suffer:

Small segments waste space: A 50-byte function occupies a full 4KB page. That's 98.8% waste—internal fragmentation.
Large segments get fragmented: A 10MB data table becomes 2,560 separate pages. The logical unity is lost; the pages might be scattered throughout physical memory.
Semantic boundaries are invisible: The system can't distinguish the page holding code from the page holding data. Protection must be applied per-page, not per-unit.

Variable sizing aligns with reality: A 50-byte segment takes 50 bytes. A 10MB segment takes 10MB. No artificial padding, no arbitrary splitting. The memory layout matches the program's structure.

The Page vs. Segment Size Philosophy

Pages are sized for hardware efficiency—power of two, matching cache lines and TLB entries. Segments are sized for logical meaning—matching the actual size of program components. Neither is wrong; they serve different purposes. Modern systems often combine both: segments for logical organization, pages for physical memory management within segments.

How Segment Sizes Are Determined

Unlike pages (whose size is fixed by hardware), segment sizes are determined by the program's structure. Different segments get their sizes from different sources:

Compile-Time Determined Sizes:

Many segment sizes are fixed when the program is compiled and linked:

Static Segment Size Sources

•Code Segment — Size equals the total bytes of compiled machine code. Determined by compiler output; fixed after linking.
•Initialized Data (.data) — Size equals the sum of all initialized global/static variables and their values. Known at link time.
•Read-Only Data (.rodata) — Size equals all constant data, string literals, and similar immutable values. Fixed at link time.
•Uninitialized Data (.bss) — Size equals the sum of all uninitialized global/static variables. Only the size is stored (not values); fixed at link time.

Runtime Determined Sizes:

Other segments have sizes that aren't known until the program runs:

Dynamic Segment Size Sources

•Stack Segment — Initial size is set at program startup (e.g., 8MB default on Linux). May grow or shrink during execution as call depth varies. Often has a maximum limit enforced by the OS.
•Heap Segment — Starts small and grows as the program allocates memory. Size depends entirely on runtime allocation patterns. May shrink if memory is returned to the OS.
•Memory-Mapped Files — Size equals the mapped portion of the file. Could be entire file or a specified region. Determined when mmap() is called.
•Shared Memory Regions — Size specified when the region is created. Processes sharing the region agree on the size at creation time.

segment_sizes.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
// Example: Where segment sizes come from
 
// CODE SEGMENT SIZE = sum of all compiled functions
// Typically 10s of KB to MBs depending on program complexity
 
// DATA SEGMENT SIZE
int initialized_array[1000] = {1, 2, 3, /* ... */ };  // 4000 bytes
double coefficients[100] = {1.0, 2.0, /* ... */ };    // 800 bytes
char greeting[] = "Hello, World!";                    // 14 bytes
// .data size = 4814 bytes (plus padding/alignment)
 
// BSS SEGMENT SIZE
int uninitialized_buffer[10000];   // 40000 bytes (not in executable)
static double workspace[5000];     // 40000 bytes (not in executable)
// .bss size = 80000 bytes (stored as just the number in executable)
 
// STACK SEGMENT SIZE
// Initial: 8 MB (typical default)
// Grows as needed, up to ulimit -s maximum
 
void recursive_function(int depth) {
    char local_buffer[4096];       // 4KB per call on stack
    if (depth > 0) {
        recursive_function(depth - 1);  // Stack grows with each call
    }
}
// 1000 deep recursion = ~4MB stack usage
 
// HEAP SEGMENT SIZE
// Starts at 0, grows with allocations
void* data = malloc(1024 * 1024);  // Request 1MB
// Heap grows to at least 1MB (plus allocator overhead)

The Executable Stores Static Sizes

When you compile a program, the resulting executable file (ELF on Linux, PE on Windows) contains headers specifying each segment's size. For .text and .data, both file size and memory size are recorded. For .bss, only the memory size is stored—there's no point writing zeros to the file. Use 'readelf -l program' to see these sizes.

The Mathematics of Segment Allocation

Managing variable-size segments isn't just an engineering challenge—it's a mathematical one. Understanding the mathematical properties of segment allocation helps predict system behavior and design better algorithms.

The Fragmentation Problem Formally:

Consider a memory of size M that has been subjected to a sequence of allocations and deallocations. At any moment:

Used memory: U (sum of all allocated segment sizes)
Free memory: F = M - U
Number of free regions (holes): H
Largest free region: L

Even when F is large, if it's scattered across many holes, we may fail to allocate a segment of size S < F because no single hole is large enough.

The 50-Percent Rule:

A remarkable result from queueing theory (Knuth, 1973):

Under steady-state allocation/deallocation with First Fit, if N blocks are allocated, approximately N/2 holes exist.

This means that roughly one-third of memory becomes unusable holes. If you have 1000 allocated segments, expect ~500 holes interspersed among them. This wasted space is external fragmentation—unavoidable with variable-size allocation.

Fragmentation Example
State	Allocated Segments	Holes	Total Free	Largest Hole
Initial	0	1	1000 KB	1000 KB
After 10 allocations	10	~5	700 KB	~200 KB
After 50 alloc/dealloc cycles	25	~12	500 KB	~50 KB
After 200 cycles	50	~25	400 KB	~20 KB
Heavily fragmented	100	~50	300 KB	~5 KB

Size Distribution Matters:

The distribution of segment sizes dramatically affects fragmentation:

Uniform sizes: If all segments are the same size, fragmentation is minimal. Each freed spot perfectly fits a new segment.

Wide size variance: If segments range from 10 bytes to 10 MB, fragmentation is severe. Large segments leave small unusable holes when freed; small segments can't fill large holes efficiently.

Power-law distribution: Real workloads often follow power-law distributions—many small segments, few large ones. This causes particular fragmentation patterns that allocation algorithms must handle.

Best vs. Worst Case:

Best case for fragmentation:
[  A  ][  B  ][  C  ][  D  ][  E  ]  // Contiguous, no holes

Worst case for fragmentation:
[ A ][ ][ B ][ ][ C ][ ][ D ][ ][ E ]  // Alternating, maximum holes

Typical long-running system:
[ A ][ ][  B  ][ ][C][  D  ][ ][ E ]  // Mixed, some holes

Fragmentation is Inevitable

No allocation algorithm can completely prevent external fragmentation with variable-size segments. The 50-percent rule is a statistical reality. Systems must either: (1) tolerate fragmentation and its consequences, (2) periodically compact memory (expensive), or (3) use fixed-size allocation like paging. Pure segmentation always fights this battle.

Allocation Strategies for Variable Sizes

When a new segment needs to be allocated, the system must find a suitable free region. With variable-size segments and potentially many holes, this decision requires a placement algorithm.

Classic Placement Algorithms:

Segment Placement Algorithms
Algorithm	Description	Time Complexity	Fragmentation Tendency
First Fit	Use first hole large enough	O(n) worst case	Good—tends to leave large holes at end
Next Fit	Like First Fit, but start from last allocation	O(n) worst case	Moderate—spreads allocation evenly
Best Fit	Use smallest hole that fits	O(n) or O(log n) with tree	Poor—leaves many tiny unusable holes
Worst Fit	Use largest hole available	O(n) or O(1) with heap	Poor—fragments large holes quickly

First Fit in Detail:

First Fit is often the best general-purpose choice. It searches from the beginning of memory and uses the first hole large enough to hold the segment.

Memory: [Used][   Free 100KB   ][Used][  Free 50KB ][Used]

Allocate 30KB segment:
- Check first hole: 100KB ≥ 30KB ✓
- Allocate 30KB at start of hole
- Result: [Used][New][Free 70KB][Used][Free 50KB][Used]

Advantages:

Fast when a suitable hole is found early
Tends to keep large holes intact at the end of memory
Simple to implement

Disadvantages:

Beginning of memory becomes fragmented with small holes
May skip over usable space

Best Fit in Detail:

Best Fit searches all holes and uses the one closest in size to the request.

Memory: [Used][Free 100KB][Used][Free 50KB][Used][Free 35KB][Used]

Allocate 30KB segment:
- Check all holes: 100KB, 50KB, 35KB
- Best fit: 35KB (closest to 30KB)
- Result: [Used][Free 100KB][Used][Free 50KB][Used][New][Free 5KB][Used]

The 5KB leftover is likely too small to be useful—creating a tiny, unusable fragment. This is Best Fit's fatal flaw.

allocation_algorithms.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
// Conceptual implementation of segment allocation algorithms
 
typedef struct hole {
    size_t address;
    size_t size;
    struct hole* next;
} Hole;
 
Hole* free_list;  // Linked list of holes
 
// First Fit: Use first hole that's large enough
void* first_fit(size_t request_size) {
    Hole* h = free_list;
    while (h != NULL) {
        if (h->size >= request_size) {
            void* address = (void*)h->address;
            // Reduce hole size or remove if fully used
            h->address += request_size;
            h->size -= request_size;
            if (h->size == 0) {
                // Remove empty hole from list
                remove_hole(h);
            }
            return address;
        }
        h = h->next;
    }
    return NULL;  // No suitable hole found
}
 
// Best Fit: Use smallest hole that's large enough
void* best_fit(size_t request_size) {
    Hole* best = NULL;
    Hole* h = free_list;
    
    while (h != NULL) {
        if (h->size >= request_size) {
            if (best == NULL || h->size < best->size) {
                best = h;  // Found a better fit
            }
        }
        h = h->next;
    }
    
    if (best != NULL) {
        void* address = (void*)best->address;
        best->address += request_size;
        best->size -= request_size;
        if (best->size == 0) {
            remove_hole(best);
        }
        return address;
    }
    return NULL;  // No suitable hole found
}

Empirical Results Favor First Fit

Simulation studies consistently show First Fit outperforming Best Fit despite intuition suggesting otherwise. First Fit's speed advantage (stopping early) and tendency to preserve large holes typically result in better overall memory utilization than Best Fit's creation of tiny fragments.

Segment Growth and Shrinkage

Variable-size segments bring another challenge: some segments need to change size during execution. The stack grows and shrinks with function calls. The heap grows with allocations. How do we handle segments that don't stay put?

The Growth Challenge:

When a segment needs to grow, there are three possibilities:

The adjacent space is free: Simply extend the segment into the neighboring hole. Best case—cheap and efficient.
The adjacent space is occupied: The segment cannot grow in place. Options:
- Move the entire segment to a larger hole (expensive!)
- Fail the growth request (unacceptable for stack/heap)
- Use non-contiguous extension (violates segment model)
No large enough hole exists anywhere: Memory exhaustion. Must free something, swap something out, or fail.

Converting Mermaid diagram...

Strategies for Growable Segments:

1. Reserve Extra Space: Allocate more than initially needed, anticipating growth. The stack often starts with 8MB even if initially using only KB. The downside: wasted space if the segment doesn't grow.

2. Direction Convention: Stack grows down, heap grows up. By placing them at opposite ends of the address space, they can grow toward each other without conflict.

High ┌───────────────┐
     │ Stack         │ grows ↓
     │       ↓       │
     ├───────────────┤
     │               │
     │  (free)       │
     │               │
     ├───────────────┤
     │       ↑       │
     │ Heap          │ grows ↑
Low  └───────────────┘

3. Virtual Address Space Buffer: Reserve a large virtual address range but don't allocate physical memory until needed. The segment can grow into the reserved range, and the OS maps physical pages on demand.

4. Segment Relocation: If a segment must move, update its base address in the segment table. All existing references remain valid because they're (segment, offset) pairs—the offset doesn't change.

Shrinkage:

Shrinking is easier than growing:

Reduce the segment's limit
The freed space becomes a new hole (or merges with an adjacent hole)
No relocation needed
Processes must be careful not to reference freed space

The Virtual Memory Solution

Modern systems largely avoid segment growth problems by using virtual memory. Each process has a huge virtual address space (128TB on x86-64). 'Growth' just means mapping more physical pages into already-reserved virtual addresses. There's no need to find contiguous physical memory or relocate anything.

Memory Compaction: The Brute Force Solution

When external fragmentation becomes severe—memory is 40% free but no hole exceeds 10KB—the most direct solution is compaction: moving all segments to one end of memory, consolidating all free space into one contiguous region.

The Compaction Process:

Before Compaction:
[A][  ][B][ ][  C  ][ ][ D ][ ][  E  ][   ]

After Compaction:
[A][B][  C  ][ D ][  E  ][                   ]
         All segments contiguous    Large free block

Compaction Steps

•Suspend all processes — No segment can be accessed during compaction (it might be moving!).
•Calculate new positions — Determine where each segment will move. This is essentially re-running the allocation algorithm on an empty memory.
•Copy segments — Physically copy each segment's contents to its new location. This is the expensive part—potentially copying gigabytes of data.
•Update segment tables — Every segment's base address changes. All segment table entries must be updated.
•Resume processes — With new base addresses in the segment tables, processes continue executing. Their (segment, offset) addresses still work.

The Cost of Compaction:

Compaction is expensive. Consider a system with 16GB of RAM, half containing segments:

Copying 8GB of data at ~10 GB/s memory bandwidth: ~0.8 seconds
During this time, the system is frozen
If done frequently, system responsiveness suffers terribly

When to Compact:

Given the cost, compaction should be infrequent:

Trigger by fragmentation threshold: When fragmentation exceeds 50%, consider compacting
Trigger by allocation failure: When an allocation fails despite sufficient total free memory
Scheduled maintenance: During low-activity periods (midnight for a workstation)
Never: Many systems simply don't compact; they rely on virtual memory to sidestep the issue

Alternatives to Full Compaction:

Partial Compaction: Move only some segments to free up enough space for a pending allocation. Less disruptive than full compaction.

Incremental Compaction: Move one segment at a time during idle periods. Gradually improves fragmentation without long freezes.

Eager Coalescing: When a segment is freed, immediately merge its space with adjacent holes. Prevents fragmentation buildup.

Why Paging Doesn't Need Compaction

Paging eliminates the need for compaction because pages are fixed-size. Any free frame can hold any page—there's no 'wrong size' problem. When memory is fragmented into 1000 free frames scattered everywhere, that's fine; we can allocate 1000 pages. This is segmentation's primary disadvantage: variable sizes require compaction or suffer fragmentation.

Combining Variable and Fixed Sizes: Segmented Paging

The tension between variable-size segments (good for logical organization, bad for fragmentation) and fixed-size pages (bad for logical organization, good for fragmentation) led to a hybrid solution: segmented paging.

The Insight:

Why choose? Use segments for the programmer's view and pages for physical memory management:

The address space is divided into segments (code, data, stack, etc.)
Each segment is divided into pages
Pages are allocated to frames in physical memory
Translation: (segment, offset) → segment table → page table → frame

This gives:

Logical organization via segments (each segment has meaning)
No external fragmentation (pages are fixed-size)
Internal fragmentation is limited (at most one page per segment)

Converting Mermaid diagram...

How It Works:

A logical address has three components: (segment, page, offset)

The segment number indexes into the segment table
The segment table entry points to a page table for that segment
The page number indexes into the segment's page table
The page table entry contains the frame number
The frame number + offset gives the physical address

Example:

Logical Address: (Segment 2, Page 5, Offset 0x100)

Segment Table:
[0] → Page Table at 0x5000   (for Code segment)
[1] → Page Table at 0x6000   (for Data segment)
[2] → Page Table at 0x7000   (for Stack segment) ← Use this

Stack Segment's Page Table (at 0x7000):
[0] → Frame 12
[1] → Frame 89
[2] → Frame 45
[3] → Frame 3
[4] → Frame 77
[5] → Frame 47   ← Use this
...

Physical Address: Frame 47 base + 0x100
                = (47 × 4096) + 256
                = 192512 + 256 = 192768

Intel x86 Protected Mode:

The Intel 80386 and its successors use exactly this scheme:

Segment selector → Segment descriptor (with base, limit, flags)
Segment base + offset → Linear address
Linear address → Page directory → Page table → Physical address

The segment provides protection attributes and a base address; paging handles physical memory allocation.

Best of Both Worlds

Segmented paging combines segment benefits (logical grouping, meaningful protection, variable logical sizes) with paging benefits (no external fragmentation, efficient physical memory use, easy swapping). The cost is more complex address translation—but hardware handles this efficiently with TLBs and cached translations.

Variable Sizes in Modern Systems

Pure hardware segmentation with variable sizes has largely given way to paging, but the concept of variable-size logical regions persists in modern operating systems.

Software Segments: Virtual Memory Areas (VMAs):

Linux maintains a list of Virtual Memory Areas (VMAs) for each process. Each VMA describes a contiguous range of virtual addresses with consistent properties—essentially a software segment:

linux_vma.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// Linux kernel's vm_area_struct (simplified)
struct vm_area_struct {
    unsigned long vm_start;     // Start virtual address
    unsigned long vm_end;       // End virtual address (exclusive)
    
    // Protection and flags
    pgprot_t vm_page_prot;      // Page protection (RWX)
    unsigned long vm_flags;      // VM_READ, VM_WRITE, VM_EXEC, etc.
    
    // Backing store
    struct file* vm_file;        // Mapped file (if any)
    unsigned long vm_pgoff;      // Offset in file
    
    // Linkage
    struct vm_area_struct* vm_next;  // Next VMA in process
    struct rb_node vm_rb;            // For fast lookup
};
 
// Example: A process's VMAs (from /proc/pid/maps)
// Start    End      Perm  Description
// 00400000-00452000 r-xp  Code segment (.text)
// 00651000-00652000 r--p  Read-only data
// 00652000-00653000 rw-p  Initialized data (.data)
// 00653000-00674000 rw-p  Heap
// 7fff6000-7fff8000 rw-p  Stack
// 7ffff7a00000-7ffff7bcd000 r-xp  libc.so code

Key Insight:

VMAs are variable-size regions—exactly like segments—but they're managed at the page level. The VMA defines the logical region; pages within it are allocated to frames as needed. This gives:

Variable logical sizes (any range of addresses)
Fixed physical sizes (4KB pages)
No external fragmentation in physical memory
Per-region protection and attributes

Huge Pages: Variable Physical Granularity:

Modern systems support huge pages (2MB or 1GB on x86-64) alongside standard 4KB pages. This introduces limited variability at the physical level:

Page Size	TLB Coverage per Entry	Use Case
4 KB	4 KB	General purpose, fine-grained
2 MB	2 MB	Large data structures, databases
1 GB	1 GB	Very large mappings, VMs

Huge pages reduce TLB misses for large, contiguous allocations—a nod to the insight that sometimes, bigger is better.

NUMA and Variable Allocation:

NUMA (Non-Uniform Memory Access) systems have another dimension of variability: location. Memory from different NUMA nodes has different access latencies. Allocation policies must consider not just size but placement across nodes.

Viewing Your Process's VMAs

On Linux, 'cat /proc/self/maps' shows your shell's VMAs. You'll see the code segment, libraries, stack, heap, and more—each with its start address, end address, permissions, and backing (file or anonymous). This is the modern manifestation of segmentation: logical regions with meaningful attributes, backed by pages.

Summary: Variable Segment Sizes

This page has explored the defining characteristic of segments—their variable size—and its profound implications for memory management. Let's consolidate the key insights:

Key Takeaways

•Variable sizes match program structure — Segments can be any size, exactly fitting the logical components they represent.
•Sizes are determined differently — Static segments (code, data) have sizes fixed at compile/link time; dynamic segments (stack, heap) vary at runtime.
•The 50-percent rule — With variable-size allocation, approximately N/2 holes exist for N allocated segments. External fragmentation is mathematically inevitable.
•Placement algorithms matter — First Fit typically outperforms Best Fit despite intuition, because Best Fit creates tiny unusable fragments.
•Segment growth is challenging — Segments that need to grow may require relocation if adjacent space is occupied.
•Compaction is the brute-force solution — Moving all segments to consolidate free space works but is expensive and disruptive.
•Segmented paging combines strengths — Use segments for logical organization, pages for physical allocation, eliminating external fragmentation.
•Modern systems use software segments — VMAs and similar structures provide variable-size logical regions backed by fixed-size pages.

What's Next:

We've explored the technical aspects of segments. The next page steps back to examine the programmer's view—how segmentation presents memory to the programmer, how it simplifies certain programming tasks, and how the logical structure of segments aligns with the way developers think about their programs.

Page Complete

You now understand why segments have variable sizes, the mathematical consequences of variable-size allocation, strategies for managing variable-size memory, and how modern systems blend variable logical regions with fixed-size physical pages. This knowledge is essential for understanding both historical segmentation and modern memory management.

Variable Segment Sizes

The Freedom and Burden of Flexibility

This flexibility is both segmentation's greatest strength and its most significant challenge.

This page explores variable segment sizes in depth: why they matter, how they're managed, and what consequences they have for system design.

What You Will Learn

Why Variable Sizes?

Program Components Have Diverse Sizes:

Consider a typical application's logical components:

Real-World Segment Size Examples
Component	Typical Size Range	Why This Size?
Small utility function	50-200 bytes	Few instructions, simple logic
String constants table	1-10 KB	Collection of error messages, labels
Main program module	100 KB - 1 MB	Core application logic
Static data tables	1 KB - 100 MB	Lookup tables, databases
Stack	64 KB - 8 MB	Depends on recursion depth, local variables
Heap	1 MB - many GB	Dynamic allocation, grows as needed
Shared library code	100 KB - 10 MB	libc, GUI libraries, etc.
Memory-mapped file	bytes - GB	Depends entirely on file size

The Mismatch with Fixed-Size Allocation:

If we were forced to use fixed-size units (like 4KB pages), these diverse components would suffer:

Small segments waste space: A 50-byte function occupies a full 4KB page. That's 98.8% waste—internal fragmentation.
Large segments get fragmented: A 10MB data table becomes 2,560 separate pages. The logical unity is lost; the pages might be scattered throughout physical memory.
Semantic boundaries are invisible: The system can't distinguish the page holding code from the page holding data. Protection must be applied per-page, not per-unit.

Variable sizing aligns with reality: A 50-byte segment takes 50 bytes. A 10MB segment takes 10MB. No artificial padding, no arbitrary splitting. The memory layout matches the program's structure.

The Page vs. Segment Size Philosophy

How Segment Sizes Are Determined

Unlike pages (whose size is fixed by hardware), segment sizes are determined by the program's structure. Different segments get their sizes from different sources:

Compile-Time Determined Sizes:

Many segment sizes are fixed when the program is compiled and linked:

Static Segment Size Sources

•Code Segment — Size equals the total bytes of compiled machine code. Determined by compiler output; fixed after linking.
•Initialized Data (.data) — Size equals the sum of all initialized global/static variables and their values. Known at link time.
•Read-Only Data (.rodata) — Size equals all constant data, string literals, and similar immutable values. Fixed at link time.
•Uninitialized Data (.bss) — Size equals the sum of all uninitialized global/static variables. Only the size is stored (not values); fixed at link time.

Runtime Determined Sizes:

Other segments have sizes that aren't known until the program runs:

Dynamic Segment Size Sources

•Stack Segment — Initial size is set at program startup (e.g., 8MB default on Linux). May grow or shrink during execution as call depth varies. Often has a maximum limit enforced by the OS.
•Heap Segment — Starts small and grows as the program allocates memory. Size depends entirely on runtime allocation patterns. May shrink if memory is returned to the OS.
•Memory-Mapped Files — Size equals the mapped portion of the file. Could be entire file or a specified region. Determined when mmap() is called.
•Shared Memory Regions — Size specified when the region is created. Processes sharing the region agree on the size at creation time.

segment_sizes.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
// Example: Where segment sizes come from
 
// CODE SEGMENT SIZE = sum of all compiled functions
// Typically 10s of KB to MBs depending on program complexity
 
// DATA SEGMENT SIZE
int initialized_array[1000] = {1, 2, 3, /* ... */ };  // 4000 bytes
double coefficients[100] = {1.0, 2.0, /* ... */ };    // 800 bytes
char greeting[] = "Hello, World!";                    // 14 bytes
// .data size = 4814 bytes (plus padding/alignment)
 
// BSS SEGMENT SIZE
int uninitialized_buffer[10000];   // 40000 bytes (not in executable)
static double workspace[5000];     // 40000 bytes (not in executable)
// .bss size = 80000 bytes (stored as just the number in executable)
 
// STACK SEGMENT SIZE
// Initial: 8 MB (typical default)
// Grows as needed, up to ulimit -s maximum
 
void recursive_function(int depth) {
    char local_buffer[4096];       // 4KB per call on stack
    if (depth > 0) {
        recursive_function(depth - 1);  // Stack grows with each call
    }
}
// 1000 deep recursion = ~4MB stack usage
 
// HEAP SEGMENT SIZE
// Starts at 0, grows with allocations
void* data = malloc(1024 * 1024);  // Request 1MB
// Heap grows to at least 1MB (plus allocator overhead)

The Executable Stores Static Sizes

The Mathematics of Segment Allocation

The Fragmentation Problem Formally:

Consider a memory of size M that has been subjected to a sequence of allocations and deallocations. At any moment:

Used memory: U (sum of all allocated segment sizes)
Free memory: F = M - U
Number of free regions (holes): H
Largest free region: L

Even when F is large, if it's scattered across many holes, we may fail to allocate a segment of size S < F because no single hole is large enough.

The 50-Percent Rule:

A remarkable result from queueing theory (Knuth, 1973):

Under steady-state allocation/deallocation with First Fit, if N blocks are allocated, approximately N/2 holes exist.

Fragmentation Example
State	Allocated Segments	Holes	Total Free	Largest Hole
Initial	0	1	1000 KB	1000 KB
After 10 allocations	10	~5	700 KB	~200 KB
After 50 alloc/dealloc cycles	25	~12	500 KB	~50 KB
After 200 cycles	50	~25	400 KB	~20 KB
Heavily fragmented	100	~50	300 KB	~5 KB

Size Distribution Matters:

The distribution of segment sizes dramatically affects fragmentation:

Uniform sizes: If all segments are the same size, fragmentation is minimal. Each freed spot perfectly fits a new segment.

Wide size variance: If segments range from 10 bytes to 10 MB, fragmentation is severe. Large segments leave small unusable holes when freed; small segments can't fill large holes efficiently.

Best vs. Worst Case:

Best case for fragmentation:
[  A  ][  B  ][  C  ][  D  ][  E  ]  // Contiguous, no holes

Worst case for fragmentation:
[ A ][ ][ B ][ ][ C ][ ][ D ][ ][ E ]  // Alternating, maximum holes

Typical long-running system:
[ A ][ ][  B  ][ ][C][  D  ][ ][ E ]  // Mixed, some holes

Fragmentation is Inevitable

Allocation Strategies for Variable Sizes

When a new segment needs to be allocated, the system must find a suitable free region. With variable-size segments and potentially many holes, this decision requires a placement algorithm.

Classic Placement Algorithms:

Segment Placement Algorithms
Algorithm	Description	Time Complexity	Fragmentation Tendency
First Fit	Use first hole large enough	O(n) worst case	Good—tends to leave large holes at end
Next Fit	Like First Fit, but start from last allocation	O(n) worst case	Moderate—spreads allocation evenly
Best Fit	Use smallest hole that fits	O(n) or O(log n) with tree	Poor—leaves many tiny unusable holes
Worst Fit	Use largest hole available	O(n) or O(1) with heap	Poor—fragments large holes quickly

First Fit in Detail:

First Fit is often the best general-purpose choice. It searches from the beginning of memory and uses the first hole large enough to hold the segment.

Memory: [Used][   Free 100KB   ][Used][  Free 50KB ][Used]

Allocate 30KB segment:
- Check first hole: 100KB ≥ 30KB ✓
- Allocate 30KB at start of hole
- Result: [Used][New][Free 70KB][Used][Free 50KB][Used]

Advantages:

Fast when a suitable hole is found early
Tends to keep large holes intact at the end of memory
Simple to implement

Disadvantages:

Beginning of memory becomes fragmented with small holes
May skip over usable space

Best Fit in Detail:

Best Fit searches all holes and uses the one closest in size to the request.

Memory: [Used][Free 100KB][Used][Free 50KB][Used][Free 35KB][Used]

Allocate 30KB segment:
- Check all holes: 100KB, 50KB, 35KB
- Best fit: 35KB (closest to 30KB)
- Result: [Used][Free 100KB][Used][Free 50KB][Used][New][Free 5KB][Used]

The 5KB leftover is likely too small to be useful—creating a tiny, unusable fragment. This is Best Fit's fatal flaw.

allocation_algorithms.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
// Conceptual implementation of segment allocation algorithms
 
typedef struct hole {
    size_t address;
    size_t size;
    struct hole* next;
} Hole;
 
Hole* free_list;  // Linked list of holes
 
// First Fit: Use first hole that's large enough
void* first_fit(size_t request_size) {
    Hole* h = free_list;
    while (h != NULL) {
        if (h->size >= request_size) {
            void* address = (void*)h->address;
            // Reduce hole size or remove if fully used
            h->address += request_size;
            h->size -= request_size;
            if (h->size == 0) {
                // Remove empty hole from list
                remove_hole(h);
            }
            return address;
        }
        h = h->next;
    }
    return NULL;  // No suitable hole found
}
 
// Best Fit: Use smallest hole that's large enough
void* best_fit(size_t request_size) {
    Hole* best = NULL;
    Hole* h = free_list;
    
    while (h != NULL) {
        if (h->size >= request_size) {
            if (best == NULL || h->size < best->size) {
                best = h;  // Found a better fit
            }
        }
        h = h->next;
    }
    
    if (best != NULL) {
        void* address = (void*)best->address;
        best->address += request_size;
        best->size -= request_size;
        if (best->size == 0) {
            remove_hole(best);
        }
        return address;
    }
    return NULL;  // No suitable hole found
}

Empirical Results Favor First Fit

Segment Growth and Shrinkage

The Growth Challenge:

When a segment needs to grow, there are three possibilities:

The adjacent space is free: Simply extend the segment into the neighboring hole. Best case—cheap and efficient.
The adjacent space is occupied: The segment cannot grow in place. Options:
- Move the entire segment to a larger hole (expensive!)
- Fail the growth request (unacceptable for stack/heap)
- Use non-contiguous extension (violates segment model)
No large enough hole exists anywhere: Memory exhaustion. Must free something, swap something out, or fail.

Converting Mermaid diagram...

Strategies for Growable Segments:

2. Direction Convention: Stack grows down, heap grows up. By placing them at opposite ends of the address space, they can grow toward each other without conflict.

High ┌───────────────┐
     │ Stack         │ grows ↓
     │       ↓       │
     ├───────────────┤
     │               │
     │  (free)       │
     │               │
     ├───────────────┤
     │       ↑       │
     │ Heap          │ grows ↑
Low  └───────────────┘

Shrinkage:

Shrinking is easier than growing:

Reduce the segment's limit
The freed space becomes a new hole (or merges with an adjacent hole)
No relocation needed
Processes must be careful not to reference freed space

The Virtual Memory Solution

Memory Compaction: The Brute Force Solution

The Compaction Process:

Before Compaction:
[A][  ][B][ ][  C  ][ ][ D ][ ][  E  ][   ]

After Compaction:
[A][B][  C  ][ D ][  E  ][                   ]
         All segments contiguous    Large free block

Compaction Steps

•Suspend all processes — No segment can be accessed during compaction (it might be moving!).
•Calculate new positions — Determine where each segment will move. This is essentially re-running the allocation algorithm on an empty memory.
•Copy segments — Physically copy each segment's contents to its new location. This is the expensive part—potentially copying gigabytes of data.
•Update segment tables — Every segment's base address changes. All segment table entries must be updated.
•Resume processes — With new base addresses in the segment tables, processes continue executing. Their (segment, offset) addresses still work.

The Cost of Compaction:

Compaction is expensive. Consider a system with 16GB of RAM, half containing segments:

Copying 8GB of data at ~10 GB/s memory bandwidth: ~0.8 seconds
During this time, the system is frozen
If done frequently, system responsiveness suffers terribly

When to Compact:

Given the cost, compaction should be infrequent:

Trigger by fragmentation threshold: When fragmentation exceeds 50%, consider compacting
Trigger by allocation failure: When an allocation fails despite sufficient total free memory
Scheduled maintenance: During low-activity periods (midnight for a workstation)
Never: Many systems simply don't compact; they rely on virtual memory to sidestep the issue

Alternatives to Full Compaction:

Partial Compaction: Move only some segments to free up enough space for a pending allocation. Less disruptive than full compaction.

Incremental Compaction: Move one segment at a time during idle periods. Gradually improves fragmentation without long freezes.

Eager Coalescing: When a segment is freed, immediately merge its space with adjacent holes. Prevents fragmentation buildup.

Why Paging Doesn't Need Compaction

Combining Variable and Fixed Sizes: Segmented Paging

The Insight:

Why choose? Use segments for the programmer's view and pages for physical memory management:

The address space is divided into segments (code, data, stack, etc.)
Each segment is divided into pages
Pages are allocated to frames in physical memory
Translation: (segment, offset) → segment table → page table → frame

This gives:

Logical organization via segments (each segment has meaning)
No external fragmentation (pages are fixed-size)
Internal fragmentation is limited (at most one page per segment)

Converting Mermaid diagram...

How It Works:

A logical address has three components: (segment, page, offset)

The segment number indexes into the segment table
The segment table entry points to a page table for that segment
The page number indexes into the segment's page table
The page table entry contains the frame number
The frame number + offset gives the physical address

Example:

Logical Address: (Segment 2, Page 5, Offset 0x100)

Segment Table:
[0] → Page Table at 0x5000   (for Code segment)
[1] → Page Table at 0x6000   (for Data segment)
[2] → Page Table at 0x7000   (for Stack segment) ← Use this

Stack Segment's Page Table (at 0x7000):
[0] → Frame 12
[1] → Frame 89
[2] → Frame 45
[3] → Frame 3
[4] → Frame 77
[5] → Frame 47   ← Use this
...

Physical Address: Frame 47 base + 0x100
                = (47 × 4096) + 256
                = 192512 + 256 = 192768

Intel x86 Protected Mode:

The Intel 80386 and its successors use exactly this scheme:

Segment selector → Segment descriptor (with base, limit, flags)
Segment base + offset → Linear address
Linear address → Page directory → Page table → Physical address

The segment provides protection attributes and a base address; paging handles physical memory allocation.

Best of Both Worlds

Variable Sizes in Modern Systems

Pure hardware segmentation with variable sizes has largely given way to paging, but the concept of variable-size logical regions persists in modern operating systems.

Software Segments: Virtual Memory Areas (VMAs):

Linux maintains a list of Virtual Memory Areas (VMAs) for each process. Each VMA describes a contiguous range of virtual addresses with consistent properties—essentially a software segment:

linux_vma.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// Linux kernel's vm_area_struct (simplified)
struct vm_area_struct {
    unsigned long vm_start;     // Start virtual address
    unsigned long vm_end;       // End virtual address (exclusive)
    
    // Protection and flags
    pgprot_t vm_page_prot;      // Page protection (RWX)
    unsigned long vm_flags;      // VM_READ, VM_WRITE, VM_EXEC, etc.
    
    // Backing store
    struct file* vm_file;        // Mapped file (if any)
    unsigned long vm_pgoff;      // Offset in file
    
    // Linkage
    struct vm_area_struct* vm_next;  // Next VMA in process
    struct rb_node vm_rb;            // For fast lookup
};
 
// Example: A process's VMAs (from /proc/pid/maps)
// Start    End      Perm  Description
// 00400000-00452000 r-xp  Code segment (.text)
// 00651000-00652000 r--p  Read-only data
// 00652000-00653000 rw-p  Initialized data (.data)
// 00653000-00674000 rw-p  Heap
// 7fff6000-7fff8000 rw-p  Stack
// 7ffff7a00000-7ffff7bcd000 r-xp  libc.so code

Key Insight:

VMAs are variable-size regions—exactly like segments—but they're managed at the page level. The VMA defines the logical region; pages within it are allocated to frames as needed. This gives:

Variable logical sizes (any range of addresses)
Fixed physical sizes (4KB pages)
No external fragmentation in physical memory
Per-region protection and attributes

Huge Pages: Variable Physical Granularity:

Modern systems support huge pages (2MB or 1GB on x86-64) alongside standard 4KB pages. This introduces limited variability at the physical level:

Page Size	TLB Coverage per Entry	Use Case
4 KB	4 KB	General purpose, fine-grained
2 MB	2 MB	Large data structures, databases
1 GB	1 GB	Very large mappings, VMs

Huge pages reduce TLB misses for large, contiguous allocations—a nod to the insight that sometimes, bigger is better.

NUMA and Variable Allocation:

Viewing Your Process's VMAs

Summary: Variable Segment Sizes

This page has explored the defining characteristic of segments—their variable size—and its profound implications for memory management. Let's consolidate the key insights:

Key Takeaways

•Variable sizes match program structure — Segments can be any size, exactly fitting the logical components they represent.
•Sizes are determined differently — Static segments (code, data) have sizes fixed at compile/link time; dynamic segments (stack, heap) vary at runtime.
•The 50-percent rule — With variable-size allocation, approximately N/2 holes exist for N allocated segments. External fragmentation is mathematically inevitable.
•Placement algorithms matter — First Fit typically outperforms Best Fit despite intuition, because Best Fit creates tiny unusable fragments.
•Segment growth is challenging — Segments that need to grow may require relocation if adjacent space is occupied.
•Compaction is the brute-force solution — Moving all segments to consolidate free space works but is expensive and disruptive.
•Segmented paging combines strengths — Use segments for logical organization, pages for physical allocation, eliminating external fragmentation.
•Modern systems use software segments — VMAs and similar structures provide variable-size logical regions backed by fixed-size pages.

What's Next:

Page Complete