Operating SystemsSegmentation Concepts

Segmentation Concepts

LevelIntermediate

Duration90 mins

TopicSegmentation Concepts

2 / 5

Code, Data, Stack Segments

The Holy Trinity of Program Memory

Every program that executes on a computer, from the simplest "Hello, World" to the most complex operating system kernel, organizes its memory into three fundamental categories: code, data, and stack. These aren't arbitrary divisions—they reflect the deep structure of computation itself.

Code holds the instructions—the immutable recipe that defines what the program does. Data holds the ingredients—the values the program manipulates. Stack holds the context—the call history, local variables, and return addresses that track where the program is in its execution.

Understanding these three segment types is essential for anyone who wants to truly understand how programs run. Each has distinct characteristics that dictate how the operating system must manage it, what protections apply, and how it interacts with the hardware.

What You Will Learn

By the end of this page, you will understand: the complete characteristics of code (text) segments, the varieties and purposes of data segments (initialized, uninitialized, read-only), stack segment mechanics including frame structure and growth, how these segments interact during program execution, protection requirements for each segment type, and how modern systems implement these segment concepts.

The Code Segment (Text Segment)

The code segment (historically called the text segment from early Unix terminology) contains the executable machine instructions that constitute the program. When a program runs, the CPU fetches instructions from this segment, decodes them, and executes them.

Fundamental Characteristics:

Code Segment Properties

•Read-Only — Code should never be modified during execution. Self-modifying code exists but is rare, dangerous, and typically prohibited by modern security policies.
•Executable — This is the only segment type where execution is permitted. The CPU's instruction pointer (EIP/RIP) points into this segment.
•Fixed Size — Once a program is compiled and linked, its code size is fixed. The code segment doesn't grow or shrink during execution.
•Highly Shareable — Multiple processes running the same program can share a single copy of the code segment, as no process modifies it.
•Position-Independent (Optional) — Modern code is often compiled as PIC (Position-Independent Code), allowing it to run at any address without relocation.

Why 'Text'?

The term "text segment" comes from early Unix and assembly language conventions. In assembly, the .text directive indicates that the following content is program code. The term persists in executable file formats, system tools, and programmer vocabulary, even though "code segment" is more descriptive.

Structure of the Code Segment:

A typical code segment contains:

Entry Point — The address where execution begins (e.g., _start or main)
Function Bodies — All compiled functions, laid out sequentially or as determined by the linker
Inline Data — Small constants embedded in instructions (immediate values)
Exception Handling Tables — Data for unwinding the stack during exceptions
Padding — Alignment padding for cache efficiency

The linker determines the layout. Optimizing linkers may reorder functions to improve cache locality—frequently called functions placed near each other minimize instruction cache misses.

code_segment_example.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
// Example: How C code becomes part of the text segment
 
// All of these functions become machine code in the text segment:
 
// Simple function: ~10-20 bytes of machine code
int add(int a, int b) {
    return a + b;
}
 
// Function with loop: ~30-50 bytes of machine code
int factorial(int n) {
    int result = 1;
    for (int i = 2; i <= n; i++) {
        result *= i;
    }
    return result;
}
 
// Function with conditionals: ~40-80 bytes depending on optimization
int classify(int x) {
    if (x < 0) return -1;
    if (x == 0) return 0;
    return 1;
}
 
// The entry point
int main() {
    int sum = add(5, 3);
    int fact = factorial(5);
    int class = classify(-10);
    return 0;
}
 
// Compiled code segment layout (conceptual):
// 
// 0x0000: _start:              ; Runtime startup code
//         ...setup...
//         call main
//         
// 0x0100: add:                 ; Function add
//         mov eax, edi
//         add eax, esi
//         ret
//         
// 0x0110: factorial:           ; Function factorial
//         ...loop code...
//         ret
//         
// 0x0160: classify:            ; Function classify
//         ...conditional code...
//         ret
//         
// 0x01A0: main:                ; Entry point
//         ...call sequence...
//         ret

Code Segment Security: W^X Policy

Modern operating systems enforce W^X (Write XOR Execute) policy: a memory region can be writable OR executable, but never both simultaneously. This prevents attackers from injecting malicious code into writable memory and executing it. The code segment is Execute+Read only; any attempt to write triggers a protection fault. This is a fundamental defense against code injection attacks.

Sharing the Code Segment

One of the most significant benefits of having a distinct code segment is shareability. Since code is read-only and identical across all instances of a program, there's no need to keep multiple copies in physical memory.

The Sharing Mechanism:

When multiple processes run the same executable, the operating system recognizes that their code segments are identical. Instead of loading separate copies for each process:

The first process that runs the program causes the code to be loaded into physical memory
Subsequent processes have their page tables (or segment tables) point to the same physical pages
Each process has its own virtual address for the code, but they all map to the same physical memory
Because the code is read-only, this sharing is completely safe—no process can affect another

Converting Mermaid diagram...

Memory Savings from Sharing:

Consider a system with 100 users running bash shells, and bash's code segment is 1MB:

Without sharing: 100 MB of physical memory for bash code
With sharing: 1 MB for bash code, plus negligible overhead

Now add shared libraries. If libc is 2MB and used by every program:

Without sharing: 2 MB × (number of processes using libc)
With sharing: 2 MB total

On a typical server with hundreds of processes, code sharing saves gigabytes of RAM.

Position-Independent Code (PIC):

For sharing to work when libraries load at different virtual addresses in different processes, code must be position-independent—it must work correctly regardless of where it's loaded.

PIC achieves this by:

Using PC-relative addressing for internal references
Using the Global Offset Table (GOT) for external data references
Using the Procedure Linkage Table (PLT) for external function calls

This small overhead (extra indirection) pays for itself many times over in memory savings.

The Trade-off: PIC Overhead vs. Sharing Benefits

Position-independent code has slightly higher runtime overhead due to indirect addressing. However, this is offset by: massive memory savings enabling more processes to run, better instruction cache utilization (shared code stays cached), and reduced page faults (shared pages are more likely to be resident). For all but the most performance-critical inner loops, PIC is the right choice.

Data Segments: The Program's State

While code defines what a program does, data segments hold the state the program manipulates. Unlike the code segment, data segments are typically writable—the whole point of data is that it changes as the program runs.

Data segments are more complex than code segments because there are multiple types of data with different characteristics:

The Data Segment Taxonomy:

Types of Data Segments
Segment	Contents	Initialization	Permissions	In Binary File?
.data	Initialized global/static variables	Values from binary	Read + Write	Yes (stores values)
.rodata	Read-only data (constants, strings)	Values from binary	Read only	Yes (stores values)
.bss	Uninitialized global/static variables	Zero-filled at load	Read + Write	No (just size)

The .data Segment (Initialized Data):

This segment holds global and static variables that have explicit initial values:

int global_counter = 100;        // Goes in .data
static char buffer[10] = "Hello"; // Goes in .data
float pi = 3.14159f;              // Goes in .data

Characteristics:

Values are stored in the executable file itself
Loaded into memory at program startup
Can be read and written during execution
Not shareable (each process needs its own copy since it can modify)

The .rodata Segment (Read-Only Data):

This segment holds data that should never change:

const int max_connections = 1024;  // Goes in .rodata
const char* message = "Error!";    // String in .rodata
static const float coefficients[] = {1.0, 2.0, 3.0}; // In .rodata

Characteristics:

Values stored in executable file
Mapped read-only at runtime (writes cause protection faults)
Can be shared between processes (it's constant)
Often placed near code for cache locality

The .bss Segment (Uninitialized Data):

The name ".bss" comes from old assembly language: "Block Started by Symbol". This segment holds uninitialized global/static variables:

int uninitialized_array[1000];  // Goes in .bss
static void* pointers[500];     // Goes in .bss

Characteristics:

Only the size is stored in the executable (not values)
The OS zero-fills this memory when loading the program
Dramatically reduces executable file size for large arrays
Read/write permissions, not shareable

Why .bss Saves Space

Consider: int buffer[1000000]; This is 4MB of zeros. In .data, the executable would include 4MB of zeros. In .bss, the executable just stores '4MB of bss needed.' At load time, the OS allocates 4MB and zero-fills it (or uses zero-page mapping). This is why the size of an executable can be much smaller than its memory footprint.

Data Segment Lifecycle and Access Patterns

Understanding how data segments behave throughout program execution is crucial for systems programmers and anyone diagnosing memory-related issues.

Lifecycle Stages:

Data Segment Timeline

•At Compile Time: The compiler categorizes each variable into .data, .rodata, or .bss based on its declaration and initialization.
•At Link Time: The linker combines segments from all object files, resolves cross-module references, and determines final segment sizes and layouts.
•At Load Time: The loader maps segments into virtual memory, loads .data and .rodata from the file, allocates and zeros .bss.
•During Execution: The program reads and writes data. .rodata remains constant; .data and .bss change according to program logic.
•At Termination: All data segments are deallocated. Unsaved changes are lost unless explicitly written to persistent storage.

data_segment_lifecycle.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
// Demonstrating data segment behavior
 
#include <stdio.h>
#include <string.h>
 
// .rodata - constant string literal
const char* PROGRAM_NAME = "DataDemo";
 
// .data - initialized global
int request_count = 0;
 
// .bss - uninitialized global (will be zero)
char receive_buffer[4096];
 
// .data - initialized static (file scope)
static int error_count = 0;
 
// .bss - uninitialized static
static FILE* log_file;
 
void process_request(const char* data) {
    // request_count is in .data, modified at runtime
    request_count++;
    
    // receive_buffer is in .bss, zero-initialized
    // First use sees zeros
    strncpy(receive_buffer, data, sizeof(receive_buffer) - 1);
    
    // PROGRAM_NAME is in .rodata
    // This would cause a protection fault if attempted:
    // PROGRAM_NAME[0] = 'X';  // CRASH! .rodata is read-only
    
    printf("[%s] Processed request #%d
", PROGRAM_NAME, request_count);
}
 
int main() {
    // At this point:
    // - request_count is 0 (loaded from .data)
    // - receive_buffer is all zeros (zeroed .bss)
    // - PROGRAM_NAME points to "DataDemo" (in .rodata)
    
    process_request("First request");   // request_count becomes 1
    process_request("Second request");  // request_count becomes 2
    
    // The value 2 exists only in memory
    // When program exits, this state is lost
    return 0;
}

Access Patterns and Optimization:

Data segments exhibit characteristic access patterns that influence system performance:

Global Variables: Often accessed from many functions, leading to good cache utilization if the variable is small and frequently used. However, global state complicates concurrency—multiple threads accessing the same global require synchronization.

Static Variables: Limited scope means fewer functions access them, potentially better cache behavior in their "home" function. Static locals are particularly interesting—they're allocated in .data or .bss but have local scope.

Constants (.rodata): Highly shareable, often excellent cache behavior since they don't change. The compiler may inline small constants directly into code.

Large Arrays: Whether in .data or .bss, large arrays can cause cache pressure. Careful layout and access patterns (sequential vs. random) dramatically affect performance.

Copy-on-Write Optimization:

Modern systems often apply copy-on-write (COW) to .data segments when forking:

Parent and child initially share .data pages (marked read-only)
When either process writes, a page fault occurs
The kernel copies the page, and each process gets its own copy
This optimization is invisible to the program but saves memory and time

Examining Your Program's Segments

On Linux, use 'size ./program' to see text, data, and bss sizes. Use 'readelf -S ./program' for detailed segment information. Use 'objdump -h ./program' for section headers. Watch a variable with 'nm ./program | grep variable_name' to see which section contains it.

The Stack Segment: Execution Context

The stack segment is fundamentally different from code and data segments. While those have fixed sizes determined at compile/load time, the stack is dynamic—it grows and shrinks as functions are called and return. The stack is the mechanism that makes function calls, recursion, and local variables possible.

What Lives on the Stack:

Stack Contents

•Local Variables — Variables declared inside functions exist on the stack. They're created when the function is called and destroyed when it returns.
•Function Parameters — Arguments passed to functions may be placed on the stack (calling convention dependent; some use registers).
•Return Addresses — When a function is called, the address to return to is saved on the stack. This enables the call/return mechanism.
•Saved Registers — Registers that the callee needs to preserve are saved on the stack and restored before returning.
•Stack Frame Pointer — Often saved to enable stack unwinding and debugging (though sometimes omitted for optimization).
•Temporary Values — Intermediate computation results that don't fit in registers spill to the stack.

Stack Growth Direction:

On most modern architectures (x86, x86-64, ARM), the stack grows downward—toward lower addresses. The stack starts at a high address and each push decreases the stack pointer.

High Address      ┌─────────────────────────┐
                  │   Command-line args     │
                  │   Environment variables │
                  ├─────────────────────────┤
                  │   main()'s stack frame  │
                  ├─────────────────────────┤
                  │   func_a()'s frame      │
                  ├─────────────────────────┤
                  │   func_b()'s frame      │ ← Current stack pointer
                  ├─────────────────────────┤
                  │                         │
                  │   (available stack)     │
                  │         ↓               │ Stack grows down
                  │                         │
                  ├─────────────────────────┤
                  │   Guard page (no access)│
Low Address       └─────────────────────────┘

This design places the stack and heap on opposite ends of the address space, with each growing toward the other. A guard page between them (or at the stack limit) generates a fault if the stack grows too far, preventing silent corruption.

Stack Overflow: When Growth Hits the Limit

The stack has a maximum size (often 8MB on Linux by default). Deep recursion or large local arrays can exhaust stack space, causing a stack overflow. Unlike heap exhaustion (which malloc signals by returning NULL), stack overflow typically causes an immediate crash—the stack pointer enters the guard page, triggering a segmentation fault.

Stack Frame Structure: Anatomy of a Function Call

Each function call creates a stack frame (also called an activation record)—a structured chunk of stack space containing everything that function needs. Understanding stack frames is essential for debugging, security analysis, and systems programming.

Canonical Stack Frame Layout (x86-64 System V ABI):

Higher addresses
    ┌─────────────────────────────┐
    │  Arguments passed via stack │ (if any beyond registers)
    │  (arg7, arg8, ...)          │
    ├─────────────────────────────┤
    │  Return address             │ ← Pushed by CALL instruction
    ├─────────────────────────────┤ ← Old RSP before CALL
    │  Saved RBP (optional)       │ ← Frame pointer
    ├─────────────────────────────┤ ← New RBP (if used)
    │  Local variables            │
    │  - variable1                │
    │  - variable2                │
    │  - ...                      │
    ├─────────────────────────────┤
    │  Saved callee-save registers│
    │  (RBX, R12-R15 if used)     │
    ├─────────────────────────────┤
    │  Spilled temporaries        │
    ├─────────────────────────────┤
    │  Outgoing arguments (stack) │ (for callee, if any)
    ├─────────────────────────────┤ ← RSP (stack pointer)
Lower addresses

stack_frame_example.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
// Example: What happens on the stack during execution
 
int multiply(int x, int y) {    // x, y passed in registers (x86-64)
    int result = x * y;          // 'result' is on the stack
    return result;               // Return value in RAX register
}
 
int compute(int a, int b, int c) {
    int product;                 // Local variable on stack
    int sum;                     // Local variable on stack
    
    product = multiply(a, b);    // Call creates new stack frame
    sum = product + c;
    return sum;
}
 
int main() {
    int answer;                  // Local variable in main's frame
    answer = compute(3, 4, 5);   // answer = (3*4) + 5 = 17
    return answer;
}
 
// Stack during multiply() call (conceptual):
//
// ┌──────────────────────────────────┐
// │ main's stack frame               │
// │   answer (uninitialized at first)│
// │   return address to CRT          │
// ├──────────────────────────────────┤
// │ compute's stack frame            │
// │   product                        │
// │   sum                            │
// │   saved RBP                      │
// │   return address to main         │
// ├──────────────────────────────────┤
// │ multiply's stack frame           │ ← CURRENT
// │   result                         │
// │   saved RBP                      │
// │   return address to compute      │
// └──────────────────────────────────┘ ← Stack pointer (RSP)

Frame Manipulation Operations:

Function Prologue (at function entry):

push rbp           ; Save caller's frame pointer
mov rbp, rsp       ; Set up new frame pointer
sub rsp, N         ; Allocate space for locals
; Save callee-save registers if needed

Function Epilogue (at function exit):

; Restore callee-save registers if saved
mov rsp, rbp       ; Deallocate locals
pop rbp            ; Restore caller's frame pointer
ret                ; Pop return address and jump

Stack Unwinding for Debugging:

When a debugger (or exception handler) needs to traverse the call stack, it follows the chain of saved frame pointers:

Start at current RBP
The saved RBP at [RBP] points to the previous frame
The return address at [RBP+8] shows where to return
Follow the chain until you reach the initial frame

This enables stack traces, debugging, and structured exception handling.

Frame Pointer Omission

Modern compilers often compile with -fomit-frame-pointer for optimization, freeing RBP as a general-purpose register. This works because compilers can track stack offsets statically, and DWARF debug information provides unwinding data. However, it makes manual debugging harder—you can't simply follow the RBP chain.

Stack Security: Defending Against Attacks

The stack's role in storing return addresses makes it a prime target for attackers. Classic buffer overflow attacks exploit the stack to redirect program execution. Understanding these attacks and their mitigations is essential for systems programmers.

The Classic Stack Buffer Overflow:

vulnerable_function.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// VULNERABLE CODE - DO NOT USE
void vulnerable(char* user_input) {
    char buffer[64];             // Fixed-size buffer on stack
    strcpy(buffer, user_input);  // DANGEROUS: no bounds checking!
    printf("You said: %s
", buffer);
}
 
// Stack layout:
// ┌──────────────────────┐
// │  Return address      │ ← Target for overflow
// ├──────────────────────┤
// │  Saved RBP           │
// ├──────────────────────┤
// │  buffer[63]          │
// │  ...                 │ ← Overflow writes upward
// │  buffer[0]           │ ← Attacker's input starts here
// └──────────────────────┘
 
// If user_input is longer than 64 bytes, strcpy overwrites:
// 1. The rest of the buffer
// 2. The saved RBP
// 3. THE RETURN ADDRESS
// 
// Attacker controls return address → Controls execution!

Modern Stack Protection Mechanisms:

Defense Layers

•Stack Canaries (SSP) — A random value placed between local variables and the return address. Before returning, the function checks if the canary is unchanged. Overwrites corrupt the canary first, triggering detection.
•NX/DEP (Non-Executable Stack) — The stack segment is marked non-executable. Even if attackers inject code, CPU refuses to execute it. They must use existing code (like ROP gadgets).
•ASLR (Address Space Layout Randomization) — Stack address is randomized at each execution. Attackers can't rely on fixed addresses; they must leak addresses or spray widely.
•Shadow Stacks (CET) — A separate, hardware-protected stack holds only return addresses. Overwrites on the main stack don't affect the shadow stack; mismatches trigger faults.
•Safe Stack — Separates the stack into "safe" (return addresses, saved registers) and "unsafe" (local buffers) portions. Overflows can't reach critical data.

stack_canary_protection.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
// How stack canaries work (conceptual)
 
void protected_function(char* input) {
    // Compiler inserts canary after prologue
    unsigned long canary = __stack_chk_guard;  // Random value
    
    char buffer[64];
    strcpy(buffer, input);  // Still vulnerable, but...
    
    // Compiler inserts check before epilogue
    if (canary != __stack_chk_guard) {
        // Canary was corrupted! Stack overflow detected
        __stack_chk_fail();  // Terminates program
    }
    
    // Only returns if canary is intact
}
 
// Stack with canary:
// ┌──────────────────────┐
// │  Return address      │
// ├──────────────────────┤
// │  Saved RBP           │
// ├──────────────────────┤
// │  🐤 Stack Canary 🐤  │ ← Must be intact to return
// ├──────────────────────┤
// │  buffer[63]          │
// │  ...                 │ ← Overflow corrupts canary first
// │  buffer[0]           │
// └──────────────────────┘

Defense in Depth

Modern systems combine all these protections. An attacker must bypass stack canaries (random, checked before return), NX (can't execute injected code), ASLR (addresses randomized), and potentially shadow stacks (separate return address storage). Each layer makes exploitation harder; together, they make classic stack attacks impractical.

The Heap Segment: Dynamic Allocation

While we've focused on code, data, and stack as the canonical segments, a complete picture requires understanding the heap segment—the region for dynamic memory allocation. The heap isn't always modeled as a traditional segment, but it's essential to program memory.

Heap vs. Stack: Complementary Roles:

Stack Allocation

•Automatic lifetime (function scope)
•Fixed size known at compile time
•Very fast allocation (just adjust pointer)
•Cannot resize after allocation
•Cannot outlive the function
•Limited total size (~8MB typical)

Heap Allocation

•Manual lifetime (explicit free)
•Size determined at runtime
•Slower allocation (search free lists)
•Can resize (realloc)
•Can outlive allocating function
•Limited only by virtual memory

Heap Characteristics:

Dynamic Growth — The heap grows upward (toward higher addresses) as allocations are made. When more space is needed, the program requests additional pages from the OS.
Non-Contiguous Internally — Unlike the stack's strictly ordered growth, heap allocations can be scattered. Free blocks exist between allocated blocks, leading to fragmentation.
Manual Management — In languages like C, programmers must explicitly free heap allocations. Memory leaks occur when allocations are never freed; use-after-free bugs when freed memory is accessed.
Allocator Complexity — The heap allocator (malloc/free implementation) is a sophisticated piece of software managing free lists, coalescing freed blocks, and balancing speed against fragmentation.

Heap in the Memory Map:

┌─────────────────────────────────┐ High addresses
│ Stack (grows down)              │
│       ↓                         │
├─────────────────────────────────┤
│                                 │
│ (unmapped region)               │
│                                 │
├─────────────────────────────────┤
│       ↑                         │
│ Heap (grows up)                 │ ← Grows via brk/sbrk or mmap
├─────────────────────────────────┤
│ BSS                             │
├─────────────────────────────────┤
│ Data                            │
├─────────────────────────────────┤
│ Text (Code)                     │
└─────────────────────────────────┘ Low addresses

Modern Allocators Go Beyond Traditional Heap

Modern allocators like jemalloc, tcmalloc, and mimalloc don't rely solely on expanding a single heap segment via brk(). They use mmap() to obtain memory from anywhere in the address space, manage separate arenas for different threads, and employ sophisticated strategies to minimize fragmentation and lock contention. The 'heap' is now a logical concept more than a single segment.

Summary: Code, Data, and Stack Segments

This page has provided a comprehensive exploration of the three fundamental segment types and the heap. Let's consolidate the key insights:

Key Takeaways

•Code (Text) Segment — Read-only, executable, fixed-size, highly shareable. Contains all machine instructions. Protected by W^X policy preventing code injection.
•Data Segments — Come in varieties: .data (initialized, RW), .rodata (read-only, shareable), .bss (uninitialized, RW, zero-filled). Represent global/static program state.
•Stack Segment — Dynamic, grows downward, holds local variables, return addresses, and call context. Limited size; protected by canaries, NX, ASLR, and shadow stacks.
•Heap Segment — Dynamic, grows upward, explicitly managed by programmer. Used for runtime-sized allocations that outlive their creating function.
•Sharing — Code and .rodata can be shared among processes; data, BSS, stack, and heap are process-private.
•Protection — Each segment type has appropriate permissions matching its purpose. Violations trigger hardware faults.
•Security — Stack is a prime attack target; multiple defense layers protect return addresses and prevent code execution in data areas.
•Modern mapping — Even without hardware segments, OSes maintain these logical divisions via virtual memory regions with segment-like properties.

What's Next:

We've seen that segments have variable sizes—the code segment might be 1KB while the data segment is 100KB. The next page explores variable segment sizes in depth: why this flexibility is powerful, how it differs from fixed-size paging, and what challenges it creates for memory management.

Page Complete

You now understand the three fundamental segment types—code, data, and stack—along with the heap. You know what lives in each, how they're protected, how they interact, and why their distinct characteristics matter for systems design and security. This foundation prepares you for understanding variable segment sizes and the programmer's view of segmented memory.

2 / 5

Loading learning content...

Operating SystemsSegmentation Concepts

Segmentation Concepts

LevelIntermediate

Duration90 mins

TopicSegmentation Concepts

2 / 5

Code, Data, Stack Segments

The Holy Trinity of Program Memory

What You Will Learn

The Code Segment (Text Segment)

Fundamental Characteristics:

Code Segment Properties

•Read-Only — Code should never be modified during execution. Self-modifying code exists but is rare, dangerous, and typically prohibited by modern security policies.
•Executable — This is the only segment type where execution is permitted. The CPU's instruction pointer (EIP/RIP) points into this segment.
•Fixed Size — Once a program is compiled and linked, its code size is fixed. The code segment doesn't grow or shrink during execution.
•Highly Shareable — Multiple processes running the same program can share a single copy of the code segment, as no process modifies it.
•Position-Independent (Optional) — Modern code is often compiled as PIC (Position-Independent Code), allowing it to run at any address without relocation.

Why 'Text'?

Structure of the Code Segment:

A typical code segment contains:

Entry Point — The address where execution begins (e.g., _start or main)
Function Bodies — All compiled functions, laid out sequentially or as determined by the linker
Inline Data — Small constants embedded in instructions (immediate values)
Exception Handling Tables — Data for unwinding the stack during exceptions
Padding — Alignment padding for cache efficiency

The linker determines the layout. Optimizing linkers may reorder functions to improve cache locality—frequently called functions placed near each other minimize instruction cache misses.

code_segment_example.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
// Example: How C code becomes part of the text segment
 
// All of these functions become machine code in the text segment:
 
// Simple function: ~10-20 bytes of machine code
int add(int a, int b) {
    return a + b;
}
 
// Function with loop: ~30-50 bytes of machine code
int factorial(int n) {
    int result = 1;
    for (int i = 2; i <= n; i++) {
        result *= i;
    }
    return result;
}
 
// Function with conditionals: ~40-80 bytes depending on optimization
int classify(int x) {
    if (x < 0) return -1;
    if (x == 0) return 0;
    return 1;
}
 
// The entry point
int main() {
    int sum = add(5, 3);
    int fact = factorial(5);
    int class = classify(-10);
    return 0;
}
 
// Compiled code segment layout (conceptual):
// 
// 0x0000: _start:              ; Runtime startup code
//         ...setup...
//         call main
//         
// 0x0100: add:                 ; Function add
//         mov eax, edi
//         add eax, esi
//         ret
//         
// 0x0110: factorial:           ; Function factorial
//         ...loop code...
//         ret
//         
// 0x0160: classify:            ; Function classify
//         ...conditional code...
//         ret
//         
// 0x01A0: main:                ; Entry point
//         ...call sequence...
//         ret

Code Segment Security: W^X Policy

Sharing the Code Segment

The Sharing Mechanism:

When multiple processes run the same executable, the operating system recognizes that their code segments are identical. Instead of loading separate copies for each process:

The first process that runs the program causes the code to be loaded into physical memory
Subsequent processes have their page tables (or segment tables) point to the same physical pages
Each process has its own virtual address for the code, but they all map to the same physical memory
Because the code is read-only, this sharing is completely safe—no process can affect another

Converting Mermaid diagram...

Memory Savings from Sharing:

Consider a system with 100 users running bash shells, and bash's code segment is 1MB:

Without sharing: 100 MB of physical memory for bash code
With sharing: 1 MB for bash code, plus negligible overhead

Now add shared libraries. If libc is 2MB and used by every program:

Without sharing: 2 MB × (number of processes using libc)
With sharing: 2 MB total

On a typical server with hundreds of processes, code sharing saves gigabytes of RAM.

Position-Independent Code (PIC):

For sharing to work when libraries load at different virtual addresses in different processes, code must be position-independent—it must work correctly regardless of where it's loaded.

PIC achieves this by:

Using PC-relative addressing for internal references
Using the Global Offset Table (GOT) for external data references
Using the Procedure Linkage Table (PLT) for external function calls

This small overhead (extra indirection) pays for itself many times over in memory savings.

The Trade-off: PIC Overhead vs. Sharing Benefits

Data Segments: The Program's State

Data segments are more complex than code segments because there are multiple types of data with different characteristics:

The Data Segment Taxonomy:

Types of Data Segments
Segment	Contents	Initialization	Permissions	In Binary File?
.data	Initialized global/static variables	Values from binary	Read + Write	Yes (stores values)
.rodata	Read-only data (constants, strings)	Values from binary	Read only	Yes (stores values)
.bss	Uninitialized global/static variables	Zero-filled at load	Read + Write	No (just size)

The .data Segment (Initialized Data):

This segment holds global and static variables that have explicit initial values:

int global_counter = 100;        // Goes in .data
static char buffer[10] = "Hello"; // Goes in .data
float pi = 3.14159f;              // Goes in .data

Characteristics:

Values are stored in the executable file itself
Loaded into memory at program startup
Can be read and written during execution
Not shareable (each process needs its own copy since it can modify)

The .rodata Segment (Read-Only Data):

This segment holds data that should never change:

const int max_connections = 1024;  // Goes in .rodata
const char* message = "Error!";    // String in .rodata
static const float coefficients[] = {1.0, 2.0, 3.0}; // In .rodata

Characteristics:

Values stored in executable file
Mapped read-only at runtime (writes cause protection faults)
Can be shared between processes (it's constant)
Often placed near code for cache locality

The .bss Segment (Uninitialized Data):

The name ".bss" comes from old assembly language: "Block Started by Symbol". This segment holds uninitialized global/static variables:

int uninitialized_array[1000];  // Goes in .bss
static void* pointers[500];     // Goes in .bss

Characteristics:

Only the size is stored in the executable (not values)
The OS zero-fills this memory when loading the program
Dramatically reduces executable file size for large arrays
Read/write permissions, not shareable

Why .bss Saves Space

Data Segment Lifecycle and Access Patterns

Understanding how data segments behave throughout program execution is crucial for systems programmers and anyone diagnosing memory-related issues.

Lifecycle Stages:

Data Segment Timeline

•At Compile Time: The compiler categorizes each variable into .data, .rodata, or .bss based on its declaration and initialization.
•At Link Time: The linker combines segments from all object files, resolves cross-module references, and determines final segment sizes and layouts.
•At Load Time: The loader maps segments into virtual memory, loads .data and .rodata from the file, allocates and zeros .bss.
•During Execution: The program reads and writes data. .rodata remains constant; .data and .bss change according to program logic.
•At Termination: All data segments are deallocated. Unsaved changes are lost unless explicitly written to persistent storage.

data_segment_lifecycle.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
// Demonstrating data segment behavior
 
#include <stdio.h>
#include <string.h>
 
// .rodata - constant string literal
const char* PROGRAM_NAME = "DataDemo";
 
// .data - initialized global
int request_count = 0;
 
// .bss - uninitialized global (will be zero)
char receive_buffer[4096];
 
// .data - initialized static (file scope)
static int error_count = 0;
 
// .bss - uninitialized static
static FILE* log_file;
 
void process_request(const char* data) {
    // request_count is in .data, modified at runtime
    request_count++;
    
    // receive_buffer is in .bss, zero-initialized
    // First use sees zeros
    strncpy(receive_buffer, data, sizeof(receive_buffer) - 1);
    
    // PROGRAM_NAME is in .rodata
    // This would cause a protection fault if attempted:
    // PROGRAM_NAME[0] = 'X';  // CRASH! .rodata is read-only
    
    printf("[%s] Processed request #%d
", PROGRAM_NAME, request_count);
}
 
int main() {
    // At this point:
    // - request_count is 0 (loaded from .data)
    // - receive_buffer is all zeros (zeroed .bss)
    // - PROGRAM_NAME points to "DataDemo" (in .rodata)
    
    process_request("First request");   // request_count becomes 1
    process_request("Second request");  // request_count becomes 2
    
    // The value 2 exists only in memory
    // When program exits, this state is lost
    return 0;
}

Access Patterns and Optimization:

Data segments exhibit characteristic access patterns that influence system performance:

Constants (.rodata): Highly shareable, often excellent cache behavior since they don't change. The compiler may inline small constants directly into code.

Large Arrays: Whether in .data or .bss, large arrays can cause cache pressure. Careful layout and access patterns (sequential vs. random) dramatically affect performance.

Copy-on-Write Optimization:

Modern systems often apply copy-on-write (COW) to .data segments when forking:

Parent and child initially share .data pages (marked read-only)
When either process writes, a page fault occurs
The kernel copies the page, and each process gets its own copy
This optimization is invisible to the program but saves memory and time

Examining Your Program's Segments

The Stack Segment: Execution Context

What Lives on the Stack:

Stack Contents

•Local Variables — Variables declared inside functions exist on the stack. They're created when the function is called and destroyed when it returns.
•Function Parameters — Arguments passed to functions may be placed on the stack (calling convention dependent; some use registers).
•Return Addresses — When a function is called, the address to return to is saved on the stack. This enables the call/return mechanism.
•Saved Registers — Registers that the callee needs to preserve are saved on the stack and restored before returning.
•Stack Frame Pointer — Often saved to enable stack unwinding and debugging (though sometimes omitted for optimization).
•Temporary Values — Intermediate computation results that don't fit in registers spill to the stack.

Stack Growth Direction:

On most modern architectures (x86, x86-64, ARM), the stack grows downward—toward lower addresses. The stack starts at a high address and each push decreases the stack pointer.

High Address      ┌─────────────────────────┐
                  │   Command-line args     │
                  │   Environment variables │
                  ├─────────────────────────┤
                  │   main()'s stack frame  │
                  ├─────────────────────────┤
                  │   func_a()'s frame      │
                  ├─────────────────────────┤
                  │   func_b()'s frame      │ ← Current stack pointer
                  ├─────────────────────────┤
                  │                         │
                  │   (available stack)     │
                  │         ↓               │ Stack grows down
                  │                         │
                  ├─────────────────────────┤
                  │   Guard page (no access)│
Low Address       └─────────────────────────┘

Stack Overflow: When Growth Hits the Limit

Stack Frame Structure: Anatomy of a Function Call

Canonical Stack Frame Layout (x86-64 System V ABI):

Higher addresses
    ┌─────────────────────────────┐
    │  Arguments passed via stack │ (if any beyond registers)
    │  (arg7, arg8, ...)          │
    ├─────────────────────────────┤
    │  Return address             │ ← Pushed by CALL instruction
    ├─────────────────────────────┤ ← Old RSP before CALL
    │  Saved RBP (optional)       │ ← Frame pointer
    ├─────────────────────────────┤ ← New RBP (if used)
    │  Local variables            │
    │  - variable1                │
    │  - variable2                │
    │  - ...                      │
    ├─────────────────────────────┤
    │  Saved callee-save registers│
    │  (RBX, R12-R15 if used)     │
    ├─────────────────────────────┤
    │  Spilled temporaries        │
    ├─────────────────────────────┤
    │  Outgoing arguments (stack) │ (for callee, if any)
    ├─────────────────────────────┤ ← RSP (stack pointer)
Lower addresses

stack_frame_example.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
// Example: What happens on the stack during execution
 
int multiply(int x, int y) {    // x, y passed in registers (x86-64)
    int result = x * y;          // 'result' is on the stack
    return result;               // Return value in RAX register
}
 
int compute(int a, int b, int c) {
    int product;                 // Local variable on stack
    int sum;                     // Local variable on stack
    
    product = multiply(a, b);    // Call creates new stack frame
    sum = product + c;
    return sum;
}
 
int main() {
    int answer;                  // Local variable in main's frame
    answer = compute(3, 4, 5);   // answer = (3*4) + 5 = 17
    return answer;
}
 
// Stack during multiply() call (conceptual):
//
// ┌──────────────────────────────────┐
// │ main's stack frame               │
// │   answer (uninitialized at first)│
// │   return address to CRT          │
// ├──────────────────────────────────┤
// │ compute's stack frame            │
// │   product                        │
// │   sum                            │
// │   saved RBP                      │
// │   return address to main         │
// ├──────────────────────────────────┤
// │ multiply's stack frame           │ ← CURRENT
// │   result                         │
// │   saved RBP                      │
// │   return address to compute      │
// └──────────────────────────────────┘ ← Stack pointer (RSP)

Frame Manipulation Operations:

Function Prologue (at function entry):

push rbp           ; Save caller's frame pointer
mov rbp, rsp       ; Set up new frame pointer
sub rsp, N         ; Allocate space for locals
; Save callee-save registers if needed

Function Epilogue (at function exit):

; Restore callee-save registers if saved
mov rsp, rbp       ; Deallocate locals
pop rbp            ; Restore caller's frame pointer
ret                ; Pop return address and jump

Stack Unwinding for Debugging:

When a debugger (or exception handler) needs to traverse the call stack, it follows the chain of saved frame pointers:

Start at current RBP
The saved RBP at [RBP] points to the previous frame
The return address at [RBP+8] shows where to return
Follow the chain until you reach the initial frame

This enables stack traces, debugging, and structured exception handling.

Frame Pointer Omission

Stack Security: Defending Against Attacks

The Classic Stack Buffer Overflow:

vulnerable_function.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
// VULNERABLE CODE - DO NOT USE
void vulnerable(char* user_input) {
    char buffer[64];             // Fixed-size buffer on stack
    strcpy(buffer, user_input);  // DANGEROUS: no bounds checking!
    printf("You said: %s
", buffer);
}
 
// Stack layout:
// ┌──────────────────────┐
// │  Return address      │ ← Target for overflow
// ├──────────────────────┤
// │  Saved RBP           │
// ├──────────────────────┤
// │  buffer[63]          │
// │  ...                 │ ← Overflow writes upward
// │  buffer[0]           │ ← Attacker's input starts here
// └──────────────────────┘
 
// If user_input is longer than 64 bytes, strcpy overwrites:
// 1. The rest of the buffer
// 2. The saved RBP
// 3. THE RETURN ADDRESS
// 
// Attacker controls return address → Controls execution!

Modern Stack Protection Mechanisms:

Defense Layers

•Stack Canaries (SSP) — A random value placed between local variables and the return address. Before returning, the function checks if the canary is unchanged. Overwrites corrupt the canary first, triggering detection.
•NX/DEP (Non-Executable Stack) — The stack segment is marked non-executable. Even if attackers inject code, CPU refuses to execute it. They must use existing code (like ROP gadgets).
•ASLR (Address Space Layout Randomization) — Stack address is randomized at each execution. Attackers can't rely on fixed addresses; they must leak addresses or spray widely.
•Shadow Stacks (CET) — A separate, hardware-protected stack holds only return addresses. Overwrites on the main stack don't affect the shadow stack; mismatches trigger faults.
•Safe Stack — Separates the stack into "safe" (return addresses, saved registers) and "unsafe" (local buffers) portions. Overflows can't reach critical data.

stack_canary_protection.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
// How stack canaries work (conceptual)
 
void protected_function(char* input) {
    // Compiler inserts canary after prologue
    unsigned long canary = __stack_chk_guard;  // Random value
    
    char buffer[64];
    strcpy(buffer, input);  // Still vulnerable, but...
    
    // Compiler inserts check before epilogue
    if (canary != __stack_chk_guard) {
        // Canary was corrupted! Stack overflow detected
        __stack_chk_fail();  // Terminates program
    }
    
    // Only returns if canary is intact
}
 
// Stack with canary:
// ┌──────────────────────┐
// │  Return address      │
// ├──────────────────────┤
// │  Saved RBP           │
// ├──────────────────────┤
// │  🐤 Stack Canary 🐤  │ ← Must be intact to return
// ├──────────────────────┤
// │  buffer[63]          │
// │  ...                 │ ← Overflow corrupts canary first
// │  buffer[0]           │
// └──────────────────────┘

Defense in Depth

The Heap Segment: Dynamic Allocation

Heap vs. Stack: Complementary Roles:

Stack Allocation

•Automatic lifetime (function scope)
•Fixed size known at compile time
•Very fast allocation (just adjust pointer)
•Cannot resize after allocation
•Cannot outlive the function
•Limited total size (~8MB typical)

Heap Allocation

•Manual lifetime (explicit free)
•Size determined at runtime
•Slower allocation (search free lists)
•Can resize (realloc)
•Can outlive allocating function
•Limited only by virtual memory

Heap Characteristics:

Dynamic Growth — The heap grows upward (toward higher addresses) as allocations are made. When more space is needed, the program requests additional pages from the OS.
Non-Contiguous Internally — Unlike the stack's strictly ordered growth, heap allocations can be scattered. Free blocks exist between allocated blocks, leading to fragmentation.
Manual Management — In languages like C, programmers must explicitly free heap allocations. Memory leaks occur when allocations are never freed; use-after-free bugs when freed memory is accessed.
Allocator Complexity — The heap allocator (malloc/free implementation) is a sophisticated piece of software managing free lists, coalescing freed blocks, and balancing speed against fragmentation.

Heap in the Memory Map:

┌─────────────────────────────────┐ High addresses
│ Stack (grows down)              │
│       ↓                         │
├─────────────────────────────────┤
│                                 │
│ (unmapped region)               │
│                                 │
├─────────────────────────────────┤
│       ↑                         │
│ Heap (grows up)                 │ ← Grows via brk/sbrk or mmap
├─────────────────────────────────┤
│ BSS                             │
├─────────────────────────────────┤
│ Data                            │
├─────────────────────────────────┤
│ Text (Code)                     │
└─────────────────────────────────┘ Low addresses

Modern Allocators Go Beyond Traditional Heap

Summary: Code, Data, and Stack Segments

This page has provided a comprehensive exploration of the three fundamental segment types and the heap. Let's consolidate the key insights:

Key Takeaways

•Code (Text) Segment — Read-only, executable, fixed-size, highly shareable. Contains all machine instructions. Protected by W^X policy preventing code injection.
•Data Segments — Come in varieties: .data (initialized, RW), .rodata (read-only, shareable), .bss (uninitialized, RW, zero-filled). Represent global/static program state.
•Stack Segment — Dynamic, grows downward, holds local variables, return addresses, and call context. Limited size; protected by canaries, NX, ASLR, and shadow stacks.
•Heap Segment — Dynamic, grows upward, explicitly managed by programmer. Used for runtime-sized allocations that outlive their creating function.
•Sharing — Code and .rodata can be shared among processes; data, BSS, stack, and heap are process-private.
•Protection — Each segment type has appropriate permissions matching its purpose. Violations trigger hardware faults.
•Security — Stack is a prime attack target; multiple defense layers protect return addresses and prevent code execution in data areas.
•Modern mapping — Even without hardware segments, OSes maintain these logical divisions via virtual memory regions with segment-like properties.

What's Next:

Page Complete

2 / 5