Segment Table - Learning Module

Loading content...

0/227

Segment Table Entry

The Blueprint of Memory Segments

Every time a program references a variable, calls a function, or pushes data onto its stack, the processor must translate a logical address specified by the program into a physical address that can be sent to the memory bus. In segmented memory architectures, this translation pivots on a critical data structure: the Segment Table Entry (STE).

The segment table entry is more than just a record—it is the contract between the operating system and the hardware that defines:

Where a segment resides in physical memory
How large the segment is allowed to be
What operations are permitted on that segment
Whether the segment is even present in memory

Understanding segment table entries at a deep level is essential for anyone seeking to comprehend how operating systems enforce memory protection, enable code sharing, and provide the illusion of a large, contiguous address space to each process.

What You Will Learn

By the end of this page, you will have a complete understanding of segment table entry structure, including base and limit fields, protection bits, presence indicators, and privilege levels. You'll understand how hardware interprets these fields during address translation and what happens when access violations occur.

The Concept of a Segment Table

Before we dive into the structure of individual entries, we must understand the segment table as a whole. A segment table is a per-process data structure maintained by the operating system and interpreted by the hardware's Memory Management Unit (MMU).

Definition:

A segment table is an array (or more complex data structure) where each entry describes one segment of a process's logical address space. The index into this table is the segment number (or selector), and the value at that index is the segment table entry containing all metadata about that segment.

Key Properties:

Per-Process: Each process typically has its own segment table, ensuring memory isolation between processes.
Hardware-Interpreted: The MMU reads segment table entries directly during address translation, making the format hardware-defined.
OS-Managed: The operating system creates, populates, and maintains these tables as processes are created, modified, and terminated.
Variable Size: Different processes may have different numbers of segments, so segment tables can vary in size.

Segment Table Structure Overview
Segment Number	Segment Table Entry (STE)	Logical Purpose
0	STE₀: Base=0x10000, Limit=0x2000, R/X	Code Segment
1	STE₁: Base=0x30000, Limit=0x4000, R/W	Data Segment
2	STE₂: Base=0x80000, Limit=0x1000, R/W	Stack Segment
3	STE₃: Base=0x50000, Limit=0x800, R	Read-Only Data
4	STE₄: (Invalid/Not Present)	Unused

In this example table, a process has five segment slots. Segments 0-3 are valid and describe code, data, stack, and read-only data regions. Segment 4 is marked invalid—any access to it will trigger a segmentation fault.

The Role of the Segment Number:

When a program generates a logical address, that address typically consists of two parts:

Logical Address = (Segment Number, Offset within Segment)

The segment number serves as an index into the segment table. The MMU retrieves the corresponding STE, extracts the base address and limit, validates the offset, applies protection checks, and finally computes the physical address if all checks pass.

Segment Table vs. Page Table

Don't confuse segment tables with page tables. While both are used for address translation, segment tables map variable-sized logical segments to physical memory, whereas page tables map fixed-size pages. Segmentation provides a logical view of memory aligned with program structure (code, data, stack), while paging provides uniform physical memory management. Modern systems often combine both approaches.

Anatomy of a Segment Table Entry

A segment table entry is a carefully packed data structure containing all the information the hardware needs to translate addresses and enforce protection for a single segment. While the exact format is architecture-specific, the conceptual components are universal.

Core Fields of an STE:

Essential STE Components

•Base Address — The starting physical memory address where this segment is located. All offsets within the segment are added to this base.
•Limit (Length) — The size of the segment, defining the maximum valid offset. Any offset beyond this limit triggers a protection fault.
•Present/Valid Bit — Indicates whether the segment is currently loaded in physical memory. If not present, accessing it causes a segment fault for the OS to handle.
•Protection Bits — Encode read, write, and execute permissions. Attempting disallowed operations triggers a protection violation.
•Privilege Level (Ring) — Specifies the minimum CPU privilege level required to access this segment, enabling kernel/user separation.
•Type Field — Distinguishes code segments from data segments, affecting how certain operations (like jumping or writing) are interpreted.
•Accessed Bit — Set by hardware when the segment is accessed, useful for OS memory management and statistics.
•Granularity Flag — Determines whether the limit is measured in bytes or larger units (e.g., 4KB pages in x86).

Visualizing the STE Layout:

Consider a simplified 64-bit segment table entry (actual formats vary by architecture):

┌──────────────────────────────────────────────────────────────────┐
│                    Segment Table Entry (64 bits)                │
├──────────────────────────────────────────────────────────────────┤
│  Bits 63-32: Base Address (32 bits)                             │
│  Bits 31-12: Limit (20 bits)                                    │
│  Bits 11-8:  Type (4 bits: Code/Data, Conforming, Expand-Down)  │
│  Bit  7:     Present (P)                                        │
│  Bits 6-5:   Privilege Level (DPL, 2 bits)                      │
│  Bit  4:     System/User descriptor                             │
│  Bit  3:     Granularity (G)                                    │
│  Bit  2:     Default operation size (D/B)                       │
│  Bit  1:     Long mode (L) - for 64-bit                         │
│  Bit  0:     Accessed (A)                                       │
└──────────────────────────────────────────────────────────────────┘

This layout illustrates how efficiently segment metadata is packed into a single machine word or double-word, allowing the MMU to extract all needed information in a single memory read.

Architecture-Specific Variations

The exact bit layout of segment table entries differs by architecture. Intel x86 protected mode uses 8-byte segment descriptors with base split across non-contiguous fields for backward compatibility. ARM and other RISC architectures traditionally favor paging over segmentation. Understanding the conceptual fields prepares you for any architecture.

The Base Address Field

The base address is the most fundamental field in a segment table entry. It specifies the starting physical address of the segment in memory. Every memory reference within this segment is computed by adding the offset to this base.

Physical Address = Base Address + Offset

Key Characteristics of the Base Field:

•Absolute Location — The base address points directly to physical memory, not virtual memory. This is the actual RAM location where the segment's data begins.
•Relocation Enabler — Because programs use segment-relative addresses (offsets), the OS can place segments anywhere in physical memory simply by adjusting the base. The program code doesn't change.
•Word Size Dependent — The number of bits allocated to the base determines the maximum addressable physical memory. A 32-bit base allows addressing up to 4GB of physical memory.
•Alignment Requirements — Some architectures require bases to be aligned to certain boundaries (e.g., 16-byte or paragraph boundaries in 8086) for efficiency or historical reasons.

Example: Base Address in Action

Suppose a process's data segment has:

Base Address: 0x00400000
Limit: 0x00010000 (64KB)

When the program accesses offset 0x1234 within this segment:

Logical Address:  Data Segment : 0x1234
Base Address:     0x00400000
Physical Address: 0x00400000 + 0x1234 = 0x00401234

The MMU performs this addition in hardware, transparent to the executing program. From the program's perspective, it simply accesses address 0x1234 in its data segment—the physical location is invisible.

Dynamic Relocation via Base Modification:

If the OS needs to move this segment (perhaps due to memory compaction), it can:

Copy the segment contents from 0x00400000 to a new location, say 0x00800000
Update the segment table entry's base to 0x00800000
The program continues running unchanged—all its segment-relative addresses still work

This is the power of segmented addressing: location independence without recompilation.

Base Field Format Complexity (x86)

In Intel x86 protected mode, the 32-bit base address is split across three non-contiguous locations within the 8-byte segment descriptor for backward compatibility with 16-bit protected mode. This design quirk means parsing x86 segment descriptors requires bit manipulation to reassemble the base. Modern 64-bit mode simplifies this by largely ignoring segmentation for user-mode code.

The Limit Field

The limit field defines the size of the segment, establishing the boundary beyond which access is forbidden. This is the mechanism by which segmentation provides memory protection—preventing buffer overflows and out-of-bounds accesses at the hardware level.

Protection Rule:

If (Offset > Limit) then TRAP to Operating System

Every memory access within a segment is checked against the limit. If the offset exceeds the permitted range, the CPU generates a segmentation fault (or general protection fault in x86 terminology), transferring control to the OS exception handler.

Interpretation of the Limit Value:

Limit Interpretation
Aspect	Description
Maximum Valid Offset	The limit typically represents the last valid byte offset. For a 64KB segment, limit = 0xFFFF (65535).
Granularity Scaling	With granularity flag=1, the limit is scaled (e.g., ×4096), allowing segments up to 4GB with a 20-bit limit field.
Expand-Up vs Expand-Down	Data segments can expand up (standard) or down (for stacks), changing how the limit is interpreted.
Zero Limit	A limit of 0 means only offset 0 (one byte) is valid, not that the segment is empty.

The Granularity Flag:

In architectures like x86, the limit field is only 20 bits, which would restrict segments to 1MB maximum. The granularity (G) bit solves this:

G=0 (Byte Granularity): Limit is interpreted as-is, in bytes. Maximum segment size = 1MB.
G=1 (Page Granularity): Limit is scaled by 4096 (page size). Effective limit = (limit + 1) × 4096 - 1. Maximum segment size = 4GB.

Example with Granularity:

Limit field value: 0xFFFFF (20 bits, all ones = 1,048,575)

With G=0: Maximum offset = 1,048,575 bytes ≈ 1MB
With G=1: Maximum offset = (1,048,575 + 1) × 4096 - 1 = 4,294,967,295 bytes = 4GB

Expand-Down Segments for Stacks:

Stack segments present a unique challenge: they grow downward from high addresses to low addresses. An expand-down segment interprets the limit differently:

Valid offsets are from (limit + 1) to the maximum possible offset
This allows the stack to grow downward without hitting the limit immediately

For example, with limit = 0x7FFFF in a 32-bit segment:

Normal (expand-up): Valid offsets 0 to 0x7FFFF
Expand-down: Valid offsets 0x80000 to 0xFFFFFFFF

Hardware Bounds Checking

The beauty of limit checking is that it happens in hardware on every memory access—no software overhead. This makes segmentation a powerful tool for catching buffer overflows and array bounds violations at the point of access, rather than after corruption has occurred.

Type Field and Segment Classification

The type field in a segment table entry classifies the segment and defines behavioral attributes. This field determines what operations are valid on the segment and how the CPU interprets accesses to it.

Primary Classification: Code vs. Data

Segments are fundamentally divided into:

Code Segments — Contain executable instructions. The CPU fetches and executes instructions from these segments.
Data Segments — Contain data (variables, stacks, heaps). The CPU reads and writes data here but cannot execute from them (with proper protection).

Type Field Bit Breakdown (x86 Example):

Type Field Bits for Data Segments
Bit	Name	Meaning when 0	Meaning when 1
Bit 3	Descriptor Type	Data Segment	Code Segment
Bit 2	Expand-Down (Data)	Expand-Up	Expand-Down (Stack)
Bit 1	Write Enable (Data)	Read-Only	Read/Write
Bit 0	Accessed	Not Accessed	Has Been Accessed

Type Field Bits for Code Segments
Bit	Name	Meaning when 0	Meaning when 1
Bit 3	Descriptor Type	Data Segment	Code Segment
Bit 2	Conforming	Non-Conforming	Conforming
Bit 1	Readable	Execute-Only	Execute/Read
Bit 0	Accessed	Not Accessed	Has Been Accessed

Special Type Attributes:

Conforming Code Segments:

A conforming code segment can be called from less privileged code without changing the privilege level. This is used for shared library code that should run at the caller's privilege level rather than elevating privileges.

Non-Conforming Code Segments:

A non-conforming code segment requires exact privilege level matching (or use of call gates) for access. This is the typical behavior for kernel code that should only run at kernel privilege.

Readable Code Segments:

While all code segments are executable, not all are readable. An execute-only segment prevents instructions from reading their own code—useful for protecting proprietary algorithms or preventing certain forms of code analysis.

Writable Data Segments:

Data segments may be read-only or read-write. Constants and string literals would typically reside in read-only data segments, while variables and the stack need read-write access.

Security Through Segmentation

The combination of type field and protection bits enables powerful security policies at the hardware level. Code segments that are not readable can't be dumped for reverse engineering. Data segments that are not executable prevent code injection attacks. These protections happen before any OS or application code runs.

The Present Bit and Segment Faults

The present bit (or valid bit) indicates whether the segment is currently loaded in physical memory. This single bit enables powerful memory management capabilities, including swapping segments to disk and implementing a form of demand segmentation.

Present Bit Semantics:

P=1 (Present): The segment is in physical memory. The base and limit fields are valid, and the MMU can perform address translation.
P=0 (Not Present): The segment is not in physical memory (perhaps swapped to disk). Any access to this segment triggers a segment-not-present fault, transferring control to the OS.

Segment Fault Handling:

When a not-present segment is accessed:

CPU generates a segment-not-present exception (Interrupt 11 on x86)
Exception handler in OS is invoked with information about which segment was accessed
OS locates the segment on disk (swap space or executable file)
OS allocates physical memory and loads the segment
OS updates the segment table entry: sets base, limit, and P=1
OS returns from exception, CPU retries the faulting instruction
Access now succeeds with the segment present

This is analogous to page faults in paging systems, but operates at the segment level.

segment_fault_handler.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
void segment_not_present_handler(interrupt_frame_t* frame) {
    // Get the segment selector that caused the fault
    uint16_t selector = frame->error_code & 0xFFF8;
    
    // Look up segment in our segment metadata table
    segment_info_t* info = lookup_segment_info(current_process, selector);
    if (info == NULL) {
        // Invalid segment - kill the process
        terminate_process(current_process, SIGSEGV);
        return;
    }
    
    // Find the segment on disk
    if (!info->is_swapped && !info->is_demand_load) {
        // Segment should exist but doesn't - bug or corruption
        kernel_panic("Segment metadata inconsistency");
    }
    
    // Allocate physical memory for the segment
    void* phys_mem = allocate_physical_pages(info->size);
    if (phys_mem == NULL) {
        // Need to swap something else out first
        swap_out_segment(find_victim_segment());
        phys_mem = allocate_physical_pages(info->size);
    }
    
    // Load segment contents from disk
    if (info->is_swapped) {
        read_from_swap(info->swap_location, phys_mem, info->size);
    } else if (info->is_demand_load) {
        read_from_executable(info->file_offset, phys_mem, info->size);
    }
    
    // Update the segment descriptor in the GDT/LDT
    segment_descriptor_t* desc = get_descriptor(current_process, selector);
    desc->base = (uint32_t)phys_mem;
    desc->limit = info->size - 1;
    desc->present = 1;
    
    // Return - CPU will retry the faulting instruction
}

Segment Swapping vs. Page Swapping

While segment swapping is conceptually clean, it has practical drawbacks. Segments vary in size—swapping a 1MB segment is far more expensive than swapping a 4KB page. This is one reason why modern systems favor paging over segmentation for virtual memory. Segments still help with logical organization, but paging handles physical memory management.

Privilege Level (DPL) and Access Control

The Descriptor Privilege Level (DPL) is a 2-bit field that specifies the minimum privilege required to access the segment. This mechanism is central to protection rings and the separation between kernel mode and user mode.

Protection Rings Hierarchy:

        ┌───────────────────────┐
        │      Ring 0 (DPL=0)   │   ← Kernel (highest privilege)
        │  Operating System     │
        └───────────────────────┘
        ┌───────────────────────┐
        │      Ring 1 (DPL=1)   │   ← Device Drivers (x86 historical)
        │    System Services    │
        └───────────────────────┘
        ┌───────────────────────┐
        │      Ring 2 (DPL=2)   │   ← System Services (rarely used)
        │   Privileged Utils    │
        └───────────────────────┘
        ┌───────────────────────────┐
        │        Ring 3 (DPL=3)     │   ← User Applications (lowest privilege)
        │     User Applications     │
        └───────────────────────────┘

Privilege Level Checks:

When code attempts to access a segment, the CPU compares:

CPL (Current Privilege Level): The privilege level of the currently executing code segment
RPL (Requested Privilege Level): The privilege level specified in the segment selector
DPL (Descriptor Privilege Level): The privilege level in the segment descriptor

Access Rules:

Privilege Level Access Rules
Segment Type	Access Rule	Effect
Data Segment	max(CPL, RPL) ≤ DPL	Higher privilege code can access lower privilege data
Non-Conforming Code	CPL = DPL and RPL ≤ DPL	Must match exactly for control transfers
Conforming Code	CPL ≥ DPL	Lower privilege code can call higher privilege conforming code

Practical Implications:

Kernel Segments (DPL=0):

Only Ring 0 code (kernel) can access these segments
User code (Ring 3) attempting access gets a General Protection Fault
This protects kernel memory from malicious or buggy user programs

User Segments (DPL=3):

Any code can access these segments (numerically, 0, 1, 2, 3 all satisfy ≤ 3)
This allows kernel code to access user data (for system calls)
But also means user segments have no special protection from kernel

Mixed Privilege Scenarios:

When a system call occurs, the kernel often needs to access user buffers. The kernel runs at CPL=0 but may access segments with DPL=3. This is permitted because 0 ≤ 3.

However, the kernel must validate user requests carefully—a user can't trick the kernel by passing a kernel segment selector, because RPL is checked against DPL as well.

Modern Usage of Rings

Most modern operating systems only use Ring 0 (kernel) and Ring 3 (user). Rings 1 and 2 are rarely used in practice. With virtualization, hypervisors sometimes run in Ring -1 (SMM or VMX root mode), adding another layer below the traditional ring structure.

The Accessed Bit and Memory Management

The accessed bit (A) is automatically set by the CPU when a segment is accessed. This simple bit serves as a building block for sophisticated memory management algorithms.

Accessed Bit Behavior:

Initially, the OS sets A=0 when loading a segment descriptor
When the CPU first accesses the segment (load or store), it atomically sets A=1
The OS can read and clear this bit to track segment usage
The CPU never clears the bit automatically—only the OS does

Use Cases for the Accessed Bit:

OS Usage of Accessed Bit

•Segment Replacement Decisions — When memory is scarce and segments must be swapped out, the OS can prefer evicting segments with A=0 (not recently accessed) over A=1 (recently used).
•Working Set Tracking — By periodically clearing all A bits and checking which are set after a time interval, the OS can estimate which segments are in active use.
•Memory Profiling — Development tools can use access bits to understand which code and data segments are used during program execution.
•Security Auditing — Monitoring access patterns can help detect unusual memory access patterns that might indicate malware.

Comparison with Page Table Accessed Bits:

The page table also has accessed and dirty bits, serving similar purposes at the page level. In combined segmentation+paging systems:

Segment accessed bit: Was any part of the segment accessed?
Page accessed bit: Was this specific page accessed?
Page dirty bit: Was this page modified? (Segments don't have a dirty bit in the descriptor)

The granularity difference matters: segment access tracking is coarse-grained (entire segment) while page access tracking is fine-grained (4KB typically).

Atomic Access Bit Updates:

The CPU must set the accessed bit atomically to prevent race conditions in multi-processor systems. This is done as part of the memory access microcode—if the bit is 0, the CPU performs a locked read-modify-write operation to set it to 1 before the actual memory access completes.

Performance Consideration

The accessed bit update is a memory write operation to the descriptor table, which can impact performance on frequently-accessed segments in tight loops. Once the bit is set, subsequent accesses don't need to update it. Some systems cache segment descriptors in registers (like segment register caches) to minimize descriptor table accesses.

Putting It All Together: Complete STE

Let's examine a complete segment table entry in context, seeing how all the fields work together during address translation and protection checking.

Scenario: Accessing a Data Variable

A user-mode program executes: x = array[100];

The compiler generated code that:

Uses segment selector 0x23 (Data Segment, Ring 3)
Accesses offset 0x8064 (where array[100] is located)

Segment Descriptor at Index 4 (selector 0x23 → index 4):

Base:       0x10000000
Limit:      0x0FFFF (with G=1 → 4GB effective limit)
Type:       Data, Expand-Up, Writable
Present:    1
DPL:        3 (User mode)
Granularity: 1 (Page units)
Accessed:   1 (Previously accessed)

Step-by-Step Translation:

•Selector Parsing: Selector 0x23 = 0000 0000 0010 0011₂ → Index=4, TI=0 (GDT), RPL=3
•Descriptor Lookup: MMU fetches entry #4 from GDT at GDTR.base + 4×8
•Present Check: P=1 → Segment is present, continue
•Privilege Check: max(CPL=3, RPL=3)=3 ≤ DPL=3 → Access allowed
•Type Check: Loading data from data segment → Valid operation
•Limit Check: Offset 0x8064 ≤ Effective limit (4GB) → Within bounds
•Address Calculation: Physical = Base(0x10000000) + Offset(0x8064) = 0x10008064
•Access Bit: Already 1, no update needed
•Memory Access: Send 0x10008064 to memory bus, perform read

Failure Scenarios:

If any check failed, a fault would occur:

Failure	Fault	Handler Action
P=0	Segment Not Present (#NP)	Load segment from disk, retry
DPL < max(CPL,RPL)	General Protection (#GP)	Terminate process
Offset > Limit	General Protection (#GP)	Terminate process
Write to read-only segment	General Protection (#GP)	Terminate process
Execute from non-code segment	General Protection (#GP)	Terminate process

The hardware performs all these checks in a single memory access cycle (assuming descriptor is cached), making segmentation protection essentially free in terms of runtime overhead.

Hardware Enforcement

The elegance of segment table entries is that protection is enforced by hardware on every memory access. No software checks are needed—the CPU simply won't allow forbidden operations. This is why buffer overflows in properly segmented systems would be caught immediately, rather than causing silent corruption.

Summary: Segment Table Entry Mastery

We've thoroughly dissected the segment table entry—the fundamental data structure underpinning segmented memory management. Let's consolidate the key concepts:

Key Takeaways

•Segment Table Entry (STE) is a per-segment metadata record containing base, limit, type, protection, and status information.
•Base Address specifies where the segment starts in physical memory, enabling relocation without code modification.
•Limit Field defines segment size, enabling hardware bounds checking on every access.
•Type Field classifies segments as code or data and defines behavioral attributes (conforming, readable, writable, expand-down).
•Present Bit enables demand loading and segment swapping, with absent segments triggering faults for OS handling.
•DPL (Descriptor Privilege Level) enforces privilege-based access control, protecting kernel from user code.
•Accessed Bit tracks segment usage for memory management decisions.
•All checks happen in hardware during every memory access, providing efficient, mandatory protection.

What's Next:

With a solid understanding of segment table entry structure, we'll now explore the base address in greater depth—examining how it enables dynamic relocation, memory sharing between processes, and efficient use of physical memory across the system.

The base address may seem like a simple field, but its implications for operating system design are profound.

Page Complete

You now understand the complete anatomy of a segment table entry—from its individual bit fields to its role in the hardware translation and protection process. This foundation prepares you to explore each STE component in depth in the following pages.