Loading content...
In a segmented memory system, every logical address tells a two-part story. The first part—the segment number—answers a fundamental question: Which logical unit of the program are we accessing? Is it the code segment containing executable instructions? The data segment holding global variables? The stack segment managing function calls? Or perhaps a dynamically allocated heap segment?
The segment number is not merely an index; it's the key that unlocks the metadata describing an entire region of a process's address space. Without correctly extracting and interpreting this component, the memory management unit cannot even begin the translation process. Understanding the segment number is therefore the essential first step in mastering segmented address translation.
By the end of this page, you will understand the precise role of the segment number in logical addresses, how it's extracted through bit manipulation, its function as an index into the segment table, the relationship between segment number width and maximum segment count, and architectural variations in segment number encoding across different systems.
A segment number (also called a segment selector or segment identifier) is the portion of a logical address that identifies which segment within a process's address space is being referenced.
More formally:
The segment number is an unsigned integer that serves as an index into the segment table, identifying the specific segment descriptor that contains the base address and limit for the referenced memory region.
This definition encapsulates several critical concepts that we must examine in detail.
The Logical Address Structure:
In a segmented system, a logical address is a tuple (s, d) where:
This is fundamentally different from a linear address where the entire address is a single number. The segment number provides a level of indirection that enables the powerful features of segmentation: variable-sized regions, per-segment protection, and logical organization of program components.
12345678910111213141516171819202122
Logical Address Structure in Segmentation: ┌─────────────────────────────────────────────────────────┐│ Logical Address │├───────────────────────┬─────────────────────────────────┤│ Segment Number │ Offset (d) ││ (s) │ (displacement within seg) │├───────────────────────┼─────────────────────────────────┤│ k bits │ m bits │└───────────────────────┴─────────────────────────────────┘ Example: 16-bit logical address with 4-bit segment number Logical Address: 0x3A5F Binary: 0011 1010 0101 1111 ├──┘ └────────────┤ │ │ │ └── Offset: 0xA5F (2655 in decimal) │ └── Segment Number: 0x3 (segment 3) This address references byte 2655 within segment 3.Unlike page numbers which are merely indices into a flat page table, segment numbers identify logically meaningful program units. A programmer consciously assigns data to segments based on its purpose, access patterns, or protection requirements. This semantic organization is a defining characteristic of segmentation that influences how segment numbers are assigned and used.
Extracting the segment number from a logical address is a fundamental hardware operation that occurs for every memory reference. The extraction must be fast—executing in a single clock cycle—and correct. Understanding the bit manipulation involved is essential for systems programmers and OS developers.
The Extraction Formula:
For a logical address with:
The segment number is extracted using a right shift:
segment_number = logical_address >> m
Alternatively, using bit masking:
segment_number = (logical_address >> m) & ((1 << k) - 1)
The mask ensures we only get the k bits we need, though the right shift alone suffices when the address is unsigned and the shift brings zeros into the high bits.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798
#include <stdio.h>#include <stdint.h> /* * Segment Number Extraction * * This demonstrates how hardware extracts the segment number * from a logical address in a segmented memory system. */ // Configuration for our example segmented architecture#define LOGICAL_ADDR_BITS 16 // Total logical address width#define SEGMENT_BITS 4 // Bits for segment number#define OFFSET_BITS 12 // Bits for offset (16 - 4 = 12) // Derived constants#define MAX_SEGMENTS (1 << SEGMENT_BITS) // 16 segments#define MAX_SEGMENT_SIZE (1 << OFFSET_BITS) // 4096 bytes per segment // Extraction masks#define OFFSET_MASK ((1 << OFFSET_BITS) - 1) // 0x0FFF#define SEGMENT_MASK ((1 << SEGMENT_BITS) - 1) // 0x000F /** * Extract segment number from logical address * This is what the MMU hardware does on every memory access */uint16_t extract_segment_number(uint16_t logical_address) { return (logical_address >> OFFSET_BITS) & SEGMENT_MASK;} /** * Extract offset from logical address */uint16_t extract_offset(uint16_t logical_address) { return logical_address & OFFSET_MASK;} /** * Compose a logical address from segment and offset * (Inverse operation - useful for understanding) */uint16_t compose_logical_address(uint16_t segment, uint16_t offset) { return (segment << OFFSET_BITS) | (offset & OFFSET_MASK);} void demonstrate_extraction(uint16_t logical_address) { uint16_t segment = extract_segment_number(logical_address); uint16_t offset = extract_offset(logical_address); printf("Logical Address: 0x%04X (binary: ", logical_address); for (int i = 15; i >= 0; i--) { printf("%d", (logical_address >> i) & 1); if (i == OFFSET_BITS) printf(" | "); } printf(")"); printf(" Segment Number: %u (0x%X)", segment, segment); printf(" Offset: %u (0x%03X)", offset, offset); printf(" Interpretation: Byte %u within Segment %u ", offset, segment);} int main() { printf("=== Segment Number Extraction Demo ==="); printf("Architecture: %d-bit addresses, %d-bit segment, %d-bit offset", LOGICAL_ADDR_BITS, SEGMENT_BITS, OFFSET_BITS); printf("Maximum segments: %d, Maximum segment size: %d bytes ", MAX_SEGMENTS, MAX_SEGMENT_SIZE); // Test various addresses demonstrate_extraction(0x0000); // Segment 0, offset 0 demonstrate_extraction(0x1000); // Segment 1, offset 0 demonstrate_extraction(0x3A5F); // Segment 3, offset 2655 demonstrate_extraction(0xF123); // Segment 15, offset 291 demonstrate_extraction(0xFFFF); // Segment 15, offset 4095 // Verify composition is inverse of extraction printf("=== Verification ==="); uint16_t seg = 5, off = 1234; uint16_t addr = compose_logical_address(seg, off); printf("Composed (seg=%u, off=%u) -> 0x%04X", seg, off, addr); printf("Extracted back: seg=%u, off=%u", extract_segment_number(addr), extract_offset(addr)); return 0;}| Logical Address | Binary Representation | Segment # | Offset |
|---|---|---|---|
| 0x0000 | 0000 | 000000000000 | 0 | 0 |
| 0x1000 | 0001 | 000000000000 | 1 | 0 |
| 0x2800 | 0010 | 100000000000 | 2 | 2048 |
| 0x3A5F | 0011 | 101001011111 | 3 | 2655 |
| 0xF123 | 1111 | 000100100011 | 15 | 291 |
| 0xFFFF | 1111 | 111111111111 | 15 | 4095 |
In actual hardware, segment number extraction is performed by hardwired shift logic that routes specific address lines to the segment table index inputs. No actual shifting computation occurs—the bits are simply connected to different destinations. This is why the extraction is effectively 'free' in terms of time cost.
The primary purpose of the segment number is to serve as an index into the segment table—a per-process data structure maintained by the operating system that describes the physical location and attributes of each segment.
The Indexing Operation:
Given:
The address of the segment table entry for segment number s is:
STE_address = STBR + (s × E)
This is a simple array indexing operation. If STBR = 0x80000, each entry is 8 bytes, and s = 3:
STE_address = 0x80000 + (3 × 8) = 0x80000 + 24 = 0x80018
The hardware then reads the segment table entry from physical address 0x80018.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154
#include <stdio.h>#include <stdint.h>#include <stdbool.h> /* * Segment Table Structure and Lookup * * Demonstrates how the segment number indexes into the segment table * to retrieve segment metadata for address translation. */ // Segment Table Entry structuretypedef struct { uint32_t base; // Physical base address of segment uint32_t limit; // Size of segment in bytes uint8_t present; // Is segment currently in memory? uint8_t protection; // Access permissions (R/W/X) uint8_t accessed; // Has segment been accessed? uint8_t modified; // Has segment been modified?} SegmentTableEntry; // Protection bit masks#define PROT_READ 0x04#define PROT_WRITE 0x02#define PROT_EXECUTE 0x01 // Simulated segment table (typically one per process)#define MAX_SEGMENTS 16SegmentTableEntry segment_table[MAX_SEGMENTS];uint32_t STBR; // Segment Table Base Register (simulated)uint32_t STLR; // Segment Table Length Register /** * Initialize segment table with sample segments */void initialize_segment_table() { STBR = (uint32_t)(uintptr_t)segment_table; // Point to our table STLR = 5; // We have 5 valid segments // Segment 0: Code segment (Read + Execute) segment_table[0] = (SegmentTableEntry){ .base = 0x00100000, .limit = 8192, .present = 1, .protection = PROT_READ | PROT_EXECUTE, .accessed = 0, .modified = 0 }; // Segment 1: Data segment (Read + Write) segment_table[1] = (SegmentTableEntry){ .base = 0x00200000, .limit = 16384, .present = 1, .protection = PROT_READ | PROT_WRITE, .accessed = 0, .modified = 0 }; // Segment 2: Stack segment (Read + Write) segment_table[2] = (SegmentTableEntry){ .base = 0x00300000, .limit = 4096, .present = 1, .protection = PROT_READ | PROT_WRITE, .accessed = 0, .modified = 0 }; // Segment 3: Shared library (Read + Execute) segment_table[3] = (SegmentTableEntry){ .base = 0x00400000, .limit = 32768, .present = 1, .protection = PROT_READ | PROT_EXECUTE, .accessed = 0, .modified = 0 }; // Segment 4: Heap segment (Read + Write) segment_table[4] = (SegmentTableEntry){ .base = 0x00500000, .limit = 65536, .present = 1, .protection = PROT_READ | PROT_WRITE, .accessed = 0, .modified = 0 };} /** * Look up segment table entry using segment number * Returns pointer to entry, or NULL if invalid segment number */SegmentTableEntry* lookup_segment(uint16_t segment_number) { // Check if segment number is within valid range if (segment_number >= STLR) { printf("ERROR: Segment %u exceeds STLR (%u)", segment_number, STLR); return NULL; } // Calculate entry address (simulated pointer arithmetic) // In hardware: STE_addr = STBR + (segment_number * sizeof(STE)) SegmentTableEntry* base = (SegmentTableEntry*)STBR; return &base[segment_number];} void print_segment_entry(uint16_t seg_num, SegmentTableEntry* entry) { if (!entry) return; printf("Segment %u:", seg_num); printf(" Base Address: 0x%08X", entry->base); printf(" Limit: %u bytes (0x%X)", entry->limit, entry->limit); printf(" Present: %s", entry->present ? "Yes" : "No"); printf(" Protection: %c%c%c", (entry->protection & PROT_READ) ? 'R' : '-', (entry->protection & PROT_WRITE) ? 'W' : '-', (entry->protection & PROT_EXECUTE) ? 'X' : '-'); printf("");} int main() { printf("=== Segment Table Lookup Demo === "); initialize_segment_table(); printf("Segment Table Base Register (STBR): 0x%08X", STBR); printf("Segment Table Length Register (STLR): %u ", STLR); // Look up each valid segment for (uint16_t i = 0; i < STLR; i++) { SegmentTableEntry* entry = lookup_segment(i); print_segment_entry(i, entry); } // Attempt to access invalid segment printf("Attempting to access segment 10..."); lookup_segment(10); // Will print error return 0;}Before indexing, the hardware compares the segment number against the Segment Table Length Register (STLR). If segment_number ≥ STLR, a trap occurs—the program is trying to access a segment that doesn't exist. This provides a first line of defense against invalid memory accesses, before any bounds checking within the segment itself.
The number of bits allocated to the segment number directly determines the maximum number of segments a process can have. This is a fundamental architectural decision with far-reaching implications for system design.
The Relationship:
Maximum Segments = 2^k
Where k is the number of bits in the segment number.
The Tradeoff:
With a fixed logical address width n, increasing k (segment bits) decreases m (offset bits):
m = n - k
This creates a fundamental tradeoff:
| Segment Bits | Offset Bits | Max Segments | Max Segment Size |
|---|---|---|---|
| 2 | 14 | 4 | 16 KB |
| 4 | 12 | 16 | 4 KB |
| 6 | 10 | 64 | 1 KB |
| 8 | 8 | 256 | 256 bytes |
| Segment Bits | Offset Bits | Max Segments | Max Segment Size |
|---|---|---|---|
| 8 | 24 | 256 | 16 MB |
| 12 | 20 | 4,096 | 1 MB |
| 16 | 16 | 65,536 | 64 KB |
| 18 | 14 | 262,144 | 16 KB |
Architectural Considerations:
The choice of segment number width depends on the intended use case:
Few large segments: Suitable for traditional segmentation where each segment represents a major program component (code, data, stack). 4-8 segments may suffice.
Many small segments: Suitable for fine-grained protection or object-based systems where each data structure could be its own segment. Thousands of segments may be needed.
Hybrid approaches: Some architectures support variable segment counts through hierarchical segment tables or per-segment limits on offset width.
The choice of segment number width reflects the system's philosophy. Multics used many segments because it treated each file and data structure as a separate segment. Intel x86 used fewer hardware segments but allowed software to manage more through segment selector tables. Modern systems often flatten segmentation, using minimal segments and relying on paging for fine-grained memory management.
Not all segmented architectures include the segment number directly in every memory address. Two distinct approaches exist: explicit segmentation where the segment is part of the address, and implicit segmentation where the segment is determined by context.
Intel x86 Implicit Segmentation:
The Intel x86 architecture uses implicit segmentation through segment registers. Instead of encoding the segment in each address, the processor maintains segment registers (CS, DS, SS, ES, FS, GS) that are implicitly used based on the type of memory access:
1234567891011121314151617181920212223242526272829303132333435
; Intel x86 Implicit Segmentation Example;; The segment used depends on the instruction and registers involved,; NOT on explicit segment specification in the address section .textglobal _start _start: ; Instruction fetch - implicitly uses CS (Code Segment) ; The CPU fetches this instruction from CS:EIP mov eax, 42 ; Data access - implicitly uses DS (Data Segment) ; Effective address: DS:0x402000 mov ebx, [0x402000] ; Stack access - implicitly uses SS (Stack Segment) ; PUSH uses SS:ESP, then decrements ESP push eax ; Accesses SS:ESP ; Stack-relative access - implicitly uses SS ; EBP-relative accesses default to SS mov ecx, [ebp-4] ; Actually SS:[EBP-4] ; Explicit segment override - programmer forces different segment ; This accesses ES:0x402000 instead of DS:0x402000 mov edx, [es:0x402000] ; FS/GS often used for thread-local storage ; On Linux x86-64, FS points to thread control block mov rax, [fs:0x28] ; Access thread-local canary value ; Key insight: The segment is NOT encoded in the address 0x402000; It's determined by the instruction context or explicit override| Reference Type | Default Segment | Can Override? |
|---|---|---|
| Instruction fetch | CS | No |
| Stack operations (PUSH/POP) | SS | No |
| String destination (ES:DI) | ES | No |
| BP or SP as base register | SS | Yes |
| Other data references | DS | Yes |
| String source (DS:SI) | DS | Yes |
In 64-bit mode (x86-64), segmentation is largely flattened. CS, DS, ES, and SS all have base = 0 and are effectively ignored for address calculation. Only FS and GS retain non-zero bases, typically used for thread-local storage. The segment number still exists in the architecture but has minimal impact on address translation, with paging handling virtually all memory management.
The segment number is the essential first component of address translation in segmented memory systems. It transforms a logical address from an abstract reference into a concrete path to physical memory.
What's Next:
With the segment number extracted and the segment table entry located, the next step in address translation focuses on the offset within segment. The offset specifies exactly which byte within the identified segment is being accessed. Combined with the segment's base address, the offset will help form the final physical address—but first, it must pass bounds checking against the segment's limit.
You now understand the segment number—the first half of every segmented address. You can extract it from logical addresses, use it to index segment tables, and appreciate the tradeoffs in segment number width. Next, we'll explore the offset component and how it pinpoints the exact memory location within a segment.