Logical Vs Physical Addresses - Learning Module

Loading content...

0/240

Address Translation

The Translation That Powers Every Memory Access

Every time your program reads a variable, calls a function, or accesses an array element, a remarkable transformation occurs—and it happens billions of times per second without you ever noticing. The CPU generates a logical address, but memory chips understand only physical addresses. Something must bridge this gap, translating every logical address into its corresponding physical location in real time.

This process—address translation—is the mechanism that makes modern multiprogramming, memory protection, and virtual memory possible. It's performed on every single memory access: instruction fetches, data reads, data writes, stack operations. At 3+ GHz with multiple memory accesses per clock, we're talking about tens of billions of translations per second. Understanding how this works is fundamental to understanding modern computer systems.

What You Will Learn

By the end of this page, you will understand the general principles of address translation, the evolution from simple relocation to modern paging, how translation enables memory protection and sharing, the critical performance requirements of translation hardware, and the foundational concepts that underpin page tables and the MMU.

The Concept of Address Translation

Address translation is the process of converting a logical (or virtual) address generated by the CPU into a physical address that can be used to access actual memory hardware.

Formally:

Address translation is a function T: L → P ∪ {⊥} that maps logical addresses (L) to either physical addresses (P) or an error state (⊥) indicating an invalid or protected access.

The inclusion of the error state ⊥ is crucial—not every logical address maps to a valid physical location. Invalid mappings result in what we commonly call segmentation faults or access violations.

Essential Properties of Address Translation

•Transparency: The translation happens without program awareness. From the program's perspective, memory access works as if logical addresses directly accessed physical memory.
•Speed: Translation must add minimal latency to memory access. Hardware support is essential—software-only translation would be prohibitively slow.
•Flexibility: The mapping can be arbitrary within constraints. Logical page 0 can map to physical frame 500; logical page 1 to frame 3. There's no required relationship.
•Context-Dependent: The translation function changes when switching between processes. Logical address 0x1000 in Process A maps differently than 0x1000 in Process B.
•Protection-Aware: Translation can refuse certain accesses based on permissions. A read-only page cannot be written even if the logical address is valid.

Converting Mermaid diagram...

The Translation Guarantee:

A well-designed translation system provides crucial guarantees:

Isolation: Translations for different processes are independent. No logical address in Process A can accidentally map to physical memory owned by Process B.
Protection: Even within a process, different regions can have different permissions (read, write, execute). The translation layer enforces these.
Completeness: Every legal memory access by a valid program will be translated. There are no 'gaps' in the translation mechanism.
Determinism: Given the same translation tables and logical address, the result is always the same physical address (or same error).

When Translation Fails

When translation maps to ⊥ (error), the hardware raises an exception (fault). The OS handles this fault: it might be a true error (bad pointer), or it might be a demand-paging situation where the OS needs to load data from disk. The fault handler's response depends on why the translation failed.

Historical Evolution of Address Translation

Address translation has evolved through several stages, each adding capabilities while maintaining backward compatibility with simpler schemes. Understanding this evolution reveals why modern systems have the complexity they do.

Evolution of Address Translation Mechanisms
Era	Mechanism	Translation Formula	Capabilities Gained
Pre-1960s	None (Absolute)	Physical = Logical	Direct hardware access
1960s	Base Register	Physical = Base + Logical	Relocation, simple multiprogramming
1960s-70s	Base + Limit	Physical = Base + Logical (if Logical < Limit)	Protection added
1970s	Segmentation	Physical = SegBase[s] + offset (if offset < SegLimit[s])	Multiple regions per process
1970s-present	Paging	Physical = FrameNumber × PageSize + Offset	No external fragmentation, efficient sharing
1980s-present	Segmentation with Paging	Two-stage translation	Combined benefits (historically Intel x86)
Modern	Multi-Level Paging	Hierarchical page table walk	Huge address spaces, sparse allocation

The Base Register Revolution:

The simplest useful translation is:

Physical Address = Base Register + Logical Address

This single addition, performed by hardware on every memory access, enabled:

Programs to be written as if starting at address 0
The OS to load programs anywhere in physical memory
Multiple programs to coexist without address conflicts

The limitation: the entire program must occupy a contiguous physical region. As programs grew, finding large contiguous regions became difficult.

base_register_translation.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
/*
 * Base Register Translation
 * The simplest form of address translation
 */
 
#include <stdint.h>
#include <stdio.h>
#include <stdbool.h>
 
// Hardware register - set by OS during context switch
uint64_t base_register = 0;
 
// Translation function
uint64_t translate_base(uint64_t logical_addr) {
    /*
     * Translation: Add base to logical address
     * 
     * This is performed by hardware on every memory access.
     * The single addition introduces ~1 cycle of latency.
     */
    return base_register + logical_addr;
}
 
/*
 * Example execution:
 * 
 * Process A loaded at physical address 0x100000:
 *   base_register = 0x100000
 *   Logical 0x0000 → Physical 0x100000
 *   Logical 0x1234 → Physical 0x101234
 * 
 * Context switch to Process B at physical address 0x500000:
 *   base_register = 0x500000
 *   Logical 0x0000 → Physical 0x500000
 *   Logical 0x1234 → Physical 0x501234
 * 
 * Same logical address, different physical location!
 */
 
/*
 * Limitation: No protection!
 * A program can access ANY physical address by using
 * a sufficiently large (or negative) logical address.
 * 
 * Logical 0xFFFFFFFF (if base = 0x100000):
 *   Physical = 0x100000 + 0xFFFFFFFF = overflows!
 *   Or wraps to address outside program's region.
 * 
 * This is why we need the LIMIT register.
 */

From Simple Translation to Paging:

Base register translation's fundamental limitation is the need for contiguous physical memory. As systems ran more programs and used more memory, external fragmentation made large contiguous allocations rare.

Paging solved this by:

Dividing logical address space into fixed-size pages (typically 4 KB)
Dividing physical memory into same-size frames
Allowing each page to map to any frame independently

Now a program's logical space appears contiguous, but its physical frames can be scattered throughout RAM. This is the translation mechanism that dominates modern systems.

The Key Insight of Paging

Paging decouples the granularity of allocation (a whole page) from the granularity of contiguity (also a page). You need contiguous physical memory only within a single page—and pages are small enough that this is never a problem. The logical-to-physical mapping handles the rest.

The Translation Process in Detail

Let's examine exactly how address translation works in a paging-based system—the dominant approach in modern operating systems.

Page-Based Translation Steps:

Extract Page Number and Offset: The logical address is split into two parts:
- Page Number (p): Identifies which page the address is in
- Page Offset (d): Position within the page
Look Up Frame Number: Use the page number as an index into the page table to find the corresponding physical frame number.
Combine Frame Number and Offset: The physical address is formed by:
- Physical Address = (Frame Number × Page Size) + Page Offset
- Or equivalently: Concatenate frame number with offset
Check Permissions: The page table entry also contains permission bits. The translation fails if access violates permissions.

Converting Mermaid diagram...

paging_translation.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
/*
 * Page-Based Address Translation
 * 
 * This is the fundamental translation mechanism of modern systems.
 * The actual implementation is in hardware (MMU), but the logic is:
 */
 
#include <stdint.h>
#include <stdio.h>
#include <stdbool.h>
 
#define PAGE_SIZE       4096        // 4 KB = 2^12 bytes
#define PAGE_SHIFT      12          // log2(PAGE_SIZE)
#define PAGE_MASK       0xFFF       // Mask for offset (12 bits)
 
#define PTE_PRESENT     (1 << 0)    // Page is in physical memory
#define PTE_WRITABLE    (1 << 1)    // Page can be written
#define PTE_USER        (1 << 2)    // Page accessible from user mode
#define PTE_ACCESSED    (1 << 5)    // Page has been read
#define PTE_DIRTY       (1 << 6)    // Page has been written
 
typedef uint64_t PageTableEntry;
 
// The page table - indexed by page number
// In reality, this is a hierarchical structure (multi-level paging)
PageTableEntry page_table[1048576];  // 2^20 entries for 32-bit address
 
// Extract page number from logical address
uint64_t get_page_number(uint64_t logical_addr) {
    return logical_addr >> PAGE_SHIFT;
}
 
// Extract page offset from logical address
uint64_t get_page_offset(uint64_t logical_addr) {
    return logical_addr & PAGE_MASK;
}
 
// Extract frame number from page table entry
uint64_t get_frame_number(PageTableEntry pte) {
    return pte >> PAGE_SHIFT;  // Frame number stored in upper bits
}
 
// Translate logical to physical address
typedef struct {
    bool valid;
    uint64_t physical_addr;
    const char* error;
} TranslationResult;
 
TranslationResult translate(uint64_t logical_addr, bool is_write, bool is_user) {
    TranslationResult result = {false, 0, NULL};
    
    // Step 1: Split logical address
    uint64_t page_num = get_page_number(logical_addr);
    uint64_t offset = get_page_offset(logical_addr);
    
    // Step 2: Look up page table entry
    PageTableEntry pte = page_table[page_num];
    
    // Step 3: Check if page is present
    if (!(pte & PTE_PRESENT)) {
        result.error = "Page fault: page not present";
        return result;  // Triggers page fault exception
    }
    
    // Step 4: Check permissions
    if (is_write && !(pte & PTE_WRITABLE)) {
        result.error = "Protection fault: page not writable";
        return result;  // Triggers protection fault
    }
    
    if (is_user && !(pte & PTE_USER)) {
        result.error = "Protection fault: kernel page from user mode";
        return result;  // Triggers protection fault
    }
    
    // Step 5: Extract frame number and form physical address
    uint64_t frame_num = get_frame_number(pte);
    uint64_t physical_addr = (frame_num << PAGE_SHIFT) | offset;
    
    // Step 6: Update accessed/dirty bits
    page_table[page_num] |= PTE_ACCESSED;
    if (is_write) {
        page_table[page_num] |= PTE_DIRTY;
    }
    
    result.valid = true;
    result.physical_addr = physical_addr;
    return result;
}
 
/*
 * Example walkthrough:
 * 
 * Logical Address: 0x12345678
 * Page Size: 4 KB (4096 bytes)
 * 
 * Step 1: Split address
 *   Page Number = 0x12345678 >> 12 = 0x12345
 *   Page Offset = 0x12345678 & 0xFFF = 0x678
 * 
 * Step 2: Look up page_table[0x12345]
 *   Suppose entry contains: 0x7ABCD003
 *   (Frame 0x7ABCD, Present + Writable flags)
 * 
 * Step 3: Check present bit (0x003 & 0x001 = 1) ✓
 * 
 * Step 4: Check permissions ✓
 * 
 * Step 5: Form physical address
 *   Frame Number = 0x7ABCD003 >> 12 = 0x7ABCD
 *   Physical Address = (0x7ABCD << 12) | 0x678 = 0x7ABCD678
 * 
 * Translation: 0x12345678 → 0x7ABCD678
 */
 
void demonstrate_translation() {
    // Set up a sample page table entry
    // Logical page 0x12345 maps to physical frame 0x7ABCD
    page_table[0x12345] = (0x7ABCDULL << PAGE_SHIFT) | PTE_PRESENT | PTE_WRITABLE | PTE_USER;
    
    uint64_t logical = 0x12345678;
    TranslationResult result = translate(logical, false, true);
    
    if (result.valid) {
        printf("Logical 0x%08llx -> Physical 0x%08llx
",
               (unsigned long long)logical,
               (unsigned long long)result.physical_addr);
    } else {
        printf("Translation failed: %s
", result.error);
    }
}

Hardware vs. Software

The translation logic shown above is implemented in hardware (the MMU). Software never performs this calculation during normal operation—it would be far too slow. The OS's role is to set up the page tables; the hardware consults them automatically on every memory access.

Translation and Memory Protection

Address translation does far more than convert addresses—it's the enforcement mechanism for memory protection. Every page table entry contains permission bits that the MMU checks on every access.

Common Page Table Entry Permission Bits
Bit	Name	Meaning if Set	Typical Use
P	Present	Page is in physical memory	Set for valid mappings
R/W	Read/Write	Page is writable	Clear for read-only code/data
U/S	User/Supervisor	Page accessible from user mode	Clear for kernel-only pages
PWT	Page Write-Through	Cache writes go directly to memory	Device memory, not normal RAM
PCD	Page Cache Disable	Do not cache this page	MMIO regions
A	Accessed	Page has been accessed (read/written)	OS uses for page replacement
D	Dirty	Page has been written	OS knows what needs writing to disk
NX/XD	No Execute	Page cannot be executed	Data pages, stack (security)

Protection in Action:

When the MMU translates an address, it simultaneously checks:

Is the page present? If not → Page Fault
Is this a write to a read-only page? If yes → Protection Fault
Is this user-mode access to a supervisor page? If yes → Protection Fault
Is this an execute on a no-execute page? If yes → Protection Fault

These checks happen in hardware, at no additional cost beyond the translation itself. This is how modern systems provide security—every memory access is validated before it occurs.

protection_scenarios.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
/*
 * Memory Protection Scenarios
 * 
 * The translation mechanism enforces all of these protections
 * in hardware, at full memory bus speed.
 */
 
/*
 * SCENARIO 1: Null Pointer Dereference
 * 
 * The first page (addresses 0x0000 - 0x0FFF) is intentionally
 * not mapped in any process. Accessing it causes a fault.
 */
void null_pointer_example() {
    int *ptr = NULL;  // Address 0x0
    // int x = *ptr;  // Would trigger: page_table[0] has Present=0
                      // Result: Page Fault, process terminated
}
 
/*
 * SCENARIO 2: Writing to Read-Only Memory (Code Segment)
 * 
 * The text (code) segment is mapped as read-only + execute.
 * Attempting to write causes a protection fault.
 */
extern void some_function(void);
void modify_code_example() {
    // Function code is at some address, e.g., 0x401000
    // Page table entry for page 0x401 has: R/W = 0 (read-only)
    
    // unsigned char* code = (unsigned char*)some_function;
    // *code = 0x90;  // Would trigger: write to read-only page
                      // Result: Protection Fault
}
 
/*
 * SCENARIO 3: Accessing Kernel Memory from User Space
 * 
 * Kernel memory is mapped with U/S = 0 (supervisor only).
 * User-mode code cannot access it.
 */
void kernel_access_example() {
    // Kernel memory often at high addresses, e.g., 0xFFFF8000...
    // Page table entry has: U/S = 0 (supervisor only)
    
    // int* kernel_data = (int*)0xFFFF800000000000;
    // int x = *kernel_data;  // Would trigger: user access to supervisor page
                              // Result: Protection Fault (general protection fault)
}
 
/*
 * SCENARIO 4: Executing Data (Buffer Overflow Defense)
 * 
 * Stack and heap are mapped with NX (No Execute) bit set.
 * Jump to stack/heap causes execution fault.
 */
void execute_data_example() {
    // Stack is at high addresses, e.g., 0x7FFF...
    // Page table entry has: NX = 1 (no execute)
    
    unsigned char shellcode[] = { 0x90, 0x90, 0xC3 };  // NOP NOP RET
    // void (*func)(void) = (void(*)(void))shellcode;
    // func();  // Would trigger: execute on non-executable page
               // Result: Protection Fault (prevents code injection attacks)
}
 
/*
 * SCENARIO 5: Stack Guard Page
 * 
 * A guard page is placed at the stack's maximum extent.
 * Stack overflow hits the guard page before corrupting memory.
 */
void stack_overflow_example() {
    // If function recurses too deeply, stack grows toward guard page
    // Guard page has: Present = 0 or no mapping
    
    // Accessing guard page triggers fault before stack
    // corrupts adjacent memory regions
    
    // stack_overflow_example();  // Eventually hits guard page
                                  // Result: Page Fault, SIGSEGV
}
 
/*
 * All these protections are enforced by the SAME mechanism:
 * 
 * 1. CPU generates logical address
 * 2. MMU looks up page table entry
 * 3. MMU checks permissions against access type
 * 4. If check fails: raise exception, OS handles
 * 5. If check passes: physical access proceeds
 * 
 * Zero overhead for valid accesses; instant detection of violations.
 */

The NX Bit: A Security Landmark

The No-Execute bit, introduced on AMD64 and adopted by Intel as XD (Execute Disable), was a crucial security enhancement. Before NX, any memory page could be executed. Buffer overflow attacks could inject code into the stack or heap and jump to it. With NX, only pages explicitly marked executable can run code—most buffer overflow exploits fail immediately. This single bit in the page table entry has prevented millions of attacks.

Translation Enables Controlled Sharing

While address translation provides isolation by default—each process has its own mappings—it also enables the opposite: controlled sharing of physical memory between processes. This is achieved by having multiple page table entries point to the same physical frame.

Converting Mermaid diagram...

Types of Shared Memory:

1. Shared Libraries (Read-Only Sharing)

When multiple processes use the same library (libc, libm, GUI libraries), loading separate copies would waste physical memory. Instead:

The library code is loaded once into physical frames
Each process maps those frames into its logical address space
Mappings are read-only + executable
100 processes sharing libc use the same ~2 MB of physical RAM

2. Inter-Process Communication (Read-Write Sharing)

Processes can explicitly share memory regions for fast communication:

A shared memory segment is allocated
Both processes map it into their address spaces
Mappings are read-write
Changes by one process are instantly visible to the other

3. Copy-on-Write (Lazy Sharing)

After fork(), parent and child share all memory:

Pages are marked read-only even if originally writable
When either process writes, a page fault occurs
The OS copies the page, giving each process its own copy
Only modified pages are copied, saving memory and time

copy_on_write.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
/*
 * Copy-on-Write: Fork Without Full Copy
 * 
 * One of the most elegant uses of address translation
 */
 
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
 
int main() {
    // Allocate 100 MB of data
    size_t size = 100 * 1024 * 1024;
    char* data = malloc(size);
    for (size_t i = 0; i < size; i++) {
        data[i] = 'A';  // Initialize all to 'A'
    }
    
    printf("Before fork: 100 MB allocated and initialized
");
    
    /*
     * WHAT HAPPENS AT FORK:
     * 
     * Without COW (hypothetically):
     *   - OS would copy all 100 MB
     *   - Both processes would have identical but separate copies
     *   - 200 MB of physical memory used
     *   - Fork takes time proportional to memory size
     * 
     * With COW (reality):
     *   - OS copies only page TABLE, not pages themselves
     *   - Both processes' page tables point to SAME physical frames
     *   - All pages marked READ-ONLY
     *   - 100 MB of physical memory used
     *   - Fork takes O(1) time (just duplicate small metadata)
     */
    
    pid_t pid = fork();
    
    if (pid == 0) {
        // Child process
        printf("Child: About to write to data[0]...
");
        
        /*
         * WHEN CHILD WRITES:
         *
         * 1. Child tries to write to data[0] (first page)
         * 2. Page is marked read-only, so MMU triggers fault
         * 3. OS sees it's a COW page (has special flag in kernel)
         * 4. OS allocates NEW frame for this one page
         * 5. OS copies page content to new frame
         * 6. OS updates child's page table to point to new frame
         * 7. OS marks the page writable in child
         * 8. Child's write now succeeds
         * 
         * Result: Only 4 KB copied, not 100 MB!
         */
        data[0] = 'B';
        
        printf("Child: data[0] = %c (modified)
", data[0]);
        printf("Child: data[1000000] = %c (still shared with parent)
", data[1000000]);
        
        _exit(0);
    } else {
        // Parent process
        wait(NULL);  // Wait for child
        
        /*
         * Parent's view:
         * 
         * - data[0] is still 'A' (parent has original page)
         * - data[1000000] was never written, so it's still shared
         * 
         * Physical memory usage: Original 100 MB + 4 KB (one copied page)
         * Without COW: Would be 200 MB
         */
        printf("Parent: data[0] = %c (unchanged)
", data[0]);
        printf("Parent: data[1000000] = %c (was never written by child)
", data[1000000]);
    }
    
    free(data);
    return 0;
}
 
/*
 * Sample output:
 * 
 * Before fork: 100 MB allocated and initialized
 * Child: About to write to data[0]...
 * Child: data[0] = B (modified)
 * Child: data[1000000] = A (still shared with parent)
 * Parent: data[0] = A (unchanged)
 * Parent: data[1000000] = A (was never written by child)
 * 
 * Key insight: Most forked processes (like shells spawning commands)
 * immediately exec() a new program, replacing all memory. With COW,
 * we never copy any pages that would just be discarded!
 */

The Economics of Sharing

On a Linux server running 100 processes, shared library mappings save gigabytes of RAM. The C library alone might be 2 MB; without sharing, that's 200 MB. With sharing, it's 2 MB total. This same principle applies to the kernel (mapped into every process), runtime frameworks, and common data files.

Performance of Address Translation

Translation happens on every memory access—if it adds significant latency, the entire system slows down. Understanding the performance implications is crucial for systems programmers and architects.

The Problem: Table Lookup

A simple page table lookup seems straightforward, but modern systems have multi-level page tables (typically 4 or 5 levels). Each level requires a memory access to read the table entry. This means a single address translation could require 4-5 memory accesses—each taking 50-100 nanoseconds!

If every memory access required 4 additional memory accesses for translation, systems would run at ~20% of their potential speed. This is unacceptable.

The Solution: Translation Lookaside Buffer (TLB)

The TLB is a small, fast hardware cache that stores recent translations. Instead of walking page tables for every access:

Check TLB first (1 cycle)
If translation is cached (TLB hit): Use it immediately
If not cached (TLB miss): Walk page tables, cache the result

With TLB hit rates of 99%+, the average translation cost is close to zero.

TLB Performance Characteristics
Property	Typical Value	Impact
TLB Size	64-1536 entries	More entries = higher hit rate
TLB Hit Time	0.5-1 cycles	Essentially free
TLB Miss Penalty	10-100 cycles	Page table walk cost
TLB Hit Rate	95-99%+	Critical for performance
TLB Reach	Entries × Page Size	How much memory is 'cached'
TLB Flush Cost	Hundreds of cycles	Context switches hurt

effective_memory_access_time.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
/*
 * Calculating Effective Memory Access Time
 * 
 * This is a key formula for understanding translation overhead
 */
 
#include <stdio.h>
 
/*
 * Variables:
 *   m = Memory access time (e.g., 100 ns to main memory)
 *   t = TLB lookup time (e.g., 1 ns)
 *   h = TLB hit rate (e.g., 0.99 for 99%)
 * 
 * Without TLB (4-level paging):
 *   EAT = 4m + m = 5m  (4 table lookups + 1 actual access)
 *       = 5 × 100 ns = 500 ns
 * 
 * With TLB:
 *   EAT = h × (t + m) + (1-h) × (4m + m + t)
 *       = h × (t + m) + (1-h) × (5m + t)
 * 
 * Example with 99% hit rate:
 *   EAT = 0.99 × (1 + 100) + 0.01 × (500 + 1)
 *       = 0.99 × 101 + 0.01 × 501
 *       = 99.99 + 5.01
 *       = 105 ns
 * 
 * Overhead: 5 ns = 5% (vs 400% without TLB)
 */
 
void calculate_eat(double hit_rate, double memory_time, double tlb_time, int page_table_levels) {
    // With TLB hit: TLB lookup + memory access
    double hit_cost = tlb_time + memory_time;
    
    // With TLB miss: TLB lookup + page table walk + memory access
    double miss_cost = tlb_time + (page_table_levels * memory_time) + memory_time;
    
    // Effective Access Time
    double eat = (hit_rate * hit_cost) + ((1.0 - hit_rate) * miss_cost);
    
    // Compare to direct access (no translation)
    double overhead = ((eat - memory_time) / memory_time) * 100;
    
    printf("=== Effective Access Time Calculation ===
");
    printf("TLB Hit Rate:       %.1f%%
", hit_rate * 100);
    printf("Memory Access Time: %.0f ns
", memory_time);
    printf("TLB Lookup Time:    %.0f ns
", tlb_time);
    printf("Page Table Levels:  %d
 
", page_table_levels);
    
    printf("TLB Hit Cost:       %.0f ns
", hit_cost);
    printf("TLB Miss Cost:      %.0f ns
", miss_cost);
    printf("Effective Access:   %.1f ns
", eat);
    printf("Translation Overhead: %.1f%%
", overhead);
}
 
int main() {
    printf("--- Scenario 1: Excellent TLB (99%% hit rate) ---
");
    calculate_eat(0.99, 100.0, 1.0, 4);
    
    printf("
--- Scenario 2: Poor TLB (80%% hit rate) ---
");
    calculate_eat(0.80, 100.0, 1.0, 4);
    
    printf("
--- Scenario 3: With Page Table Caching ---
");
    // Page table entries often in L2/L3 cache, faster than main memory
    printf("If page table entries are cached (avg 20ns access):
");
    double cached_eat = 0.99 * 101 + 0.01 * (1 + 4*20 + 100);
    printf("Effective Access: %.1f ns (even better!)
", cached_eat);
    
    return 0;
}
 
/*
 * Output:
 * 
 * --- Scenario 1: Excellent TLB (99% hit rate) ---
 * TLB Hit Rate:       99.0%
 * Memory Access Time: 100 ns
 * TLB Lookup Time:    1 ns
 * Page Table Levels:  4
 * 
 * TLB Hit Cost:       101 ns
 * TLB Miss Cost:      501 ns
 * Effective Access:   105.0 ns
 * Translation Overhead: 5.0%
 * 
 * --- Scenario 2: Poor TLB (80% hit rate) ---
 * TLB Hit Rate:       80.0%
 * Memory Access Time: 100 ns
 * TLB Lookup Time:    1 ns
 * Page Table Levels:  4
 * 
 * TLB Hit Cost:       101 ns
 * TLB Miss Cost:      501 ns
 * Effective Access:   181.0 ns
 * Translation Overhead: 81.0%
 * 
 * Key Insight: TLB hit rate is CRITICAL for performance!
 */

Context Switch TLB Flush

When switching between processes, the TLB typically must be flushed because translations are per-process. This means the next process starts with an empty TLB and must suffer misses until it warms up. Frequent context switches thus hurt memory performance significantly. Modern CPUs use Address Space IDs (ASIDs) or Process Context IDs (PCIDs) to allow some TLB entries to persist across switches.

Translation in Multi-Level Systems

Modern 64-bit systems use multi-level page tables (typically 4 or 5 levels) to support vast address spaces while keeping memory overhead manageable. This introduces complexity into the translation process.

Why Multi-Level?

A single-level page table for a 48-bit address space with 4 KB pages would need:

2^(48-12) = 2^36 entries (about 68 billion)
At 8 bytes per entry: 512 GB just for the page table!

By using multiple levels, unused regions don't need table entries at all. A sparse address space (common in reality) has most upper-level entries marked 'not present', and the corresponding lower-level tables don't exist.

x86-64 4-Level Paging:

PML4 (Page Map Level 4): 512 entries, each covering 512 GB
PDPT (Page Directory Pointer Table): 512 entries, each covering 1 GB
PD (Page Directory): 512 entries, each covering 2 MB
PT (Page Table): 512 entries, each covering 4 KB

Each level adds flexibility: if an entire 1 GB region is unmapped, we don't need any of the lower tables for it.

multilevel_translation.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
/*
 * Multi-Level Page Table Walk (x86-64 4-Level Paging)
 * 
 * This shows the full translation process from virtual to physical address.
 */
 
#include <stdint.h>
#include <stdio.h>
#include <stdbool.h>
 
#define PAGE_SIZE 4096
#define PAGE_SHIFT 12
 
// Each page table has 512 entries (9 bits of address)
#define ENTRIES_PER_TABLE 512
#define ENTRY_BITS 9
 
// Entry flags (same for all levels)
#define PTE_PRESENT   (1ULL << 0)
#define PTE_WRITABLE  (1ULL << 1)
#define PTE_USER      (1ULL << 2)
#define PTE_PS        (1ULL << 7)   // Page Size (for huge pages)
 
typedef uint64_t PTE;
 
// Page table structure (one 4KB page of 512 8-byte entries)
typedef struct {
    PTE entries[ENTRIES_PER_TABLE];
} PageTable;
 
// Extract index for each level from virtual address
// Virtual address format for 4-level paging (48-bit canonical):
//   Bits 47-39: PML4 index (9 bits)
//   Bits 38-30: PDPT index (9 bits)
//   Bits 29-21: PD index (9 bits)
//   Bits 20-12: PT index (9 bits)
//   Bits 11-0:  Page offset (12 bits)
 
uint16_t get_pml4_index(uint64_t vaddr) {
    return (vaddr >> 39) & 0x1FF;  // Bits 47-39
}
 
uint16_t get_pdpt_index(uint64_t vaddr) {
    return (vaddr >> 30) & 0x1FF;  // Bits 38-30
}
 
uint16_t get_pd_index(uint64_t vaddr) {
    return (vaddr >> 21) & 0x1FF;  // Bits 29-21
}
 
uint16_t get_pt_index(uint64_t vaddr) {
    return (vaddr >> 12) & 0x1FF;  // Bits 20-12
}
 
uint16_t get_page_offset(uint64_t vaddr) {
    return vaddr & 0xFFF;          // Bits 11-0
}
 
// Get physical address of next-level table from a PTE
uint64_t get_table_address(PTE entry) {
    // Physical address is in bits 51-12 (40 bits), aligned to 4KB
    return entry & 0x000FFFFFFFFFF000ULL;
}
 
// Simulated memory read (in real hardware, this goes to memory bus)
PageTable* read_table_from_memory(uint64_t physical_address) {
    // In reality, this accesses physical memory
    // For simulation, return a pointer
    return (PageTable*)(uintptr_t)physical_address;
}
 
// Full 4-level translation
typedef struct {
    bool valid;
    uint64_t physical_address;
    const char* error;
} TranslationResult;
 
TranslationResult translate_4level(uint64_t cr3, uint64_t virtual_address) {
    TranslationResult result = {false, 0, NULL};
    
    printf("
Translating virtual address: 0x%016llx
", 
           (unsigned long long)virtual_address);
    
    // CR3 holds the physical address of the PML4 table
    uint64_t pml4_base = cr3 & 0x000FFFFFFFFFF000ULL;
    printf("  CR3 (PML4 base): 0x%016llx
", (unsigned long long)pml4_base);
    
    // Level 1: PML4
    uint16_t pml4_idx = get_pml4_index(virtual_address);
    printf("  PML4 index: %d
", pml4_idx);
    
    // (Simplified - in reality would read from physical memory)
    // PageTable* pml4 = read_table_from_memory(pml4_base);
    // PTE pml4_entry = pml4->entries[pml4_idx];
    // For demonstration, assume entry is present
    
    // Continue through PDPT, PD, PT...
    // (Each level follows the same pattern)
    
    uint16_t pdpt_idx = get_pdpt_index(virtual_address);
    uint16_t pd_idx = get_pd_index(virtual_address);
    uint16_t pt_idx = get_pt_index(virtual_address);
    uint16_t offset = get_page_offset(virtual_address);
    
    printf("  PDPT index: %d
", pdpt_idx);
    printf("  PD index:   %d
", pd_idx);
    printf("  PT index:   %d
", pt_idx);
    printf("  Offset:     0x%03x
", offset);
    
    // Final translation (simulated):
    // physical_frame = PT[pt_idx] >> PAGE_SHIFT
    // physical_address = (physical_frame << PAGE_SHIFT) | offset
    
    result.valid = true;
    // result.physical_address = simulated_physical;
    
    return result;
}
 
/*
 * Example page table walk:
 * 
 * Virtual Address: 0x00007FFFCBA98765
 * 
 * 1. CR3 = 0x12345000 (PML4 at physical 0x12345000)
 * 2. PML4 index = (0x7FFFCBA98765 >> 39) & 0x1FF = 255
 *    Read PML4[255] from physical 0x12345000 + 255*8
 *    Suppose it contains 0x67890003 (present, writable, user)
 *    PDPT is at physical 0x67890000
 * 
 * 3. PDPT index = (0x7FFFCBA98765 >> 30) & 0x1FF = 511
 *    Read PDPT[511] from physical 0x67890000 + 511*8
 *    Suppose it contains 0xABCDE003
 *    PD is at physical 0xABCDE000
 * 
 * 4. PD index = (0x7FFFCBA98765 >> 21) & 0x1FF = 477
 *    Read PD[477] from physical 0xABCDE000 + 477*8
 *    Suppose it contains 0x11111003
 *    PT is at physical 0x11111000
 * 
 * 5. PT index = (0x7FFFCBA98765 >> 12) & 0x1FF = 152
 *    Read PT[152] from physical 0x11111000 + 152*8
 *    Suppose it contains 0x22222003
 *    Frame is at physical 0x22222000
 * 
 * 6. Offset = 0x7FFFCBA98765 & 0xFFF = 0x765
 *    Physical Address = 0x22222000 + 0x765 = 0x22222765
 * 
 * Total: 4 memory accesses for translation + 1 for actual data = 5 accesses
 * (Without TLB, this would be devastating for performance!)
 */

Hardware Page Table Walker

Modern CPUs include dedicated hardware called the Page Table Walker (PTW) or Page Miss Handler (PMH). On a TLB miss, the PTW autonomously reads through the page table levels, often prefetching likely-needed entries. Some CPUs can have multiple outstanding page walks in parallel, hiding much of the latency. The PTW is one of the more complex pieces of memory hardware.

Summary: Address Translation

We've explored address translation—the mechanism that bridges logical and physical address spaces, happening billions of times per second in your computer right now.

Key Takeaways

•Address translation converts logical addresses to physical addresses on every memory access, enabling the abstraction that programs depend on.
•Translation evolved from simple base registers to sophisticated multi-level paging, each step adding capabilities while maintaining performance.
•In paging, addresses are split into page number and offset—the page table provides frame numbers, and the offset transfers directly.
•Translation enforces memory protection—permission bits in page table entries are checked on every access, providing security at hardware speed.
•Controlled sharing is enabled by mapping multiple PTEs to the same frame—this powers shared libraries, IPC, and copy-on-write semantics.
•The TLB is critical for performance—caching translations avoids the multi-access cost of page table walks for most operations.
•Multi-level page tables handle vast address spaces—unused regions cost nothing because absent upper-level entries eliminate lower-level tables.

What's Next:

We've described how translation works conceptually. Next, we'll examine the hardware that makes it all possible: the Memory Management Unit (MMU). This specialized processor component performs translations, checks permissions, handles TLB management, and raises faults—all at memory bus speed.

Page Complete

You now understand address translation as the bridge between logical and physical address spaces. This knowledge is essential for understanding how operating systems provide isolation, protection, and virtual memory. The next step is to examine the hardware—the MMU—that performs this translation in real time, enabling the performance that modern systems require.

Address Translation

The Translation That Powers Every Memory Access

What You Will Learn

The Concept of Address Translation

Address translation is the process of converting a logical (or virtual) address generated by the CPU into a physical address that can be used to access actual memory hardware.

Formally:

Address translation is a function T: L → P ∪ {⊥} that maps logical addresses (L) to either physical addresses (P) or an error state (⊥) indicating an invalid or protected access.

Essential Properties of Address Translation

•Transparency: The translation happens without program awareness. From the program's perspective, memory access works as if logical addresses directly accessed physical memory.
•Speed: Translation must add minimal latency to memory access. Hardware support is essential—software-only translation would be prohibitively slow.
•Flexibility: The mapping can be arbitrary within constraints. Logical page 0 can map to physical frame 500; logical page 1 to frame 3. There's no required relationship.
•Context-Dependent: The translation function changes when switching between processes. Logical address 0x1000 in Process A maps differently than 0x1000 in Process B.
•Protection-Aware: Translation can refuse certain accesses based on permissions. A read-only page cannot be written even if the logical address is valid.

Converting Mermaid diagram...

The Translation Guarantee:

A well-designed translation system provides crucial guarantees:

Isolation: Translations for different processes are independent. No logical address in Process A can accidentally map to physical memory owned by Process B.
Protection: Even within a process, different regions can have different permissions (read, write, execute). The translation layer enforces these.
Completeness: Every legal memory access by a valid program will be translated. There are no 'gaps' in the translation mechanism.
Determinism: Given the same translation tables and logical address, the result is always the same physical address (or same error).

When Translation Fails

Historical Evolution of Address Translation

Evolution of Address Translation Mechanisms
Era	Mechanism	Translation Formula	Capabilities Gained
Pre-1960s	None (Absolute)	Physical = Logical	Direct hardware access
1960s	Base Register	Physical = Base + Logical	Relocation, simple multiprogramming
1960s-70s	Base + Limit	Physical = Base + Logical (if Logical < Limit)	Protection added
1970s	Segmentation	Physical = SegBase[s] + offset (if offset < SegLimit[s])	Multiple regions per process
1970s-present	Paging	Physical = FrameNumber × PageSize + Offset	No external fragmentation, efficient sharing
1980s-present	Segmentation with Paging	Two-stage translation	Combined benefits (historically Intel x86)
Modern	Multi-Level Paging	Hierarchical page table walk	Huge address spaces, sparse allocation

The Base Register Revolution:

The simplest useful translation is:

Physical Address = Base Register + Logical Address

This single addition, performed by hardware on every memory access, enabled:

Programs to be written as if starting at address 0
The OS to load programs anywhere in physical memory
Multiple programs to coexist without address conflicts

The limitation: the entire program must occupy a contiguous physical region. As programs grew, finding large contiguous regions became difficult.

base_register_translation.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
/*
 * Base Register Translation
 * The simplest form of address translation
 */
 
#include <stdint.h>
#include <stdio.h>
#include <stdbool.h>
 
// Hardware register - set by OS during context switch
uint64_t base_register = 0;
 
// Translation function
uint64_t translate_base(uint64_t logical_addr) {
    /*
     * Translation: Add base to logical address
     * 
     * This is performed by hardware on every memory access.
     * The single addition introduces ~1 cycle of latency.
     */
    return base_register + logical_addr;
}
 
/*
 * Example execution:
 * 
 * Process A loaded at physical address 0x100000:
 *   base_register = 0x100000
 *   Logical 0x0000 → Physical 0x100000
 *   Logical 0x1234 → Physical 0x101234
 * 
 * Context switch to Process B at physical address 0x500000:
 *   base_register = 0x500000
 *   Logical 0x0000 → Physical 0x500000
 *   Logical 0x1234 → Physical 0x501234
 * 
 * Same logical address, different physical location!
 */
 
/*
 * Limitation: No protection!
 * A program can access ANY physical address by using
 * a sufficiently large (or negative) logical address.
 * 
 * Logical 0xFFFFFFFF (if base = 0x100000):
 *   Physical = 0x100000 + 0xFFFFFFFF = overflows!
 *   Or wraps to address outside program's region.
 * 
 * This is why we need the LIMIT register.
 */

From Simple Translation to Paging:

Paging solved this by:

Dividing logical address space into fixed-size pages (typically 4 KB)
Dividing physical memory into same-size frames
Allowing each page to map to any frame independently

Now a program's logical space appears contiguous, but its physical frames can be scattered throughout RAM. This is the translation mechanism that dominates modern systems.

The Key Insight of Paging

The Translation Process in Detail

Let's examine exactly how address translation works in a paging-based system—the dominant approach in modern operating systems.

Page-Based Translation Steps:

Extract Page Number and Offset: The logical address is split into two parts:
- Page Number (p): Identifies which page the address is in
- Page Offset (d): Position within the page
Look Up Frame Number: Use the page number as an index into the page table to find the corresponding physical frame number.
Combine Frame Number and Offset: The physical address is formed by:
- Physical Address = (Frame Number × Page Size) + Page Offset
- Or equivalently: Concatenate frame number with offset
Check Permissions: The page table entry also contains permission bits. The translation fails if access violates permissions.

Converting Mermaid diagram...

paging_translation.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
/*
 * Page-Based Address Translation
 * 
 * This is the fundamental translation mechanism of modern systems.
 * The actual implementation is in hardware (MMU), but the logic is:
 */
 
#include <stdint.h>
#include <stdio.h>
#include <stdbool.h>
 
#define PAGE_SIZE       4096        // 4 KB = 2^12 bytes
#define PAGE_SHIFT      12          // log2(PAGE_SIZE)
#define PAGE_MASK       0xFFF       // Mask for offset (12 bits)
 
#define PTE_PRESENT     (1 << 0)    // Page is in physical memory
#define PTE_WRITABLE    (1 << 1)    // Page can be written
#define PTE_USER        (1 << 2)    // Page accessible from user mode
#define PTE_ACCESSED    (1 << 5)    // Page has been read
#define PTE_DIRTY       (1 << 6)    // Page has been written
 
typedef uint64_t PageTableEntry;
 
// The page table - indexed by page number
// In reality, this is a hierarchical structure (multi-level paging)
PageTableEntry page_table[1048576];  // 2^20 entries for 32-bit address
 
// Extract page number from logical address
uint64_t get_page_number(uint64_t logical_addr) {
    return logical_addr >> PAGE_SHIFT;
}
 
// Extract page offset from logical address
uint64_t get_page_offset(uint64_t logical_addr) {
    return logical_addr & PAGE_MASK;
}
 
// Extract frame number from page table entry
uint64_t get_frame_number(PageTableEntry pte) {
    return pte >> PAGE_SHIFT;  // Frame number stored in upper bits
}
 
// Translate logical to physical address
typedef struct {
    bool valid;
    uint64_t physical_addr;
    const char* error;
} TranslationResult;
 
TranslationResult translate(uint64_t logical_addr, bool is_write, bool is_user) {
    TranslationResult result = {false, 0, NULL};
    
    // Step 1: Split logical address
    uint64_t page_num = get_page_number(logical_addr);
    uint64_t offset = get_page_offset(logical_addr);
    
    // Step 2: Look up page table entry
    PageTableEntry pte = page_table[page_num];
    
    // Step 3: Check if page is present
    if (!(pte & PTE_PRESENT)) {
        result.error = "Page fault: page not present";
        return result;  // Triggers page fault exception
    }
    
    // Step 4: Check permissions
    if (is_write && !(pte & PTE_WRITABLE)) {
        result.error = "Protection fault: page not writable";
        return result;  // Triggers protection fault
    }
    
    if (is_user && !(pte & PTE_USER)) {
        result.error = "Protection fault: kernel page from user mode";
        return result;  // Triggers protection fault
    }
    
    // Step 5: Extract frame number and form physical address
    uint64_t frame_num = get_frame_number(pte);
    uint64_t physical_addr = (frame_num << PAGE_SHIFT) | offset;
    
    // Step 6: Update accessed/dirty bits
    page_table[page_num] |= PTE_ACCESSED;
    if (is_write) {
        page_table[page_num] |= PTE_DIRTY;
    }
    
    result.valid = true;
    result.physical_addr = physical_addr;
    return result;
}
 
/*
 * Example walkthrough:
 * 
 * Logical Address: 0x12345678
 * Page Size: 4 KB (4096 bytes)
 * 
 * Step 1: Split address
 *   Page Number = 0x12345678 >> 12 = 0x12345
 *   Page Offset = 0x12345678 & 0xFFF = 0x678
 * 
 * Step 2: Look up page_table[0x12345]
 *   Suppose entry contains: 0x7ABCD003
 *   (Frame 0x7ABCD, Present + Writable flags)
 * 
 * Step 3: Check present bit (0x003 & 0x001 = 1) ✓
 * 
 * Step 4: Check permissions ✓
 * 
 * Step 5: Form physical address
 *   Frame Number = 0x7ABCD003 >> 12 = 0x7ABCD
 *   Physical Address = (0x7ABCD << 12) | 0x678 = 0x7ABCD678
 * 
 * Translation: 0x12345678 → 0x7ABCD678
 */
 
void demonstrate_translation() {
    // Set up a sample page table entry
    // Logical page 0x12345 maps to physical frame 0x7ABCD
    page_table[0x12345] = (0x7ABCDULL << PAGE_SHIFT) | PTE_PRESENT | PTE_WRITABLE | PTE_USER;
    
    uint64_t logical = 0x12345678;
    TranslationResult result = translate(logical, false, true);
    
    if (result.valid) {
        printf("Logical 0x%08llx -> Physical 0x%08llx
",
               (unsigned long long)logical,
               (unsigned long long)result.physical_addr);
    } else {
        printf("Translation failed: %s
", result.error);
    }
}

Hardware vs. Software

Translation and Memory Protection

Address translation does far more than convert addresses—it's the enforcement mechanism for memory protection. Every page table entry contains permission bits that the MMU checks on every access.

Common Page Table Entry Permission Bits
Bit	Name	Meaning if Set	Typical Use
P	Present	Page is in physical memory	Set for valid mappings
R/W	Read/Write	Page is writable	Clear for read-only code/data
U/S	User/Supervisor	Page accessible from user mode	Clear for kernel-only pages
PWT	Page Write-Through	Cache writes go directly to memory	Device memory, not normal RAM
PCD	Page Cache Disable	Do not cache this page	MMIO regions
A	Accessed	Page has been accessed (read/written)	OS uses for page replacement
D	Dirty	Page has been written	OS knows what needs writing to disk
NX/XD	No Execute	Page cannot be executed	Data pages, stack (security)

Protection in Action:

When the MMU translates an address, it simultaneously checks:

Is the page present? If not → Page Fault
Is this a write to a read-only page? If yes → Protection Fault
Is this user-mode access to a supervisor page? If yes → Protection Fault
Is this an execute on a no-execute page? If yes → Protection Fault

These checks happen in hardware, at no additional cost beyond the translation itself. This is how modern systems provide security—every memory access is validated before it occurs.

protection_scenarios.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
/*
 * Memory Protection Scenarios
 * 
 * The translation mechanism enforces all of these protections
 * in hardware, at full memory bus speed.
 */
 
/*
 * SCENARIO 1: Null Pointer Dereference
 * 
 * The first page (addresses 0x0000 - 0x0FFF) is intentionally
 * not mapped in any process. Accessing it causes a fault.
 */
void null_pointer_example() {
    int *ptr = NULL;  // Address 0x0
    // int x = *ptr;  // Would trigger: page_table[0] has Present=0
                      // Result: Page Fault, process terminated
}
 
/*
 * SCENARIO 2: Writing to Read-Only Memory (Code Segment)
 * 
 * The text (code) segment is mapped as read-only + execute.
 * Attempting to write causes a protection fault.
 */
extern void some_function(void);
void modify_code_example() {
    // Function code is at some address, e.g., 0x401000
    // Page table entry for page 0x401 has: R/W = 0 (read-only)
    
    // unsigned char* code = (unsigned char*)some_function;
    // *code = 0x90;  // Would trigger: write to read-only page
                      // Result: Protection Fault
}
 
/*
 * SCENARIO 3: Accessing Kernel Memory from User Space
 * 
 * Kernel memory is mapped with U/S = 0 (supervisor only).
 * User-mode code cannot access it.
 */
void kernel_access_example() {
    // Kernel memory often at high addresses, e.g., 0xFFFF8000...
    // Page table entry has: U/S = 0 (supervisor only)
    
    // int* kernel_data = (int*)0xFFFF800000000000;
    // int x = *kernel_data;  // Would trigger: user access to supervisor page
                              // Result: Protection Fault (general protection fault)
}
 
/*
 * SCENARIO 4: Executing Data (Buffer Overflow Defense)
 * 
 * Stack and heap are mapped with NX (No Execute) bit set.
 * Jump to stack/heap causes execution fault.
 */
void execute_data_example() {
    // Stack is at high addresses, e.g., 0x7FFF...
    // Page table entry has: NX = 1 (no execute)
    
    unsigned char shellcode[] = { 0x90, 0x90, 0xC3 };  // NOP NOP RET
    // void (*func)(void) = (void(*)(void))shellcode;
    // func();  // Would trigger: execute on non-executable page
               // Result: Protection Fault (prevents code injection attacks)
}
 
/*
 * SCENARIO 5: Stack Guard Page
 * 
 * A guard page is placed at the stack's maximum extent.
 * Stack overflow hits the guard page before corrupting memory.
 */
void stack_overflow_example() {
    // If function recurses too deeply, stack grows toward guard page
    // Guard page has: Present = 0 or no mapping
    
    // Accessing guard page triggers fault before stack
    // corrupts adjacent memory regions
    
    // stack_overflow_example();  // Eventually hits guard page
                                  // Result: Page Fault, SIGSEGV
}
 
/*
 * All these protections are enforced by the SAME mechanism:
 * 
 * 1. CPU generates logical address
 * 2. MMU looks up page table entry
 * 3. MMU checks permissions against access type
 * 4. If check fails: raise exception, OS handles
 * 5. If check passes: physical access proceeds
 * 
 * Zero overhead for valid accesses; instant detection of violations.
 */

The NX Bit: A Security Landmark

Translation Enables Controlled Sharing

Converting Mermaid diagram...

Types of Shared Memory:

1. Shared Libraries (Read-Only Sharing)

When multiple processes use the same library (libc, libm, GUI libraries), loading separate copies would waste physical memory. Instead:

The library code is loaded once into physical frames
Each process maps those frames into its logical address space
Mappings are read-only + executable
100 processes sharing libc use the same ~2 MB of physical RAM

2. Inter-Process Communication (Read-Write Sharing)

Processes can explicitly share memory regions for fast communication:

A shared memory segment is allocated
Both processes map it into their address spaces
Mappings are read-write
Changes by one process are instantly visible to the other

3. Copy-on-Write (Lazy Sharing)

After fork(), parent and child share all memory:

Pages are marked read-only even if originally writable
When either process writes, a page fault occurs
The OS copies the page, giving each process its own copy
Only modified pages are copied, saving memory and time

copy_on_write.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
/*
 * Copy-on-Write: Fork Without Full Copy
 * 
 * One of the most elegant uses of address translation
 */
 
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/wait.h>
 
int main() {
    // Allocate 100 MB of data
    size_t size = 100 * 1024 * 1024;
    char* data = malloc(size);
    for (size_t i = 0; i < size; i++) {
        data[i] = 'A';  // Initialize all to 'A'
    }
    
    printf("Before fork: 100 MB allocated and initialized
");
    
    /*
     * WHAT HAPPENS AT FORK:
     * 
     * Without COW (hypothetically):
     *   - OS would copy all 100 MB
     *   - Both processes would have identical but separate copies
     *   - 200 MB of physical memory used
     *   - Fork takes time proportional to memory size
     * 
     * With COW (reality):
     *   - OS copies only page TABLE, not pages themselves
     *   - Both processes' page tables point to SAME physical frames
     *   - All pages marked READ-ONLY
     *   - 100 MB of physical memory used
     *   - Fork takes O(1) time (just duplicate small metadata)
     */
    
    pid_t pid = fork();
    
    if (pid == 0) {
        // Child process
        printf("Child: About to write to data[0]...
");
        
        /*
         * WHEN CHILD WRITES:
         *
         * 1. Child tries to write to data[0] (first page)
         * 2. Page is marked read-only, so MMU triggers fault
         * 3. OS sees it's a COW page (has special flag in kernel)
         * 4. OS allocates NEW frame for this one page
         * 5. OS copies page content to new frame
         * 6. OS updates child's page table to point to new frame
         * 7. OS marks the page writable in child
         * 8. Child's write now succeeds
         * 
         * Result: Only 4 KB copied, not 100 MB!
         */
        data[0] = 'B';
        
        printf("Child: data[0] = %c (modified)
", data[0]);
        printf("Child: data[1000000] = %c (still shared with parent)
", data[1000000]);
        
        _exit(0);
    } else {
        // Parent process
        wait(NULL);  // Wait for child
        
        /*
         * Parent's view:
         * 
         * - data[0] is still 'A' (parent has original page)
         * - data[1000000] was never written, so it's still shared
         * 
         * Physical memory usage: Original 100 MB + 4 KB (one copied page)
         * Without COW: Would be 200 MB
         */
        printf("Parent: data[0] = %c (unchanged)
", data[0]);
        printf("Parent: data[1000000] = %c (was never written by child)
", data[1000000]);
    }
    
    free(data);
    return 0;
}
 
/*
 * Sample output:
 * 
 * Before fork: 100 MB allocated and initialized
 * Child: About to write to data[0]...
 * Child: data[0] = B (modified)
 * Child: data[1000000] = A (still shared with parent)
 * Parent: data[0] = A (unchanged)
 * Parent: data[1000000] = A (was never written by child)
 * 
 * Key insight: Most forked processes (like shells spawning commands)
 * immediately exec() a new program, replacing all memory. With COW,
 * we never copy any pages that would just be discarded!
 */

The Economics of Sharing

Performance of Address Translation

The Problem: Table Lookup

If every memory access required 4 additional memory accesses for translation, systems would run at ~20% of their potential speed. This is unacceptable.

The Solution: Translation Lookaside Buffer (TLB)

The TLB is a small, fast hardware cache that stores recent translations. Instead of walking page tables for every access:

Check TLB first (1 cycle)
If translation is cached (TLB hit): Use it immediately
If not cached (TLB miss): Walk page tables, cache the result

With TLB hit rates of 99%+, the average translation cost is close to zero.

TLB Performance Characteristics
Property	Typical Value	Impact
TLB Size	64-1536 entries	More entries = higher hit rate
TLB Hit Time	0.5-1 cycles	Essentially free
TLB Miss Penalty	10-100 cycles	Page table walk cost
TLB Hit Rate	95-99%+	Critical for performance
TLB Reach	Entries × Page Size	How much memory is 'cached'
TLB Flush Cost	Hundreds of cycles	Context switches hurt

effective_memory_access_time.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
/*
 * Calculating Effective Memory Access Time
 * 
 * This is a key formula for understanding translation overhead
 */
 
#include <stdio.h>
 
/*
 * Variables:
 *   m = Memory access time (e.g., 100 ns to main memory)
 *   t = TLB lookup time (e.g., 1 ns)
 *   h = TLB hit rate (e.g., 0.99 for 99%)
 * 
 * Without TLB (4-level paging):
 *   EAT = 4m + m = 5m  (4 table lookups + 1 actual access)
 *       = 5 × 100 ns = 500 ns
 * 
 * With TLB:
 *   EAT = h × (t + m) + (1-h) × (4m + m + t)
 *       = h × (t + m) + (1-h) × (5m + t)
 * 
 * Example with 99% hit rate:
 *   EAT = 0.99 × (1 + 100) + 0.01 × (500 + 1)
 *       = 0.99 × 101 + 0.01 × 501
 *       = 99.99 + 5.01
 *       = 105 ns
 * 
 * Overhead: 5 ns = 5% (vs 400% without TLB)
 */
 
void calculate_eat(double hit_rate, double memory_time, double tlb_time, int page_table_levels) {
    // With TLB hit: TLB lookup + memory access
    double hit_cost = tlb_time + memory_time;
    
    // With TLB miss: TLB lookup + page table walk + memory access
    double miss_cost = tlb_time + (page_table_levels * memory_time) + memory_time;
    
    // Effective Access Time
    double eat = (hit_rate * hit_cost) + ((1.0 - hit_rate) * miss_cost);
    
    // Compare to direct access (no translation)
    double overhead = ((eat - memory_time) / memory_time) * 100;
    
    printf("=== Effective Access Time Calculation ===
");
    printf("TLB Hit Rate:       %.1f%%
", hit_rate * 100);
    printf("Memory Access Time: %.0f ns
", memory_time);
    printf("TLB Lookup Time:    %.0f ns
", tlb_time);
    printf("Page Table Levels:  %d
 
", page_table_levels);
    
    printf("TLB Hit Cost:       %.0f ns
", hit_cost);
    printf("TLB Miss Cost:      %.0f ns
", miss_cost);
    printf("Effective Access:   %.1f ns
", eat);
    printf("Translation Overhead: %.1f%%
", overhead);
}
 
int main() {
    printf("--- Scenario 1: Excellent TLB (99%% hit rate) ---
");
    calculate_eat(0.99, 100.0, 1.0, 4);
    
    printf("
--- Scenario 2: Poor TLB (80%% hit rate) ---
");
    calculate_eat(0.80, 100.0, 1.0, 4);
    
    printf("
--- Scenario 3: With Page Table Caching ---
");
    // Page table entries often in L2/L3 cache, faster than main memory
    printf("If page table entries are cached (avg 20ns access):
");
    double cached_eat = 0.99 * 101 + 0.01 * (1 + 4*20 + 100);
    printf("Effective Access: %.1f ns (even better!)
", cached_eat);
    
    return 0;
}
 
/*
 * Output:
 * 
 * --- Scenario 1: Excellent TLB (99% hit rate) ---
 * TLB Hit Rate:       99.0%
 * Memory Access Time: 100 ns
 * TLB Lookup Time:    1 ns
 * Page Table Levels:  4
 * 
 * TLB Hit Cost:       101 ns
 * TLB Miss Cost:      501 ns
 * Effective Access:   105.0 ns
 * Translation Overhead: 5.0%
 * 
 * --- Scenario 2: Poor TLB (80% hit rate) ---
 * TLB Hit Rate:       80.0%
 * Memory Access Time: 100 ns
 * TLB Lookup Time:    1 ns
 * Page Table Levels:  4
 * 
 * TLB Hit Cost:       101 ns
 * TLB Miss Cost:      501 ns
 * Effective Access:   181.0 ns
 * Translation Overhead: 81.0%
 * 
 * Key Insight: TLB hit rate is CRITICAL for performance!
 */

Context Switch TLB Flush

Translation in Multi-Level Systems

Why Multi-Level?

A single-level page table for a 48-bit address space with 4 KB pages would need:

2^(48-12) = 2^36 entries (about 68 billion)
At 8 bytes per entry: 512 GB just for the page table!

x86-64 4-Level Paging:

PML4 (Page Map Level 4): 512 entries, each covering 512 GB
PDPT (Page Directory Pointer Table): 512 entries, each covering 1 GB
PD (Page Directory): 512 entries, each covering 2 MB
PT (Page Table): 512 entries, each covering 4 KB

Each level adds flexibility: if an entire 1 GB region is unmapped, we don't need any of the lower tables for it.

multilevel_translation.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
/*
 * Multi-Level Page Table Walk (x86-64 4-Level Paging)
 * 
 * This shows the full translation process from virtual to physical address.
 */
 
#include <stdint.h>
#include <stdio.h>
#include <stdbool.h>
 
#define PAGE_SIZE 4096
#define PAGE_SHIFT 12
 
// Each page table has 512 entries (9 bits of address)
#define ENTRIES_PER_TABLE 512
#define ENTRY_BITS 9
 
// Entry flags (same for all levels)
#define PTE_PRESENT   (1ULL << 0)
#define PTE_WRITABLE  (1ULL << 1)
#define PTE_USER      (1ULL << 2)
#define PTE_PS        (1ULL << 7)   // Page Size (for huge pages)
 
typedef uint64_t PTE;
 
// Page table structure (one 4KB page of 512 8-byte entries)
typedef struct {
    PTE entries[ENTRIES_PER_TABLE];
} PageTable;
 
// Extract index for each level from virtual address
// Virtual address format for 4-level paging (48-bit canonical):
//   Bits 47-39: PML4 index (9 bits)
//   Bits 38-30: PDPT index (9 bits)
//   Bits 29-21: PD index (9 bits)
//   Bits 20-12: PT index (9 bits)
//   Bits 11-0:  Page offset (12 bits)
 
uint16_t get_pml4_index(uint64_t vaddr) {
    return (vaddr >> 39) & 0x1FF;  // Bits 47-39
}
 
uint16_t get_pdpt_index(uint64_t vaddr) {
    return (vaddr >> 30) & 0x1FF;  // Bits 38-30
}
 
uint16_t get_pd_index(uint64_t vaddr) {
    return (vaddr >> 21) & 0x1FF;  // Bits 29-21
}
 
uint16_t get_pt_index(uint64_t vaddr) {
    return (vaddr >> 12) & 0x1FF;  // Bits 20-12
}
 
uint16_t get_page_offset(uint64_t vaddr) {
    return vaddr & 0xFFF;          // Bits 11-0
}
 
// Get physical address of next-level table from a PTE
uint64_t get_table_address(PTE entry) {
    // Physical address is in bits 51-12 (40 bits), aligned to 4KB
    return entry & 0x000FFFFFFFFFF000ULL;
}
 
// Simulated memory read (in real hardware, this goes to memory bus)
PageTable* read_table_from_memory(uint64_t physical_address) {
    // In reality, this accesses physical memory
    // For simulation, return a pointer
    return (PageTable*)(uintptr_t)physical_address;
}
 
// Full 4-level translation
typedef struct {
    bool valid;
    uint64_t physical_address;
    const char* error;
} TranslationResult;
 
TranslationResult translate_4level(uint64_t cr3, uint64_t virtual_address) {
    TranslationResult result = {false, 0, NULL};
    
    printf("
Translating virtual address: 0x%016llx
", 
           (unsigned long long)virtual_address);
    
    // CR3 holds the physical address of the PML4 table
    uint64_t pml4_base = cr3 & 0x000FFFFFFFFFF000ULL;
    printf("  CR3 (PML4 base): 0x%016llx
", (unsigned long long)pml4_base);
    
    // Level 1: PML4
    uint16_t pml4_idx = get_pml4_index(virtual_address);
    printf("  PML4 index: %d
", pml4_idx);
    
    // (Simplified - in reality would read from physical memory)
    // PageTable* pml4 = read_table_from_memory(pml4_base);
    // PTE pml4_entry = pml4->entries[pml4_idx];
    // For demonstration, assume entry is present
    
    // Continue through PDPT, PD, PT...
    // (Each level follows the same pattern)
    
    uint16_t pdpt_idx = get_pdpt_index(virtual_address);
    uint16_t pd_idx = get_pd_index(virtual_address);
    uint16_t pt_idx = get_pt_index(virtual_address);
    uint16_t offset = get_page_offset(virtual_address);
    
    printf("  PDPT index: %d
", pdpt_idx);
    printf("  PD index:   %d
", pd_idx);
    printf("  PT index:   %d
", pt_idx);
    printf("  Offset:     0x%03x
", offset);
    
    // Final translation (simulated):
    // physical_frame = PT[pt_idx] >> PAGE_SHIFT
    // physical_address = (physical_frame << PAGE_SHIFT) | offset
    
    result.valid = true;
    // result.physical_address = simulated_physical;
    
    return result;
}
 
/*
 * Example page table walk:
 * 
 * Virtual Address: 0x00007FFFCBA98765
 * 
 * 1. CR3 = 0x12345000 (PML4 at physical 0x12345000)
 * 2. PML4 index = (0x7FFFCBA98765 >> 39) & 0x1FF = 255
 *    Read PML4[255] from physical 0x12345000 + 255*8
 *    Suppose it contains 0x67890003 (present, writable, user)
 *    PDPT is at physical 0x67890000
 * 
 * 3. PDPT index = (0x7FFFCBA98765 >> 30) & 0x1FF = 511
 *    Read PDPT[511] from physical 0x67890000 + 511*8
 *    Suppose it contains 0xABCDE003
 *    PD is at physical 0xABCDE000
 * 
 * 4. PD index = (0x7FFFCBA98765 >> 21) & 0x1FF = 477
 *    Read PD[477] from physical 0xABCDE000 + 477*8
 *    Suppose it contains 0x11111003
 *    PT is at physical 0x11111000
 * 
 * 5. PT index = (0x7FFFCBA98765 >> 12) & 0x1FF = 152
 *    Read PT[152] from physical 0x11111000 + 152*8
 *    Suppose it contains 0x22222003
 *    Frame is at physical 0x22222000
 * 
 * 6. Offset = 0x7FFFCBA98765 & 0xFFF = 0x765
 *    Physical Address = 0x22222000 + 0x765 = 0x22222765
 * 
 * Total: 4 memory accesses for translation + 1 for actual data = 5 accesses
 * (Without TLB, this would be devastating for performance!)
 */

Hardware Page Table Walker

Summary: Address Translation

We've explored address translation—the mechanism that bridges logical and physical address spaces, happening billions of times per second in your computer right now.

Key Takeaways

•Address translation converts logical addresses to physical addresses on every memory access, enabling the abstraction that programs depend on.
•Translation evolved from simple base registers to sophisticated multi-level paging, each step adding capabilities while maintaining performance.
•In paging, addresses are split into page number and offset—the page table provides frame numbers, and the offset transfers directly.
•Translation enforces memory protection—permission bits in page table entries are checked on every access, providing security at hardware speed.
•Controlled sharing is enabled by mapping multiple PTEs to the same frame—this powers shared libraries, IPC, and copy-on-write semantics.
•The TLB is critical for performance—caching translations avoids the multi-access cost of page table walks for most operations.
•Multi-level page tables handle vast address spaces—unused regions cost nothing because absent upper-level entries eliminate lower-level tables.

What's Next:

Page Complete