Operating SystemsPage Tables

Page Tables

LevelIntermediate

Duration90 mins

TopicPage Tables

1 / 5

Page Table Structure

The Memory Mapping Problem

When a process accesses memory address 0x7FFA4B20, how does the hardware know which physical memory location actually contains that data? The answer lies in one of the most elegant and performance-critical data structures in computer architecture: the page table.

Every memory access in a modern system—every instruction fetch, every variable read, every buffer write—involves a translation from virtual to physical addresses. This translation must happen fast (nanoseconds), correctly (no security holes), and efficiently (minimal memory overhead). The page table is the bridge that makes all of this possible.

What You Will Learn

By the end of this page, you will understand the fundamental structure of page tables, why they are designed the way they are, how they organize the virtual-to-physical mapping, and the key design trade-offs that shape their implementation in real operating systems and hardware.

Why Page Tables Exist

To understand page table structure, we must first understand the problem they solve. In a paged virtual memory system:

Virtual address space is divided into fixed-size units called pages (typically 4KB)
Physical memory is divided into same-sized units called frames
A process's pages can be scattered across any available frames in physical memory

This non-contiguous allocation provides tremendous flexibility—eliminating external fragmentation and enabling memory sharing—but creates a fundamental challenge: how do we track which virtual page maps to which physical frame?

The Scale of the Problem

Consider a 64-bit virtual address space with 4KB pages. That's potentially 2^52 virtual pages per process. We need a structure that can map any of those pages to physical frames, support multiple processes simultaneously, and be accessed billions of times per second. The page table design must balance completeness, speed, and space efficiency.

The Conceptual Model:

At its simplest, a page table is an associative array (like a dictionary or map) where:

Key: Virtual page number (VPN)
Value: Physical frame number (PFN) plus metadata

When the CPU generates a virtual address, the Memory Management Unit (MMU) extracts the virtual page number, looks it up in the page table, retrieves the corresponding frame number, and combines it with the offset to form the physical address.

The Engineering Challenge:

A naive implementation as a simple array indexed by VPN would be catastrophically wasteful. A 32-bit address space with 4KB pages has 2^20 pages, requiring over a million entries per process—before we even consider 64-bit systems. The page table structure must be clever enough to handle sparse address spaces efficiently.

Linear (Single-Level) Page Tables

The simplest page table structure is a linear array indexed directly by the virtual page number. This approach is conceptually straightforward and was used in early systems, but understanding its characteristics illuminates why more complex structures became necessary.

Structure:

Page Table = Array[0 ... (Virtual_Address_Space_Size / Page_Size) - 1]

page_table[VPN] = PTE (Page Table Entry)

Each entry in the array is a Page Table Entry (PTE) containing:

Physical frame number
Valid bit (is this mapping valid?)
Protection bits (read/write/execute permissions)
Other control bits (dirty, accessed, etc.)

Linear Page Table Space Requirements
Address Space	Page Size	Number of PTEs	Table Size (4B entries)	Table Size (8B entries)
32-bit (4GB)	4KB	1,048,576 (2²⁰)	4 MB	8 MB
32-bit (4GB)	4MB (huge)	1,024 (2¹⁰)	4 KB	8 KB
48-bit (256TB)	4KB	68,719,476,736 (2³⁶)	256 GB	512 GB
64-bit (16EB)	4KB	4,503,599,627,370,496 (2⁵²)	16 PB	32 PB

The Sparsity Problem

Most processes use only a tiny fraction of their virtual address space. A typical process might have: code (~1MB), heap (~10MB), stack (~1MB), shared libraries (~50MB)—totaling perhaps 100MB of actual mappings. Yet with a 48-bit virtual address space, we'd need 512GB just to store the page table! This is why linear page tables are impractical for modern systems.

Why Study Linear Tables?

Despite their impracticality for large address spaces, linear page tables illuminate key concepts:

O(1) lookup: Given a VPN, we can find the PTE in constant time with a single array index operation
No search required: Unlike segment tables with variable sizes, paging with fixed-size pages enables direct indexing
Hardware simplicity: The MMU can compute page_table_base + (VPN × entry_size) directly

These properties—especially fast lookup—remain design goals for all page table structures. The challenge is achieving similar speed while handling sparse address spaces efficiently.

Page Table Organization

A page table must support several fundamental operations efficiently:

Core Operations:

Lookup (Translation): Given a virtual page number, find the corresponding physical frame number
Update: Mark a page as accessed, dirty, or change protection
Allocate: Create a new mapping when a process faults on an unmapped page
Deallocate: Remove a mapping when memory is freed or process terminates
Protection Check: Verify the access type matches the page's permissions

The organization of the page table directly affects the performance of each operation.

Key Organizational Principles

•Per-Process Tables: Each process has its own page table, enabling isolated virtual address spaces. When the OS switches between processes, it changes the active page table.
•Contiguous Layout for Hardware: The MMU must be able to walk the page table efficiently. Hardware typically expects predictable layouts with pointer-based traversal.
•Sparse Representation: Only allocate storage for regions of the address space that are actually used. Unused regions should not consume memory.
•Metadata Storage: Each mapping includes not just the translation but permission bits, status flags, and possibly additional OS-specific data.
•Atomic Updates: Changes to page table entries must be atomic to prevent race conditions in multiprocessor systems.

Memory Layout Regions:

A typical process's virtual address space has distinct regions that influence page table organization:

+---------------------------+ 0xFFFFFFFF (or higher)
|       Kernel Space        | ← Shared across processes
+---------------------------+
|         Stack ↓           | ← Grows downward
|           ...             |
|         Heap ↑            | ← Grows upward
+---------------------------+
|    Uninitialized Data     | ← BSS segment
+---------------------------+
|    Initialized Data       | ← Data segment  
+---------------------------+
|          Code             | ← Text segment
+---------------------------+ 0x00000000 (or near it)

Notice the large gap between heap and stack. This sparse region means most of the address space is unmapped. An efficient page table structure allocates entries only for the actually-mapped regions.

Multi-Level Page Tables

The solution to the sparsity problem is hierarchical (multi-level) page tables. Instead of one giant array, we use a tree structure where:

The root table (page directory) contains pointers to second-level tables
Second-level tables contain pointers to third-level tables (if needed)
Leaf tables contain the actual virtual-to-physical mappings

Key Insight:

If an entire region of the address space is unmapped, we simply don't allocate the corresponding subtable. A single null pointer in the root can eliminate millions of unused entries.

Two-Level Example (32-bit, 4KB pages):

A 32-bit virtual address has 20 bits for the page number (4GB / 4KB = 2²⁰ pages). We split this into:

10 bits: Page Directory Index (1024 entries)
10 bits: Page Table Index (1024 entries per table)
12 bits: Page Offset (within the 4KB page)

address_breakdown.txt
32-bit Virtual Address Breakdown (Two-Level Paging):
 
┌─────────────────┬─────────────────┬────────────────┐
│   Directory     │   Table         │    Offset      │
│   Index (10b)   │   Index (10b)   │    (12 bits)   │
├─────────────────┼─────────────────┼────────────────┤
│   31 ─────── 22 │ 21 ─────── 12   │  11 ─────── 0  │
└─────────────────┴─────────────────┴────────────────┘
 
Translation Process:
1. CR3 register points to Page Directory base
2. Directory[bits 31-22] → Page Table base (or null)
3. PageTable[bits 21-12] → Frame Number (or null)
4. Physical Address = Frame Number + Offset

Space Savings Analysis:

Consider a process using only:

4MB of code/data (at low addresses): needs 1,024 pages = 1 page table
4MB of heap (above data): needs 1,024 pages = 1 page table
4MB of stack (at high addresses): needs 1,024 pages = 1 page table

With a linear table: 4MB (1M entries × 4 bytes)

With two-level tables:

1 page directory: 4KB (1024 entries × 4 bytes)
3 page tables: 3 × 4KB = 12KB
Total: 16KB (vs 4MB = 256× savings!)

The remaining ~1021 page directory entries are null, representing unmapped regions without wasting memory.

Modern Systems Use Deeper Hierarchies

64-bit systems like x86-64 use 4-level page tables (PML4 → PDPT → PD → PT), with 9 bits per level plus 12-bit offset, addressing 48 bits of virtual address space. Intel's 5-level paging extends this to 57 bits. Each additional level allows handling larger address spaces while maintaining sparse efficiency.

Page Table Entry Format

Each entry in a page table—called a Page Table Entry (PTE)—contains far more than just the frame number. The PTE is a carefully designed bit field that controls translation, protection, and status tracking.

Anatomy of a Page Table Entry:

A typical PTE contains:

Physical Frame Number (PFN): The actual translation—which physical frame holds this page
Valid/Present Bit: Is this mapping currently valid and in physical memory?
Protection Bits: Read, Write, Execute permissions
User/Supervisor Bit: Can user-mode code access this page?
Accessed Bit: Has this page been read since loaded?
Dirty Bit: Has this page been written since loaded?
Cache Control Bits: Memory type (write-back, write-through, uncacheable)
Reserved/OS Bits: Available for operating system use

pte_format_x86.txt
x86 32-bit Page Table Entry (4KB pages):
 
┌────────────────────────────────────────────────────────────┐
│ 63   │ 12 │ 11 │ 10│ 9 │ 8 │ 7  │ 6│ 5│ 4  │ 3  │ 2│ 1 │ 0 │
├──────┼────┼────┼───┼───┼───┼────┼──┼──┼────┼────┼──┼───┼───┤
│ PFN  │Avl │ G  │PAT│ D │ A │PCD │PWT│U/S│R/W│ P │  │   │   │
└──────┴────┴────┴───┴───┴───┴────┴──┴──┴────┴────┴──┴───┴───┘
 
Bit Fields:
  P   (0)   : Present - page is in physical memory
  R/W (1)   : Read/Write - 0=read-only, 1=read-write  
  U/S (2)   : User/Supervisor - 0=kernel only, 1=user accessible
  PWT (3)   : Page Write-Through - cache write policy
  PCD (4)   : Page Cache Disable - disable caching
  A   (5)   : Accessed - set by hardware on any access
  D   (6)   : Dirty - set by hardware on write
  PAT (7)   : Page Attribute Table index (extended caching)
  G   (8)   : Global - don't flush from TLB on CR3 switch
  Avl (9-11): Available for OS use
  PFN (12-31): Physical Frame Number

Why Each Field Matters:

Present Bit (P): The most critical bit. When P=0, the entire rest of the entry can be used by the OS for any purpose (e.g., storing swap location). Hardware will fault on access, and the OS can handle it.

Read/Write and User/Supervisor: These form the core protection matrix. A page might be:

R/W=0, U/S=0: Kernel read-only (e.g., kernel code)
R/W=1, U/S=0: Kernel read-write (e.g., kernel data structures)
R/W=0, U/S=1: User read-only (e.g., user code, read-only data)
R/W=1, U/S=1: User read-write (e.g., user heap, stack)

Accessed and Dirty Bits: These are set automatically by hardware, enabling the OS to implement page replacement algorithms without trapping every memory access. The OS periodically clears these bits to track recent usage patterns.

Hardware vs Software Bits

Some PTE bits are managed by hardware (Accessed, Dirty), some by software (protection bits), and some by both (Present). Modern architectures also provide 'available' bits that the OS can use for any purpose—commonly used for tracking page states, copy-on-write markers, or swap location metadata.

Page Table Size Considerations

Page table size is a critical concern in system design. Tables consume physical memory, and in a system with many processes, this overhead can be substantial. Understanding these trade-offs is essential for both OS designers and systems programmers.

Factors Affecting Page Table Size:

Virtual Address Space Size: Larger address spaces require more levels or wider entries
Page Size: Larger pages mean fewer entries needed
Entry Size: Depends on physical address bits needed plus metadata
Address Space Sparsity: More unused regions = more savings from multi-level
Number of Processes: Each process needs its own page table(s)

Page Table Memory Overhead Examples
Scenario	Configuration	Page Table Size	Notes
Simple 32-bit process	4KB pages, 2-level, 20MB mapped	~24KB	1 PD + 5 PTs
Complex 32-bit process	4KB pages, 2-level, 500MB mapped	~516KB	1 PD + 128 PTs
64-bit server process	4KB pages, 4-level, 4GB mapped	~8MB	Spread across many PTs
Browser with many tabs	64-bit, shared libs, 8GB	~24MB	Includes shared mappings
Database server	64-bit, huge pages, 256GB	~2MB	Huge pages reduce entries

Optimization Strategies:

Large/Huge Pages: Using 2MB or 1GB pages instead of 4KB dramatically reduces page table size:

4KB pages: 1GB memory = 262,144 PTEs
2MB pages: 1GB memory = 512 PTEs
1GB pages: 1GB memory = 1 PTE

Trade-off: Larger pages increase internal fragmentation and reduce sharing granularity.

Page Table Sharing: Read-only regions (like shared libraries) can share page table entries across processes. The kernel maps libc.so once and points multiple processes' PTEs to the same frames.

Lazy Allocation: Page tables themselves can be demand-allocated. A page directory entry stays null until a fault occurs in that region, at which point the OS allocates the needed page table.

Page Table Thrashing

In extreme memory pressure, even page tables themselves can be swapped out. This leads to severe performance degradation: accessing any page in a swapped-out region requires first paging in the page table, then handling the actual page fault. Some systems pin critical page tables in memory to avoid this scenario.

Process Isolation Through Page Tables

One of the most important functions of page tables is process isolation—ensuring that one process cannot access another's memory without explicit permission. This isolation is the foundation of multi-process operating systems and modern security models.

How Isolation Works:

Each process has its own page table (or hierarchy of tables)
The CPU has a register (CR3 on x86) pointing to the current page table
On context switch, the OS updates this register to point to the new process's page table
All address translations now use the new process's mappings

Even if two processes use the same virtual address (e.g., both have code at 0x400000), they map to different physical frames.

isolation_example.txt
Process Isolation Example:
 
Process A's Page Table:               Process B's Page Table:
┌─────────────────────────┐           ┌─────────────────────────┐
│ VPN 0x400 → Frame 100   │           │ VPN 0x400 → Frame 250   │
│ VPN 0x401 → Frame 101   │           │ VPN 0x401 → Frame 251   │
│ VPN 0x500 → Frame 200   │           │ VPN 0x500 → Frame 300   │
│ VPN 0x600 → Invalid     │           │ VPN 0x600 → Frame 350   │
└─────────────────────────┘           └─────────────────────────┘
 
When Process A runs (CR3 → Process A's PT):
  Virtual 0x400000 → Physical 0x64000 (Frame 100)
  
When Process B runs (CR3 → Process B's PT):  
  Virtual 0x400000 → Physical 0xFA000 (Frame 250)
 
Same virtual address, completely different physical memory!

Security Implications:

Memory Protection: Process B cannot even form an address that maps to Process A's frames
No Guessing Attacks: Even if B tries every possible virtual address, none will reach A's data
Hardware Enforcement: The MMU enforces this on every single memory access—no software overhead

Shared Regions:

Despite isolation, processes need to share some memory:

Kernel Space: The upper portion of every process's page table maps the same kernel frames
Shared Libraries: libc, etc., map to the same physical frames (read-only) across processes
Explicit Shared Memory: mmap() with MAP_SHARED creates shared mappings

These shared regions appear in multiple page tables pointing to the same physical frames, with appropriate protection bits.

Kernel Mapping Strategy

Most operating systems map the kernel into the upper portion of every process's virtual address space (e.g., above 0xFFFF800000000000 on Linux x86-64). This allows system calls to execute without changing page tables, improving performance. The kernel pages are marked supervisor-only, so user code cannot access them despite them being 'present' in the page table.

Hardware Support for Page Tables

Page tables require intimate cooperation between hardware and software. The CPU provides dedicated registers, the MMU implements the table walk algorithm, and the OS manages table contents. Understanding this division is crucial for systems programming.

Critical Hardware Components:

Page Table Base Register:

x86: CR3 (Control Register 3)
ARM: TTBR0/TTBR1 (Translation Table Base Registers)
RISC-V: SATP (Supervisor Address Translation and Protection)

This register holds the physical address of the root page table. When the OS switches processes, it stores the new page table address here.

Hardware Page Table Walk

•CPU generates virtual address during instruction execution
•MMU extracts page table indices from the virtual address bits
•MMU reads CR3 to get root table physical address
•MMU loads first-level entry from memory (cache or RAM)
•If entry is null/invalid, MMU signals page fault
•Otherwise, MMU extracts next-level table address from entry
•Repeat for each level until reaching leaf entry
•Extract frame number from leaf PTE, check permissions
•Combine frame number with page offset to form physical address
•If permissions violated, signal protection fault

Software Responsibilities:

The operating system must:

Allocate Page Tables: Request physical frames for page table storage
Initialize Entries: Set up valid mappings with correct permissions
Maintain Consistency: Update tables when allocating/freeing memory
Handle Faults: Respond when hardware signals an unmapped or protected access
Manage CR3: Switch page tables on context switch
Flush TLB: Invalidate cached translations when tables change

The TLB Factor:

In practice, the MMU caches recent translations in the Translation Lookaside Buffer (TLB). Most memory accesses hit the TLB and skip the table walk entirely. This caching is why page tables can be relatively slow to access—most translations never touch them.

Page Walk Cost

A full 4-level page table walk requires 4 memory accesses before the actual data access—potentially 5 cache misses. At ~100 cycles per miss, this could be 500 cycles per memory access without the TLB. This is why TLB hit rates are so critical for system performance, typically exceeding 99% in well-behaved workloads.

Summary: Page Table Structure

We've explored the fundamental architecture of page tables—the data structure that makes virtual memory possible. Let's consolidate the key insights:

Key Takeaways

•Page tables map virtual pages to physical frames — They're the essential translation mechanism for non-contiguous memory allocation.
•Linear tables are simple but impractical — The space overhead for large, sparse address spaces is prohibitive.
•Multi-level tables solve the sparsity problem — Hierarchical structures allocate memory only for mapped regions.
•PTEs contain translation plus metadata — Frame numbers, protection bits, and status flags all pack into each entry.
•Each process has its own page table — This provides isolation while allowing controlled sharing.
•Hardware and software cooperate — The MMU walks tables; the OS manages their contents.
•Size matters — Page table overhead can be significant; huge pages and sharing help reduce it.

What's Next:

Now that we understand the overall structure of page tables, we'll dive deeper into the Page Table Entry (PTE)—examining each field in detail, understanding the hardware semantics, and seeing how the OS uses these bits to implement sophisticated memory management policies.

Page Complete

You now understand the fundamental structure of page tables—from linear arrays to multi-level hierarchies. This foundation prepares you to explore the details of page table entries, where protection, caching, and status tracking come together to enable modern virtual memory systems.

1 / 5

Loading learning content...

Operating SystemsPage Tables

Page Tables

LevelIntermediate

Duration90 mins

TopicPage Tables

1 / 5

Page Table Structure

The Memory Mapping Problem

What You Will Learn

Why Page Tables Exist

To understand page table structure, we must first understand the problem they solve. In a paged virtual memory system:

Virtual address space is divided into fixed-size units called pages (typically 4KB)
Physical memory is divided into same-sized units called frames
A process's pages can be scattered across any available frames in physical memory

The Scale of the Problem

The Conceptual Model:

At its simplest, a page table is an associative array (like a dictionary or map) where:

Key: Virtual page number (VPN)
Value: Physical frame number (PFN) plus metadata

The Engineering Challenge:

Linear (Single-Level) Page Tables

Structure:

Page Table = Array[0 ... (Virtual_Address_Space_Size / Page_Size) - 1]

page_table[VPN] = PTE (Page Table Entry)

Each entry in the array is a Page Table Entry (PTE) containing:

Physical frame number
Valid bit (is this mapping valid?)
Protection bits (read/write/execute permissions)
Other control bits (dirty, accessed, etc.)

Linear Page Table Space Requirements
Address Space	Page Size	Number of PTEs	Table Size (4B entries)	Table Size (8B entries)
32-bit (4GB)	4KB	1,048,576 (2²⁰)	4 MB	8 MB
32-bit (4GB)	4MB (huge)	1,024 (2¹⁰)	4 KB	8 KB
48-bit (256TB)	4KB	68,719,476,736 (2³⁶)	256 GB	512 GB
64-bit (16EB)	4KB	4,503,599,627,370,496 (2⁵²)	16 PB	32 PB

The Sparsity Problem

Why Study Linear Tables?

Despite their impracticality for large address spaces, linear page tables illuminate key concepts:

O(1) lookup: Given a VPN, we can find the PTE in constant time with a single array index operation
No search required: Unlike segment tables with variable sizes, paging with fixed-size pages enables direct indexing
Hardware simplicity: The MMU can compute page_table_base + (VPN × entry_size) directly

These properties—especially fast lookup—remain design goals for all page table structures. The challenge is achieving similar speed while handling sparse address spaces efficiently.

Page Table Organization

A page table must support several fundamental operations efficiently:

Core Operations:

Lookup (Translation): Given a virtual page number, find the corresponding physical frame number
Update: Mark a page as accessed, dirty, or change protection
Allocate: Create a new mapping when a process faults on an unmapped page
Deallocate: Remove a mapping when memory is freed or process terminates
Protection Check: Verify the access type matches the page's permissions

The organization of the page table directly affects the performance of each operation.

Key Organizational Principles

•Per-Process Tables: Each process has its own page table, enabling isolated virtual address spaces. When the OS switches between processes, it changes the active page table.
•Contiguous Layout for Hardware: The MMU must be able to walk the page table efficiently. Hardware typically expects predictable layouts with pointer-based traversal.
•Sparse Representation: Only allocate storage for regions of the address space that are actually used. Unused regions should not consume memory.
•Metadata Storage: Each mapping includes not just the translation but permission bits, status flags, and possibly additional OS-specific data.
•Atomic Updates: Changes to page table entries must be atomic to prevent race conditions in multiprocessor systems.

Memory Layout Regions:

A typical process's virtual address space has distinct regions that influence page table organization:

+---------------------------+ 0xFFFFFFFF (or higher)
|       Kernel Space        | ← Shared across processes
+---------------------------+
|         Stack ↓           | ← Grows downward
|           ...             |
|         Heap ↑            | ← Grows upward
+---------------------------+
|    Uninitialized Data     | ← BSS segment
+---------------------------+
|    Initialized Data       | ← Data segment  
+---------------------------+
|          Code             | ← Text segment
+---------------------------+ 0x00000000 (or near it)

Notice the large gap between heap and stack. This sparse region means most of the address space is unmapped. An efficient page table structure allocates entries only for the actually-mapped regions.

Multi-Level Page Tables

The solution to the sparsity problem is hierarchical (multi-level) page tables. Instead of one giant array, we use a tree structure where:

The root table (page directory) contains pointers to second-level tables
Second-level tables contain pointers to third-level tables (if needed)
Leaf tables contain the actual virtual-to-physical mappings

Key Insight:

If an entire region of the address space is unmapped, we simply don't allocate the corresponding subtable. A single null pointer in the root can eliminate millions of unused entries.

Two-Level Example (32-bit, 4KB pages):

A 32-bit virtual address has 20 bits for the page number (4GB / 4KB = 2²⁰ pages). We split this into:

10 bits: Page Directory Index (1024 entries)
10 bits: Page Table Index (1024 entries per table)
12 bits: Page Offset (within the 4KB page)

address_breakdown.txt
32-bit Virtual Address Breakdown (Two-Level Paging):
 
┌─────────────────┬─────────────────┬────────────────┐
│   Directory     │   Table         │    Offset      │
│   Index (10b)   │   Index (10b)   │    (12 bits)   │
├─────────────────┼─────────────────┼────────────────┤
│   31 ─────── 22 │ 21 ─────── 12   │  11 ─────── 0  │
└─────────────────┴─────────────────┴────────────────┘
 
Translation Process:
1. CR3 register points to Page Directory base
2. Directory[bits 31-22] → Page Table base (or null)
3. PageTable[bits 21-12] → Frame Number (or null)
4. Physical Address = Frame Number + Offset

Space Savings Analysis:

Consider a process using only:

4MB of code/data (at low addresses): needs 1,024 pages = 1 page table
4MB of heap (above data): needs 1,024 pages = 1 page table
4MB of stack (at high addresses): needs 1,024 pages = 1 page table

With a linear table: 4MB (1M entries × 4 bytes)

With two-level tables:

1 page directory: 4KB (1024 entries × 4 bytes)
3 page tables: 3 × 4KB = 12KB
Total: 16KB (vs 4MB = 256× savings!)

The remaining ~1021 page directory entries are null, representing unmapped regions without wasting memory.

Modern Systems Use Deeper Hierarchies

Page Table Entry Format

Anatomy of a Page Table Entry:

A typical PTE contains:

Physical Frame Number (PFN): The actual translation—which physical frame holds this page
Valid/Present Bit: Is this mapping currently valid and in physical memory?
Protection Bits: Read, Write, Execute permissions
User/Supervisor Bit: Can user-mode code access this page?
Accessed Bit: Has this page been read since loaded?
Dirty Bit: Has this page been written since loaded?
Cache Control Bits: Memory type (write-back, write-through, uncacheable)
Reserved/OS Bits: Available for operating system use

pte_format_x86.txt
x86 32-bit Page Table Entry (4KB pages):
 
┌────────────────────────────────────────────────────────────┐
│ 63   │ 12 │ 11 │ 10│ 9 │ 8 │ 7  │ 6│ 5│ 4  │ 3  │ 2│ 1 │ 0 │
├──────┼────┼────┼───┼───┼───┼────┼──┼──┼────┼────┼──┼───┼───┤
│ PFN  │Avl │ G  │PAT│ D │ A │PCD │PWT│U/S│R/W│ P │  │   │   │
└──────┴────┴────┴───┴───┴───┴────┴──┴──┴────┴────┴──┴───┴───┘
 
Bit Fields:
  P   (0)   : Present - page is in physical memory
  R/W (1)   : Read/Write - 0=read-only, 1=read-write  
  U/S (2)   : User/Supervisor - 0=kernel only, 1=user accessible
  PWT (3)   : Page Write-Through - cache write policy
  PCD (4)   : Page Cache Disable - disable caching
  A   (5)   : Accessed - set by hardware on any access
  D   (6)   : Dirty - set by hardware on write
  PAT (7)   : Page Attribute Table index (extended caching)
  G   (8)   : Global - don't flush from TLB on CR3 switch
  Avl (9-11): Available for OS use
  PFN (12-31): Physical Frame Number

Why Each Field Matters:

Read/Write and User/Supervisor: These form the core protection matrix. A page might be:

R/W=0, U/S=0: Kernel read-only (e.g., kernel code)
R/W=1, U/S=0: Kernel read-write (e.g., kernel data structures)
R/W=0, U/S=1: User read-only (e.g., user code, read-only data)
R/W=1, U/S=1: User read-write (e.g., user heap, stack)

Hardware vs Software Bits

Page Table Size Considerations

Factors Affecting Page Table Size:

Virtual Address Space Size: Larger address spaces require more levels or wider entries
Page Size: Larger pages mean fewer entries needed
Entry Size: Depends on physical address bits needed plus metadata
Address Space Sparsity: More unused regions = more savings from multi-level
Number of Processes: Each process needs its own page table(s)

Page Table Memory Overhead Examples
Scenario	Configuration	Page Table Size	Notes
Simple 32-bit process	4KB pages, 2-level, 20MB mapped	~24KB	1 PD + 5 PTs
Complex 32-bit process	4KB pages, 2-level, 500MB mapped	~516KB	1 PD + 128 PTs
64-bit server process	4KB pages, 4-level, 4GB mapped	~8MB	Spread across many PTs
Browser with many tabs	64-bit, shared libs, 8GB	~24MB	Includes shared mappings
Database server	64-bit, huge pages, 256GB	~2MB	Huge pages reduce entries

Optimization Strategies:

Large/Huge Pages: Using 2MB or 1GB pages instead of 4KB dramatically reduces page table size:

4KB pages: 1GB memory = 262,144 PTEs
2MB pages: 1GB memory = 512 PTEs
1GB pages: 1GB memory = 1 PTE

Trade-off: Larger pages increase internal fragmentation and reduce sharing granularity.

Page Table Sharing: Read-only regions (like shared libraries) can share page table entries across processes. The kernel maps libc.so once and points multiple processes' PTEs to the same frames.

Lazy Allocation: Page tables themselves can be demand-allocated. A page directory entry stays null until a fault occurs in that region, at which point the OS allocates the needed page table.

Page Table Thrashing

Process Isolation Through Page Tables

How Isolation Works:

Each process has its own page table (or hierarchy of tables)
The CPU has a register (CR3 on x86) pointing to the current page table
On context switch, the OS updates this register to point to the new process's page table
All address translations now use the new process's mappings

Even if two processes use the same virtual address (e.g., both have code at 0x400000), they map to different physical frames.

isolation_example.txt
Process Isolation Example:
 
Process A's Page Table:               Process B's Page Table:
┌─────────────────────────┐           ┌─────────────────────────┐
│ VPN 0x400 → Frame 100   │           │ VPN 0x400 → Frame 250   │
│ VPN 0x401 → Frame 101   │           │ VPN 0x401 → Frame 251   │
│ VPN 0x500 → Frame 200   │           │ VPN 0x500 → Frame 300   │
│ VPN 0x600 → Invalid     │           │ VPN 0x600 → Frame 350   │
└─────────────────────────┘           └─────────────────────────┘
 
When Process A runs (CR3 → Process A's PT):
  Virtual 0x400000 → Physical 0x64000 (Frame 100)
  
When Process B runs (CR3 → Process B's PT):  
  Virtual 0x400000 → Physical 0xFA000 (Frame 250)
 
Same virtual address, completely different physical memory!

Security Implications:

Memory Protection: Process B cannot even form an address that maps to Process A's frames
No Guessing Attacks: Even if B tries every possible virtual address, none will reach A's data
Hardware Enforcement: The MMU enforces this on every single memory access—no software overhead

Shared Regions:

Despite isolation, processes need to share some memory:

Kernel Space: The upper portion of every process's page table maps the same kernel frames
Shared Libraries: libc, etc., map to the same physical frames (read-only) across processes
Explicit Shared Memory: mmap() with MAP_SHARED creates shared mappings

These shared regions appear in multiple page tables pointing to the same physical frames, with appropriate protection bits.

Kernel Mapping Strategy

Hardware Support for Page Tables

Critical Hardware Components:

Page Table Base Register:

x86: CR3 (Control Register 3)
ARM: TTBR0/TTBR1 (Translation Table Base Registers)
RISC-V: SATP (Supervisor Address Translation and Protection)

This register holds the physical address of the root page table. When the OS switches processes, it stores the new page table address here.

Hardware Page Table Walk

•CPU generates virtual address during instruction execution
•MMU extracts page table indices from the virtual address bits
•MMU reads CR3 to get root table physical address
•MMU loads first-level entry from memory (cache or RAM)
•If entry is null/invalid, MMU signals page fault
•Otherwise, MMU extracts next-level table address from entry
•Repeat for each level until reaching leaf entry
•Extract frame number from leaf PTE, check permissions
•Combine frame number with page offset to form physical address
•If permissions violated, signal protection fault

Software Responsibilities:

The operating system must:

Allocate Page Tables: Request physical frames for page table storage
Initialize Entries: Set up valid mappings with correct permissions
Maintain Consistency: Update tables when allocating/freeing memory
Handle Faults: Respond when hardware signals an unmapped or protected access
Manage CR3: Switch page tables on context switch
Flush TLB: Invalidate cached translations when tables change

The TLB Factor:

Page Walk Cost

Summary: Page Table Structure

We've explored the fundamental architecture of page tables—the data structure that makes virtual memory possible. Let's consolidate the key insights:

Key Takeaways

•Page tables map virtual pages to physical frames — They're the essential translation mechanism for non-contiguous memory allocation.
•Linear tables are simple but impractical — The space overhead for large, sparse address spaces is prohibitive.
•Multi-level tables solve the sparsity problem — Hierarchical structures allocate memory only for mapped regions.
•PTEs contain translation plus metadata — Frame numbers, protection bits, and status flags all pack into each entry.
•Each process has its own page table — This provides isolation while allowing controlled sharing.
•Hardware and software cooperate — The MMU walks tables; the OS manages their contents.
•Size matters — Page table overhead can be significant; huge pages and sharing help reduce it.

What's Next:

Page Complete

1 / 5