Operating SystemsPaging

Hash-based and Inverted Page Tables

LevelAdvanced

Duration90 mins

TopicPaging

5 / 5

Sharing Challenges in Inverted Page Tables

The Sharing Paradox

Memory sharing is fundamental to modern operating systems. Shared libraries save gigabytes of RAM across thousands of processes. Copy-on-write fork creates child processes without duplicating memory. Shared memory enables high-performance inter-process communication. Memory-mapped files allow multiple processes to access the same file data in memory.

All of these sharing mechanisms rely on a simple capability: multiple virtual pages mapping to the same physical frame. Process A's virtual page 0x1000 and Process B's virtual page 0x2000 can both map to physical frame 500, meaning both processes read and write the same physical memory.

With conventional per-process page tables, this is trivial—each process's table independently points to the same frame. But with inverted page tables, a fundamental constraint creates a paradox: each physical frame has exactly one entry in the table. If frame 500 has one entry, how can it record that BOTH Process A and Process B have mappings to it?

This sharing challenge is the most significant limitation of inverted page tables and drives complex solutions that partially negate their space advantages.

What You Will Learn

By the end of this page, you will understand why sharing conflicts with inverted table design, the various solutions developed to support sharing despite this limitation, the performance and complexity costs of these solutions, and how sharing requirements influence the choice between table architectures.

The Sharing Problem Explained

To understand the sharing problem clearly, let's contrast how sharing works in conventional versus inverted page tables.

Sharing in Conventional Page Tables:

Each process has its own page table, and entries are independent:

Process A's Page Table:          Process B's Page Table:
┌─────────────────────────┐      ┌─────────────────────────┐
│ VPN 0x1000 → Frame 500  │      │ VPN 0x2000 → Frame 500  │
│ VPN 0x1001 → Frame 501  │      │ VPN 0x2001 → Frame 501  │
│ VPN 0x1002 → Frame 502  │      │ VPN 0x2002 → Frame 502  │
└─────────────────────────┘      └─────────────────────────┘

Shared Library (libc.so):
  - Loaded into frames 500, 501, 502
  - A maps it at VA 0x1000-0x1002
  - B maps it at VA 0x2000-0x2002
  - Both point to same physical frames
  - No conflict - separate tables!

This is seamless. The library exists once in physical memory, but each process has its own virtual address for it (which can even be different for ASLR security).

Sharing in Inverted Page Tables:

Now consider the same scenario with an inverted table:

Inverted Page Table (one entry per frame):
┌─────────────────────────────────────────┐
│ Frame 500: PID=?, VPN=?  ← PROBLEM!    │
│ Frame 501: ...                          │
│ Frame 502: ...                          │
└─────────────────────────────────────────┘

Frame 500 contains shared library code.
Process A wants: (A, 0x1000) → Frame 500
Process B wants: (B, 0x2000) → Frame 500

But IPT[500] can only store ONE (PID, VPN) pair!

If we store (A, 0x1000), B's translation fails.
If we store (B, 0x2000), A's translation fails.
We cannot store both in a single entry.

The Fundamental Conflict:

Inverted Table Invariant:
  |entries| = |physical frames|
  Each frame has exactly one entry
  
Sharing Requirement:
  One frame may be mapped by multiple (process, VPN) pairs
  
Conflict:
  Cannot represent N mappings with 1 entry

This is not a bug or implementation limitation—it's an inherent property of the inverted organization. The table is indexed by physical frame, and each frame has one slot. Sharing violates the assumption that each frame belongs to exactly one virtual page.

Converting Mermaid diagram...

The Scale of the Problem

Modern systems share extensively. A typical Linux desktop shares: libc.so (100+ processes), libpthread.so (50+ processes), libX11.so (30+ processes), and many more. Without sharing support, each process would need private copies of all libraries, requiring orders of magnitude more memory.

Solution Approaches for Sharing

Several approaches have been developed to enable sharing despite the one-entry-per-frame constraint. Each has different trade-offs between complexity, memory overhead, and lookup performance.

Approach 1: Chained Sharing List

Extend each IPT entry with a chain of additional (PID, VPN) pairs:

IPT Entry Structure (Extended):
┌─────────────────────────────────────────┐
│ Primary: PID_A, VPN_A, Control          │
│ Share Chain Pointer → [PID_B, VPN_B] ─→ [PID_C, VPN_C] → NULL
└─────────────────────────────────────────┘

Frame 500 shared by A, B, C:
  IPT[500].primary = (A, 0x1000)
  IPT[500].share_chain = [(B, 0x2000), (C, 0x3000)]

Lookup Modification:

lookup(pid, vpn):
    frame = hash_lookup(pid, vpn)
    if frame != NOT_FOUND:
        return frame
    
    // Check if this VPN is a secondary sharer
    for each frame in all_frames:  // Or use auxiliary hash
        if (pid, vpn) in IPT[frame].share_chain:
            return frame
    
    return PAGE_FAULT

Problems:

Linear search of share chains is slow
Need auxiliary structure to hash secondary sharers
Per-entry memory overhead for share chains

Approach 2: Per-Process Sharing Tables

Maintain a separate small table per process that records shared page mappings:

Process A Sharing Table:       Process B Sharing Table:
┌───────────────────────┐     ┌───────────────────────┐
│ VPN 0x1000 → Frame 500│     │ VPN 0x2000 → Frame 500│
│ VPN 0x1001 → Frame 501│     │ VPN 0x2001 → Frame 501│
└───────────────────────┘     └───────────────────────┘

Inverted Table (stores primary owner):
  IPT[500]: (A, 0x1000)  // A is "primary"
  
Lookup Algorithm:
  1. Check process's sharing table first
  2. If not found, check inverted table hash

Advantages:

Lookup for shared pages is fast (small per-process table)
No modification to core IPT structure
Per-process tables are typically small (just shared pages)

Disadvantages:

Adds per-process memory overhead (partially negating IPT benefit)
Must synchronize process table with IPT on share/unshare
Two-level lookup for shared pages

sharing_solution.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
// Sharing solution with per-process sharing tables
 
typedef struct {
    uint64_t vpn;              // Virtual page number
    uint32_t frame;            // Physical frame
    uint32_t control;          // Protection bits (may differ per-process)
} ProcessShareEntry;
 
typedef struct {
    ProcessShareEntry* entries;  // Array/hash of shared mappings
    size_t count;                // Number of shared pages
    size_t capacity;             // Allocated capacity
} ProcessSharingTable;
 
typedef struct {
    InvertedPageTable* ipt;                    // Main inverted table
    ProcessSharingTable* share_tables;         // Per-process sharing tables
    size_t num_processes;                      
} HybridPageTable;
 
// Translation with sharing support
TranslationResult translate_with_sharing(HybridPageTable* hpt,
                                          uint16_t pid,
                                          uint64_t vpn) {
    TranslationResult result = {0};
    
    // Step 1: Check per-process sharing table first
    ProcessSharingTable* share_table = &hpt->share_tables[pid];
    
    for (size_t i = 0; i < share_table->count; i++) {
        if (share_table->entries[i].vpn == vpn) {
            // Found in sharing table!
            uint32_t frame = share_table->entries[i].frame;
            result.success = true;
            result.physical_address = 
                ((uint64_t)frame << PAGE_SHIFT) | (vpn & PAGE_OFFSET_MASK);
            result.control = share_table->entries[i].control;
            return result;
        }
    }
    
    // Step 2: Fall back to main inverted page table
    return ipt_translate(hpt->ipt, pid, vpn << PAGE_SHIFT);
}
 
// Create shared mapping
bool create_shared_mapping(HybridPageTable* hpt,
                           uint16_t pid,
                           uint64_t vpn,
                           uint32_t frame,
                           uint32_t control) {
    // Check if frame already has an owner in IPT
    IPTEntry* existing = &hpt->ipt->entries[frame];
    
    if (existing->control & IPT_VALID) {
        // Frame already owned - this is a new sharer
        // Add to per-process sharing table
        ProcessSharingTable* share_table = &hpt->share_tables[pid];
        
        if (share_table->count >= share_table->capacity) {
            // Grow the table
            resize_sharing_table(share_table);
        }
        
        // Add entry
        ProcessShareEntry* new_entry = &share_table->entries[share_table->count++];
        new_entry->vpn = vpn;
        new_entry->frame = frame;
        new_entry->control = control;
        
        // Also add to hash table for this process's VPN
        // (for efficient reverse lookup)
        
        return true;
    } else {
        // Frame has no owner - make this process the primary
        return ipt_insert(hpt->ipt, pid, vpn, frame, control);
    }
}

Approach 3: Hash Table per Shared Frame

For heavily shared frames (like libc pages shared by 100+ processes), maintain a separate hash table:

IPT Entry:
  if (share_count > THRESHOLD):
      share_table_pointer → HashTable[(PID,VPN) → control]
  else:
      inline_share_list[4]  // Small inline array for common case

Approach 4: Frame Aliasing (PowerPC Approach)

Some systems use "virtual aliases" where the same physical frame can appear multiple times in the IPT through different hash paths:

Primary hash for (A, 0x1000):   bucket 42 → IPT entry at frame 500
Primary hash for (B, 0x2000):   bucket 87 → No entry
Alias mechanism: bucket 87 chain includes pointer to frame 500

Both lookups eventually find frame 500, even though
IPT[500] only stores one (PID, VPN).

This requires careful implementation and hash table awareness of sharing.

Modern Hybrid Approach

Most practical inverted page table implementations use a hybrid: the main IPT stores primary owners, while per-process supplementary tables track shared mappings. This preserves IPT's memory benefits for private pages while handling sharing through the supplementary tables.

Implementation Complexity Analysis

Supporting sharing in inverted page tables adds significant implementation complexity compared to conventional tables. Understanding this complexity helps evaluate the true cost of the inverted approach.

Complexity Sources:

1. Multi-Structure Lookup:

Conventional Table Lookup:
  1. Index into page table
  2. Read entry
  Done.

Inverted Table with Sharing:
  1. Check per-process sharing table (hash lookup + chain traverse)
  2. If miss, check main IPT (hash lookup + chain traverse)
  3. May need to check share chains if found in IPT
  4. Verify PID matches at each step
  Multiple data structures, multiple lookups.

2. Insert/Delete Coordination:

When a shared page is unmapped:

Unmap shared page from Process B:
  1. Remove from B's sharing table
  2. Check if frame has other sharers
     - If yes: update share list
     - If no: check if primary owner remains
       - If yes: done
       - If no: clear IPT entry (frame now free)
  3. Invalidate B's TLB entries
  4. May need to update reference count
  
Multiple structures must stay consistent.

3. Reference Counting:

Shared frames need reference counts:

Frame Reference Tracking:
  - Primary IPT entry has ref_count field
  - Each share adds 1, each unmap subtracts 1
  - When ref_count hits 0, frame is free
  - Must be atomic in multi-CPU systems
  
Adds: storage overhead, atomic operations, coordination logic

Complexity Comparison
Operation	Conventional O()	Inverted O()	Notes
Private page lookup	O(levels)	O(1 + chain)	Similar
Shared page lookup	O(levels)	O(1 + chain) + O(share_table)	IPT slower
Create sharing	O(1)	O(1) + share_table insert	IPT more complex
Remove sharing	O(1)	O(sharers) coordination	IPT much more complex
Frame → owners	O(all_processes)	O(1) + O(sharers)	IPT wins here

sharing_coordination.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
// Complex sharing coordination logic
 
typedef struct {
    _Atomic uint32_t ref_count;      // Number of mappers
    uint16_t primary_pid;            // Primary owner PID
    bool has_secondary_sharers;      // Quick check for sharing
} FrameSharingInfo;
 
// Safely remove a shared mapping
bool unmap_shared_page(HybridPageTable* hpt, 
                       uint16_t pid, 
                       uint64_t vpn) {
    // Step 1: Find the frame
    TranslationResult tr = translate_with_sharing(hpt, pid, vpn);
    if (!tr.success) {
        return false;  // Not mapped
    }
    
    uint32_t frame = tr.frame_number;
    FrameSharingInfo* info = &hpt->sharing_info[frame];
    
    // Step 2: Atomically decrement reference count
    uint32_t old_count = atomic_fetch_sub(&info->ref_count, 1);
    
    if (old_count == 1) {
        // This was the last reference - frame is now free
        
        // Clear IPT entry
        hpt->ipt->entries[frame].control &= ~IPT_VALID;
        
        // Remove from hash chains
        ipt_remove_from_hash(hpt->ipt, pid, vpn);
        
        // Add frame to free list
        add_to_free_list(hpt, frame);
        
        return true;
    }
    
    // Step 3: Not last reference - just remove this process's mapping
    
    if (info->primary_pid == pid) {
        // Removing primary owner - must promote a secondary
        promote_secondary_to_primary(hpt, frame);
    }
    
    // Remove from per-process sharing table
    ProcessSharingTable* share_table = &hpt->share_tables[pid];
    remove_from_sharing_table(share_table, vpn);
    
    // Remove from hash chain for this (pid, vpn)
    ipt_remove_from_hash(hpt->ipt, pid, vpn);
    
    // Invalidate TLB
    tlb_invalidate(pid, vpn);
    
    return true;
}
 
// Promote a secondary sharer to primary
void promote_secondary_to_primary(HybridPageTable* hpt, uint32_t frame) {
    IPTEntry* entry = &hpt->ipt->entries[frame];
    FrameSharingInfo* info = &hpt->sharing_info[frame];
    
    // Find a secondary sharer to become primary
    // This requires scanning share tables or maintaining a list
    
    for (size_t pid = 0; pid < hpt->num_processes; pid++) {
        ProcessSharingTable* table = &hpt->share_tables[pid];
        
        for (size_t i = 0; i < table->count; i++) {
            if (table->entries[i].frame == frame) {
                // Found a secondary sharer - promote them
                uint64_t new_primary_vpn = table->entries[i].vpn;
                uint16_t new_primary_pid = pid;
                
                // Update IPT entry
                entry->pid = new_primary_pid;
                entry->vpn = new_primary_vpn;
                info->primary_pid = new_primary_pid;
                
                // Remove from their share table (now in IPT)
                remove_from_sharing_table(table, new_primary_vpn);
                
                // Update hash chains
                // ... (complex re-hashing logic)
                
                return;
            }
        }
    }
    
    // No secondary found - this shouldn't happen if ref_count > 0
}

The Hidden Cost

While inverted tables save page table memory, the sharing support infrastructure (per-process tables, reference counting, coordination logic, share chains) adds both memory overhead and code complexity. For sharing-heavy workloads, the net benefit may be smaller than initially apparent.

Impact on Common Operations

Sharing challenges affect several fundamental operating system operations. Understanding these impacts reveals the practical implications of inverted table design.

Fork and Copy-on-Write (CoW):

Fork creates a child process that shares most memory with the parent. CoW defers copying until writes occur:

Conventional Tables:

fork():
  1. Create new page table for child
  2. Copy parent's page table entries (not data)
  3. Mark all pages read-only in BOTH tables
  4. On write: fault → copy page → make writable
  
Simple: child table is independent, shares frames by pointing to them.

Inverted Tables:

fork():
  1. For each of parent's pages:
     - Frame already has entry: IPT[frame] = (parent, vpn)
     - Must add child mapping somehow...
     
Options:
  A. Add all child mappings to share structures (expensive)
  B. Use "implicit sharing" where child inherits parent's ASID temporarily
  C. Defer until child actually accesses memory
  
Much more complex.

Shared Libraries:

Shared libraries like libc.so are mapped into every process:

Conventional Tables:

Each process page table has entries pointing to library frames
Library loaded once, mapped many times
No structural changes needed

Inverted Tables:

IPT entries can store only one (PID, VPN) per frame
100+ processes sharing libc means 100+ entries in share tables
Share table overhead rivals conventional table overhead!

Memory-Mapped Files:

Multiple processes mapping the same file share physical pages:

Process A: mmap(file, 0x10000000, READ)
Process B: mmap(file, 0x20000000, READ)
Process C: mmap(file, 0x30000000, READ)

Conventional: Three page table entries, one physical page.
Inverted: One IPT entry + share table entries for B and C.

Shared Memory (IPC):

Direct memory sharing for inter-process communication:

Process A: shmget() + shmat(0x5000000)
Process B: shmget() + shmat(0x6000000)  // Same shm segment

Conventional: A's table: 0x5000000 → Frame X
              B's table: 0x6000000 → Frame X
              Independent entries.
              
Inverted: IPT[X] = (A, 0x5000000) [primary]
          B's share table: 0x6000000 → X
          Must track that B also has access.

Operation Impact Summary
Operation	Conventional Impact	Inverted Impact	Concern Level
fork()	Copy table, mark CoW	Complex sharing setup	High
exec()	Build new table	Clear process entries, build new	Medium
mmap(shared)	Add table entry	Sharing table update	High
munmap(shared)	Remove table entry	Sharing coordination	High
exit()	Free table	Remove from all shares	Medium
Library loading	Table entries	Share structures	High

fork_inverted.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
// Fork implementation with inverted page table sharing
 
Process* fork_with_ipt(Process* parent) {
    Process* child = allocate_process();
    child->asid = allocate_asid();
    
    // Initialize empty sharing table for child
    child->share_table = create_sharing_table();
    
    // For each frame owned by parent
    for (uint32_t frame = 0; frame < num_frames; frame++) {
        IPTEntry* entry = &ipt->entries[frame];
        
        if ((entry->control & IPT_VALID) && entry->pid == parent->asid) {
            // Parent owns this frame - child needs to share it
            
            // Mark page as CoW (read-only in parent too)
            entry->control &= ~IPT_WRITE;
            entry->control |= IPT_COW;
            
            // Add child mapping to share structure
            // Child uses same VPN as parent (fork semantics)
            add_to_sharing_table(child->share_table, 
                                 entry->vpn,
                                 frame,
                                 entry->control);
            
            // Increment frame reference count
            atomic_fetch_add(&frame_refs[frame], 1);
        }
    }
    
    // Also check parent's share table (parent might be sharer, not owner)
    for (size_t i = 0; i < parent->share_table->count; i++) {
        ProcessShareEntry* pse = &parent->share_table->entries[i];
        
        // Child also shares these pages
        add_to_sharing_table(child->share_table,
                             pse->vpn,
                             pse->frame,
                             pse->control & ~IPT_WRITE | IPT_COW);
        
        atomic_fetch_add(&frame_refs[pse->frame], 1);
    }
    
    // Invalidate parent's TLB entries (permissions changed to read-only)
    tlb_flush_asid(parent->asid);
    
    return child;
}
 
// CoW fault handler
void handle_cow_fault(uint16_t pid, uint64_t faulting_vpn) {
    // Find the shared frame
    TranslationResult tr = translate_with_sharing(get_hpt(), pid, faulting_vpn);
    uint32_t old_frame = tr.frame_number;
    
    uint32_t ref_count = atomic_load(&frame_refs[old_frame]);
    
    if (ref_count == 1) {
        // Only one reference - just make it writable
        IPTEntry* entry = &ipt->entries[old_frame];
        if (entry->pid == pid) {
            entry->control &= ~IPT_COW;
            entry->control |= IPT_WRITE;
        } else {
            // It's in share table
            update_share_table_control(pid, faulting_vpn, IPT_WRITE);
        }
        return;
    }
    
    // Multiple references - must copy
    uint32_t new_frame = allocate_frame();
    
    // Copy data
    memcpy(frame_to_addr(new_frame), frame_to_addr(old_frame), PAGE_SIZE);
    
    // Update this process's mapping to new frame
    // Remove from sharing, add as sole owner of new frame
    remove_shared_mapping(get_hpt(), pid, faulting_vpn);
    ipt_insert(get_ipt(), pid, faulting_vpn, new_frame, IPT_WRITE | IPT_VALID);
    
    // Decrement old frame's ref count
    atomic_fetch_sub(&frame_refs[old_frame], 1);
    
    // Set new frame's ref count to 1
    frame_refs[new_frame] = 1;
    
    // Invalidate old TLB entry
    tlb_invalidate(pid, faulting_vpn);
}

The Fork Problem

Fork is particularly problematic because it creates sharing for potentially thousands of pages instantaneously. With conventional tables, this is just a page table copy. With inverted tables, every page needs sharing structure updates. Systems using inverted tables often implement fork differently (e.g., immediate copy, no CoW) or use special optimizations like shared-ASID periods.

Performance Implications

The sharing overhead in inverted page tables has measurable performance implications across various dimensions.

Memory Overhead Analysis:

Let's quantify the sharing overhead:

Scenario: 100 processes, each mapping libc.so (1000 pages)

Conventional Tables:
  - 1000 PTEs per process for libc
  - 1000 × 8 bytes × 100 processes = 800 KB
  - Plus 1000 physical frames for libc data
  
Inverted Table (Naive):
  - 1000 IPT entries (one per libc frame)
  - Plus 1000 × 99 share table entries (secondary sharers)
  - = 1000 + 99,000 = 100,000 entries
  - 100,000 × 12 bytes = 1.2 MB
  - WORSE than conventional!
  
Inverted Table (Optimized):
  - Use "library region" concept
  - Store ranges instead of individual pages
  - Overhead: ~10KB per process for shared libs
  - Still ~1 MB total, marginal improvement

The sharing structures can consume as much or more memory than conventional tables for sharing-heavy workloads.

Lookup Latency:

Private Page Lookup:
  Conventional: ~400 cycles (4-level walk)
  Inverted: ~200 cycles (hash + chain)
  Winner: Inverted

Shared Page Lookup (Page in Share Table):
  Conventional: ~400 cycles (same as private)
  Inverted: ~100 cycles (share table) + ~200 cycles (if miss, check IPT)
  Winner: Depends on share table organization

Shared Page Lookup (Page in Long Share Chain):
  Conventional: ~400 cycles (unchanged)
  Inverted: ~500+ cycles (traverse chain)
  Winner: Conventional

Fork Performance:

Fork with 1000 pages (all become shared):

Conventional:
  - Copy page table: 1000 × copy_entry = ~50µs
  - Mark CoW: Modify protection bits = ~20µs
  Total: ~70µs
  
Inverted:
  - Create share table entries: 1000 × insert = ~100µs
  - Update ref counts: 1000 × atomic_inc = ~10µs
  - Invalidate parent TLB: ~20µs
  Total: ~130µs
  
Inverted is ~2x slower for fork.

Benchmark Comparison (Simulated Workload)
Workload	Conventional Time	Inverted Time	Ratio
10K private page accesses	40ms	25ms	0.63x (faster)
10K shared page accesses	40ms	45ms	1.13x (slower)
Fork + exec	0.5ms	1.1ms	2.2x (slower)
1000 process startup	50ms	120ms	2.4x (slower)
Steady-state (mixed)	Base	+5-15%	Slight overhead

TLB Behavior Differences:

Sharing affects TLB utilization:

Conventional Tables:

Each process has independent TLB entries for shared pages
TLB for libc in Process A doesn't help Process B
But: PCID allows multi-process TLB coexistence

Inverted Tables:

TLB entries include ASID, so same issue
BUT: Could potentially share TLB entries for global pages (e.g., kernel pages, truly shared libraries)
Requires hardware support for "global" entries

Page Replacement Advantage:

One area where inverted tables excel with sharing:

Page Replacement: Which process owns frame 1000?

Conventional:
  - Must search all process page tables
  - Or maintain reverse mapping (more overhead)
  - O(num_processes) worst case
  
Inverted:
  - IPT[1000] gives primary owner instantly
  - Share structures give other sharers
  - O(num_sharers) for that frame specifically
  
Inverted wins when many processes exist but few share each frame.

Workload Matters

The performance trade-off depends heavily on workload characteristics. Sharing-heavy workloads (desktops, multi-user systems) favor conventional tables. Sharing-light workloads (embedded systems, single-purpose servers) may benefit from inverted tables' simpler private page handling.

Design Recommendations and Trade-offs

Based on the sharing challenges analysis, several design principles emerge for systems considering inverted page tables.

When to Choose Inverted Page Tables:

Good Fit Scenarios

•Single-process systems: Embedded devices, single-purpose appliances where sharing is minimal or non-existent
•Large physical memory, few processes: Scientific computing nodes with few large processes
•Static sharing patterns: Systems where shared mappings are established at boot and rarely change
•Memory-constrained systems: Where page table overhead must be bounded regardless of process count
•64-bit embedded: Where virtual address space is huge but physical memory and process count are small

Poor Fit Scenarios

•General-purpose desktops/laptops: Heavy shared library usage, frequent fork/exec
•Multi-user servers: Many processes, extensive sharing, dynamic workloads
•Database servers: Memory-mapped files shared across connections
•Containerized environments: Many containers sharing base images
•Systems requiring fork() performance: Inverted table fork is significantly slower

Hybrid Architectures:

Some systems use hybrid approaches to get benefits of both:

1. Inverted for User Pages, Conventional for Kernel:

Kernel has fixed mappings, rarely changes
User processes may vary widely
Use conventional for stable kernel, inverted for dynamic user space

2. Inverted for Private, Conventional for Shared:

Private pages use IPT (compact, efficient)
Shared pages use per-process tables (simple sharing)
Selection based on sharing metadata

3. Software IPT with Hardware Conventional:

Hardware walks conventional tables for TLB miss
Software maintains IPT for reverse mapping and memory management
Best of both worlds, but duplication overhead

Implementation Guidelines:

Optimize for the common case: If most pages are private, optimize private lookup and accept sharing overhead
Use compact sharing representations: Range-based sharing structures for contiguous shared regions (libraries)
Lazy sharing expansion: Don't create share entries until actually accessed (demand-based)
Pool shared library mappings: Treat common libraries specially with optimized data structures
Measure your workload: Profile actual sharing patterns before choosing architecture

Architecture Decision Matrix
Criterion	Favor Inverted	Favor Conventional
Process count	Very high (1000+)	Moderate (<100)
Shared memory usage	Minimal	Extensive
Fork frequency	Rare	Common
VA space density	Very sparse	Moderately populated
Physical memory	Limited	Abundant
Application type	Embedded/specialized	General purpose

Modern Reality

Despite the theoretical appeal of inverted tables, modern mainstream systems (x86-64, ARM64, RISC-V) all use conventional multi-level page tables. The sharing challenges, combined with mature hardware support for multi-level walking and large TLBs, make conventional tables the pragmatic choice for general-purpose systems.

Summary: Sharing Challenges

Memory sharing capability is fundamental to efficient modern operating systems, and the single-entry-per-frame constraint of inverted page tables creates significant design challenges that partially offset their memory advantages.

Key Takeaways

•The sharing problem arises because inverted tables have exactly one entry per physical frame, but sharing requires representing multiple (process, VPN) mappings to a single frame.
•Solution approaches include chained sharing lists, per-process sharing tables, frame aliasing, and hybrid structures—each adding complexity and overhead.
•Implementation complexity is substantial: sharing coordination, reference counting, primary/secondary owner management, and multi-structure synchronization all add code paths and potential bugs.
•Common operations are affected: fork() becomes more expensive, shared library loading creates share table pressure, and memory-mapped file sharing requires additional bookkeeping.
•Performance implications vary by workload: inverted tables can be slower for sharing-heavy workloads while faster for private-page-dominated scenarios.
•Design recommendations suggest inverted tables for specialized systems (embedded, single-process) while conventional tables remain appropriate for general-purpose systems with extensive sharing.

Module Complete:

You have now completed the comprehensive study of hash-based and inverted page tables. This module has equipped you with:

Deep understanding of hashed page table design, hash functions, and collision handling
Knowledge of inverted page table architecture and its physical-memory-proportional sizing
Understanding of system-wide table implications for context switching, TLB management, and concurrency
Appreciation of sharing challenges that limit inverted table applicability

These alternative page table architectures represent important design points in the memory management design space, illuminating trade-offs that inform all virtual memory system design.

Module Complete

You have mastered hash-based and inverted page tables—from hashing fundamentals through collision handling, inverted table structure, system-wide organization, and sharing challenges. This knowledge provides a complete picture of advanced page table architectures and their trade-offs relative to conventional multi-level designs.

5 / 5

Loading learning content...

Operating SystemsPaging

Hash-based and Inverted Page Tables

LevelAdvanced

Duration90 mins

TopicPaging

5 / 5

Sharing Challenges in Inverted Page Tables

The Sharing Paradox

This sharing challenge is the most significant limitation of inverted page tables and drives complex solutions that partially negate their space advantages.

What You Will Learn

The Sharing Problem Explained

To understand the sharing problem clearly, let's contrast how sharing works in conventional versus inverted page tables.

Sharing in Conventional Page Tables:

Each process has its own page table, and entries are independent:

Process A's Page Table:          Process B's Page Table:
┌─────────────────────────┐      ┌─────────────────────────┐
│ VPN 0x1000 → Frame 500  │      │ VPN 0x2000 → Frame 500  │
│ VPN 0x1001 → Frame 501  │      │ VPN 0x2001 → Frame 501  │
│ VPN 0x1002 → Frame 502  │      │ VPN 0x2002 → Frame 502  │
└─────────────────────────┘      └─────────────────────────┘

Shared Library (libc.so):
  - Loaded into frames 500, 501, 502
  - A maps it at VA 0x1000-0x1002
  - B maps it at VA 0x2000-0x2002
  - Both point to same physical frames
  - No conflict - separate tables!

This is seamless. The library exists once in physical memory, but each process has its own virtual address for it (which can even be different for ASLR security).

Sharing in Inverted Page Tables:

Now consider the same scenario with an inverted table:

Inverted Page Table (one entry per frame):
┌─────────────────────────────────────────┐
│ Frame 500: PID=?, VPN=?  ← PROBLEM!    │
│ Frame 501: ...                          │
│ Frame 502: ...                          │
└─────────────────────────────────────────┘

Frame 500 contains shared library code.
Process A wants: (A, 0x1000) → Frame 500
Process B wants: (B, 0x2000) → Frame 500

But IPT[500] can only store ONE (PID, VPN) pair!

If we store (A, 0x1000), B's translation fails.
If we store (B, 0x2000), A's translation fails.
We cannot store both in a single entry.

The Fundamental Conflict:

Inverted Table Invariant:
  |entries| = |physical frames|
  Each frame has exactly one entry
  
Sharing Requirement:
  One frame may be mapped by multiple (process, VPN) pairs
  
Conflict:
  Cannot represent N mappings with 1 entry

Converting Mermaid diagram...

The Scale of the Problem

Solution Approaches for Sharing

Several approaches have been developed to enable sharing despite the one-entry-per-frame constraint. Each has different trade-offs between complexity, memory overhead, and lookup performance.

Approach 1: Chained Sharing List

Extend each IPT entry with a chain of additional (PID, VPN) pairs:

IPT Entry Structure (Extended):
┌─────────────────────────────────────────┐
│ Primary: PID_A, VPN_A, Control          │
│ Share Chain Pointer → [PID_B, VPN_B] ─→ [PID_C, VPN_C] → NULL
└─────────────────────────────────────────┘

Frame 500 shared by A, B, C:
  IPT[500].primary = (A, 0x1000)
  IPT[500].share_chain = [(B, 0x2000), (C, 0x3000)]

Lookup Modification:

lookup(pid, vpn):
    frame = hash_lookup(pid, vpn)
    if frame != NOT_FOUND:
        return frame
    
    // Check if this VPN is a secondary sharer
    for each frame in all_frames:  // Or use auxiliary hash
        if (pid, vpn) in IPT[frame].share_chain:
            return frame
    
    return PAGE_FAULT

Problems:

Linear search of share chains is slow
Need auxiliary structure to hash secondary sharers
Per-entry memory overhead for share chains

Approach 2: Per-Process Sharing Tables

Maintain a separate small table per process that records shared page mappings:

Process A Sharing Table:       Process B Sharing Table:
┌───────────────────────┐     ┌───────────────────────┐
│ VPN 0x1000 → Frame 500│     │ VPN 0x2000 → Frame 500│
│ VPN 0x1001 → Frame 501│     │ VPN 0x2001 → Frame 501│
└───────────────────────┘     └───────────────────────┘

Inverted Table (stores primary owner):
  IPT[500]: (A, 0x1000)  // A is "primary"
  
Lookup Algorithm:
  1. Check process's sharing table first
  2. If not found, check inverted table hash

Advantages:

Lookup for shared pages is fast (small per-process table)
No modification to core IPT structure
Per-process tables are typically small (just shared pages)

Disadvantages:

Adds per-process memory overhead (partially negating IPT benefit)
Must synchronize process table with IPT on share/unshare
Two-level lookup for shared pages

sharing_solution.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
// Sharing solution with per-process sharing tables
 
typedef struct {
    uint64_t vpn;              // Virtual page number
    uint32_t frame;            // Physical frame
    uint32_t control;          // Protection bits (may differ per-process)
} ProcessShareEntry;
 
typedef struct {
    ProcessShareEntry* entries;  // Array/hash of shared mappings
    size_t count;                // Number of shared pages
    size_t capacity;             // Allocated capacity
} ProcessSharingTable;
 
typedef struct {
    InvertedPageTable* ipt;                    // Main inverted table
    ProcessSharingTable* share_tables;         // Per-process sharing tables
    size_t num_processes;                      
} HybridPageTable;
 
// Translation with sharing support
TranslationResult translate_with_sharing(HybridPageTable* hpt,
                                          uint16_t pid,
                                          uint64_t vpn) {
    TranslationResult result = {0};
    
    // Step 1: Check per-process sharing table first
    ProcessSharingTable* share_table = &hpt->share_tables[pid];
    
    for (size_t i = 0; i < share_table->count; i++) {
        if (share_table->entries[i].vpn == vpn) {
            // Found in sharing table!
            uint32_t frame = share_table->entries[i].frame;
            result.success = true;
            result.physical_address = 
                ((uint64_t)frame << PAGE_SHIFT) | (vpn & PAGE_OFFSET_MASK);
            result.control = share_table->entries[i].control;
            return result;
        }
    }
    
    // Step 2: Fall back to main inverted page table
    return ipt_translate(hpt->ipt, pid, vpn << PAGE_SHIFT);
}
 
// Create shared mapping
bool create_shared_mapping(HybridPageTable* hpt,
                           uint16_t pid,
                           uint64_t vpn,
                           uint32_t frame,
                           uint32_t control) {
    // Check if frame already has an owner in IPT
    IPTEntry* existing = &hpt->ipt->entries[frame];
    
    if (existing->control & IPT_VALID) {
        // Frame already owned - this is a new sharer
        // Add to per-process sharing table
        ProcessSharingTable* share_table = &hpt->share_tables[pid];
        
        if (share_table->count >= share_table->capacity) {
            // Grow the table
            resize_sharing_table(share_table);
        }
        
        // Add entry
        ProcessShareEntry* new_entry = &share_table->entries[share_table->count++];
        new_entry->vpn = vpn;
        new_entry->frame = frame;
        new_entry->control = control;
        
        // Also add to hash table for this process's VPN
        // (for efficient reverse lookup)
        
        return true;
    } else {
        // Frame has no owner - make this process the primary
        return ipt_insert(hpt->ipt, pid, vpn, frame, control);
    }
}

Approach 3: Hash Table per Shared Frame

For heavily shared frames (like libc pages shared by 100+ processes), maintain a separate hash table:

IPT Entry:
  if (share_count > THRESHOLD):
      share_table_pointer → HashTable[(PID,VPN) → control]
  else:
      inline_share_list[4]  // Small inline array for common case

Approach 4: Frame Aliasing (PowerPC Approach)

Some systems use "virtual aliases" where the same physical frame can appear multiple times in the IPT through different hash paths:

Primary hash for (A, 0x1000):   bucket 42 → IPT entry at frame 500
Primary hash for (B, 0x2000):   bucket 87 → No entry
Alias mechanism: bucket 87 chain includes pointer to frame 500

Both lookups eventually find frame 500, even though
IPT[500] only stores one (PID, VPN).

This requires careful implementation and hash table awareness of sharing.

Modern Hybrid Approach

Implementation Complexity Analysis

Complexity Sources:

1. Multi-Structure Lookup:

Conventional Table Lookup:
  1. Index into page table
  2. Read entry
  Done.

Inverted Table with Sharing:
  1. Check per-process sharing table (hash lookup + chain traverse)
  2. If miss, check main IPT (hash lookup + chain traverse)
  3. May need to check share chains if found in IPT
  4. Verify PID matches at each step
  Multiple data structures, multiple lookups.

2. Insert/Delete Coordination:

When a shared page is unmapped:

Unmap shared page from Process B:
  1. Remove from B's sharing table
  2. Check if frame has other sharers
     - If yes: update share list
     - If no: check if primary owner remains
       - If yes: done
       - If no: clear IPT entry (frame now free)
  3. Invalidate B's TLB entries
  4. May need to update reference count
  
Multiple structures must stay consistent.

3. Reference Counting:

Shared frames need reference counts:

Frame Reference Tracking:
  - Primary IPT entry has ref_count field
  - Each share adds 1, each unmap subtracts 1
  - When ref_count hits 0, frame is free
  - Must be atomic in multi-CPU systems
  
Adds: storage overhead, atomic operations, coordination logic

Complexity Comparison
Operation	Conventional O()	Inverted O()	Notes
Private page lookup	O(levels)	O(1 + chain)	Similar
Shared page lookup	O(levels)	O(1 + chain) + O(share_table)	IPT slower
Create sharing	O(1)	O(1) + share_table insert	IPT more complex
Remove sharing	O(1)	O(sharers) coordination	IPT much more complex
Frame → owners	O(all_processes)	O(1) + O(sharers)	IPT wins here

sharing_coordination.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
// Complex sharing coordination logic
 
typedef struct {
    _Atomic uint32_t ref_count;      // Number of mappers
    uint16_t primary_pid;            // Primary owner PID
    bool has_secondary_sharers;      // Quick check for sharing
} FrameSharingInfo;
 
// Safely remove a shared mapping
bool unmap_shared_page(HybridPageTable* hpt, 
                       uint16_t pid, 
                       uint64_t vpn) {
    // Step 1: Find the frame
    TranslationResult tr = translate_with_sharing(hpt, pid, vpn);
    if (!tr.success) {
        return false;  // Not mapped
    }
    
    uint32_t frame = tr.frame_number;
    FrameSharingInfo* info = &hpt->sharing_info[frame];
    
    // Step 2: Atomically decrement reference count
    uint32_t old_count = atomic_fetch_sub(&info->ref_count, 1);
    
    if (old_count == 1) {
        // This was the last reference - frame is now free
        
        // Clear IPT entry
        hpt->ipt->entries[frame].control &= ~IPT_VALID;
        
        // Remove from hash chains
        ipt_remove_from_hash(hpt->ipt, pid, vpn);
        
        // Add frame to free list
        add_to_free_list(hpt, frame);
        
        return true;
    }
    
    // Step 3: Not last reference - just remove this process's mapping
    
    if (info->primary_pid == pid) {
        // Removing primary owner - must promote a secondary
        promote_secondary_to_primary(hpt, frame);
    }
    
    // Remove from per-process sharing table
    ProcessSharingTable* share_table = &hpt->share_tables[pid];
    remove_from_sharing_table(share_table, vpn);
    
    // Remove from hash chain for this (pid, vpn)
    ipt_remove_from_hash(hpt->ipt, pid, vpn);
    
    // Invalidate TLB
    tlb_invalidate(pid, vpn);
    
    return true;
}
 
// Promote a secondary sharer to primary
void promote_secondary_to_primary(HybridPageTable* hpt, uint32_t frame) {
    IPTEntry* entry = &hpt->ipt->entries[frame];
    FrameSharingInfo* info = &hpt->sharing_info[frame];
    
    // Find a secondary sharer to become primary
    // This requires scanning share tables or maintaining a list
    
    for (size_t pid = 0; pid < hpt->num_processes; pid++) {
        ProcessSharingTable* table = &hpt->share_tables[pid];
        
        for (size_t i = 0; i < table->count; i++) {
            if (table->entries[i].frame == frame) {
                // Found a secondary sharer - promote them
                uint64_t new_primary_vpn = table->entries[i].vpn;
                uint16_t new_primary_pid = pid;
                
                // Update IPT entry
                entry->pid = new_primary_pid;
                entry->vpn = new_primary_vpn;
                info->primary_pid = new_primary_pid;
                
                // Remove from their share table (now in IPT)
                remove_from_sharing_table(table, new_primary_vpn);
                
                // Update hash chains
                // ... (complex re-hashing logic)
                
                return;
            }
        }
    }
    
    // No secondary found - this shouldn't happen if ref_count > 0
}

The Hidden Cost

Impact on Common Operations

Sharing challenges affect several fundamental operating system operations. Understanding these impacts reveals the practical implications of inverted table design.

Fork and Copy-on-Write (CoW):

Fork creates a child process that shares most memory with the parent. CoW defers copying until writes occur:

Conventional Tables:

fork():
  1. Create new page table for child
  2. Copy parent's page table entries (not data)
  3. Mark all pages read-only in BOTH tables
  4. On write: fault → copy page → make writable
  
Simple: child table is independent, shares frames by pointing to them.

Inverted Tables:

fork():
  1. For each of parent's pages:
     - Frame already has entry: IPT[frame] = (parent, vpn)
     - Must add child mapping somehow...
     
Options:
  A. Add all child mappings to share structures (expensive)
  B. Use "implicit sharing" where child inherits parent's ASID temporarily
  C. Defer until child actually accesses memory
  
Much more complex.

Shared Libraries:

Shared libraries like libc.so are mapped into every process:

Conventional Tables:

Each process page table has entries pointing to library frames
Library loaded once, mapped many times
No structural changes needed

Inverted Tables:

IPT entries can store only one (PID, VPN) per frame
100+ processes sharing libc means 100+ entries in share tables
Share table overhead rivals conventional table overhead!

Memory-Mapped Files:

Multiple processes mapping the same file share physical pages:

Process A: mmap(file, 0x10000000, READ)
Process B: mmap(file, 0x20000000, READ)
Process C: mmap(file, 0x30000000, READ)

Conventional: Three page table entries, one physical page.
Inverted: One IPT entry + share table entries for B and C.

Shared Memory (IPC):

Direct memory sharing for inter-process communication:

Process A: shmget() + shmat(0x5000000)
Process B: shmget() + shmat(0x6000000)  // Same shm segment

Conventional: A's table: 0x5000000 → Frame X
              B's table: 0x6000000 → Frame X
              Independent entries.
              
Inverted: IPT[X] = (A, 0x5000000) [primary]
          B's share table: 0x6000000 → X
          Must track that B also has access.

Operation Impact Summary
Operation	Conventional Impact	Inverted Impact	Concern Level
fork()	Copy table, mark CoW	Complex sharing setup	High
exec()	Build new table	Clear process entries, build new	Medium
mmap(shared)	Add table entry	Sharing table update	High
munmap(shared)	Remove table entry	Sharing coordination	High
exit()	Free table	Remove from all shares	Medium
Library loading	Table entries	Share structures	High

fork_inverted.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
// Fork implementation with inverted page table sharing
 
Process* fork_with_ipt(Process* parent) {
    Process* child = allocate_process();
    child->asid = allocate_asid();
    
    // Initialize empty sharing table for child
    child->share_table = create_sharing_table();
    
    // For each frame owned by parent
    for (uint32_t frame = 0; frame < num_frames; frame++) {
        IPTEntry* entry = &ipt->entries[frame];
        
        if ((entry->control & IPT_VALID) && entry->pid == parent->asid) {
            // Parent owns this frame - child needs to share it
            
            // Mark page as CoW (read-only in parent too)
            entry->control &= ~IPT_WRITE;
            entry->control |= IPT_COW;
            
            // Add child mapping to share structure
            // Child uses same VPN as parent (fork semantics)
            add_to_sharing_table(child->share_table, 
                                 entry->vpn,
                                 frame,
                                 entry->control);
            
            // Increment frame reference count
            atomic_fetch_add(&frame_refs[frame], 1);
        }
    }
    
    // Also check parent's share table (parent might be sharer, not owner)
    for (size_t i = 0; i < parent->share_table->count; i++) {
        ProcessShareEntry* pse = &parent->share_table->entries[i];
        
        // Child also shares these pages
        add_to_sharing_table(child->share_table,
                             pse->vpn,
                             pse->frame,
                             pse->control & ~IPT_WRITE | IPT_COW);
        
        atomic_fetch_add(&frame_refs[pse->frame], 1);
    }
    
    // Invalidate parent's TLB entries (permissions changed to read-only)
    tlb_flush_asid(parent->asid);
    
    return child;
}
 
// CoW fault handler
void handle_cow_fault(uint16_t pid, uint64_t faulting_vpn) {
    // Find the shared frame
    TranslationResult tr = translate_with_sharing(get_hpt(), pid, faulting_vpn);
    uint32_t old_frame = tr.frame_number;
    
    uint32_t ref_count = atomic_load(&frame_refs[old_frame]);
    
    if (ref_count == 1) {
        // Only one reference - just make it writable
        IPTEntry* entry = &ipt->entries[old_frame];
        if (entry->pid == pid) {
            entry->control &= ~IPT_COW;
            entry->control |= IPT_WRITE;
        } else {
            // It's in share table
            update_share_table_control(pid, faulting_vpn, IPT_WRITE);
        }
        return;
    }
    
    // Multiple references - must copy
    uint32_t new_frame = allocate_frame();
    
    // Copy data
    memcpy(frame_to_addr(new_frame), frame_to_addr(old_frame), PAGE_SIZE);
    
    // Update this process's mapping to new frame
    // Remove from sharing, add as sole owner of new frame
    remove_shared_mapping(get_hpt(), pid, faulting_vpn);
    ipt_insert(get_ipt(), pid, faulting_vpn, new_frame, IPT_WRITE | IPT_VALID);
    
    // Decrement old frame's ref count
    atomic_fetch_sub(&frame_refs[old_frame], 1);
    
    // Set new frame's ref count to 1
    frame_refs[new_frame] = 1;
    
    // Invalidate old TLB entry
    tlb_invalidate(pid, faulting_vpn);
}

The Fork Problem

Performance Implications

The sharing overhead in inverted page tables has measurable performance implications across various dimensions.

Memory Overhead Analysis:

Let's quantify the sharing overhead:

Scenario: 100 processes, each mapping libc.so (1000 pages)

Conventional Tables:
  - 1000 PTEs per process for libc
  - 1000 × 8 bytes × 100 processes = 800 KB
  - Plus 1000 physical frames for libc data
  
Inverted Table (Naive):
  - 1000 IPT entries (one per libc frame)
  - Plus 1000 × 99 share table entries (secondary sharers)
  - = 1000 + 99,000 = 100,000 entries
  - 100,000 × 12 bytes = 1.2 MB
  - WORSE than conventional!
  
Inverted Table (Optimized):
  - Use "library region" concept
  - Store ranges instead of individual pages
  - Overhead: ~10KB per process for shared libs
  - Still ~1 MB total, marginal improvement

The sharing structures can consume as much or more memory than conventional tables for sharing-heavy workloads.

Lookup Latency:

Private Page Lookup:
  Conventional: ~400 cycles (4-level walk)
  Inverted: ~200 cycles (hash + chain)
  Winner: Inverted

Shared Page Lookup (Page in Share Table):
  Conventional: ~400 cycles (same as private)
  Inverted: ~100 cycles (share table) + ~200 cycles (if miss, check IPT)
  Winner: Depends on share table organization

Shared Page Lookup (Page in Long Share Chain):
  Conventional: ~400 cycles (unchanged)
  Inverted: ~500+ cycles (traverse chain)
  Winner: Conventional

Fork Performance:

Fork with 1000 pages (all become shared):

Conventional:
  - Copy page table: 1000 × copy_entry = ~50µs
  - Mark CoW: Modify protection bits = ~20µs
  Total: ~70µs
  
Inverted:
  - Create share table entries: 1000 × insert = ~100µs
  - Update ref counts: 1000 × atomic_inc = ~10µs
  - Invalidate parent TLB: ~20µs
  Total: ~130µs
  
Inverted is ~2x slower for fork.

Benchmark Comparison (Simulated Workload)
Workload	Conventional Time	Inverted Time	Ratio
10K private page accesses	40ms	25ms	0.63x (faster)
10K shared page accesses	40ms	45ms	1.13x (slower)
Fork + exec	0.5ms	1.1ms	2.2x (slower)
1000 process startup	50ms	120ms	2.4x (slower)
Steady-state (mixed)	Base	+5-15%	Slight overhead

TLB Behavior Differences:

Sharing affects TLB utilization:

Conventional Tables:

Each process has independent TLB entries for shared pages
TLB for libc in Process A doesn't help Process B
But: PCID allows multi-process TLB coexistence

Inverted Tables:

TLB entries include ASID, so same issue
BUT: Could potentially share TLB entries for global pages (e.g., kernel pages, truly shared libraries)
Requires hardware support for "global" entries

Page Replacement Advantage:

One area where inverted tables excel with sharing:

Page Replacement: Which process owns frame 1000?

Conventional:
  - Must search all process page tables
  - Or maintain reverse mapping (more overhead)
  - O(num_processes) worst case
  
Inverted:
  - IPT[1000] gives primary owner instantly
  - Share structures give other sharers
  - O(num_sharers) for that frame specifically
  
Inverted wins when many processes exist but few share each frame.

Workload Matters

Design Recommendations and Trade-offs

Based on the sharing challenges analysis, several design principles emerge for systems considering inverted page tables.

When to Choose Inverted Page Tables:

Good Fit Scenarios

•Single-process systems: Embedded devices, single-purpose appliances where sharing is minimal or non-existent
•Large physical memory, few processes: Scientific computing nodes with few large processes
•Static sharing patterns: Systems where shared mappings are established at boot and rarely change
•Memory-constrained systems: Where page table overhead must be bounded regardless of process count
•64-bit embedded: Where virtual address space is huge but physical memory and process count are small

Poor Fit Scenarios

•General-purpose desktops/laptops: Heavy shared library usage, frequent fork/exec
•Multi-user servers: Many processes, extensive sharing, dynamic workloads
•Database servers: Memory-mapped files shared across connections
•Containerized environments: Many containers sharing base images
•Systems requiring fork() performance: Inverted table fork is significantly slower

Hybrid Architectures:

Some systems use hybrid approaches to get benefits of both:

1. Inverted for User Pages, Conventional for Kernel:

Kernel has fixed mappings, rarely changes
User processes may vary widely
Use conventional for stable kernel, inverted for dynamic user space

2. Inverted for Private, Conventional for Shared:

Private pages use IPT (compact, efficient)
Shared pages use per-process tables (simple sharing)
Selection based on sharing metadata

3. Software IPT with Hardware Conventional:

Hardware walks conventional tables for TLB miss
Software maintains IPT for reverse mapping and memory management
Best of both worlds, but duplication overhead

Implementation Guidelines:

Optimize for the common case: If most pages are private, optimize private lookup and accept sharing overhead
Use compact sharing representations: Range-based sharing structures for contiguous shared regions (libraries)
Lazy sharing expansion: Don't create share entries until actually accessed (demand-based)
Pool shared library mappings: Treat common libraries specially with optimized data structures
Measure your workload: Profile actual sharing patterns before choosing architecture

Architecture Decision Matrix
Criterion	Favor Inverted	Favor Conventional
Process count	Very high (1000+)	Moderate (<100)
Shared memory usage	Minimal	Extensive
Fork frequency	Rare	Common
VA space density	Very sparse	Moderately populated
Physical memory	Limited	Abundant
Application type	Embedded/specialized	General purpose

Modern Reality

Summary: Sharing Challenges

Key Takeaways

•The sharing problem arises because inverted tables have exactly one entry per physical frame, but sharing requires representing multiple (process, VPN) mappings to a single frame.
•Solution approaches include chained sharing lists, per-process sharing tables, frame aliasing, and hybrid structures—each adding complexity and overhead.
•Implementation complexity is substantial: sharing coordination, reference counting, primary/secondary owner management, and multi-structure synchronization all add code paths and potential bugs.
•Common operations are affected: fork() becomes more expensive, shared library loading creates share table pressure, and memory-mapped file sharing requires additional bookkeeping.
•Performance implications vary by workload: inverted tables can be slower for sharing-heavy workloads while faster for private-page-dominated scenarios.
•Design recommendations suggest inverted tables for specialized systems (embedded, single-process) while conventional tables remain appropriate for general-purpose systems with extensive sharing.

Module Complete:

You have now completed the comprehensive study of hash-based and inverted page tables. This module has equipped you with:

Deep understanding of hashed page table design, hash functions, and collision handling
Knowledge of inverted page table architecture and its physical-memory-proportional sizing
Understanding of system-wide table implications for context switching, TLB management, and concurrency
Appreciation of sharing challenges that limit inverted table applicability

These alternative page table architectures represent important design points in the memory management design space, illuminating trade-offs that inform all virtual memory system design.

Module Complete

5 / 5