Hash Based And Inverted Page Tables - Learning Module

Loading content...

0/227

Single Table for All Processes

The System-Wide Translation Directory

In conventional memory management, each process maintains its own page table—a private directory mapping that process's virtual addresses to physical frames. When the OS switches between processes, it must also switch page tables, loading a new base register and potentially flushing cached translations.

Inverted page tables break this per-process paradigm. A single system-wide table serves all processes simultaneously. Every translation—regardless of which process initiated it—consults the same data structure. The table doesn't change when processes switch; only the process identifier used for lookup changes.

This architectural choice has profound implications for system design. It affects context switch performance, TLB management strategy, concurrent access patterns, memory allocation policies, and even security isolation. Understanding these implications is essential for system architects evaluating memory management approaches.

What You Will Learn

By the end of this page, you will understand how a single table supports multiple processes, the impact on context switching and TLB behavior, concurrency control requirements, process identifier management, and the system-wide coordination challenges that emerge from shared table organization.

Per-Process vs System-Wide Table Organization

Understanding the fundamental difference between per-process and system-wide table organizations clarifies why each approach exists and when each is appropriate.

Per-Process Tables (Conventional):

Each process has its own page table hierarchy:

System State:
┌─────────────────┐     ┌──────────────────┐
│ Process A (PID 1) │     │ Process A Tables │
│ CR3 → 0x1000    │────→│ L4 → L3 → L2 → L1│
└─────────────────┘     └──────────────────┘

┌─────────────────┐     ┌──────────────────┐
│ Process B (PID 2) │     │ Process B Tables │
│ CR3 → 0x5000    │────→│ L4 → L3 → L2 → L1│
└─────────────────┘     └──────────────────┘

┌─────────────────┐     ┌──────────────────┐
│ Process C (PID 3) │     │ Process C Tables │
│ CR3 → 0x9000    │────→│ L4 → L3 → L2 → L1│
└─────────────────┘     └──────────────────┘

Properties:

Process isolation is implicit (different tables = separate address spaces)
Context switch requires loading new table base register
Each table sized according to that process's needs
Same VPN in different processes maps to different entries
No process identifier needed in entries (table ownership implies it)

System-Wide Table (Inverted):

One table serves all processes:

System State:
┌─────────────────┐
│ Process A (PID 1) │─────┐
│ ASID: 0x01      │     │
└─────────────────┘     │
                        │    ┌──────────────────────┐
┌─────────────────┐     ├───→│ Single Inverted     │
│ Process B (PID 2) │─────┤    │ Page Table          │
│ ASID: 0x02      │     │    │                      │
└─────────────────┘     │    │ Frame 0: (A, 0x100) │
                        │    │ Frame 1: (B, 0x200) │
┌─────────────────┐     │    │ Frame 2: (C, 0x050) │
│ Process C (PID 3) │─────┘    │ Frame 3: (A, 0x101) │
│ ASID: 0x03      │          │ ...                  │
└─────────────────┘          └──────────────────────┘

Properties:

Process isolation is explicit (PID in entries, checked during lookup)
Context switch changes current PID register, NOT table pointer
Table sized by physical memory, not process count
Same VPN in different processes has different (PID, VPN) keys
Process identifier required in every entry

Architectural Comparison
Aspect	Per-Process Tables	System-Wide Table
Isolation mechanism	Separate tables	PID in entries
Context switch	Change CR3/table base	Change current PID
Table switch overhead	May flush TLB	No table switch needed
Memory overhead	Per-process growth	Fixed by physical size
Entry content	No PID (implicit)	PID required (explicit)
Concurrent access	Per-process locks viable	Global coordination needed
TLB management	Per-process flush options	ASID-based invalidation

The Identity Question

With per-process tables, a translation inherently 'knows' which process it belongs to by which table it's in. With system-wide tables, each translation is an orphan that must be explicitly tagged with its owner. This difference ripples through the entire system design.

Context Switching Implications

Context switching—transitioning CPU execution from one process to another—interacts differently with per-process versus system-wide page tables. This difference significantly affects switch performance and TLB behavior.

Per-Process Table Context Switch:

1. Save current process state (registers, etc.)
2. Load new process's page table base into CR3
   ├── Hardware automatically invalidates TLB entries
   │   (all cached translations now invalid)
   └── Or: Use ASID to tag TLB entries, selective invalidation
3. Restore new process state
4. Resume execution
   └── First memory accesses cause TLB misses
       (must rewalk page tables to reload TLB)

The TLB Flush Problem:

When CR3 changes, the TLB contains translations from the old process. If not invalidated, these could cause the new process to access wrong physical frames. Traditional x86 invalidated the entire TLB on CR3 change—very expensive.

Modern Mitigation: PCID/ASID

Modern CPUs (Intel with PCID, ARM with ASID) tag TLB entries with a process/address space identifier. This allows TLB entries from multiple processes to coexist:

TLB Entry: (ASID, VPN) → (PFN, permissions)

Switch from Process A (ASID=1) to Process B (ASID=2):
  - Change current ASID register from 1 to 2
  - No TLB flush needed!
  - Process B's entries (ASID=2) are already cached
  - Process A's entries (ASID=1) remain but won't match

System-Wide Table Context Switch:

With an inverted page table, there's no table base register to change:

1. Save current process state
2. Update current PID/ASID register
   └── No page table base change!
       (same table serves all processes)
3. Restore new process state
4. Resume execution
   └── TLB entries from previous process may remain
       but won't match (different PID in lookup)

Inherent ASID-like Behavior:

Inverted page tables effectively require ASID-style TLB organization because:

Lookups always include (PID, VPN), not just VPN
TLB must distinguish entries by PID
Different processes naturally have disjoint cached translations

This means inverted page table systems 'get ASID for free'—the per-lookup PID serves the same purpose.

Converting Mermaid diagram...

context_switch.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
// Context switch comparison
 
// Per-process table context switch (simplified x86-64)
void context_switch_per_process(Process* old, Process* new) {
    // Save old process state
    save_registers(&old->registers);
    old->kernel_stack = get_current_stack();
    
    // Change page table base
    uint64_t new_cr3 = new->page_table_base;
    
    // With PCID support: Preserve TLB entries
    if (pcid_supported()) {
        new_cr3 |= (uint64_t)new->pcid;
        // Bit 63: don't flush on write
        new_cr3 |= (1ULL << 63);
    }
    // else: TLB fully flushed on CR3 write
    
    write_cr3(new_cr3);  // Switch page tables
    
    // Restore new process state
    set_current_stack(new->kernel_stack);
    restore_registers(&new->registers);
}
 
// System-wide table context switch
void context_switch_inverted(Process* old, Process* new) {
    // Save old process state
    save_registers(&old->registers);
    old->kernel_stack = get_current_stack();
    
    // No page table base change needed!
    // Just update the current process identifier
    set_current_asid(new->asid);
    
    // Restore new process state
    set_current_stack(new->kernel_stack);
    restore_registers(&new->registers);
    
    // TLB entries from 'old' remain but have different ASID
    // so they won't match lookups for 'new'
}

Context Switch Performance

System-wide tables can offer faster context switches because there's no table base change and no possibility of TLB flush from base change. However, modern per-process systems with PCID achieve similar benefits by preserving TLB entries across switches. The performance difference is now minimal in practice.

TLB Management with System-Wide Tables

The Translation Lookaside Buffer (TLB) is a critical performance component that caches recent address translations. With system-wide page tables, TLB management has unique characteristics.

TLB Entry Structure for Inverted Tables:

TLB entries must include the process identifier:

Standard TLB Entry:
  Key:   VPN
  Value: PFN, permissions

Inverted Table TLB Entry:
  Key:   (PID/ASID, VPN)
  Value: PFN, permissions
  
Matching Logic:
  if TLB[i].asid == current_asid AND TLB[i].vpn == requested_vpn:
      return TLB[i].pfn

TLB Lookup with ASID:

Every TLB lookup must check the process identifier:

Lookup(virtual_address):
    vpn = extract_vpn(virtual_address)
    current_asid = get_current_asid()
    
    for each TLB_entry:
        if TLB_entry.valid AND
           TLB_entry.asid == current_asid AND
           TLB_entry.vpn == vpn:
            return HIT(TLB_entry.pfn)
    
    return MISS  // Must consult inverted page table

TLB Invalidation Strategies:

Invalidation must account for the global nature of the table:

1. Full TLB Flush:

flush_tlb_all():
    // Invalidate every entry in TLB
    // Expensive but simple
    // Used when: ASID wrap-around, major table restructuring

2. ASID-Specific Flush:

flush_tlb_asid(asid):
    // Invalidate only entries with this ASID
    for each TLB_entry:
        if TLB_entry.asid == asid:
            TLB_entry.valid = false
    // Used when: Process exits, ASID recycled

3. Single Entry Invalidation:

flush_tlb_entry(asid, vpn):
    // Invalidate specific (ASID, VPN) pair
    for each TLB_entry:
        if TLB_entry.asid == asid AND TLB_entry.vpn == vpn:
            TLB_entry.valid = false
    // Used when: Single page unmapped, protection change

4. Range Invalidation:

flush_tlb_range(asid, vpn_start, vpn_end):
    // Invalidate range of pages
    for each TLB_entry:
        if TLB_entry.asid == asid AND
           TLB_entry.vpn >= vpn_start AND
           TLB_entry.vpn <= vpn_end:
            TLB_entry.valid = false
    // Used when: munmap of multi-page region

TLB Invalidation Scenarios
Event	Required Invalidation	Scope
Process exits	Flush ASID	Single process
munmap() single page	Invalidate (ASID, VPN)	Single entry
mprotect() region	Flush range (ASID, VPN range)	Range
ASID wraparound	Full flush	Entire TLB
Fork (CoW setup)	Invalidate parent entries for CoW pages	Specific pages
Shared page unmap	All ASIDs for that VPN*	Multi-process

tlb_operations.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
// TLB management for inverted page table systems
 
typedef struct {
    uint16_t asid;
    bool valid;
    bool global;     // Kernel pages: match all ASIDs
    uint64_t vpn;
    uint64_t pfn;
    uint32_t permissions;
} TLBEntry;
 
// Software TLB miss handler (inverted page table system)
void tlb_miss_handler(uint16_t asid, uint64_t faulting_address) {
    uint64_t vpn = faulting_address >> PAGE_SHIFT;
    
    // Search inverted page table
    IPTTranslationResult result = ipt_translate(global_ipt, asid, faulting_address);
    
    if (!result.success) {
        // Not a TLB miss - it's a genuine page fault
        handle_page_fault(asid, faulting_address);
        return;
    }
    
    // Install translation in TLB
    TLBEntry new_entry = {
        .asid = asid,
        .valid = true,
        .global = false,
        .vpn = vpn,
        .pfn = result.frame_number,
        .permissions = result.control
    };
    
    // Find slot (implementation-dependent replacement policy)
    int slot = tlb_find_replacement_slot();
    hardware_write_tlb(slot, &new_entry);
    
    // Return to faulting instruction
}
 
// ASID management
#define MAX_ASID 256  // 8-bit ASID
 
typedef struct {
    uint16_t asid_to_pid[MAX_ASID];   // What PID uses each ASID
    uint8_t next_asid;                 // Next ASID to allocate
    uint64_t generation;               // Incremented on wraparound
} ASIDManager;
 
uint16_t allocate_asid(ASIDManager* mgr, uint32_t pid) {
    uint16_t asid = mgr->next_asid++;
    
    if (mgr->next_asid == 0) {
        // Wraparound! Must flush all TLB entries
        mgr->generation++;
        tlb_flush_all();
        mgr->next_asid = 1;  // 0 often reserved for kernel
    }
    
    mgr->asid_to_pid[asid] = pid;
    return asid;
}

ASID Exhaustion

ASID space is limited (typically 8-16 bits = 256-65536 values). When ASIDs are exhausted and must be recycled, a full TLB flush is typically required to ensure old translations don't match new processes. Systems with many active processes may hit this limit, causing periodic TLB performance drops.

Concurrency and Synchronization Challenges

With a single table accessed by all processes (and potentially all CPUs simultaneously), concurrency control becomes significantly more complex than with per-process tables.

The Concurrency Challenge:

Per-Process Tables:
  - CPU₀ running Process A → Accesses Table_A
  - CPU₁ running Process B → Accesses Table_B
  - Minimal contention (different tables)

System-Wide Table:
  - CPU₀ running Process A → Accesses Global_IPT
  - CPU₁ running Process B → Accesses Global_IPT
  - Potential contention (same table)

Contention Sources:

Lookup Contention: Multiple CPUs reading hash chains simultaneously (typically safe with proper memory ordering)
Insert Contention: Adding new entries requires modifying hash chain heads (requires synchronization)
Delete Contention: Removing entries requires unlinking from chains (requires careful synchronization)
Update Contention: Modifying entry control bits (e.g., dirty bit) requires atomic operations

Synchronization Strategies:

1. Global Lock (Simple but Slow):

spinlock_t global_ipt_lock;

void ipt_operation(Operation op) {
    spin_lock(&global_ipt_lock);
    perform_operation(op);
    spin_unlock(&global_ipt_lock);
}

Problem: Serializes all IPT accesses across all CPUs. Severe bottleneck.

2. Per-Bucket Locks (Better):

spinlock_t bucket_locks[NUM_BUCKETS];

void ipt_operation(uint64_t vpn, Operation op) {
    uint32_t bucket = hash(vpn) % NUM_BUCKETS;
    spin_lock(&bucket_locks[bucket]);
    perform_operation(op);
    spin_unlock(&bucket_locks[bucket]);
}

Better: Concurrent access to different buckets. Contention only on same bucket.

3. Read-Copy-Update (RCU) for Lookups:

void ipt_lookup(uint64_t vpn) {
    rcu_read_lock();  // No actual lock, just memory barrier
    HPTEntry* entry = find_entry(vpn);  // Read-only traversal
    result = entry ? entry->pfn : NOT_FOUND;
    rcu_read_unlock();
    return result;
}

void ipt_insert(uint64_t vpn, uint64_t pfn) {
    HPTEntry* new_entry = alloc_entry(vpn, pfn);
    spin_lock(&bucket_locks[bucket]);
    // Atomically add to chain
    new_entry->next = bucket_head;
    atomic_store(&bucket_head, new_entry);
    spin_unlock(&bucket_locks[bucket]);
}

Excellent for read-heavy workloads (most memory accesses are lookups, not modifications).

concurrent_ipt.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
// Concurrent inverted page table with RCU
 
#include <urcu.h>  // Userspace RCU library
 
typedef struct IPTEntry {
    uint16_t pid;
    uint64_t vpn;
    uint64_t pfn;
    _Atomic uint32_t control;        // Atomic for concurrent updates
    struct IPTEntry* next;
} IPTEntry;
 
typedef struct {
    _Atomic(IPTEntry*) hash_heads[NUM_BUCKETS];  // Atomic pointers
    spinlock_t mod_locks[NUM_BUCKETS];            // For modifications
} ConcurrentIPT;
 
// Read-side: No locking, just RCU
IPTEntry* ipt_lookup_rcu(ConcurrentIPT* ipt, uint16_t pid, uint64_t vpn) {
    rcu_read_lock();
    
    uint32_t bucket = hash(pid, vpn);
    IPTEntry* entry = atomic_load(&ipt->hash_heads[bucket]);
    
    while (entry != NULL) {
        if (entry->pid == pid && entry->vpn == vpn) {
            // Found - update accessed bit atomically
            atomic_fetch_or(&entry->control, IPT_ACCESSED);
            rcu_read_unlock();
            return entry;
        }
        entry = entry->next;
    }
    
    rcu_read_unlock();
    return NULL;
}
 
// Write-side: Lock bucket, use RCU-safe updates
bool ipt_insert_rcu(ConcurrentIPT* ipt, uint16_t pid, 
                    uint64_t vpn, uint64_t pfn, uint32_t control) {
    uint32_t bucket = hash(pid, vpn);
    
    // Allocate before locking
    IPTEntry* new_entry = malloc(sizeof(IPTEntry));
    new_entry->pid = pid;
    new_entry->vpn = vpn;
    new_entry->pfn = pfn;
    atomic_store(&new_entry->control, control);
    
    spin_lock(&ipt->mod_locks[bucket]);
    
    // Link at head (atomic pointer update)
    new_entry->next = atomic_load(&ipt->hash_heads[bucket]);
    atomic_store(&ipt->hash_heads[bucket], new_entry);
    
    spin_unlock(&ipt->mod_locks[bucket]);
    return true;
}
 
// Delete: RCU deferred free
bool ipt_delete_rcu(ConcurrentIPT* ipt, uint16_t pid, uint64_t vpn) {
    uint32_t bucket = hash(pid, vpn);
    
    spin_lock(&ipt->mod_locks[bucket]);
    
    IPTEntry* prev = NULL;
    IPTEntry* entry = atomic_load(&ipt->hash_heads[bucket]);
    
    while (entry != NULL) {
        if (entry->pid == pid && entry->vpn == vpn) {
            // Unlink from chain
            if (prev == NULL) {
                atomic_store(&ipt->hash_heads[bucket], entry->next);
            } else {
                prev->next = entry->next;
            }
            
            spin_unlock(&ipt->mod_locks[bucket]);
            
            // Wait for all readers to finish, then free
            call_rcu(&entry->rcu_head, rcu_free_entry);
            return true;
        }
        prev = entry;
        entry = entry->next;
    }
    
    spin_unlock(&ipt->mod_locks[bucket]);
    return false;
}

Hardware Assistance

On architectures with hardware page table walkers (like PowerPC with HTAB), the hardware handles much of the synchronization. Software only locks during modifications, and hardware reads are inherently safe. Software-walked TLB systems (some MIPS, older ARM) require more careful software synchronization.

Process Identifier Management

Managing process identifiers in a system-wide table environment requires careful coordination between several system components. The identifier serves as a critical discriminator in all address translations.

Identifier Types:

1. Process ID (PID):

Operating system's process identifier
Typically 32 bits (up to ~4 billion processes)
Unique for process lifetime, may be recycled after exit
Too large to embed in every page table entry efficiently

2. Address Space Identifier (ASID):

Hardware-supported compact identifier
Typically 8-16 bits (256-65536 values)
Assigned by OS, mapped to PID
Used in TLB and page table entries
Must be recycled frequently (ASID space smaller than PID space)

3. Virtual Segment ID (VSID):

PowerPC-specific concept
Identifies segments within a process
Combined with VPN for page table lookup

ASID Lifecycle Management:

Process Creation:
  1. Allocate new process (get PID from OS)
  2. Request ASID from ASID allocator
     - If free ASID available: assign it
     - If no free ASIDs: reclaim one (see below)
  3. Record PID ↔ ASID mapping
  4. Initialize page table entries with this ASID

Process Execution:
  1. On context switch, load ASID into hardware register
  2. All TLB lookups automatially filtered by ASID
  3. Page table lookups include ASID in search key

Process Exit:
  1. Invalidate all TLB entries for this ASID
  2. Remove all page table entries with this PID
  3. Release ASID back to free pool
  4. Clear PID ↔ ASID mapping

ASID Exhaustion and Recycling:

With only 256 ASIDs (8-bit) and potentially thousands of processes:

Scenario: 300 processes active, only 256 ASIDs

Solution 1: ASID Versioning (Generational)
  - Maintain generation counter G
  - Full ASID is (G * 256 + raw_asid)
  - TLB entries tagged with full ASID
  - When raw_asid wraps, increment G and flush TLB
  - Processes with old-generation ASIDs get new ones on next run

Solution 2: ASID Stealing
  - When out of ASIDs, pick a victim process
  - Flush victim's TLB entries
  - Reassign victim's ASID to new process
  - Victim gets new ASID when it runs again

Solution 3: Lazy ASID Allocation
  - Only assign ASID when process is about to run
  - Release ASID if process is idle too long
  - Active process set << total process set

asid_allocator.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
// ASID Allocator with generation tracking
 
#include <stdint.h>
#include <stdbool.h>
 
#define ASID_BITS 8
#define NUM_ASIDS (1 << ASID_BITS)
 
typedef struct {
    uint64_t generation;                    // Current generation
    uint8_t next_raw_asid;                  // Next raw ASID to allocate
    uint32_t asid_to_pid[NUM_ASIDS];        // Who owns each ASID
    uint64_t asid_generation[NUM_ASIDS];    // Generation when assigned
    spinlock_t lock;
} ASIDAllocator;
 
typedef struct {
    uint64_t full_asid;     // (generation << ASID_BITS) | raw_asid
    bool valid;             // Was allocation successful
} ASIDAllocation;
 
// Allocate ASID for process
ASIDAllocation allocate_asid(ASIDAllocator* allocator, uint32_t pid) {
    ASIDAllocation result = {0};
    
    spin_lock(&allocator->lock);
    
    uint8_t raw_asid = allocator->next_raw_asid++;
    
    // Check for wraparound
    if (allocator->next_raw_asid == 0) {
        // Generation rollover - must flush all TLBs
        allocator->generation++;
        tlb_flush_all();
        allocator->next_raw_asid = 1;  // 0 often reserved
        raw_asid = 1;
    }
    
    // Record allocation
    allocator->asid_to_pid[raw_asid] = pid;
    allocator->asid_generation[raw_asid] = allocator->generation;
    
    // Compute full ASID
    result.full_asid = (allocator->generation << ASID_BITS) | raw_asid;
    result.valid = true;
    
    spin_unlock(&allocator->lock);
    return result;
}
 
// Check if process's ASID is still valid
bool asid_valid(ASIDAllocator* allocator, uint32_t pid, 
                uint64_t process_asid) {
    uint8_t raw_asid = process_asid & (NUM_ASIDS - 1);
    uint64_t asid_gen = process_asid >> ASID_BITS;
    
    spin_lock(&allocator->lock);
    
    bool valid = (allocator->asid_to_pid[raw_asid] == pid &&
                  allocator->asid_generation[raw_asid] == asid_gen);
    
    spin_unlock(&allocator->lock);
    return valid;
}
 
// Free ASID when process exits
void free_asid(ASIDAllocator* allocator, uint32_t pid, uint64_t asid) {
    uint8_t raw_asid = asid & (NUM_ASIDS - 1);
    
    spin_lock(&allocator->lock);
    
    if (allocator->asid_to_pid[raw_asid] == pid) {
        allocator->asid_to_pid[raw_asid] = 0;  // Mark as free
        // Flush TLB entries for this ASID
        tlb_flush_asid(raw_asid);
    }
    
    spin_unlock(&allocator->lock);
}

Modern ASID Sizes

ARM64 supports 8-bit or 16-bit ASIDs (configurable). Intel PCID is 12 bits (4096 values). Larger ASID spaces reduce recycling frequency and TLB flush overhead, making system-wide table designs more practical for systems with many processes.

System Design Implications

The choice of system-wide versus per-process page tables ripples through many aspects of operating system design. Understanding these implications helps evaluate when each approach is appropriate.

Memory Allocator Integration:

System-wide tables tightly couple with physical memory allocation:

Per-Process Tables:
  - Process can have arbitrary VPN → PFN mappings
  - No constraint between frame number and table location
  - Allocator returns any free frame

Inverted Table:
  - Entry for frame F is at IPT[F]
  - When allocating frame F, must update IPT[F] in place
  - Allocator must coordinate with table update

Implications:

Free frame tracking and IPT valid-bit are inherently linked
Frame allocation IS entry allocation (same array)
Can implement "lazy" entry creation (frame unused = entry invalid)

Kernel Mapping Strategies:

Kernel virtual addresses must work for all processes:

Per-Process (Linux/x86-64):

Kernel mapped in every process's page table
Upper half of VA space reserved for kernel
Kernel entries replicated across all tables
Kernel never changes CR3, always has valid mappings

0x0000... - 0x7FFF...: User space
0x8000... - 0xFFFF...: Kernel space

Every process table has
identical kernel entries

System-Wide (Inverted):

Kernel uses special ASID (often 0)
Or: Kernel entries marked "global"
Global entries match ANY current ASID
Kernel mappings exist once in shared table

IPT entries:
  [Frame 0]: global, VPN: 0x8000
  [Frame 1]: global, VPN: 0x8001
  ...
  [Frame K]: ASID:5, VPN: 0x100

Debugging and Introspection:

System-wide tables affect how debuggers and monitoring tools work:

Per-Process Tables:

Easy to dump one process's mappings (walk its table)
/proc/PID/maps can show virtual layout
Debugger attaches to process, follows its CR3

Inverted Tables:

Harder to reconstruct one process's view (must filter IPT by PID)
Physical memory view is natural (iterate IPT)
Debugger must filter by ASID while examining memory

Security Isolation Considerations:

Per-Process Tables:
  - Isolation by construction: Different tables, no overlap possible
  - Compromise of one table doesn't affect others
  - KASLR: Each process can have different kernel randomization*

Inverted Tables:
  - Isolation by PID check: Same table, access filtered by PID
  - Corruption of IPT affects all processes
  - Must ensure PID checks are unforgeable

*Note: In practice, KASLR typically uses same kernel layout for all processes for simplicity.

Design Decision Summary

•Choose System-Wide When: Many processes, large VAS but small physical memory, context switch frequency is high, memory budget is constrained, page replacement is frequent (O(1) reverse lookup valuable)
•Choose Per-Process When: Memory sharing is pervasive, process count is moderate, TLB miss handling must be fastest possible, existing ecosystem assumes per-process tables, security isolation is paramount
•Hybrid Approaches: Some systems use inverted tables for user pages but per-process tables for kernel mappings, combining benefits of both

The Trend

Modern commodity systems (x86-64, ARM64) predominantly use per-process multi-level tables with PCID/ASID support. Inverted tables remain important in specialized domains (PowerPC servers, some embedded systems) and as a conceptual alternative that illuminates different trade-offs in memory management design.

Summary: Single Table for All Processes

System-wide page tables represent a fundamentally different organization than per-process tables, with implications spanning context switching, TLB management, concurrency control, and overall system architecture.

Key Takeaways

•Single table organization means all processes share one physical structure. Process isolation is achieved through explicit PID/ASID tagging rather than separate tables.
•Context switching is simpler—only the current PID register changes, not the page table base. This inherently supports ASID-style TLB preservation.
•TLB management must handle ASID in every entry. TLB entries naturally include process identification, enabling multi-process caching and fine-grained invalidation.
•Concurrency control is more challenging because all CPUs access the same structure. RCU for reads and per-bucket locking for writes provide scalable solutions.
•ASID management is critical—the limited ASID space requires recycling strategies like generation counting to handle more processes than available ASIDs.
•System design is affected throughout—from memory allocation integration to kernel mapping strategies to debugging and security considerations.

What's Next:

The final page explores sharing challenges—the most significant limitation of inverted page tables. When multiple processes share physical frames (shared libraries, shared memory, copy-on-write after fork), the single-entry-per-frame constraint creates complexity that doesn't exist with per-process tables.

Page Complete

You now understand system-wide page table organization comprehensively—from the fundamental contrast with per-process tables through context switching, TLB management, concurrency control, and ASID management. This knowledge illuminates how table organization choices affect every aspect of memory management system design.

Single Table for All Processes

The System-Wide Translation Directory

What You Will Learn

Per-Process vs System-Wide Table Organization

Understanding the fundamental difference between per-process and system-wide table organizations clarifies why each approach exists and when each is appropriate.

Per-Process Tables (Conventional):

Each process has its own page table hierarchy:

System State:
┌─────────────────┐     ┌──────────────────┐
│ Process A (PID 1) │     │ Process A Tables │
│ CR3 → 0x1000    │────→│ L4 → L3 → L2 → L1│
└─────────────────┘     └──────────────────┘

┌─────────────────┐     ┌──────────────────┐
│ Process B (PID 2) │     │ Process B Tables │
│ CR3 → 0x5000    │────→│ L4 → L3 → L2 → L1│
└─────────────────┘     └──────────────────┘

┌─────────────────┐     ┌──────────────────┐
│ Process C (PID 3) │     │ Process C Tables │
│ CR3 → 0x9000    │────→│ L4 → L3 → L2 → L1│
└─────────────────┘     └──────────────────┘

Properties:

Process isolation is implicit (different tables = separate address spaces)
Context switch requires loading new table base register
Each table sized according to that process's needs
Same VPN in different processes maps to different entries
No process identifier needed in entries (table ownership implies it)

System-Wide Table (Inverted):

One table serves all processes:

System State:
┌─────────────────┐
│ Process A (PID 1) │─────┐
│ ASID: 0x01      │     │
└─────────────────┘     │
                        │    ┌──────────────────────┐
┌─────────────────┐     ├───→│ Single Inverted     │
│ Process B (PID 2) │─────┤    │ Page Table          │
│ ASID: 0x02      │     │    │                      │
└─────────────────┘     │    │ Frame 0: (A, 0x100) │
                        │    │ Frame 1: (B, 0x200) │
┌─────────────────┐     │    │ Frame 2: (C, 0x050) │
│ Process C (PID 3) │─────┘    │ Frame 3: (A, 0x101) │
│ ASID: 0x03      │          │ ...                  │
└─────────────────┘          └──────────────────────┘

Properties:

Process isolation is explicit (PID in entries, checked during lookup)
Context switch changes current PID register, NOT table pointer
Table sized by physical memory, not process count
Same VPN in different processes has different (PID, VPN) keys
Process identifier required in every entry

Architectural Comparison
Aspect	Per-Process Tables	System-Wide Table
Isolation mechanism	Separate tables	PID in entries
Context switch	Change CR3/table base	Change current PID
Table switch overhead	May flush TLB	No table switch needed
Memory overhead	Per-process growth	Fixed by physical size
Entry content	No PID (implicit)	PID required (explicit)
Concurrent access	Per-process locks viable	Global coordination needed
TLB management	Per-process flush options	ASID-based invalidation

The Identity Question

Context Switching Implications

Per-Process Table Context Switch:

1. Save current process state (registers, etc.)
2. Load new process's page table base into CR3
   ├── Hardware automatically invalidates TLB entries
   │   (all cached translations now invalid)
   └── Or: Use ASID to tag TLB entries, selective invalidation
3. Restore new process state
4. Resume execution
   └── First memory accesses cause TLB misses
       (must rewalk page tables to reload TLB)

The TLB Flush Problem:

Modern Mitigation: PCID/ASID

Modern CPUs (Intel with PCID, ARM with ASID) tag TLB entries with a process/address space identifier. This allows TLB entries from multiple processes to coexist:

TLB Entry: (ASID, VPN) → (PFN, permissions)

Switch from Process A (ASID=1) to Process B (ASID=2):
  - Change current ASID register from 1 to 2
  - No TLB flush needed!
  - Process B's entries (ASID=2) are already cached
  - Process A's entries (ASID=1) remain but won't match

System-Wide Table Context Switch:

With an inverted page table, there's no table base register to change:

1. Save current process state
2. Update current PID/ASID register
   └── No page table base change!
       (same table serves all processes)
3. Restore new process state
4. Resume execution
   └── TLB entries from previous process may remain
       but won't match (different PID in lookup)

Inherent ASID-like Behavior:

Inverted page tables effectively require ASID-style TLB organization because:

Lookups always include (PID, VPN), not just VPN
TLB must distinguish entries by PID
Different processes naturally have disjoint cached translations

This means inverted page table systems 'get ASID for free'—the per-lookup PID serves the same purpose.

Converting Mermaid diagram...

context_switch.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
// Context switch comparison
 
// Per-process table context switch (simplified x86-64)
void context_switch_per_process(Process* old, Process* new) {
    // Save old process state
    save_registers(&old->registers);
    old->kernel_stack = get_current_stack();
    
    // Change page table base
    uint64_t new_cr3 = new->page_table_base;
    
    // With PCID support: Preserve TLB entries
    if (pcid_supported()) {
        new_cr3 |= (uint64_t)new->pcid;
        // Bit 63: don't flush on write
        new_cr3 |= (1ULL << 63);
    }
    // else: TLB fully flushed on CR3 write
    
    write_cr3(new_cr3);  // Switch page tables
    
    // Restore new process state
    set_current_stack(new->kernel_stack);
    restore_registers(&new->registers);
}
 
// System-wide table context switch
void context_switch_inverted(Process* old, Process* new) {
    // Save old process state
    save_registers(&old->registers);
    old->kernel_stack = get_current_stack();
    
    // No page table base change needed!
    // Just update the current process identifier
    set_current_asid(new->asid);
    
    // Restore new process state
    set_current_stack(new->kernel_stack);
    restore_registers(&new->registers);
    
    // TLB entries from 'old' remain but have different ASID
    // so they won't match lookups for 'new'
}

Context Switch Performance

TLB Management with System-Wide Tables

The Translation Lookaside Buffer (TLB) is a critical performance component that caches recent address translations. With system-wide page tables, TLB management has unique characteristics.

TLB Entry Structure for Inverted Tables:

TLB entries must include the process identifier:

Standard TLB Entry:
  Key:   VPN
  Value: PFN, permissions

Inverted Table TLB Entry:
  Key:   (PID/ASID, VPN)
  Value: PFN, permissions
  
Matching Logic:
  if TLB[i].asid == current_asid AND TLB[i].vpn == requested_vpn:
      return TLB[i].pfn

TLB Lookup with ASID:

Every TLB lookup must check the process identifier:

Lookup(virtual_address):
    vpn = extract_vpn(virtual_address)
    current_asid = get_current_asid()
    
    for each TLB_entry:
        if TLB_entry.valid AND
           TLB_entry.asid == current_asid AND
           TLB_entry.vpn == vpn:
            return HIT(TLB_entry.pfn)
    
    return MISS  // Must consult inverted page table

TLB Invalidation Strategies:

Invalidation must account for the global nature of the table:

1. Full TLB Flush:

flush_tlb_all():
    // Invalidate every entry in TLB
    // Expensive but simple
    // Used when: ASID wrap-around, major table restructuring

2. ASID-Specific Flush:

flush_tlb_asid(asid):
    // Invalidate only entries with this ASID
    for each TLB_entry:
        if TLB_entry.asid == asid:
            TLB_entry.valid = false
    // Used when: Process exits, ASID recycled

3. Single Entry Invalidation:

flush_tlb_entry(asid, vpn):
    // Invalidate specific (ASID, VPN) pair
    for each TLB_entry:
        if TLB_entry.asid == asid AND TLB_entry.vpn == vpn:
            TLB_entry.valid = false
    // Used when: Single page unmapped, protection change

4. Range Invalidation:

flush_tlb_range(asid, vpn_start, vpn_end):
    // Invalidate range of pages
    for each TLB_entry:
        if TLB_entry.asid == asid AND
           TLB_entry.vpn >= vpn_start AND
           TLB_entry.vpn <= vpn_end:
            TLB_entry.valid = false
    // Used when: munmap of multi-page region

TLB Invalidation Scenarios
Event	Required Invalidation	Scope
Process exits	Flush ASID	Single process
munmap() single page	Invalidate (ASID, VPN)	Single entry
mprotect() region	Flush range (ASID, VPN range)	Range
ASID wraparound	Full flush	Entire TLB
Fork (CoW setup)	Invalidate parent entries for CoW pages	Specific pages
Shared page unmap	All ASIDs for that VPN*	Multi-process

tlb_operations.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
// TLB management for inverted page table systems
 
typedef struct {
    uint16_t asid;
    bool valid;
    bool global;     // Kernel pages: match all ASIDs
    uint64_t vpn;
    uint64_t pfn;
    uint32_t permissions;
} TLBEntry;
 
// Software TLB miss handler (inverted page table system)
void tlb_miss_handler(uint16_t asid, uint64_t faulting_address) {
    uint64_t vpn = faulting_address >> PAGE_SHIFT;
    
    // Search inverted page table
    IPTTranslationResult result = ipt_translate(global_ipt, asid, faulting_address);
    
    if (!result.success) {
        // Not a TLB miss - it's a genuine page fault
        handle_page_fault(asid, faulting_address);
        return;
    }
    
    // Install translation in TLB
    TLBEntry new_entry = {
        .asid = asid,
        .valid = true,
        .global = false,
        .vpn = vpn,
        .pfn = result.frame_number,
        .permissions = result.control
    };
    
    // Find slot (implementation-dependent replacement policy)
    int slot = tlb_find_replacement_slot();
    hardware_write_tlb(slot, &new_entry);
    
    // Return to faulting instruction
}
 
// ASID management
#define MAX_ASID 256  // 8-bit ASID
 
typedef struct {
    uint16_t asid_to_pid[MAX_ASID];   // What PID uses each ASID
    uint8_t next_asid;                 // Next ASID to allocate
    uint64_t generation;               // Incremented on wraparound
} ASIDManager;
 
uint16_t allocate_asid(ASIDManager* mgr, uint32_t pid) {
    uint16_t asid = mgr->next_asid++;
    
    if (mgr->next_asid == 0) {
        // Wraparound! Must flush all TLB entries
        mgr->generation++;
        tlb_flush_all();
        mgr->next_asid = 1;  // 0 often reserved for kernel
    }
    
    mgr->asid_to_pid[asid] = pid;
    return asid;
}

ASID Exhaustion

Concurrency and Synchronization Challenges

With a single table accessed by all processes (and potentially all CPUs simultaneously), concurrency control becomes significantly more complex than with per-process tables.

The Concurrency Challenge:

Per-Process Tables:
  - CPU₀ running Process A → Accesses Table_A
  - CPU₁ running Process B → Accesses Table_B
  - Minimal contention (different tables)

System-Wide Table:
  - CPU₀ running Process A → Accesses Global_IPT
  - CPU₁ running Process B → Accesses Global_IPT
  - Potential contention (same table)

Contention Sources:

Lookup Contention: Multiple CPUs reading hash chains simultaneously (typically safe with proper memory ordering)
Insert Contention: Adding new entries requires modifying hash chain heads (requires synchronization)
Delete Contention: Removing entries requires unlinking from chains (requires careful synchronization)
Update Contention: Modifying entry control bits (e.g., dirty bit) requires atomic operations

Synchronization Strategies:

1. Global Lock (Simple but Slow):

spinlock_t global_ipt_lock;

void ipt_operation(Operation op) {
    spin_lock(&global_ipt_lock);
    perform_operation(op);
    spin_unlock(&global_ipt_lock);
}

Problem: Serializes all IPT accesses across all CPUs. Severe bottleneck.

2. Per-Bucket Locks (Better):

spinlock_t bucket_locks[NUM_BUCKETS];

void ipt_operation(uint64_t vpn, Operation op) {
    uint32_t bucket = hash(vpn) % NUM_BUCKETS;
    spin_lock(&bucket_locks[bucket]);
    perform_operation(op);
    spin_unlock(&bucket_locks[bucket]);
}

Better: Concurrent access to different buckets. Contention only on same bucket.

3. Read-Copy-Update (RCU) for Lookups:

void ipt_lookup(uint64_t vpn) {
    rcu_read_lock();  // No actual lock, just memory barrier
    HPTEntry* entry = find_entry(vpn);  // Read-only traversal
    result = entry ? entry->pfn : NOT_FOUND;
    rcu_read_unlock();
    return result;
}

void ipt_insert(uint64_t vpn, uint64_t pfn) {
    HPTEntry* new_entry = alloc_entry(vpn, pfn);
    spin_lock(&bucket_locks[bucket]);
    // Atomically add to chain
    new_entry->next = bucket_head;
    atomic_store(&bucket_head, new_entry);
    spin_unlock(&bucket_locks[bucket]);
}

Excellent for read-heavy workloads (most memory accesses are lookups, not modifications).

concurrent_ipt.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
// Concurrent inverted page table with RCU
 
#include <urcu.h>  // Userspace RCU library
 
typedef struct IPTEntry {
    uint16_t pid;
    uint64_t vpn;
    uint64_t pfn;
    _Atomic uint32_t control;        // Atomic for concurrent updates
    struct IPTEntry* next;
} IPTEntry;
 
typedef struct {
    _Atomic(IPTEntry*) hash_heads[NUM_BUCKETS];  // Atomic pointers
    spinlock_t mod_locks[NUM_BUCKETS];            // For modifications
} ConcurrentIPT;
 
// Read-side: No locking, just RCU
IPTEntry* ipt_lookup_rcu(ConcurrentIPT* ipt, uint16_t pid, uint64_t vpn) {
    rcu_read_lock();
    
    uint32_t bucket = hash(pid, vpn);
    IPTEntry* entry = atomic_load(&ipt->hash_heads[bucket]);
    
    while (entry != NULL) {
        if (entry->pid == pid && entry->vpn == vpn) {
            // Found - update accessed bit atomically
            atomic_fetch_or(&entry->control, IPT_ACCESSED);
            rcu_read_unlock();
            return entry;
        }
        entry = entry->next;
    }
    
    rcu_read_unlock();
    return NULL;
}
 
// Write-side: Lock bucket, use RCU-safe updates
bool ipt_insert_rcu(ConcurrentIPT* ipt, uint16_t pid, 
                    uint64_t vpn, uint64_t pfn, uint32_t control) {
    uint32_t bucket = hash(pid, vpn);
    
    // Allocate before locking
    IPTEntry* new_entry = malloc(sizeof(IPTEntry));
    new_entry->pid = pid;
    new_entry->vpn = vpn;
    new_entry->pfn = pfn;
    atomic_store(&new_entry->control, control);
    
    spin_lock(&ipt->mod_locks[bucket]);
    
    // Link at head (atomic pointer update)
    new_entry->next = atomic_load(&ipt->hash_heads[bucket]);
    atomic_store(&ipt->hash_heads[bucket], new_entry);
    
    spin_unlock(&ipt->mod_locks[bucket]);
    return true;
}
 
// Delete: RCU deferred free
bool ipt_delete_rcu(ConcurrentIPT* ipt, uint16_t pid, uint64_t vpn) {
    uint32_t bucket = hash(pid, vpn);
    
    spin_lock(&ipt->mod_locks[bucket]);
    
    IPTEntry* prev = NULL;
    IPTEntry* entry = atomic_load(&ipt->hash_heads[bucket]);
    
    while (entry != NULL) {
        if (entry->pid == pid && entry->vpn == vpn) {
            // Unlink from chain
            if (prev == NULL) {
                atomic_store(&ipt->hash_heads[bucket], entry->next);
            } else {
                prev->next = entry->next;
            }
            
            spin_unlock(&ipt->mod_locks[bucket]);
            
            // Wait for all readers to finish, then free
            call_rcu(&entry->rcu_head, rcu_free_entry);
            return true;
        }
        prev = entry;
        entry = entry->next;
    }
    
    spin_unlock(&ipt->mod_locks[bucket]);
    return false;
}

Hardware Assistance

Process Identifier Management

Identifier Types:

1. Process ID (PID):

Operating system's process identifier
Typically 32 bits (up to ~4 billion processes)
Unique for process lifetime, may be recycled after exit
Too large to embed in every page table entry efficiently

2. Address Space Identifier (ASID):

Hardware-supported compact identifier
Typically 8-16 bits (256-65536 values)
Assigned by OS, mapped to PID
Used in TLB and page table entries
Must be recycled frequently (ASID space smaller than PID space)

3. Virtual Segment ID (VSID):

PowerPC-specific concept
Identifies segments within a process
Combined with VPN for page table lookup

ASID Lifecycle Management:

Process Creation:
  1. Allocate new process (get PID from OS)
  2. Request ASID from ASID allocator
     - If free ASID available: assign it
     - If no free ASIDs: reclaim one (see below)
  3. Record PID ↔ ASID mapping
  4. Initialize page table entries with this ASID

Process Execution:
  1. On context switch, load ASID into hardware register
  2. All TLB lookups automatially filtered by ASID
  3. Page table lookups include ASID in search key

Process Exit:
  1. Invalidate all TLB entries for this ASID
  2. Remove all page table entries with this PID
  3. Release ASID back to free pool
  4. Clear PID ↔ ASID mapping

ASID Exhaustion and Recycling:

With only 256 ASIDs (8-bit) and potentially thousands of processes:

Scenario: 300 processes active, only 256 ASIDs

Solution 1: ASID Versioning (Generational)
  - Maintain generation counter G
  - Full ASID is (G * 256 + raw_asid)
  - TLB entries tagged with full ASID
  - When raw_asid wraps, increment G and flush TLB
  - Processes with old-generation ASIDs get new ones on next run

Solution 2: ASID Stealing
  - When out of ASIDs, pick a victim process
  - Flush victim's TLB entries
  - Reassign victim's ASID to new process
  - Victim gets new ASID when it runs again

Solution 3: Lazy ASID Allocation
  - Only assign ASID when process is about to run
  - Release ASID if process is idle too long
  - Active process set << total process set

asid_allocator.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
// ASID Allocator with generation tracking
 
#include <stdint.h>
#include <stdbool.h>
 
#define ASID_BITS 8
#define NUM_ASIDS (1 << ASID_BITS)
 
typedef struct {
    uint64_t generation;                    // Current generation
    uint8_t next_raw_asid;                  // Next raw ASID to allocate
    uint32_t asid_to_pid[NUM_ASIDS];        // Who owns each ASID
    uint64_t asid_generation[NUM_ASIDS];    // Generation when assigned
    spinlock_t lock;
} ASIDAllocator;
 
typedef struct {
    uint64_t full_asid;     // (generation << ASID_BITS) | raw_asid
    bool valid;             // Was allocation successful
} ASIDAllocation;
 
// Allocate ASID for process
ASIDAllocation allocate_asid(ASIDAllocator* allocator, uint32_t pid) {
    ASIDAllocation result = {0};
    
    spin_lock(&allocator->lock);
    
    uint8_t raw_asid = allocator->next_raw_asid++;
    
    // Check for wraparound
    if (allocator->next_raw_asid == 0) {
        // Generation rollover - must flush all TLBs
        allocator->generation++;
        tlb_flush_all();
        allocator->next_raw_asid = 1;  // 0 often reserved
        raw_asid = 1;
    }
    
    // Record allocation
    allocator->asid_to_pid[raw_asid] = pid;
    allocator->asid_generation[raw_asid] = allocator->generation;
    
    // Compute full ASID
    result.full_asid = (allocator->generation << ASID_BITS) | raw_asid;
    result.valid = true;
    
    spin_unlock(&allocator->lock);
    return result;
}
 
// Check if process's ASID is still valid
bool asid_valid(ASIDAllocator* allocator, uint32_t pid, 
                uint64_t process_asid) {
    uint8_t raw_asid = process_asid & (NUM_ASIDS - 1);
    uint64_t asid_gen = process_asid >> ASID_BITS;
    
    spin_lock(&allocator->lock);
    
    bool valid = (allocator->asid_to_pid[raw_asid] == pid &&
                  allocator->asid_generation[raw_asid] == asid_gen);
    
    spin_unlock(&allocator->lock);
    return valid;
}
 
// Free ASID when process exits
void free_asid(ASIDAllocator* allocator, uint32_t pid, uint64_t asid) {
    uint8_t raw_asid = asid & (NUM_ASIDS - 1);
    
    spin_lock(&allocator->lock);
    
    if (allocator->asid_to_pid[raw_asid] == pid) {
        allocator->asid_to_pid[raw_asid] = 0;  // Mark as free
        // Flush TLB entries for this ASID
        tlb_flush_asid(raw_asid);
    }
    
    spin_unlock(&allocator->lock);
}

Modern ASID Sizes

System Design Implications

The choice of system-wide versus per-process page tables ripples through many aspects of operating system design. Understanding these implications helps evaluate when each approach is appropriate.

Memory Allocator Integration:

System-wide tables tightly couple with physical memory allocation:

Per-Process Tables:
  - Process can have arbitrary VPN → PFN mappings
  - No constraint between frame number and table location
  - Allocator returns any free frame

Inverted Table:
  - Entry for frame F is at IPT[F]
  - When allocating frame F, must update IPT[F] in place
  - Allocator must coordinate with table update

Implications:

Free frame tracking and IPT valid-bit are inherently linked
Frame allocation IS entry allocation (same array)
Can implement "lazy" entry creation (frame unused = entry invalid)

Kernel Mapping Strategies:

Kernel virtual addresses must work for all processes:

Per-Process (Linux/x86-64):

Kernel mapped in every process's page table
Upper half of VA space reserved for kernel
Kernel entries replicated across all tables
Kernel never changes CR3, always has valid mappings

0x0000... - 0x7FFF...: User space
0x8000... - 0xFFFF...: Kernel space

Every process table has
identical kernel entries

System-Wide (Inverted):

Kernel uses special ASID (often 0)
Or: Kernel entries marked "global"
Global entries match ANY current ASID
Kernel mappings exist once in shared table

IPT entries:
  [Frame 0]: global, VPN: 0x8000
  [Frame 1]: global, VPN: 0x8001
  ...
  [Frame K]: ASID:5, VPN: 0x100

Debugging and Introspection:

System-wide tables affect how debuggers and monitoring tools work:

Per-Process Tables:

Easy to dump one process's mappings (walk its table)
/proc/PID/maps can show virtual layout
Debugger attaches to process, follows its CR3

Inverted Tables:

Harder to reconstruct one process's view (must filter IPT by PID)
Physical memory view is natural (iterate IPT)
Debugger must filter by ASID while examining memory

Security Isolation Considerations:

Per-Process Tables:
  - Isolation by construction: Different tables, no overlap possible
  - Compromise of one table doesn't affect others
  - KASLR: Each process can have different kernel randomization*

Inverted Tables:
  - Isolation by PID check: Same table, access filtered by PID
  - Corruption of IPT affects all processes
  - Must ensure PID checks are unforgeable

*Note: In practice, KASLR typically uses same kernel layout for all processes for simplicity.

Design Decision Summary

•Choose System-Wide When: Many processes, large VAS but small physical memory, context switch frequency is high, memory budget is constrained, page replacement is frequent (O(1) reverse lookup valuable)
•Choose Per-Process When: Memory sharing is pervasive, process count is moderate, TLB miss handling must be fastest possible, existing ecosystem assumes per-process tables, security isolation is paramount
•Hybrid Approaches: Some systems use inverted tables for user pages but per-process tables for kernel mappings, combining benefits of both

The Trend

Summary: Single Table for All Processes

Key Takeaways

•Single table organization means all processes share one physical structure. Process isolation is achieved through explicit PID/ASID tagging rather than separate tables.
•Context switching is simpler—only the current PID register changes, not the page table base. This inherently supports ASID-style TLB preservation.
•TLB management must handle ASID in every entry. TLB entries naturally include process identification, enabling multi-process caching and fine-grained invalidation.
•Concurrency control is more challenging because all CPUs access the same structure. RCU for reads and per-bucket locking for writes provide scalable solutions.
•ASID management is critical—the limited ASID space requires recycling strategies like generation counting to handle more processes than available ASIDs.
•System design is affected throughout—from memory allocation integration to kernel mapping strategies to debugging and security considerations.

What's Next:

Page Complete