Operating SystemsVirtual Memory Concepts

Virtual Memory Concepts

LevelIntermediate

Duration60 mins

TopicVirtual Memory Concepts

1 / 5

Virtual Address Space

The Great Memory Illusion

Every program you run believes it owns the entire machine. When you write a C program that declares an array of a million integers, it doesn't negotiate with other programs for space. When a web browser opens dozens of tabs, each tab operates as if it has uncontested access to memory. This isn't naivety on the part of your programs—it's the result of one of the most elegant abstractions in computer science: the virtual address space.

The virtual address space is a fundamental illusion maintained by the operating system and hardware working in concert. Each process is given its own private, contiguous address space that appears to span from address 0 to some maximum value (like 2⁴⁸ - 1 on 64-bit systems). This space is completely independent of physical memory, completely isolated from other processes, and completely under the control of the operating system.

Understanding virtual address spaces isn't just academic knowledge—it's essential for debugging memory issues, understanding security vulnerabilities, optimizing program performance, and designing systems that scale. This page provides a comprehensive exploration of this critical concept.

What You Will Learn

By the end of this page, you will understand what a virtual address space is, how it differs from physical memory, why every process has its own independent address space, and how this abstraction enables the sophisticated memory management capabilities of modern operating systems.

Defining Virtual Address Space

A virtual address space is the logical view of memory as seen by a running process. It is an abstraction that presents each process with its own private, contiguous range of addresses, completely independent of the actual physical memory layout or the presence of other processes in the system.

The key insight: From a process's perspective, it is the only entity using memory, and that memory forms a single, unbroken sequence of addresses starting from zero. This is fundamentally different from the physical reality, where:

Physical memory is shared among all processes
Physical memory may not be contiguous for any single process
Physical memory has a fixed, finite size
Other processes occupy portions of physical memory simultaneously

Virtual Address Space vs Physical Memory: Fundamental Differences
Characteristic	Virtual Address Space	Physical Memory
Ownership	Private to each process	Shared among all processes
Contiguity	Always appears contiguous	Fragmented per process
Size	Can exceed physical RAM	Fixed hardware limit
Starting Address	Begins at 0 (or near 0)	Shared address range
Visibility	Only own addresses visible	Contains all process data
Lifetime	Exists while process runs	Persists across processes

The Programmer's Mental Model

When you write code, you naturally think in terms of virtual addresses. A pointer variable contains a virtual address. Array indexing computes virtual addresses. Stack frames and heap allocations exist within the virtual address space. Understanding this distinction is fundamental to systems programming.

Anatomy of a Virtual Address Space

A virtual address space has a well-defined structure, divided into distinct regions that serve specific purposes. Understanding this layout is crucial for debugging, security analysis, and performance optimization.

The canonical layout (from low addresses to high addresses):

The exact layout varies by operating system and CPU architecture, but the conceptual organization follows consistent principles across platforms.

Standard Virtual Address Space Layout (Low to High Addresses)
Region	Address Range (conceptual)	Purpose	Growth Direction
Text (Code)	Low addresses	Executable machine code	Fixed size
Data	After text	Initialized global/static variables	Fixed size
BSS	After data	Uninitialized global/static variables	Fixed size
Heap	After BSS	Dynamic allocations (malloc, new)	Grows upward ↑
[Unmapped Gap]	Middle range	Reserved for growth	—
Stack	Near top	Function call frames, local variables	Grows downward ↓
Kernel Space	Highest addresses	OS kernel (protected)	Fixed size

The heap-stack gap:

The large unmapped region between the heap and stack serves multiple purposes:

Growth accommodation — Both heap and stack can grow into this space without colliding
Security — Makes it harder to predict addresses for exploitation
Flexibility — Allows memory-mapped files to be placed in this region
ASLR support — Enables randomization of library load addresses

Why this layout matters:

The separation of code (text) from data isn't arbitrary—it enables crucial optimizations and security features:

Code sharing — Multiple instances of the same program can share the text segment
Write protection — Code pages can be marked read-only, preventing code injection
Execution prevention — Data pages can be marked non-executable (DEP/NX bit)
Copy-on-write — Forked processes initially share all segments until modification

The Kernel Lives in Every Address Space

The kernel space at the top of every virtual address space is a mapping to the same physical memory for all processes. This design allows efficient system calls—the process doesn't need to switch address spaces to enter the kernel, just privilege levels. However, modern vulnerabilities like Meltdown have prompted kernel page table isolation (KPTI) on some systems.

Virtual Addresses in Practice

Let's examine how virtual addresses manifest in real programs. Understanding these concrete details bridges the gap between abstract concepts and practical debugging.

Examining a process's memory map:

On Linux systems, you can examine the virtual address space of any process by reading /proc/[pid]/maps. Here's an annotated example:

Process Memory Map Example (/proc/self/maps)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Sample output from cat /proc/self/maps (annotated)
 
# Text segment (code) - note the 'r-x' permissions (read, execute)
00400000-0040c000 r-xp 00000000 08:01 262146    /bin/cat
 
# Read-only data 
0060b000-0060c000 r--p 0000b000 08:01 262146    /bin/cat
 
# Initialized data (read-write)
0060c000-0060d000 rw-p 0000c000 08:01 262146    /bin/cat
 
# Heap - grows upward from here
0060d000-0062e000 rw-p 00000000 00:00 0         [heap]
 
# Memory-mapped libraries (shared)
7f8e4a000000-7f8e4a1bc000 r-xp 00000000 08:01 524291  /lib/x86_64-linux-gnu/libc-2.27.so
 
# Thread local storage, vdso, etc.
7f8e4a5c4000-7f8e4a5c8000 rw-p 00000000 00:00 0
 
# Stack - grows downward toward lower addresses  
7ffd5c4e0000-7ffd5c501000 rw-p 00000000 00:00 0     [stack]
 
# Kernel mappings (not accessible from user mode)
ffffffffff600000-ffffffffff601000 r-xp ...       [vsyscall]

Understanding the permission flags:

Each memory region has four permission characters:

Flag	Meaning	Security Implication
`r`	Readable	Can read data from this region
`w`	Writable	Can modify data in this region
`x`	Executable	CPU can execute instructions here
`p`/`s`	Private/Shared	Private (copy-on-write) or shared

The virtual address space is sparse:

Notice the gaps between memory regions. Not all addresses in a virtual address space are valid—attempting to access an unmapped address triggers a segmentation fault (SIGSEGV on Unix systems). This sparsity is actually beneficial:

Memory efficiency — Only used regions consume physical memory or page table entries
Security — Invalid access is immediately detected
Flexibility — Regions can be added or resized dynamically

Virtual Addresses Are Not Physical Addresses

A critical insight for systems programmers: the virtual address 0x7ffd5c4e0000 has no inherent relationship to physical location 0x7ffd5c4e0000. The same virtual address in two different processes typically maps to completely different physical locations. Only the hardware MMU and operating system know the true mapping.

Address Space Size and Addressing

The size of a virtual address space is determined by the CPU's addressing capability—specifically, the number of bits in a virtual address.

32-bit vs 64-bit addressing:

Virtual Address Space Sizes by Architecture
Architecture	Address Bits	Virtual Space Size	Practical Limit
IA-32 (x86)	32 bits	4 GB (2³²)	~3 GB user space
x86-64 (canonical)	48 bits*	256 TB (2⁴⁸)	~128 TB user space
x86-64 (full)	64 bits	16 EB (2⁶⁴)	Not currently used
ARM64	48 bits	256 TB (2⁴⁸)	Configurable

Why x86-64 uses only 48 bits:

Although 64-bit registers can hold 64-bit addresses, current x86-64 implementations use only 48 bits for virtual addresses. This isn't hardware limitation but practical design:

Page table size — Full 64-bit addressing would require enormous page tables
Physical constraints — No current system needs 16 exabytes of virtual space
Future expansion — The architecture allows future CPUs to extend to more bits

Canonical addresses:

In x86-64, valid addresses must be 'canonical'—bits 48-63 must all be copies of bit 47. This creates two valid ranges:

User space: 0x0000000000000000 to 0x00007fffffffffff (lower 128 TB)
Kernel space: 0xffff800000000000 to 0xffffffffffffffff (upper 128 TB)

The 'hole' in the middle (non-canonical addresses) is reserved for future expansion.

Canonical Address Visualization
Virtual Address Space Layout (x86-64)
═══════════════════════════════════════════════════════════════════
 
0xFFFFFFFFFFFFFFFF ┌─────────────────────────────────────────────┐
                   │                                             │
                   │            KERNEL SPACE                     │
                   │         (Upper 128 TB)                      │
                   │     0xFFFF800000000000 - 0xFFFFFFFFFFFFFFFF │
                   │                                             │
0xFFFF800000000000 ├─────────────────────────────────────────────┤
                   │                                             │
                   │          NON-CANONICAL HOLE                 │
                   │                                             │
                   │      (Invalid - reserved for future)        │
                   │                                             │
0x00007FFFFFFFFFFF ├─────────────────────────────────────────────┤
                   │                                             │
                   │             USER SPACE                      │
                   │          (Lower 128 TB)                     │
                   │     0x0000000000000000 - 0x00007FFFFFFFFFFF │
                   │                                             │
0x0000000000000000 └─────────────────────────────────────────────┘

Why 128 TB Is Still Enormous

128 TB of virtual address space seems excessive, but remember: modern applications can memory-map huge files, use sparse data structures, and benefit from address space randomization. Having abundant address space enables these techniques without the fragmentation problems that plagued 32-bit systems.

Per-Process Address Space Independence

One of the most powerful properties of virtual address spaces is isolation: each process has its own independent address space. This isolation is absolute—there is no direct way for one process to access another's memory through virtual addresses.

The implications of independence:

Benefits of Address Space Independence

•Security — A malicious or buggy process cannot directly corrupt another process's memory. There's no address it can compute that will access another process's data.
•Stability — A crashed process affects only itself. Its corrupted pointers and wild writes are contained within its own address space.
•Simplicity — Programmers don't need to coordinate memory usage between processes. Each process can use addresses 0x1000, 0x2000, etc., without conflict.
•Relocation — Programs don't need to know where they'll be loaded in physical memory. They're always loaded at the same virtual addresses.
•Flexibility — The OS can rearrange physical memory behind the scenes without processes knowing or caring.

Same virtual address, different physical locations:

Consider two processes, A and B, running simultaneously. Both might have code loaded at virtual address 0x400000 and stack at 0x7fff00000000. These identical virtual addresses refer to completely different physical memory locations.

The translation mechanism:

The CPU's Memory Management Unit (MMU) maintains separate page tables for each process. When Process A is running, the MMU uses Process A's page tables to translate virtual addresses to physical addresses. When the OS context-switches to Process B, it switches the active page table, and now the same virtual addresses translate to Process B's physical pages.

Context switch and address space:

Address Space Isolation Visualization
Process A's View          Process B's View           Physical Memory
════════════════          ════════════════           ═══════════════
    
0xFFFF... ┌───────┐       0xFFFF... ┌───────┐       ┌───────┐ 0x00000000
          │Kernel │                 │Kernel │       │ OS    │
          ├───────┤                 ├───────┤       ├───────┤ 0x00100000
          │ Stack │                 │ Stack │       │ P_A   │ ─── A's Code
          ├───────┤                 ├───────┤       │ code  │
          │       │                 │       │       ├───────┤ 0x00200000
          │       │                 │       │       │ P_B   │ ─── B's Stack
          ├───────┤                 ├───────┤       │ stack │
          │ Heap  │                 │ Heap  │       ├───────┤ 0x00300000
          ├───────┤                 ├───────┤       │ P_A   │ ─── A's Heap
          │ Data  │                 │ Data  │       │ heap  │
          ├───────┤                 ├───────┤       ├───────┤ 0x00400000
          │ Code  │                 │ Code  │       │ P_B   │ ─── B's Code
0x0000... └───────┘       0x0000... └───────┘       └───────┘
                                                    │ ... etc
 
Same virtual layout, but translations (arrows) point to different physical pages!

Shared Memory Is Opt-In

While address spaces are isolated by default, processes can explicitly request shared memory regions for inter-process communication. This is an exception that proves the rule—sharing requires explicit coordination through the operating system, not accidental address collisions.

Virtual Address Composition

A virtual address is not just a single number—it's a structured value composed of multiple fields, each serving a specific purpose in the address translation process. Understanding this structure is essential for grasping how virtual memory works at the hardware level.

The fundamental split: page number and offset

Every virtual address is divided into at least two parts:

Virtual Page Number (VPN) — Identifies which page of the virtual address space
Page Offset — Identifies the specific byte within that page

For a system with 4 KB pages (2¹² bytes):

Virtual Address Structure (4 KB pages)
Virtual Address (32-bit system with 4 KB pages):
─────────────────────────────────────────────────────
 
│ 31          20 │ 19          12 │ 11           0 │
├────────────────┼────────────────┼────────────────┤
│  VPN (20 bits) │                │ Offset (12 b)  │
└────────────────┴────────────────┴────────────────┘
        ↓                                  ↓
   Identifies the              Identifies the byte
   specific page               within that page
   (2^20 = 1M pages)           (2^12 = 4096 bytes)
 
 
Example: Virtual Address 0x00403500
═══════════════════════════════════
 
Hex:    0x00403500
Binary: 0000 0000 0100 0000 0011 0101 0000 0000
 
Split at bit 12:
  VPN:    0x00403 (which page)
  Offset: 0x500   (position within page = byte 1280)
 
This address refers to byte 1280 of virtual page 0x403.

Multi-level page tables add more fields:

Modern systems use hierarchical page tables to avoid having a single enormous page table. This means the VPN is further subdivided:

Virtual Address Fields by Architecture
Architecture	Page Size	Level 4	Level 3	Level 2	Level 1	Offset
IA-32 (2-level)	4 KB	—	—	10 bits	10 bits	12 bits
x86-64 (4-level)	4 KB	9 bits	9 bits	9 bits	9 bits	12 bits
x86-64 (5-level)	4 KB	9 bits	9 bits	9 bits	9 bits	12 bits
ARM64 (4-level)	4 KB	9 bits	9 bits	9 bits	9 bits	12 bits

x86-64 Virtual Address Structure (4-level paging)
x86-64 Virtual Address with 4-Level Paging:
══════════════════════════════════════════════
 
│ 63    48 │ 47   39 │ 38   30 │ 29   21 │ 20   12 │ 11    0 │
├──────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Sign Ext │  PML4   │  PDPT   │   PD    │   PT    │ Offset  │
│ (16 bits)│(9 bits) │(9 bits) │(9 bits) │(9 bits) │(12 bits)│
└──────────┴─────────┴─────────┴─────────┴─────────┴─────────┘
              ↓         ↓         ↓         ↓         ↓
           Page       Page       Page      Page      Byte
           Map       Directory  Directory  Table    within
           Level 4    Pointer   (Level2)  (Level1)   Page
 
Translation path:
CR3 → PML4[index] → PDPT[index] → PD[index] → PT[index] → Physical Page + Offset

The Offset Is Preserved

A crucial property: the page offset bits are never translated—they're copied directly from the virtual address to the physical address. This is because pages (and frames) are the same size, so the byte position within a page is the same whether viewing it virtually or physically.

Virtual vs Logical vs Linear Addresses

The terminology around memory addresses can be confusing, as different architectures and textbooks use varying terms. Let's clarify the distinctions:

In most modern systems, these terms are effectively synonymous:

Virtual Address — The address a program uses; undergoes translation to physical
Logical Address — Often used interchangeably with virtual address
Linear Address — In x86 terminology, the address after segmentation but before paging

The x86 historical context:

In legacy x86 systems with segmentation, there was a distinction:

Logical Address = Segment:Offset (what the program specifies)
Linear Address = After segment translation (segment base + offset)
Physical Address = After page translation (what appears on memory bus)

In modern x86-64 long mode, segmentation is essentially disabled (flat memory model), so logical ≈ linear ≈ virtual.

Address Translation Chain (Historical x86 vs Modern)
Historical x86 Protected Mode (Segmentation + Paging):
═══════════════════════════════════════════════════════
 
Logical Address (selector:offset)
        │
        ▼ Segment Translation
        │  (add segment base from descriptor)
        │
Linear Address (32 or 48 bits)
        │
        ▼ Page Translation
        │  (page table walk)
        │
Physical Address
 
 
Modern x86-64 Long Mode (Flat Model):
═════════════════════════════════════
 
Virtual Address (64 bits, 48 used)
        │
        │ Segmentation: FS/GS only (for TLS)
        │ All other segments have base 0, limit max
        │
Linear Address ≈ Virtual Address
        │
        ▼ Page Translation (4-level page tables)
        │
Physical Address

Terminology in This Course

Throughout this course, we use 'virtual address' to mean the address used by programs that requires translation to a physical address. This aligns with modern usage and avoids the complexity of legacy segmentation schemes that are rarely relevant today.

Hardware Support for Virtual Address Spaces

Virtual address spaces are not purely a software abstraction—they require intimate hardware support. The CPU must translate every memory access from virtual to physical addresses, and it must do so incredibly fast (every memory operation, many times per instruction).

Key hardware components:

Hardware Supporting Virtual Address Spaces
Component	Location	Function	Performance Impact
MMU	Inside CPU	Performs address translation	Every memory access
TLB	Inside CPU	Caches recent translations	Critical for performance
Page Table Base Register	CPU register (CR3)	Points to current page table	Changed on context switch
Page Table Walker	Inside MMU	Traverses page table hierarchy	On TLB miss
Page Fault Handler	CPU microcode	Generates exception on invalid access	Traps to OS kernel

The critical path—every memory access:

CPU generates a virtual address (instruction fetch, load, store)
TLB is checked for a cached translation
If TLB hit: physical address is immediately available
If TLB miss: page table walker traverses the page table hierarchy
Translation is loaded into TLB for future use
Physical address is sent to memory controller
Data returns from memory

Why hardware support is essential:

Software-only translation would be impossibly slow. Consider that a CPU might execute billions of instructions per second, each potentially accessing memory multiple times. The translation overhead must be near-zero for typical cases, which requires dedicated silicon.

The TLB's crucial role:

The Translation Lookaside Buffer is perhaps the most performance-critical cache in the entire system. A TLB miss can cost hundreds of cycles (the time to walk a 4-level page table), so high TLB hit rates are essential. Modern CPUs have:

L1 TLB: Fastest, smallest (dozens of entries), split I-TLB and D-TLB
L2 TLB: Larger (hundreds to thousands of entries), unified
TLB for different page sizes (4 KB, 2 MB, 1 GB pages)

Context Switches Invalidate TLB Entries

When the OS switches between processes, it typically changes the page table base register (CR3 on x86). This invalidates TLB entries from the old process, causing TLB misses until the new process's working set is cached. Modern CPUs support 'PCID' (Process-Context Identifiers) to tag TLB entries and avoid full flushes.

Summary: The Foundation of Virtual Memory

The virtual address space is the foundational abstraction that enables modern virtual memory systems. We've covered its essential aspects in depth:

Key Takeaways

•Virtual address space is a private abstraction — Each process sees its own independent, contiguous address space, isolated from all other processes.
•Address space has well-defined regions — Code, data, heap, stack, and kernel space each serve specific purposes with distinct permissions.
•Virtual addresses are structured — They contain page numbers and offsets, enabling efficient translation through hierarchical page tables.
•Address space size depends on architecture — 32-bit systems offer 4 GB, while 64-bit systems offer 128 TB or more of user space.
•Isolation provides security and stability — No process can accidentally or maliciously access another's memory through normal address computations.
•Hardware support is essential — The MMU, TLB, and page table walker enable address translation at speeds that keep up with modern CPUs.

What's next:

Now that we understand the virtual address space abstraction, the next page explores one of its most remarkable properties: virtual address spaces can be larger than physical memory. This capability is what transforms virtual memory from a mere protection mechanism into a fundamental enabler of modern computing.

Page Complete

You now understand the virtual address space—the fundamental abstraction that gives each process its own private, isolated view of memory. This concept is the cornerstone of all virtual memory techniques we'll explore throughout this module.

1 / 5

Loading learning content...

Operating SystemsVirtual Memory Concepts

Virtual Memory Concepts

LevelIntermediate

Duration60 mins

TopicVirtual Memory Concepts

1 / 5

Virtual Address Space

The Great Memory Illusion

What You Will Learn

Defining Virtual Address Space

Physical memory is shared among all processes
Physical memory may not be contiguous for any single process
Physical memory has a fixed, finite size
Other processes occupy portions of physical memory simultaneously

Virtual Address Space vs Physical Memory: Fundamental Differences
Characteristic	Virtual Address Space	Physical Memory
Ownership	Private to each process	Shared among all processes
Contiguity	Always appears contiguous	Fragmented per process
Size	Can exceed physical RAM	Fixed hardware limit
Starting Address	Begins at 0 (or near 0)	Shared address range
Visibility	Only own addresses visible	Contains all process data
Lifetime	Exists while process runs	Persists across processes

The Programmer's Mental Model

Anatomy of a Virtual Address Space

The canonical layout (from low addresses to high addresses):

The exact layout varies by operating system and CPU architecture, but the conceptual organization follows consistent principles across platforms.

Standard Virtual Address Space Layout (Low to High Addresses)
Region	Address Range (conceptual)	Purpose	Growth Direction
Text (Code)	Low addresses	Executable machine code	Fixed size
Data	After text	Initialized global/static variables	Fixed size
BSS	After data	Uninitialized global/static variables	Fixed size
Heap	After BSS	Dynamic allocations (malloc, new)	Grows upward ↑
[Unmapped Gap]	Middle range	Reserved for growth	—
Stack	Near top	Function call frames, local variables	Grows downward ↓
Kernel Space	Highest addresses	OS kernel (protected)	Fixed size

The heap-stack gap:

The large unmapped region between the heap and stack serves multiple purposes:

Growth accommodation — Both heap and stack can grow into this space without colliding
Security — Makes it harder to predict addresses for exploitation
Flexibility — Allows memory-mapped files to be placed in this region
ASLR support — Enables randomization of library load addresses

Why this layout matters:

The separation of code (text) from data isn't arbitrary—it enables crucial optimizations and security features:

Code sharing — Multiple instances of the same program can share the text segment
Write protection — Code pages can be marked read-only, preventing code injection
Execution prevention — Data pages can be marked non-executable (DEP/NX bit)
Copy-on-write — Forked processes initially share all segments until modification

The Kernel Lives in Every Address Space

Virtual Addresses in Practice

Let's examine how virtual addresses manifest in real programs. Understanding these concrete details bridges the gap between abstract concepts and practical debugging.

Examining a process's memory map:

On Linux systems, you can examine the virtual address space of any process by reading /proc/[pid]/maps. Here's an annotated example:

Process Memory Map Example (/proc/self/maps)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Sample output from cat /proc/self/maps (annotated)
 
# Text segment (code) - note the 'r-x' permissions (read, execute)
00400000-0040c000 r-xp 00000000 08:01 262146    /bin/cat
 
# Read-only data 
0060b000-0060c000 r--p 0000b000 08:01 262146    /bin/cat
 
# Initialized data (read-write)
0060c000-0060d000 rw-p 0000c000 08:01 262146    /bin/cat
 
# Heap - grows upward from here
0060d000-0062e000 rw-p 00000000 00:00 0         [heap]
 
# Memory-mapped libraries (shared)
7f8e4a000000-7f8e4a1bc000 r-xp 00000000 08:01 524291  /lib/x86_64-linux-gnu/libc-2.27.so
 
# Thread local storage, vdso, etc.
7f8e4a5c4000-7f8e4a5c8000 rw-p 00000000 00:00 0
 
# Stack - grows downward toward lower addresses  
7ffd5c4e0000-7ffd5c501000 rw-p 00000000 00:00 0     [stack]
 
# Kernel mappings (not accessible from user mode)
ffffffffff600000-ffffffffff601000 r-xp ...       [vsyscall]

Understanding the permission flags:

Each memory region has four permission characters:

Flag	Meaning	Security Implication
`r`	Readable	Can read data from this region
`w`	Writable	Can modify data in this region
`x`	Executable	CPU can execute instructions here
`p`/`s`	Private/Shared	Private (copy-on-write) or shared

The virtual address space is sparse:

Memory efficiency — Only used regions consume physical memory or page table entries
Security — Invalid access is immediately detected
Flexibility — Regions can be added or resized dynamically

Virtual Addresses Are Not Physical Addresses

Address Space Size and Addressing

The size of a virtual address space is determined by the CPU's addressing capability—specifically, the number of bits in a virtual address.

32-bit vs 64-bit addressing:

Virtual Address Space Sizes by Architecture
Architecture	Address Bits	Virtual Space Size	Practical Limit
IA-32 (x86)	32 bits	4 GB (2³²)	~3 GB user space
x86-64 (canonical)	48 bits*	256 TB (2⁴⁸)	~128 TB user space
x86-64 (full)	64 bits	16 EB (2⁶⁴)	Not currently used
ARM64	48 bits	256 TB (2⁴⁸)	Configurable

Why x86-64 uses only 48 bits:

Although 64-bit registers can hold 64-bit addresses, current x86-64 implementations use only 48 bits for virtual addresses. This isn't hardware limitation but practical design:

Page table size — Full 64-bit addressing would require enormous page tables
Physical constraints — No current system needs 16 exabytes of virtual space
Future expansion — The architecture allows future CPUs to extend to more bits

Canonical addresses:

In x86-64, valid addresses must be 'canonical'—bits 48-63 must all be copies of bit 47. This creates two valid ranges:

User space: 0x0000000000000000 to 0x00007fffffffffff (lower 128 TB)
Kernel space: 0xffff800000000000 to 0xffffffffffffffff (upper 128 TB)

The 'hole' in the middle (non-canonical addresses) is reserved for future expansion.

Canonical Address Visualization
Virtual Address Space Layout (x86-64)
═══════════════════════════════════════════════════════════════════
 
0xFFFFFFFFFFFFFFFF ┌─────────────────────────────────────────────┐
                   │                                             │
                   │            KERNEL SPACE                     │
                   │         (Upper 128 TB)                      │
                   │     0xFFFF800000000000 - 0xFFFFFFFFFFFFFFFF │
                   │                                             │
0xFFFF800000000000 ├─────────────────────────────────────────────┤
                   │                                             │
                   │          NON-CANONICAL HOLE                 │
                   │                                             │
                   │      (Invalid - reserved for future)        │
                   │                                             │
0x00007FFFFFFFFFFF ├─────────────────────────────────────────────┤
                   │                                             │
                   │             USER SPACE                      │
                   │          (Lower 128 TB)                     │
                   │     0x0000000000000000 - 0x00007FFFFFFFFFFF │
                   │                                             │
0x0000000000000000 └─────────────────────────────────────────────┘

Why 128 TB Is Still Enormous

Per-Process Address Space Independence

The implications of independence:

Benefits of Address Space Independence

•Security — A malicious or buggy process cannot directly corrupt another process's memory. There's no address it can compute that will access another process's data.
•Stability — A crashed process affects only itself. Its corrupted pointers and wild writes are contained within its own address space.
•Simplicity — Programmers don't need to coordinate memory usage between processes. Each process can use addresses 0x1000, 0x2000, etc., without conflict.
•Relocation — Programs don't need to know where they'll be loaded in physical memory. They're always loaded at the same virtual addresses.
•Flexibility — The OS can rearrange physical memory behind the scenes without processes knowing or caring.

Same virtual address, different physical locations:

The translation mechanism:

Context switch and address space:

Address Space Isolation Visualization
Process A's View          Process B's View           Physical Memory
════════════════          ════════════════           ═══════════════
    
0xFFFF... ┌───────┐       0xFFFF... ┌───────┐       ┌───────┐ 0x00000000
          │Kernel │                 │Kernel │       │ OS    │
          ├───────┤                 ├───────┤       ├───────┤ 0x00100000
          │ Stack │                 │ Stack │       │ P_A   │ ─── A's Code
          ├───────┤                 ├───────┤       │ code  │
          │       │                 │       │       ├───────┤ 0x00200000
          │       │                 │       │       │ P_B   │ ─── B's Stack
          ├───────┤                 ├───────┤       │ stack │
          │ Heap  │                 │ Heap  │       ├───────┤ 0x00300000
          ├───────┤                 ├───────┤       │ P_A   │ ─── A's Heap
          │ Data  │                 │ Data  │       │ heap  │
          ├───────┤                 ├───────┤       ├───────┤ 0x00400000
          │ Code  │                 │ Code  │       │ P_B   │ ─── B's Code
0x0000... └───────┘       0x0000... └───────┘       └───────┘
                                                    │ ... etc
 
Same virtual layout, but translations (arrows) point to different physical pages!

Shared Memory Is Opt-In

Virtual Address Composition

The fundamental split: page number and offset

Every virtual address is divided into at least two parts:

Virtual Page Number (VPN) — Identifies which page of the virtual address space
Page Offset — Identifies the specific byte within that page

For a system with 4 KB pages (2¹² bytes):

Virtual Address Structure (4 KB pages)
Virtual Address (32-bit system with 4 KB pages):
─────────────────────────────────────────────────────
 
│ 31          20 │ 19          12 │ 11           0 │
├────────────────┼────────────────┼────────────────┤
│  VPN (20 bits) │                │ Offset (12 b)  │
└────────────────┴────────────────┴────────────────┘
        ↓                                  ↓
   Identifies the              Identifies the byte
   specific page               within that page
   (2^20 = 1M pages)           (2^12 = 4096 bytes)
 
 
Example: Virtual Address 0x00403500
═══════════════════════════════════
 
Hex:    0x00403500
Binary: 0000 0000 0100 0000 0011 0101 0000 0000
 
Split at bit 12:
  VPN:    0x00403 (which page)
  Offset: 0x500   (position within page = byte 1280)
 
This address refers to byte 1280 of virtual page 0x403.

Multi-level page tables add more fields:

Modern systems use hierarchical page tables to avoid having a single enormous page table. This means the VPN is further subdivided:

Virtual Address Fields by Architecture
Architecture	Page Size	Level 4	Level 3	Level 2	Level 1	Offset
IA-32 (2-level)	4 KB	—	—	10 bits	10 bits	12 bits
x86-64 (4-level)	4 KB	9 bits	9 bits	9 bits	9 bits	12 bits
x86-64 (5-level)	4 KB	9 bits	9 bits	9 bits	9 bits	12 bits
ARM64 (4-level)	4 KB	9 bits	9 bits	9 bits	9 bits	12 bits

x86-64 Virtual Address Structure (4-level paging)
x86-64 Virtual Address with 4-Level Paging:
══════════════════════════════════════════════
 
│ 63    48 │ 47   39 │ 38   30 │ 29   21 │ 20   12 │ 11    0 │
├──────────┼─────────┼─────────┼─────────┼─────────┼─────────┤
│ Sign Ext │  PML4   │  PDPT   │   PD    │   PT    │ Offset  │
│ (16 bits)│(9 bits) │(9 bits) │(9 bits) │(9 bits) │(12 bits)│
└──────────┴─────────┴─────────┴─────────┴─────────┴─────────┘
              ↓         ↓         ↓         ↓         ↓
           Page       Page       Page      Page      Byte
           Map       Directory  Directory  Table    within
           Level 4    Pointer   (Level2)  (Level1)   Page
 
Translation path:
CR3 → PML4[index] → PDPT[index] → PD[index] → PT[index] → Physical Page + Offset

The Offset Is Preserved

Virtual vs Logical vs Linear Addresses

The terminology around memory addresses can be confusing, as different architectures and textbooks use varying terms. Let's clarify the distinctions:

In most modern systems, these terms are effectively synonymous:

Virtual Address — The address a program uses; undergoes translation to physical
Logical Address — Often used interchangeably with virtual address
Linear Address — In x86 terminology, the address after segmentation but before paging

The x86 historical context:

In legacy x86 systems with segmentation, there was a distinction:

Logical Address = Segment:Offset (what the program specifies)
Linear Address = After segment translation (segment base + offset)
Physical Address = After page translation (what appears on memory bus)

In modern x86-64 long mode, segmentation is essentially disabled (flat memory model), so logical ≈ linear ≈ virtual.

Address Translation Chain (Historical x86 vs Modern)
Historical x86 Protected Mode (Segmentation + Paging):
═══════════════════════════════════════════════════════
 
Logical Address (selector:offset)
        │
        ▼ Segment Translation
        │  (add segment base from descriptor)
        │
Linear Address (32 or 48 bits)
        │
        ▼ Page Translation
        │  (page table walk)
        │
Physical Address
 
 
Modern x86-64 Long Mode (Flat Model):
═════════════════════════════════════
 
Virtual Address (64 bits, 48 used)
        │
        │ Segmentation: FS/GS only (for TLS)
        │ All other segments have base 0, limit max
        │
Linear Address ≈ Virtual Address
        │
        ▼ Page Translation (4-level page tables)
        │
Physical Address

Terminology in This Course

Hardware Support for Virtual Address Spaces

Key hardware components:

Hardware Supporting Virtual Address Spaces
Component	Location	Function	Performance Impact
MMU	Inside CPU	Performs address translation	Every memory access
TLB	Inside CPU	Caches recent translations	Critical for performance
Page Table Base Register	CPU register (CR3)	Points to current page table	Changed on context switch
Page Table Walker	Inside MMU	Traverses page table hierarchy	On TLB miss
Page Fault Handler	CPU microcode	Generates exception on invalid access	Traps to OS kernel

The critical path—every memory access:

CPU generates a virtual address (instruction fetch, load, store)
TLB is checked for a cached translation
If TLB hit: physical address is immediately available
If TLB miss: page table walker traverses the page table hierarchy
Translation is loaded into TLB for future use
Physical address is sent to memory controller
Data returns from memory

Why hardware support is essential:

The TLB's crucial role:

L1 TLB: Fastest, smallest (dozens of entries), split I-TLB and D-TLB
L2 TLB: Larger (hundreds to thousands of entries), unified
TLB for different page sizes (4 KB, 2 MB, 1 GB pages)

Context Switches Invalidate TLB Entries

Summary: The Foundation of Virtual Memory

The virtual address space is the foundational abstraction that enables modern virtual memory systems. We've covered its essential aspects in depth:

Key Takeaways

•Virtual address space is a private abstraction — Each process sees its own independent, contiguous address space, isolated from all other processes.
•Address space has well-defined regions — Code, data, heap, stack, and kernel space each serve specific purposes with distinct permissions.
•Virtual addresses are structured — They contain page numbers and offsets, enabling efficient translation through hierarchical page tables.
•Address space size depends on architecture — 32-bit systems offer 4 GB, while 64-bit systems offer 128 TB or more of user space.
•Isolation provides security and stability — No process can accidentally or maliciously access another's memory through normal address computations.
•Hardware support is essential — The MMU, TLB, and page table walker enable address translation at speeds that keep up with modern CPUs.

What's next:

Page Complete

1 / 5