Linux Memory Management - Learning Module

Loading content...

0/227

Virtual Address Space Layout

The Foundation of Process Isolation

Every process in Linux operates within the illusion of having exclusive access to an enormous, contiguous memory space. This illusion—the virtual address space—is one of the most elegant abstractions in operating system design. It enables process isolation, memory protection, efficient memory sharing, and forms the foundation upon which higher-level memory management facilities are built.

Understanding the virtual address space layout is essential for systems programmers, kernel developers, security researchers, and performance engineers. It explains phenomena ranging from why certain memory accesses cause segmentation faults, to how exploits like return-to-libc attacks work, to why 32-bit systems can't address more than 4GB of RAM without special configurations.

What You Will Learn

By the end of this page, you will understand: (1) the canonical virtual address space layout in Linux for both 32-bit and 64-bit architectures, (2) how user space and kernel space are divided, (3) the purpose and location of each memory region (text, data, BSS, heap, stack, memory-mapped regions), (4) security features like ASLR and their impact on layout, and (5) how to examine a process's address space using Linux tools.

The Illusion of Ownership

When a process runs on Linux, it believes it has access to a massive, continuous block of memory—ranging from address 0 to some maximum address determined by the architecture. This belief is an illusion carefully maintained by the kernel and the CPU's Memory Management Unit (MMU).

Why create this illusion?

The virtual address space abstraction provides several critical benefits:

Process Isolation: Each process operates in its own address space, unable to access another process's memory directly. A bug in one process cannot corrupt another.
Simplified Programming Model: Programmers don't need to coordinate memory usage with other processes or know physical memory layouts. Every process can use the same virtual addresses.
Efficient Memory Use: Physical memory can be fragmented, but virtual memory appears contiguous. The OS maps virtual pages to physical frames wherever available.
Memory Protection: Different regions can have different permissions (read/write/execute), enforced by hardware.
Demand Paging: Only the portions of the address space actually in use need to reside in physical memory. The rest can live on disk.

Virtual ≠ Physical

A critical insight: virtual addresses bear no direct relationship to physical addresses. Two processes might both use virtual address 0x00400000 for their code, but the MMU maps each to completely different physical memory locations. This mapping is managed through page tables, which we'll explore in the next page.

Architectural Address Space Sizes

The size of the virtual address space is determined by the CPU architecture—specifically, by the number of bits used for addressing.

32-bit Architectures (x86, ARM32)

With 32 address bits, the theoretical maximum address space is 2³² = 4 GB. However, this 4 GB must be shared between user space and kernel space. The traditional Linux split is:

User space: 0x00000000 to 0xBFFFFFFF (3 GB)
Kernel space: 0xC0000000 to 0xFFFFFFFF (1 GB)

This 3:1 split is the default, though alternative configurations like 2:2 or even 1:3 (HIGHMEM configurations) exist for specific use cases.

64-bit Architectures (x86_64, ARM64)

With 64 address bits, the theoretical maximum is 2⁶⁴ = 16 exabytes—far more than any existing or foreseeable system could use. In practice, current x86_64 processors implement only 48-bit virtual addressing (256 TB), with recent extensions supporting 57-bit addressing (128 PB).

The 48-bit address space (256 TB) is split:

User space: 0x0000000000000000 to 0x00007FFFFFFFFFFF (128 TB)
Non-canonical hole: 0x0000800000000000 to 0xFFFF7FFFFFFFFFFF
Kernel space: 0xFFFF800000000000 to 0xFFFFFFFFFFFFFFFF (128 TB)

Virtual Address Space Comparison
Architecture	Address Bits	Total Size	User Space	Kernel Space
x86 (32-bit)	32	4 GB	3 GB	1 GB
x86 with PAE	32 virtual / 36 physical	4 GB virtual	3 GB	1 GB
x86_64 (48-bit)	48	256 TB	128 TB	128 TB
x86_64 (57-bit LA57)	57	128 PB	64 PB	64 PB
ARM64	48 or 52	256 TB or 4 PB	Half	Half

The Non-Canonical Hole

On x86_64, addresses between user and kernel space are "non-canonical"—any access to them triggers a general protection fault. This creates a natural gap that helps detect pointer corruption and provides some separation between user and kernel regions.

The Canonical Process Layout

Within a process's user-space portion of the virtual address space, memory is organized into several distinct regions, each serving a specific purpose. Understanding this layout is fundamental to understanding how programs execute, how debuggers work, and how exploits are constructed.

The Canonical Regions (Low to High Address):

NULL Guard Page (address 0): The very bottom of the address space is intentionally unmapped. Any attempt to dereference a NULL pointer causes a segmentation fault rather than reading garbage.
Text Segment (.text): Contains the executable machine code of the program. This region is typically marked read-only and executable (r-x).
Read-Only Data (.rodata): Contains constant data—string literals, const variables. Marked read-only (r--).
Initialized Data (.data): Contains global and static variables that have explicit initial values. Marked read-write (rw-).
Uninitialized Data (.bss): Contains global and static variables without explicit initializers (or initialized to zero). The kernel zeroes this region but doesn't store it in the executable—saving disk space. Marked read-write (rw-).
Heap: The dynamically allocated memory region. Grows upward (toward higher addresses) as programs call malloc(), new, or brk()/sbrk(). Marked read-write (rw-).
Memory-Mapped Regions: Located between heap and stack, this area contains memory-mapped files, shared libraries, anonymous mappings, and mmap() allocations. These regions are created and destroyed dynamically.
Stack: The primary stack for the process, growing downward (toward lower addresses). Contains function call frames, local variables, and return addresses. Marked read-write (rw-), sometimes with execute permissions disabled (no-exec stack).

Converting Mermaid diagram...

Why This Layout?

The layout is not arbitrary—it reflects decades of engineering tradeoffs:

NULL at bottom: Catches the most common programming error (NULL pointer dereference) at the hardware level.
Text before data: Code and constants are read-only and can be shared across processes running the same executable.
Heap grows up, stack grows down: These are the two unbounded-growth regions. By growing toward each other from opposite ends, they can each use as much space as available without a fixed boundary.
Libraries in the middle: Shared libraries are loaded via mmap() into the gap between heap and stack, where they don't interfere with either.

Examining the Address Space

Linux provides several interfaces to examine a process's virtual address space. Understanding these tools is essential for debugging, performance analysis, and security research.

Key Tools for Address Space Inspection

•/proc/[pid]/maps — Text file listing all memory mappings for a process, with addresses, permissions, and backing files.
•/proc/[pid]/smaps — Extended version with detailed memory statistics (RSS, PSS, swap usage) per mapping.
•pmap — Command-line tool that presents /proc/[pid]/maps information in a readable format.
•readelf — Shows segment information from ELF executables, revealing intended load addresses.
•gdb — The GNU Debugger can examine memory regions with info proc mappings.

examining_process_maps.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# View memory mappings for current process
cat /proc/self/maps
 
# Example output (simplified):
# 00400000-00452000 r-xp 00000000 08:01 123456  /bin/bash      [.text]
# 00651000-00652000 r--p 00051000 08:01 123456  /bin/bash      [.rodata]
# 00652000-0065b000 rw-p 00052000 08:01 123456  /bin/bash      [.data/.bss]
# 00e2a000-00f1f000 rw-p 00000000 00:00 0       [heap]
# 7f7b2c000000-7f7b2c1e4000 r-xp 00000000 08:01 789012 /lib/libc.so.6
# ...
# 7ffc3bff3000-7ffc3c014000 rw-p 00000000 00:00 0       [stack]
# 7ffc3c1fe000-7ffc3c200000 r--p 00000000 00:00 0       [vvar]
# 7ffc3c200000-7ffc3c202000 r-xp 00000000 00:00 0       [vdso]
# ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
 
# View detailed memory statistics
cat /proc/self/smaps | head -50
 
# Use pmap for a cleaner view
pmap -x $$
 
# Column meanings in /proc/[pid]/maps:
# address          permissions offset device inode pathname
# 00400000-00452000 r-xp      00000000 08:01 123456 /bin/bash
#
# Permissions: r=read, w=write, x=execute, p=private, s=shared

Reading the Maps File:

Each line in /proc/[pid]/maps describes a Virtual Memory Area (VMA). The format is:

address_start-address_end permissions offset device inode pathname

address range: The virtual address range of this mapping
permissions: rwxp or rwxs (read/write/execute, private or shared)
offset: Offset into the file (for file-backed mappings)
device: Major:minor device numbers of the backing file
inode: Inode number of the backing file
pathname: File path, or special region name like [heap], [stack], [vdso]

detailed_maps_analysis.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
/* Program to demonstrate address space layout */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/mman.h>
 
/* Global initialized variable -> .data section */
int global_initialized = 42;
 
/* Global uninitialized variable -> .bss section */
int global_uninitialized;
 
/* Constant string -> .rodata section */
const char* readonly_string = "This is in .rodata";
 
void show_addresses() {
    /* Local variable -> stack */
    int stack_var = 100;
    
    /* Dynamic allocation -> heap */
    char* heap_mem = malloc(1024);
    
    /* Memory mapped region */
    void* mmap_region = mmap(NULL, 4096, 
                             PROT_READ | PROT_WRITE,
                             MAP_PRIVATE | MAP_ANONYMOUS, 
                             -1, 0);
    
    printf("Address Space Layout Analysis\n");
    printf("========================================\n\n");
    
    printf("Code Section (.text):\n");
    printf("  show_addresses():    %p\n\n", (void*)show_addresses);
    
    printf("Read-Only Data (.rodata):\n");
    printf("  readonly_string:     %p\n\n", (void*)readonly_string);
    
    printf("Initialized Data (.data):\n");
    printf("  global_initialized:  %p\n\n", (void*)&global_initialized);
    
    printf("Uninitialized Data (.bss):\n");
    printf("  global_uninitialized: %p\n\n", (void*)&global_uninitialized);
    
    printf("Heap:\n");
    printf("  malloc'd memory:     %p\n\n", (void*)heap_mem);
    
    printf("Memory-Mapped Region:\n");
    printf("  mmap'd region:       %p\n\n", mmap_region);
    
    printf("Stack:\n");
    printf("  stack_var:           %p\n\n", (void*)&stack_var);
    
    printf("\nProcess ID: %d\n", getpid());
    printf("Check /proc/%d/maps for detailed layout\n", getpid());
    
    /* Pause to allow inspection */
    printf("\nPress Enter to continue...\n");
    getchar();
    
    free(heap_mem);
    munmap(mmap_region, 4096);
}
 
int main() {
    show_addresses();
    return 0;
}

The Kernel-User Space Boundary

The division between kernel space and user space is one of the most critical security boundaries in the operating system. Understanding how this boundary works—and the ongoing efforts to strengthen it—is essential for both kernel developers and security researchers.

The Fundamental Division:

In every process's virtual address space, the upper portion is reserved for the kernel. This region contains:

Kernel code: The executable kernel image
Kernel data structures: Task lists, memory maps, file system caches
Direct mapping of physical memory: The kernel can access any physical address through this mapping
vmalloc region: Virtually contiguous kernel allocations
Module space: Where loadable kernel modules reside

Why Include Kernel Space in Every Process?

When a process makes a system call, execution transitions from user mode to kernel mode. Having the kernel already mapped into the process's address space makes this transition efficient—no page table switch is required. The MMU simply changes its privilege level, and the kernel code is immediately accessible.

x86_64 Kernel Space Layout (Simplified)
Region	Address Range	Size	Purpose
Direct Map	0xFFFF888000000000 - 0xFFFFc87FFFFFFFFF	64 TB	Linear mapping of all physical memory
vmalloc/ioremap	0xFFFFc90000000000 - 0xFFFFe8FFFFFFFFFF	32 TB	Virtual contiguous allocations
Virtual Memory Map	0xFFFFea0000000000 - 0xFFFFeb0000000000	1 TB	struct page array
Kernel Text	0xFFFFFFFF80000000 - 0xFFFFFFFF9FFFFFFF	512 MB	Kernel code
Modules	0xFFFFFFFFA0000000 - 0xFFFFFFFFFEFFFFFF	1.5 GB	Loadable kernel modules

Kernel Page Table Isolation (KPTI)

Modern Linux kernels implement KPTI (formerly KAISER) in response to the Meltdown vulnerability. With KPTI enabled, user-mode page tables contain only a minimal kernel mapping—just enough to handle system call entry. The full kernel is mapped only in kernel-mode page tables. This adds overhead but prevents Meltdown-style attacks from reading kernel memory.

Protection Mechanisms:

Several mechanisms enforce the kernel-user boundary:

CPU Privilege Rings: The x86 architecture has 4 privilege levels (rings 0-3). Linux uses ring 0 for kernel and ring 3 for user space. Memory accesses check the current privilege level against page table permissions.
Supervisor Mode Access Prevention (SMAP): Prevents the kernel from accidentally reading/writing user-space memory. This catches a class of kernel vulnerabilities where attackers trick the kernel into using user-controlled pointers.
Supervisor Mode Execution Prevention (SMEP): Prevents the kernel from executing code in user-space pages. Defeats classic exploits that redirect kernel execution to attacker-controlled user-space code.
NX Bit (No-Execute): Individual pages can be marked non-executable. Combined with SMEP, this significantly raises the bar for code injection attacks.

Address Space Layout Randomization (ASLR)

Address Space Layout Randomization (ASLR) is a security technique that randomizes the base addresses of key memory regions each time a program runs. This makes exploitation significantly harder by removing the attacker's ability to predict where code and data will be located.

What ASLR Randomizes:

Stack starting address
Memory-mapped regions (libraries, mmap allocations)
Heap starting address
Main executable position (when compiled as Position-Independent Executable, PIE)

Levels of ASLR in Linux:

Linux provides three ASLR modes, controlled by /proc/sys/kernel/randomize_va_space:

0: No randomization (for debugging)
1: Randomize stack, mmap, VDSO
2: Full randomization (includes heap)

Most distributions run with level 2.

aslr_demonstration.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# Check current ASLR setting
cat /proc/sys/kernel/randomize_va_space
# Output: 2 (full randomization)
 
# Observe randomization across runs
for i in {1..5}; do
    cat /proc/self/maps | grep '[stack]'
done
# Output shows different addresses each time:
# 7fff8b332000-7fff8b353000 rw-p 00000000 00:00 0 [stack]
# 7ffc4a1d7000-7ffc4a1f8000 rw-p 00000000 00:00 0 [stack]
# 7ffd2e3bc000-7ffd2e3dd000 rw-p 00000000 00:00 0 [stack]
# ...
 
# Show library base address changes
ldd /bin/ls 2>/dev/null | grep libc
ldd /bin/ls 2>/dev/null | grep libc
# Addresses differ between invocations
 
# Temporarily disable ASLR for a single process
setarch $(uname -m) -R /bin/bash -c 'cat /proc/self/maps | grep stack'
setarch $(uname -m) -R /bin/bash -c 'cat /proc/self/maps | grep stack'
# Same addresses now
 
# Check if a binary is compiled with PIE (Position Independent Executable)
file /bin/ls
# Should show "pie executable" for full ASLR protection
 
# Check with readelf
readelf -h /bin/ls | grep Type
# Output: Type: DYN (Shared object file)  <- PIE enabled
# vs:     Type: EXEC (Executable file)    <- No PIE

ASLR Entropy

The effectiveness of ASLR depends on entropy—how many bits of randomization are applied. On 64-bit systems, there's ample address space for high entropy. On 32-bit systems, limited address space means lower entropy, making brute-force attacks feasible. This is one reason 64-bit systems are more secure.

ASLR Bypass Techniques:

While ASLR is a powerful defense, it's not absolute. Researchers have developed various bypass techniques:

Information Leaks: If an attacker can leak a single address (through format string vulnerabilities, side channels, etc.), they can calculate the base address of an entire region.
Heap Spraying: Flooding the heap with controlled data increases the probability that a guess lands in attacker-controlled memory.
JIT (Just-In-Time) Spraying: In browsers, JavaScript JIT compilers can be coerced into placing predictable gadgets in executable memory.
Side Channels: Timing attacks, cache attacks, and speculative execution can leak address information.

Defense in depth is essential—ASLR should be combined with stack canaries, DEP/NX, and other protections.

Special Memory Regions

Beyond the standard text/data/heap/stack regions, Linux maps several special regions into every process's address space for performance and functionality reasons.

Special Memory Regions

•vDSO (Virtual Dynamic Shared Object) — A small shared library mapped by the kernel containing optimized implementations of frequently-used system calls (like gettimeofday() and clock_gettime()). By calling these in user space, programs avoid the overhead of a full system call for time-sensitive operations.
•vvar — A read-only page containing kernel data needed by the vDSO, such as the current time. Updated by the kernel, read by user space.
•vsyscall (Legacy) — The predecessor to vDSO, now mostly deprecated. Located at a fixed address (0xffffffffff600000), which made it useful for attackers as a source of known gadgets. Modern systems restrict vsyscall to emulated execution.
•Stack Guard Page — An unmapped page at the bottom of the stack. If the stack grows into this page (stack overflow), a segmentation fault is raised before the stack can corrupt other memory regions.
•Thread Stacks — Additional threads get their own stacks, typically allocated via mmap in the memory-mapped region. Each has its own guard page.

vdso_analysis.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
/* Demonstrating vDSO performance benefits */
#include <stdio.h>
#include <time.h>
#include <sys/time.h>
#include <unistd.h>
 
#define ITERATIONS 10000000
 
int main() {
    struct timespec ts;
    struct timeval tv;
    clock_t start, end;
    
    /* gettimeofday() - uses vDSO, extremely fast */
    start = clock();
    for (int i = 0; i < ITERATIONS; i++) {
        gettimeofday(&tv, NULL);
    }
    end = clock();
    printf("gettimeofday (vDSO): %.3f seconds\n",
           (double)(end - start) / CLOCKS_PER_SEC);
    
    /* clock_gettime() - also uses vDSO */
    start = clock();
    for (int i = 0; i < ITERATIONS; i++) {
        clock_gettime(CLOCK_REALTIME, &ts);
    }
    end = clock();
    printf("clock_gettime (vDSO): %.3f seconds\n",
           (double)(end - start) / CLOCKS_PER_SEC);
    
    /* Compare with actual syscall - getpid always goes to kernel */
    start = clock();
    for (int i = 0; i < ITERATIONS; i++) {
        getpid();
    }
    end = clock();
    printf("getpid (real syscall): %.3f seconds\n",
           (double)(end - start) / CLOCKS_PER_SEC);
    
    /* Note: getpid() is actually cached by glibc, so the difference
     * may be less dramatic. Use syscall(SYS_getpid) for true syscall */
    
    return 0;
}
 
/* Typical output:
 * gettimeofday (vDSO): 0.150 seconds
 * clock_gettime (vDSO): 0.170 seconds
 * getpid (real syscall): 0.450 seconds
 *
 * vDSO calls are ~3x faster than real syscalls!
 */

vDSO Internals

The vDSO is compiled into the kernel and exposed to user space as if it were a shared library. You can extract it from a running process and disassemble it: dd if=/proc/self/mem bs=1 skip=$((0x7fff...)) count=8192 2>/dev/null | objdump -d -. This reveals the actual vDSO implementation for your kernel version.

Kernel Data Structures: mm_struct and VMAs

The kernel maintains sophisticated data structures to track each process's virtual address space. Understanding these structures is essential for kernel development and provides insight into how the mapping abstraction is implemented.

The mm_struct Structure:

Every process with a virtual address space (i.e., all user-space processes) has an associated mm_struct. This structure is the master record for the process's memory state.

mm_struct_overview.c
C (Kernel)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
/* Simplified view of mm_struct (from include/linux/mm_types.h) */
struct mm_struct {
    /* VMA list and tree for fast lookup */
    struct vm_area_struct *mmap;       /* List of VMAs */
    struct rb_root mm_rb;               /* Red-black tree root */
    
    /* Page table pointer */
    pgd_t *pgd;                         /* Page Global Directory */
    
    /* Reference counting */
    atomic_t mm_users;                  /* How many users (threads) */
    atomic_t mm_count;                  /* Reference count */
    
    /* Memory statistics */
    atomic_long_t nr_ptes;              /* Page table pages */
    atomic_long_t nr_pmds;              /* PMD pages */
    unsigned long total_vm;             /* Total mapped pages */
    unsigned long locked_vm;            /* Locked (non-swappable) pages */
    unsigned long data_vm;              /* Data segment pages */
    unsigned long exec_vm;              /* Executable pages */
    unsigned long stack_vm;             /* Stack pages */
    
    /* Key address boundaries */
    unsigned long start_code, end_code; /* Code segment bounds */
    unsigned long start_data, end_data; /* Data segment bounds */
    unsigned long start_brk, brk;       /* Heap bounds */
    unsigned long start_stack;          /* Stack start */
    unsigned long arg_start, arg_end;   /* Command line arguments */
    unsigned long env_start, env_end;   /* Environment variables */
    
    /* mmap state */
    unsigned long mmap_base;            /* Base for mmap allocations */
    unsigned long task_size;            /* Size of user space */
    unsigned long highest_vm_end;       /* Highest VMA end address */
    
    /* Flags */
    unsigned long flags;                /* Memory flags */
    
    /* Lock for VMA operations */
    struct rw_semaphore mmap_lock;
    
    /* ... many more fields ... */
};

Virtual Memory Areas (VMAs):

Each contiguous region of the virtual address space is represented by a vm_area_struct. VMAs are organized in two data structures for efficient access:

Linked List: For sequential traversal of all VMAs
Red-Black Tree: For O(log n) lookup by address

When a page fault occurs, the kernel uses the red-black tree to quickly find the VMA containing the faulting address.

vma_structure.c
C (Kernel)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
/* Virtual Memory Area structure (simplified) */
struct vm_area_struct {
    /* VMA boundaries */
    unsigned long vm_start;              /* Start address (inclusive) */
    unsigned long vm_end;                /* End address (exclusive) */
    
    /* Linkage */
    struct vm_area_struct *vm_next;      /* Next VMA in list */
    struct vm_area_struct *vm_prev;      /* Previous VMA */
    struct rb_node vm_rb;                /* Node in red-black tree */
    
    /* The owning mm_struct */
    struct mm_struct *vm_mm;
    
    /* Access permissions */
    pgprot_t vm_page_prot;               /* Page protection flags */
    unsigned long vm_flags;              /* Flags: VM_READ, VM_WRITE, etc. */
    
    /* Backing storage */
    struct file *vm_file;                /* File for file-backed mapping */
    unsigned long vm_pgoff;              /* Offset in file (pages) */
    void *vm_private_data;               /* Private data */
    
    /* Operations for this VMA type */
    const struct vm_operations_struct *vm_ops;
    
    /* Anonymous memory management */
    struct anon_vma *anon_vma;           /* Anonymous VMA linkage */
    
    /* ... additional fields ... */
};
 
/* Common vm_flags values */
#define VM_READ     0x00000001  /* Pages can be read */
#define VM_WRITE    0x00000002  /* Pages can be written */
#define VM_EXEC     0x00000004  /* Pages can be executed */
#define VM_SHARED   0x00000008  /* Pages are shared */
#define VM_MAYREAD  0x00000010  /* VM_READ can be set */
#define VM_MAYWRITE 0x00000020  /* VM_WRITE can be set */
#define VM_MAYEXEC  0x00000040  /* VM_EXEC can be set */
#define VM_GROWSDOWN 0x00000100 /* Stack-like growth */
#define VM_LOCKED   0x00002000  /* Pages are locked in memory */
#define VM_IO       0x00004000  /* Memory-mapped I/O */
#define VM_DENYWRITE 0x00000800 /* ETXTBSY on write attempts */
 
/* VMA operations structure - defines how to handle page faults, etc. */
struct vm_operations_struct {
    void (*open)(struct vm_area_struct *area);
    void (*close)(struct vm_area_struct *area);
    vm_fault_t (*fault)(struct vm_fault *vmf);
    vm_fault_t (*huge_fault)(struct vm_fault *vmf, unsigned int order);
    /* ... */
};

VMA Merging

The kernel aggressively merges adjacent VMAs with identical properties. When you mmap two adjacent regions with the same permissions and backing, the kernel may merge them into a single VMA to reduce memory overhead. This is why /proc/pid/maps may show fewer regions than expected.

Summary: Virtual Address Space Layout

We have comprehensively examined `how Linux organizes virtual memory for each process. Let's consolidate the key concepts:

Key Takeaways

•Each process has an isolated virtual address space — ranging from 4 GB on 32-bit to 128+ TB on 64-bit systems, providing the illusion of exclusive memory ownership.
•The canonical layout includes NULL guard, text, rodata, data, bss, heap (growing up), memory-mapped regions, and stack (growing down).
•User and kernel space are partitioned — user space occupies the lower addresses, kernel space the upper. All kernel memory is mapped into every process for efficient system calls.
•ASLR randomizes key addresses — stack, heap, libraries, and (with PIE) the executable itself, making exploitation significantly harder.
•Special regions (vDSO, vvar) enable fast user-space access to kernel data, avoiding system call overhead for common operations.
•The kernel tracks mappings using mm_struct and VMAs — organized as a linked list and red-black tree for both sequential access and fast lookup.
•Security mechanisms (SMEP, SMAP, KPTI, NX) enforce the kernel-user boundary and prevent various classes of attacks.

What's Next:

In the next page, we'll dive deeper into how these virtual addresses are translated to physical addresses through page table management. We'll examine the multi-level page table architecture, TLB operation, and how Linux optimizes these critical data structures for performance and memory efficiency.

Page Complete

You now have a comprehensive understanding of Linux virtual address space layout—the foundation upon which all memory management is built. This knowledge is essential for systems programming, kernel development, and security analysis.

Virtual Address Space Layout

The Foundation of Process Isolation

What You Will Learn

The Illusion of Ownership

Why create this illusion?

The virtual address space abstraction provides several critical benefits:

Process Isolation: Each process operates in its own address space, unable to access another process's memory directly. A bug in one process cannot corrupt another.
Simplified Programming Model: Programmers don't need to coordinate memory usage with other processes or know physical memory layouts. Every process can use the same virtual addresses.
Efficient Memory Use: Physical memory can be fragmented, but virtual memory appears contiguous. The OS maps virtual pages to physical frames wherever available.
Memory Protection: Different regions can have different permissions (read/write/execute), enforced by hardware.
Demand Paging: Only the portions of the address space actually in use need to reside in physical memory. The rest can live on disk.

Virtual ≠ Physical

Architectural Address Space Sizes

The size of the virtual address space is determined by the CPU architecture—specifically, by the number of bits used for addressing.

32-bit Architectures (x86, ARM32)

With 32 address bits, the theoretical maximum address space is 2³² = 4 GB. However, this 4 GB must be shared between user space and kernel space. The traditional Linux split is:

User space: 0x00000000 to 0xBFFFFFFF (3 GB)
Kernel space: 0xC0000000 to 0xFFFFFFFF (1 GB)

This 3:1 split is the default, though alternative configurations like 2:2 or even 1:3 (HIGHMEM configurations) exist for specific use cases.

64-bit Architectures (x86_64, ARM64)

The 48-bit address space (256 TB) is split:

User space: 0x0000000000000000 to 0x00007FFFFFFFFFFF (128 TB)
Non-canonical hole: 0x0000800000000000 to 0xFFFF7FFFFFFFFFFF
Kernel space: 0xFFFF800000000000 to 0xFFFFFFFFFFFFFFFF (128 TB)

Virtual Address Space Comparison
Architecture	Address Bits	Total Size	User Space	Kernel Space
x86 (32-bit)	32	4 GB	3 GB	1 GB
x86 with PAE	32 virtual / 36 physical	4 GB virtual	3 GB	1 GB
x86_64 (48-bit)	48	256 TB	128 TB	128 TB
x86_64 (57-bit LA57)	57	128 PB	64 PB	64 PB
ARM64	48 or 52	256 TB or 4 PB	Half	Half

The Non-Canonical Hole

The Canonical Process Layout

The Canonical Regions (Low to High Address):

NULL Guard Page (address 0): The very bottom of the address space is intentionally unmapped. Any attempt to dereference a NULL pointer causes a segmentation fault rather than reading garbage.
Text Segment (.text): Contains the executable machine code of the program. This region is typically marked read-only and executable (r-x).
Read-Only Data (.rodata): Contains constant data—string literals, const variables. Marked read-only (r--).
Initialized Data (.data): Contains global and static variables that have explicit initial values. Marked read-write (rw-).
Uninitialized Data (.bss): Contains global and static variables without explicit initializers (or initialized to zero). The kernel zeroes this region but doesn't store it in the executable—saving disk space. Marked read-write (rw-).
Heap: The dynamically allocated memory region. Grows upward (toward higher addresses) as programs call malloc(), new, or brk()/sbrk(). Marked read-write (rw-).
Memory-Mapped Regions: Located between heap and stack, this area contains memory-mapped files, shared libraries, anonymous mappings, and mmap() allocations. These regions are created and destroyed dynamically.
Stack: The primary stack for the process, growing downward (toward lower addresses). Contains function call frames, local variables, and return addresses. Marked read-write (rw-), sometimes with execute permissions disabled (no-exec stack).

Converting Mermaid diagram...

Why This Layout?

The layout is not arbitrary—it reflects decades of engineering tradeoffs:

NULL at bottom: Catches the most common programming error (NULL pointer dereference) at the hardware level.
Text before data: Code and constants are read-only and can be shared across processes running the same executable.
Heap grows up, stack grows down: These are the two unbounded-growth regions. By growing toward each other from opposite ends, they can each use as much space as available without a fixed boundary.
Libraries in the middle: Shared libraries are loaded via mmap() into the gap between heap and stack, where they don't interfere with either.

Examining the Address Space

Linux provides several interfaces to examine a process's virtual address space. Understanding these tools is essential for debugging, performance analysis, and security research.

Key Tools for Address Space Inspection

•/proc/[pid]/maps — Text file listing all memory mappings for a process, with addresses, permissions, and backing files.
•/proc/[pid]/smaps — Extended version with detailed memory statistics (RSS, PSS, swap usage) per mapping.
•pmap — Command-line tool that presents /proc/[pid]/maps information in a readable format.
•readelf — Shows segment information from ELF executables, revealing intended load addresses.
•gdb — The GNU Debugger can examine memory regions with info proc mappings.

examining_process_maps.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# View memory mappings for current process
cat /proc/self/maps
 
# Example output (simplified):
# 00400000-00452000 r-xp 00000000 08:01 123456  /bin/bash      [.text]
# 00651000-00652000 r--p 00051000 08:01 123456  /bin/bash      [.rodata]
# 00652000-0065b000 rw-p 00052000 08:01 123456  /bin/bash      [.data/.bss]
# 00e2a000-00f1f000 rw-p 00000000 00:00 0       [heap]
# 7f7b2c000000-7f7b2c1e4000 r-xp 00000000 08:01 789012 /lib/libc.so.6
# ...
# 7ffc3bff3000-7ffc3c014000 rw-p 00000000 00:00 0       [stack]
# 7ffc3c1fe000-7ffc3c200000 r--p 00000000 00:00 0       [vvar]
# 7ffc3c200000-7ffc3c202000 r-xp 00000000 00:00 0       [vdso]
# ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
 
# View detailed memory statistics
cat /proc/self/smaps | head -50
 
# Use pmap for a cleaner view
pmap -x $$
 
# Column meanings in /proc/[pid]/maps:
# address          permissions offset device inode pathname
# 00400000-00452000 r-xp      00000000 08:01 123456 /bin/bash
#
# Permissions: r=read, w=write, x=execute, p=private, s=shared

Reading the Maps File:

Each line in /proc/[pid]/maps describes a Virtual Memory Area (VMA). The format is:

address_start-address_end permissions offset device inode pathname

address range: The virtual address range of this mapping
permissions: rwxp or rwxs (read/write/execute, private or shared)
offset: Offset into the file (for file-backed mappings)
device: Major:minor device numbers of the backing file
inode: Inode number of the backing file
pathname: File path, or special region name like [heap], [stack], [vdso]

detailed_maps_analysis.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
/* Program to demonstrate address space layout */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/mman.h>
 
/* Global initialized variable -> .data section */
int global_initialized = 42;
 
/* Global uninitialized variable -> .bss section */
int global_uninitialized;
 
/* Constant string -> .rodata section */
const char* readonly_string = "This is in .rodata";
 
void show_addresses() {
    /* Local variable -> stack */
    int stack_var = 100;
    
    /* Dynamic allocation -> heap */
    char* heap_mem = malloc(1024);
    
    /* Memory mapped region */
    void* mmap_region = mmap(NULL, 4096, 
                             PROT_READ | PROT_WRITE,
                             MAP_PRIVATE | MAP_ANONYMOUS, 
                             -1, 0);
    
    printf("Address Space Layout Analysis\n");
    printf("========================================\n\n");
    
    printf("Code Section (.text):\n");
    printf("  show_addresses():    %p\n\n", (void*)show_addresses);
    
    printf("Read-Only Data (.rodata):\n");
    printf("  readonly_string:     %p\n\n", (void*)readonly_string);
    
    printf("Initialized Data (.data):\n");
    printf("  global_initialized:  %p\n\n", (void*)&global_initialized);
    
    printf("Uninitialized Data (.bss):\n");
    printf("  global_uninitialized: %p\n\n", (void*)&global_uninitialized);
    
    printf("Heap:\n");
    printf("  malloc'd memory:     %p\n\n", (void*)heap_mem);
    
    printf("Memory-Mapped Region:\n");
    printf("  mmap'd region:       %p\n\n", mmap_region);
    
    printf("Stack:\n");
    printf("  stack_var:           %p\n\n", (void*)&stack_var);
    
    printf("\nProcess ID: %d\n", getpid());
    printf("Check /proc/%d/maps for detailed layout\n", getpid());
    
    /* Pause to allow inspection */
    printf("\nPress Enter to continue...\n");
    getchar();
    
    free(heap_mem);
    munmap(mmap_region, 4096);
}
 
int main() {
    show_addresses();
    return 0;
}

The Kernel-User Space Boundary

The Fundamental Division:

In every process's virtual address space, the upper portion is reserved for the kernel. This region contains:

Kernel code: The executable kernel image
Kernel data structures: Task lists, memory maps, file system caches
Direct mapping of physical memory: The kernel can access any physical address through this mapping
vmalloc region: Virtually contiguous kernel allocations
Module space: Where loadable kernel modules reside

Why Include Kernel Space in Every Process?

x86_64 Kernel Space Layout (Simplified)
Region	Address Range	Size	Purpose
Direct Map	0xFFFF888000000000 - 0xFFFFc87FFFFFFFFF	64 TB	Linear mapping of all physical memory
vmalloc/ioremap	0xFFFFc90000000000 - 0xFFFFe8FFFFFFFFFF	32 TB	Virtual contiguous allocations
Virtual Memory Map	0xFFFFea0000000000 - 0xFFFFeb0000000000	1 TB	struct page array
Kernel Text	0xFFFFFFFF80000000 - 0xFFFFFFFF9FFFFFFF	512 MB	Kernel code
Modules	0xFFFFFFFFA0000000 - 0xFFFFFFFFFEFFFFFF	1.5 GB	Loadable kernel modules

Kernel Page Table Isolation (KPTI)

Protection Mechanisms:

Several mechanisms enforce the kernel-user boundary:

CPU Privilege Rings: The x86 architecture has 4 privilege levels (rings 0-3). Linux uses ring 0 for kernel and ring 3 for user space. Memory accesses check the current privilege level against page table permissions.
Supervisor Mode Access Prevention (SMAP): Prevents the kernel from accidentally reading/writing user-space memory. This catches a class of kernel vulnerabilities where attackers trick the kernel into using user-controlled pointers.
Supervisor Mode Execution Prevention (SMEP): Prevents the kernel from executing code in user-space pages. Defeats classic exploits that redirect kernel execution to attacker-controlled user-space code.
NX Bit (No-Execute): Individual pages can be marked non-executable. Combined with SMEP, this significantly raises the bar for code injection attacks.

Address Space Layout Randomization (ASLR)

What ASLR Randomizes:

Stack starting address
Memory-mapped regions (libraries, mmap allocations)
Heap starting address
Main executable position (when compiled as Position-Independent Executable, PIE)

Levels of ASLR in Linux:

Linux provides three ASLR modes, controlled by /proc/sys/kernel/randomize_va_space:

0: No randomization (for debugging)
1: Randomize stack, mmap, VDSO
2: Full randomization (includes heap)

Most distributions run with level 2.

aslr_demonstration.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# Check current ASLR setting
cat /proc/sys/kernel/randomize_va_space
# Output: 2 (full randomization)
 
# Observe randomization across runs
for i in {1..5}; do
    cat /proc/self/maps | grep '[stack]'
done
# Output shows different addresses each time:
# 7fff8b332000-7fff8b353000 rw-p 00000000 00:00 0 [stack]
# 7ffc4a1d7000-7ffc4a1f8000 rw-p 00000000 00:00 0 [stack]
# 7ffd2e3bc000-7ffd2e3dd000 rw-p 00000000 00:00 0 [stack]
# ...
 
# Show library base address changes
ldd /bin/ls 2>/dev/null | grep libc
ldd /bin/ls 2>/dev/null | grep libc
# Addresses differ between invocations
 
# Temporarily disable ASLR for a single process
setarch $(uname -m) -R /bin/bash -c 'cat /proc/self/maps | grep stack'
setarch $(uname -m) -R /bin/bash -c 'cat /proc/self/maps | grep stack'
# Same addresses now
 
# Check if a binary is compiled with PIE (Position Independent Executable)
file /bin/ls
# Should show "pie executable" for full ASLR protection
 
# Check with readelf
readelf -h /bin/ls | grep Type
# Output: Type: DYN (Shared object file)  <- PIE enabled
# vs:     Type: EXEC (Executable file)    <- No PIE

ASLR Entropy

ASLR Bypass Techniques:

While ASLR is a powerful defense, it's not absolute. Researchers have developed various bypass techniques:

Information Leaks: If an attacker can leak a single address (through format string vulnerabilities, side channels, etc.), they can calculate the base address of an entire region.
Heap Spraying: Flooding the heap with controlled data increases the probability that a guess lands in attacker-controlled memory.
JIT (Just-In-Time) Spraying: In browsers, JavaScript JIT compilers can be coerced into placing predictable gadgets in executable memory.
Side Channels: Timing attacks, cache attacks, and speculative execution can leak address information.

Defense in depth is essential—ASLR should be combined with stack canaries, DEP/NX, and other protections.

Special Memory Regions

Beyond the standard text/data/heap/stack regions, Linux maps several special regions into every process's address space for performance and functionality reasons.

Special Memory Regions

•vDSO (Virtual Dynamic Shared Object) — A small shared library mapped by the kernel containing optimized implementations of frequently-used system calls (like gettimeofday() and clock_gettime()). By calling these in user space, programs avoid the overhead of a full system call for time-sensitive operations.
•vvar — A read-only page containing kernel data needed by the vDSO, such as the current time. Updated by the kernel, read by user space.
•vsyscall (Legacy) — The predecessor to vDSO, now mostly deprecated. Located at a fixed address (0xffffffffff600000), which made it useful for attackers as a source of known gadgets. Modern systems restrict vsyscall to emulated execution.
•Stack Guard Page — An unmapped page at the bottom of the stack. If the stack grows into this page (stack overflow), a segmentation fault is raised before the stack can corrupt other memory regions.
•Thread Stacks — Additional threads get their own stacks, typically allocated via mmap in the memory-mapped region. Each has its own guard page.

vdso_analysis.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
/* Demonstrating vDSO performance benefits */
#include <stdio.h>
#include <time.h>
#include <sys/time.h>
#include <unistd.h>
 
#define ITERATIONS 10000000
 
int main() {
    struct timespec ts;
    struct timeval tv;
    clock_t start, end;
    
    /* gettimeofday() - uses vDSO, extremely fast */
    start = clock();
    for (int i = 0; i < ITERATIONS; i++) {
        gettimeofday(&tv, NULL);
    }
    end = clock();
    printf("gettimeofday (vDSO): %.3f seconds\n",
           (double)(end - start) / CLOCKS_PER_SEC);
    
    /* clock_gettime() - also uses vDSO */
    start = clock();
    for (int i = 0; i < ITERATIONS; i++) {
        clock_gettime(CLOCK_REALTIME, &ts);
    }
    end = clock();
    printf("clock_gettime (vDSO): %.3f seconds\n",
           (double)(end - start) / CLOCKS_PER_SEC);
    
    /* Compare with actual syscall - getpid always goes to kernel */
    start = clock();
    for (int i = 0; i < ITERATIONS; i++) {
        getpid();
    }
    end = clock();
    printf("getpid (real syscall): %.3f seconds\n",
           (double)(end - start) / CLOCKS_PER_SEC);
    
    /* Note: getpid() is actually cached by glibc, so the difference
     * may be less dramatic. Use syscall(SYS_getpid) for true syscall */
    
    return 0;
}
 
/* Typical output:
 * gettimeofday (vDSO): 0.150 seconds
 * clock_gettime (vDSO): 0.170 seconds
 * getpid (real syscall): 0.450 seconds
 *
 * vDSO calls are ~3x faster than real syscalls!
 */

vDSO Internals

Kernel Data Structures: mm_struct and VMAs

The mm_struct Structure:

Every process with a virtual address space (i.e., all user-space processes) has an associated mm_struct. This structure is the master record for the process's memory state.

mm_struct_overview.c
C (Kernel)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
/* Simplified view of mm_struct (from include/linux/mm_types.h) */
struct mm_struct {
    /* VMA list and tree for fast lookup */
    struct vm_area_struct *mmap;       /* List of VMAs */
    struct rb_root mm_rb;               /* Red-black tree root */
    
    /* Page table pointer */
    pgd_t *pgd;                         /* Page Global Directory */
    
    /* Reference counting */
    atomic_t mm_users;                  /* How many users (threads) */
    atomic_t mm_count;                  /* Reference count */
    
    /* Memory statistics */
    atomic_long_t nr_ptes;              /* Page table pages */
    atomic_long_t nr_pmds;              /* PMD pages */
    unsigned long total_vm;             /* Total mapped pages */
    unsigned long locked_vm;            /* Locked (non-swappable) pages */
    unsigned long data_vm;              /* Data segment pages */
    unsigned long exec_vm;              /* Executable pages */
    unsigned long stack_vm;             /* Stack pages */
    
    /* Key address boundaries */
    unsigned long start_code, end_code; /* Code segment bounds */
    unsigned long start_data, end_data; /* Data segment bounds */
    unsigned long start_brk, brk;       /* Heap bounds */
    unsigned long start_stack;          /* Stack start */
    unsigned long arg_start, arg_end;   /* Command line arguments */
    unsigned long env_start, env_end;   /* Environment variables */
    
    /* mmap state */
    unsigned long mmap_base;            /* Base for mmap allocations */
    unsigned long task_size;            /* Size of user space */
    unsigned long highest_vm_end;       /* Highest VMA end address */
    
    /* Flags */
    unsigned long flags;                /* Memory flags */
    
    /* Lock for VMA operations */
    struct rw_semaphore mmap_lock;
    
    /* ... many more fields ... */
};

Virtual Memory Areas (VMAs):

Each contiguous region of the virtual address space is represented by a vm_area_struct. VMAs are organized in two data structures for efficient access:

Linked List: For sequential traversal of all VMAs
Red-Black Tree: For O(log n) lookup by address

When a page fault occurs, the kernel uses the red-black tree to quickly find the VMA containing the faulting address.

vma_structure.c
C (Kernel)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
/* Virtual Memory Area structure (simplified) */
struct vm_area_struct {
    /* VMA boundaries */
    unsigned long vm_start;              /* Start address (inclusive) */
    unsigned long vm_end;                /* End address (exclusive) */
    
    /* Linkage */
    struct vm_area_struct *vm_next;      /* Next VMA in list */
    struct vm_area_struct *vm_prev;      /* Previous VMA */
    struct rb_node vm_rb;                /* Node in red-black tree */
    
    /* The owning mm_struct */
    struct mm_struct *vm_mm;
    
    /* Access permissions */
    pgprot_t vm_page_prot;               /* Page protection flags */
    unsigned long vm_flags;              /* Flags: VM_READ, VM_WRITE, etc. */
    
    /* Backing storage */
    struct file *vm_file;                /* File for file-backed mapping */
    unsigned long vm_pgoff;              /* Offset in file (pages) */
    void *vm_private_data;               /* Private data */
    
    /* Operations for this VMA type */
    const struct vm_operations_struct *vm_ops;
    
    /* Anonymous memory management */
    struct anon_vma *anon_vma;           /* Anonymous VMA linkage */
    
    /* ... additional fields ... */
};
 
/* Common vm_flags values */
#define VM_READ     0x00000001  /* Pages can be read */
#define VM_WRITE    0x00000002  /* Pages can be written */
#define VM_EXEC     0x00000004  /* Pages can be executed */
#define VM_SHARED   0x00000008  /* Pages are shared */
#define VM_MAYREAD  0x00000010  /* VM_READ can be set */
#define VM_MAYWRITE 0x00000020  /* VM_WRITE can be set */
#define VM_MAYEXEC  0x00000040  /* VM_EXEC can be set */
#define VM_GROWSDOWN 0x00000100 /* Stack-like growth */
#define VM_LOCKED   0x00002000  /* Pages are locked in memory */
#define VM_IO       0x00004000  /* Memory-mapped I/O */
#define VM_DENYWRITE 0x00000800 /* ETXTBSY on write attempts */
 
/* VMA operations structure - defines how to handle page faults, etc. */
struct vm_operations_struct {
    void (*open)(struct vm_area_struct *area);
    void (*close)(struct vm_area_struct *area);
    vm_fault_t (*fault)(struct vm_fault *vmf);
    vm_fault_t (*huge_fault)(struct vm_fault *vmf, unsigned int order);
    /* ... */
};

VMA Merging

Summary: Virtual Address Space Layout

We have comprehensively examined `how Linux organizes virtual memory for each process. Let's consolidate the key concepts:

Key Takeaways

•Each process has an isolated virtual address space — ranging from 4 GB on 32-bit to 128+ TB on 64-bit systems, providing the illusion of exclusive memory ownership.
•The canonical layout includes NULL guard, text, rodata, data, bss, heap (growing up), memory-mapped regions, and stack (growing down).
•User and kernel space are partitioned — user space occupies the lower addresses, kernel space the upper. All kernel memory is mapped into every process for efficient system calls.
•ASLR randomizes key addresses — stack, heap, libraries, and (with PIE) the executable itself, making exploitation significantly harder.
•Special regions (vDSO, vvar) enable fast user-space access to kernel data, avoiding system call overhead for common operations.
•The kernel tracks mappings using mm_struct and VMAs — organized as a linked list and red-black tree for both sequential access and fast lookup.
•Security mechanisms (SMEP, SMAP, KPTI, NX) enforce the kernel-user boundary and prevent various classes of attacks.

What's Next:

Page Complete