Loading content...
Every process in Linux operates within the illusion of having exclusive access to an enormous, contiguous memory space. This illusion—the virtual address space—is one of the most elegant abstractions in operating system design. It enables process isolation, memory protection, efficient memory sharing, and forms the foundation upon which higher-level memory management facilities are built.
Understanding the virtual address space layout is essential for systems programmers, kernel developers, security researchers, and performance engineers. It explains phenomena ranging from why certain memory accesses cause segmentation faults, to how exploits like return-to-libc attacks work, to why 32-bit systems can't address more than 4GB of RAM without special configurations.
By the end of this page, you will understand: (1) the canonical virtual address space layout in Linux for both 32-bit and 64-bit architectures, (2) how user space and kernel space are divided, (3) the purpose and location of each memory region (text, data, BSS, heap, stack, memory-mapped regions), (4) security features like ASLR and their impact on layout, and (5) how to examine a process's address space using Linux tools.
When a process runs on Linux, it believes it has access to a massive, continuous block of memory—ranging from address 0 to some maximum address determined by the architecture. This belief is an illusion carefully maintained by the kernel and the CPU's Memory Management Unit (MMU).
Why create this illusion?
The virtual address space abstraction provides several critical benefits:
Process Isolation: Each process operates in its own address space, unable to access another process's memory directly. A bug in one process cannot corrupt another.
Simplified Programming Model: Programmers don't need to coordinate memory usage with other processes or know physical memory layouts. Every process can use the same virtual addresses.
Efficient Memory Use: Physical memory can be fragmented, but virtual memory appears contiguous. The OS maps virtual pages to physical frames wherever available.
Memory Protection: Different regions can have different permissions (read/write/execute), enforced by hardware.
Demand Paging: Only the portions of the address space actually in use need to reside in physical memory. The rest can live on disk.
A critical insight: virtual addresses bear no direct relationship to physical addresses. Two processes might both use virtual address 0x00400000 for their code, but the MMU maps each to completely different physical memory locations. This mapping is managed through page tables, which we'll explore in the next page.
The size of the virtual address space is determined by the CPU architecture—specifically, by the number of bits used for addressing.
32-bit Architectures (x86, ARM32)
With 32 address bits, the theoretical maximum address space is 2³² = 4 GB. However, this 4 GB must be shared between user space and kernel space. The traditional Linux split is:
This 3:1 split is the default, though alternative configurations like 2:2 or even 1:3 (HIGHMEM configurations) exist for specific use cases.
64-bit Architectures (x86_64, ARM64)
With 64 address bits, the theoretical maximum is 2⁶⁴ = 16 exabytes—far more than any existing or foreseeable system could use. In practice, current x86_64 processors implement only 48-bit virtual addressing (256 TB), with recent extensions supporting 57-bit addressing (128 PB).
The 48-bit address space (256 TB) is split:
| Architecture | Address Bits | Total Size | User Space | Kernel Space |
|---|---|---|---|---|
| x86 (32-bit) | 32 | 4 GB | 3 GB | 1 GB |
| x86 with PAE | 32 virtual / 36 physical | 4 GB virtual | 3 GB | 1 GB |
| x86_64 (48-bit) | 48 | 256 TB | 128 TB | 128 TB |
| x86_64 (57-bit LA57) | 57 | 128 PB | 64 PB | 64 PB |
| ARM64 | 48 or 52 | 256 TB or 4 PB | Half | Half |
On x86_64, addresses between user and kernel space are "non-canonical"—any access to them triggers a general protection fault. This creates a natural gap that helps detect pointer corruption and provides some separation between user and kernel regions.
Within a process's user-space portion of the virtual address space, memory is organized into several distinct regions, each serving a specific purpose. Understanding this layout is fundamental to understanding how programs execute, how debuggers work, and how exploits are constructed.
The Canonical Regions (Low to High Address):
NULL Guard Page (address 0): The very bottom of the address space is intentionally unmapped. Any attempt to dereference a NULL pointer causes a segmentation fault rather than reading garbage.
Text Segment (.text): Contains the executable machine code of the program. This region is typically marked read-only and executable (r-x).
Read-Only Data (.rodata): Contains constant data—string literals, const variables. Marked read-only (r--).
Initialized Data (.data): Contains global and static variables that have explicit initial values. Marked read-write (rw-).
Uninitialized Data (.bss): Contains global and static variables without explicit initializers (or initialized to zero). The kernel zeroes this region but doesn't store it in the executable—saving disk space. Marked read-write (rw-).
Heap: The dynamically allocated memory region. Grows upward (toward higher addresses) as programs call malloc(), new, or brk()/sbrk(). Marked read-write (rw-).
Memory-Mapped Regions: Located between heap and stack, this area contains memory-mapped files, shared libraries, anonymous mappings, and mmap() allocations. These regions are created and destroyed dynamically.
Stack: The primary stack for the process, growing downward (toward lower addresses). Contains function call frames, local variables, and return addresses. Marked read-write (rw-), sometimes with execute permissions disabled (no-exec stack).
Why This Layout?
The layout is not arbitrary—it reflects decades of engineering tradeoffs:
NULL at bottom: Catches the most common programming error (NULL pointer dereference) at the hardware level.
Text before data: Code and constants are read-only and can be shared across processes running the same executable.
Heap grows up, stack grows down: These are the two unbounded-growth regions. By growing toward each other from opposite ends, they can each use as much space as available without a fixed boundary.
Libraries in the middle: Shared libraries are loaded via mmap() into the gap between heap and stack, where they don't interfere with either.
Linux provides several interfaces to examine a process's virtual address space. Understanding these tools is essential for debugging, performance analysis, and security research.
/proc/[pid]/maps — Text file listing all memory mappings for a process, with addresses, permissions, and backing files./proc/[pid]/smaps — Extended version with detailed memory statistics (RSS, PSS, swap usage) per mapping.pmap — Command-line tool that presents /proc/[pid]/maps information in a readable format.readelf — Shows segment information from ELF executables, revealing intended load addresses.gdb — The GNU Debugger can examine memory regions with info proc mappings.1234567891011121314151617181920212223242526
# View memory mappings for current processcat /proc/self/maps # Example output (simplified):# 00400000-00452000 r-xp 00000000 08:01 123456 /bin/bash [.text]# 00651000-00652000 r--p 00051000 08:01 123456 /bin/bash [.rodata]# 00652000-0065b000 rw-p 00052000 08:01 123456 /bin/bash [.data/.bss]# 00e2a000-00f1f000 rw-p 00000000 00:00 0 [heap]# 7f7b2c000000-7f7b2c1e4000 r-xp 00000000 08:01 789012 /lib/libc.so.6# ...# 7ffc3bff3000-7ffc3c014000 rw-p 00000000 00:00 0 [stack]# 7ffc3c1fe000-7ffc3c200000 r--p 00000000 00:00 0 [vvar]# 7ffc3c200000-7ffc3c202000 r-xp 00000000 00:00 0 [vdso]# ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] # View detailed memory statisticscat /proc/self/smaps | head -50 # Use pmap for a cleaner viewpmap -x $$ # Column meanings in /proc/[pid]/maps:# address permissions offset device inode pathname# 00400000-00452000 r-xp 00000000 08:01 123456 /bin/bash## Permissions: r=read, w=write, x=execute, p=private, s=sharedReading the Maps File:
Each line in /proc/[pid]/maps describes a Virtual Memory Area (VMA). The format is:
address_start-address_end permissions offset device inode pathname
[heap], [stack], [vdso]1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768
/* Program to demonstrate address space layout */#include <stdio.h>#include <stdlib.h>#include <string.h>#include <unistd.h>#include <sys/mman.h> /* Global initialized variable -> .data section */int global_initialized = 42; /* Global uninitialized variable -> .bss section */int global_uninitialized; /* Constant string -> .rodata section */const char* readonly_string = "This is in .rodata"; void show_addresses() { /* Local variable -> stack */ int stack_var = 100; /* Dynamic allocation -> heap */ char* heap_mem = malloc(1024); /* Memory mapped region */ void* mmap_region = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); printf("Address Space Layout Analysis\n"); printf("========================================\n\n"); printf("Code Section (.text):\n"); printf(" show_addresses(): %p\n\n", (void*)show_addresses); printf("Read-Only Data (.rodata):\n"); printf(" readonly_string: %p\n\n", (void*)readonly_string); printf("Initialized Data (.data):\n"); printf(" global_initialized: %p\n\n", (void*)&global_initialized); printf("Uninitialized Data (.bss):\n"); printf(" global_uninitialized: %p\n\n", (void*)&global_uninitialized); printf("Heap:\n"); printf(" malloc'd memory: %p\n\n", (void*)heap_mem); printf("Memory-Mapped Region:\n"); printf(" mmap'd region: %p\n\n", mmap_region); printf("Stack:\n"); printf(" stack_var: %p\n\n", (void*)&stack_var); printf("\nProcess ID: %d\n", getpid()); printf("Check /proc/%d/maps for detailed layout\n", getpid()); /* Pause to allow inspection */ printf("\nPress Enter to continue...\n"); getchar(); free(heap_mem); munmap(mmap_region, 4096);} int main() { show_addresses(); return 0;}The division between kernel space and user space is one of the most critical security boundaries in the operating system. Understanding how this boundary works—and the ongoing efforts to strengthen it—is essential for both kernel developers and security researchers.
The Fundamental Division:
In every process's virtual address space, the upper portion is reserved for the kernel. This region contains:
Why Include Kernel Space in Every Process?
When a process makes a system call, execution transitions from user mode to kernel mode. Having the kernel already mapped into the process's address space makes this transition efficient—no page table switch is required. The MMU simply changes its privilege level, and the kernel code is immediately accessible.
| Region | Address Range | Size | Purpose |
|---|---|---|---|
| Direct Map | 0xFFFF888000000000 - 0xFFFFc87FFFFFFFFF | 64 TB | Linear mapping of all physical memory |
| vmalloc/ioremap | 0xFFFFc90000000000 - 0xFFFFe8FFFFFFFFFF | 32 TB | Virtual contiguous allocations |
| Virtual Memory Map | 0xFFFFea0000000000 - 0xFFFFeb0000000000 | 1 TB | struct page array |
| Kernel Text | 0xFFFFFFFF80000000 - 0xFFFFFFFF9FFFFFFF | 512 MB | Kernel code |
| Modules | 0xFFFFFFFFA0000000 - 0xFFFFFFFFFEFFFFFF | 1.5 GB | Loadable kernel modules |
Modern Linux kernels implement KPTI (formerly KAISER) in response to the Meltdown vulnerability. With KPTI enabled, user-mode page tables contain only a minimal kernel mapping—just enough to handle system call entry. The full kernel is mapped only in kernel-mode page tables. This adds overhead but prevents Meltdown-style attacks from reading kernel memory.
Protection Mechanisms:
Several mechanisms enforce the kernel-user boundary:
CPU Privilege Rings: The x86 architecture has 4 privilege levels (rings 0-3). Linux uses ring 0 for kernel and ring 3 for user space. Memory accesses check the current privilege level against page table permissions.
Supervisor Mode Access Prevention (SMAP): Prevents the kernel from accidentally reading/writing user-space memory. This catches a class of kernel vulnerabilities where attackers trick the kernel into using user-controlled pointers.
Supervisor Mode Execution Prevention (SMEP): Prevents the kernel from executing code in user-space pages. Defeats classic exploits that redirect kernel execution to attacker-controlled user-space code.
NX Bit (No-Execute): Individual pages can be marked non-executable. Combined with SMEP, this significantly raises the bar for code injection attacks.
Address Space Layout Randomization (ASLR) is a security technique that randomizes the base addresses of key memory regions each time a program runs. This makes exploitation significantly harder by removing the attacker's ability to predict where code and data will be located.
What ASLR Randomizes:
Levels of ASLR in Linux:
Linux provides three ASLR modes, controlled by /proc/sys/kernel/randomize_va_space:
Most distributions run with level 2.
1234567891011121314151617181920212223242526272829303132
# Check current ASLR settingcat /proc/sys/kernel/randomize_va_space# Output: 2 (full randomization) # Observe randomization across runsfor i in {1..5}; do cat /proc/self/maps | grep '[stack]'done# Output shows different addresses each time:# 7fff8b332000-7fff8b353000 rw-p 00000000 00:00 0 [stack]# 7ffc4a1d7000-7ffc4a1f8000 rw-p 00000000 00:00 0 [stack]# 7ffd2e3bc000-7ffd2e3dd000 rw-p 00000000 00:00 0 [stack]# ... # Show library base address changesldd /bin/ls 2>/dev/null | grep libcldd /bin/ls 2>/dev/null | grep libc# Addresses differ between invocations # Temporarily disable ASLR for a single processsetarch $(uname -m) -R /bin/bash -c 'cat /proc/self/maps | grep stack'setarch $(uname -m) -R /bin/bash -c 'cat /proc/self/maps | grep stack'# Same addresses now # Check if a binary is compiled with PIE (Position Independent Executable)file /bin/ls# Should show "pie executable" for full ASLR protection # Check with readelfreadelf -h /bin/ls | grep Type# Output: Type: DYN (Shared object file) <- PIE enabled# vs: Type: EXEC (Executable file) <- No PIEThe effectiveness of ASLR depends on entropy—how many bits of randomization are applied. On 64-bit systems, there's ample address space for high entropy. On 32-bit systems, limited address space means lower entropy, making brute-force attacks feasible. This is one reason 64-bit systems are more secure.
ASLR Bypass Techniques:
While ASLR is a powerful defense, it's not absolute. Researchers have developed various bypass techniques:
Information Leaks: If an attacker can leak a single address (through format string vulnerabilities, side channels, etc.), they can calculate the base address of an entire region.
Heap Spraying: Flooding the heap with controlled data increases the probability that a guess lands in attacker-controlled memory.
JIT (Just-In-Time) Spraying: In browsers, JavaScript JIT compilers can be coerced into placing predictable gadgets in executable memory.
Side Channels: Timing attacks, cache attacks, and speculative execution can leak address information.
Defense in depth is essential—ASLR should be combined with stack canaries, DEP/NX, and other protections.
Beyond the standard text/data/heap/stack regions, Linux maps several special regions into every process's address space for performance and functionality reasons.
gettimeofday() and clock_gettime()). By calling these in user space, programs avoid the overhead of a full system call for time-sensitive operations.1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253
/* Demonstrating vDSO performance benefits */#include <stdio.h>#include <time.h>#include <sys/time.h>#include <unistd.h> #define ITERATIONS 10000000 int main() { struct timespec ts; struct timeval tv; clock_t start, end; /* gettimeofday() - uses vDSO, extremely fast */ start = clock(); for (int i = 0; i < ITERATIONS; i++) { gettimeofday(&tv, NULL); } end = clock(); printf("gettimeofday (vDSO): %.3f seconds\n", (double)(end - start) / CLOCKS_PER_SEC); /* clock_gettime() - also uses vDSO */ start = clock(); for (int i = 0; i < ITERATIONS; i++) { clock_gettime(CLOCK_REALTIME, &ts); } end = clock(); printf("clock_gettime (vDSO): %.3f seconds\n", (double)(end - start) / CLOCKS_PER_SEC); /* Compare with actual syscall - getpid always goes to kernel */ start = clock(); for (int i = 0; i < ITERATIONS; i++) { getpid(); } end = clock(); printf("getpid (real syscall): %.3f seconds\n", (double)(end - start) / CLOCKS_PER_SEC); /* Note: getpid() is actually cached by glibc, so the difference * may be less dramatic. Use syscall(SYS_getpid) for true syscall */ return 0;} /* Typical output: * gettimeofday (vDSO): 0.150 seconds * clock_gettime (vDSO): 0.170 seconds * getpid (real syscall): 0.450 seconds * * vDSO calls are ~3x faster than real syscalls! */The vDSO is compiled into the kernel and exposed to user space as if it were a shared library. You can extract it from a running process and disassemble it: dd if=/proc/self/mem bs=1 skip=$((0x7fff...)) count=8192 2>/dev/null | objdump -d -. This reveals the actual vDSO implementation for your kernel version.
The kernel maintains sophisticated data structures to track each process's virtual address space. Understanding these structures is essential for kernel development and provides insight into how the mapping abstraction is implemented.
The mm_struct Structure:
Every process with a virtual address space (i.e., all user-space processes) has an associated mm_struct. This structure is the master record for the process's memory state.
12345678910111213141516171819202122232425262728293031323334353637383940414243
/* Simplified view of mm_struct (from include/linux/mm_types.h) */struct mm_struct { /* VMA list and tree for fast lookup */ struct vm_area_struct *mmap; /* List of VMAs */ struct rb_root mm_rb; /* Red-black tree root */ /* Page table pointer */ pgd_t *pgd; /* Page Global Directory */ /* Reference counting */ atomic_t mm_users; /* How many users (threads) */ atomic_t mm_count; /* Reference count */ /* Memory statistics */ atomic_long_t nr_ptes; /* Page table pages */ atomic_long_t nr_pmds; /* PMD pages */ unsigned long total_vm; /* Total mapped pages */ unsigned long locked_vm; /* Locked (non-swappable) pages */ unsigned long data_vm; /* Data segment pages */ unsigned long exec_vm; /* Executable pages */ unsigned long stack_vm; /* Stack pages */ /* Key address boundaries */ unsigned long start_code, end_code; /* Code segment bounds */ unsigned long start_data, end_data; /* Data segment bounds */ unsigned long start_brk, brk; /* Heap bounds */ unsigned long start_stack; /* Stack start */ unsigned long arg_start, arg_end; /* Command line arguments */ unsigned long env_start, env_end; /* Environment variables */ /* mmap state */ unsigned long mmap_base; /* Base for mmap allocations */ unsigned long task_size; /* Size of user space */ unsigned long highest_vm_end; /* Highest VMA end address */ /* Flags */ unsigned long flags; /* Memory flags */ /* Lock for VMA operations */ struct rw_semaphore mmap_lock; /* ... many more fields ... */};Virtual Memory Areas (VMAs):
Each contiguous region of the virtual address space is represented by a vm_area_struct. VMAs are organized in two data structures for efficient access:
When a page fault occurs, the kernel uses the red-black tree to quickly find the VMA containing the faulting address.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253
/* Virtual Memory Area structure (simplified) */struct vm_area_struct { /* VMA boundaries */ unsigned long vm_start; /* Start address (inclusive) */ unsigned long vm_end; /* End address (exclusive) */ /* Linkage */ struct vm_area_struct *vm_next; /* Next VMA in list */ struct vm_area_struct *vm_prev; /* Previous VMA */ struct rb_node vm_rb; /* Node in red-black tree */ /* The owning mm_struct */ struct mm_struct *vm_mm; /* Access permissions */ pgprot_t vm_page_prot; /* Page protection flags */ unsigned long vm_flags; /* Flags: VM_READ, VM_WRITE, etc. */ /* Backing storage */ struct file *vm_file; /* File for file-backed mapping */ unsigned long vm_pgoff; /* Offset in file (pages) */ void *vm_private_data; /* Private data */ /* Operations for this VMA type */ const struct vm_operations_struct *vm_ops; /* Anonymous memory management */ struct anon_vma *anon_vma; /* Anonymous VMA linkage */ /* ... additional fields ... */}; /* Common vm_flags values */#define VM_READ 0x00000001 /* Pages can be read */#define VM_WRITE 0x00000002 /* Pages can be written */#define VM_EXEC 0x00000004 /* Pages can be executed */#define VM_SHARED 0x00000008 /* Pages are shared */#define VM_MAYREAD 0x00000010 /* VM_READ can be set */#define VM_MAYWRITE 0x00000020 /* VM_WRITE can be set */#define VM_MAYEXEC 0x00000040 /* VM_EXEC can be set */#define VM_GROWSDOWN 0x00000100 /* Stack-like growth */#define VM_LOCKED 0x00002000 /* Pages are locked in memory */#define VM_IO 0x00004000 /* Memory-mapped I/O */#define VM_DENYWRITE 0x00000800 /* ETXTBSY on write attempts */ /* VMA operations structure - defines how to handle page faults, etc. */struct vm_operations_struct { void (*open)(struct vm_area_struct *area); void (*close)(struct vm_area_struct *area); vm_fault_t (*fault)(struct vm_fault *vmf); vm_fault_t (*huge_fault)(struct vm_fault *vmf, unsigned int order); /* ... */};The kernel aggressively merges adjacent VMAs with identical properties. When you mmap two adjacent regions with the same permissions and backing, the kernel may merge them into a single VMA to reduce memory overhead. This is why /proc/pid/maps may show fewer regions than expected.
We have comprehensively examined `how Linux organizes virtual memory for each process. Let's consolidate the key concepts:
What's Next:
In the next page, we'll dive deeper into how these virtual addresses are translated to physical addresses through page table management. We'll examine the multi-level page table architecture, TLB operation, and how Linux optimizes these critical data structures for performance and memory efficiency.
You now have a comprehensive understanding of Linux virtual address space layout—the foundation upon which all memory management is built. This knowledge is essential for systems programming, kernel development, and security analysis.