Loading learning content...
From a process's perspective, it has exclusive access to a vast, contiguous memory space—typically spanning 128 TB or more of virtual addresses on a 64-bit system. It can allocate memory, map files, and execute code as if no other process existed. This illusion of isolation and abundance is created by the operating system's memory management system.
But this illusion requires meticulous bookkeeping. The kernel must track:
All of this information lives in the PCB—specifically, in the memory management information structures that describe the process's address space.
By the end of this page, you will understand how the PCB represents a process's memory: the memory descriptor structure, virtual memory areas (VMAs), page table pointers, memory statistics, and resource limits. You'll see how fork() creates shared mappings, how mmap() adds regions, and how the kernel enforces memory protection.
The memory descriptor (called mm_struct in Linux, struct _MADDRESS_SPACE in Windows, or vm_map in macOS) is the top-level structure that describes a process's entire virtual address space. It's referenced from the PCB and contains or points to everything the kernel needs for memory management.
What the Memory Descriptor Contains:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061
// Linux Memory Descriptor (simplified from include/linux/mm_types.h) struct mm_struct { // Virtual Memory Areas struct vm_area_struct *mmap; // List of VMAs struct rb_root mm_rb; // Red-black tree of VMAs for fast lookup struct vm_area_struct *mmap_cache; // Last accessed VMA (cache) // Address Space Layout unsigned long mmap_base; // Base address for mmap() allocations unsigned long task_size; // Size of user address space unsigned long highest_vm_end; // Highest VMA end address // Page Tables pgd_t *pgd; // Pointer to top-level page table (PGD/PML4) // Reference Counting atomic_t mm_users; // Number of users (thread count) atomic_t mm_count; // Number of references (includes kernel) // Memory Statistics unsigned long total_vm; // Total pages mapped (virtual) unsigned long locked_vm; // Pages locked in memory unsigned long pinned_vm; // Pages pinned (can't be swapped) unsigned long data_vm; // Data + stack pages unsigned long exec_vm; // Executable pages unsigned long stack_vm; // Stack pages // Limits unsigned long def_stack_guard_gap; // Gap between stack and mmap // Code and Data Boundaries unsigned long start_code, end_code; // Text segment unsigned long start_data, end_data; // Data segment unsigned long start_brk, brk; // Heap (start and current end) unsigned long start_stack; // Stack start unsigned long arg_start, arg_end; // Command line arguments unsigned long env_start, env_end; // Environment variables // Synchronization struct rw_semaphore mmap_sem; // Protects VMA list/tree spinlock_t page_table_lock; // Protects page tables // Architecture-Specific mm_context_t context; // CPU-specific context (ASID, etc.) // ... many more fields in actual kernel}; // In task_struct (PCB):struct task_struct { // ... other fields struct mm_struct *mm; // Memory descriptor (user address space) struct mm_struct *active_mm; // Currently active address space // Kernel threads have mm = NULL but need active_mm for context switch // User processes: mm == active_mm // ... other fields};Threads within a process share the same mm_struct—they have the same address space. The mm_users count tracks how many threads reference this mm_struct. When a thread is created with CLONE_VM, it doesn't get a new mm_struct; it shares the parent's. This is why threads see each other's memory changes instantly.
A process's address space isn't uniformly mapped—it consists of discrete regions with different properties. Each region is represented by a Virtual Memory Area (VMA) structure.
What a VMA Represents:
A VMA describes a contiguous region of virtual addresses with uniform characteristics:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051
// Linux VMA Structure (simplified from include/linux/mm_types.h) struct vm_area_struct { // Address Range unsigned long vm_start; // Start virtual address (inclusive) unsigned long vm_end; // End virtual address (exclusive) // Linkage struct vm_area_struct *vm_next, *vm_prev; // Sorted list by address struct rb_node vm_rb; // Red-black tree node (for fast lookup) // Owning mm_struct struct mm_struct *vm_mm; // Address space this VMA belongs to // Protection and Flags pgprot_t vm_page_prot; // Page table protection bits unsigned long vm_flags; // VMA flags (see below) // Backing Store struct file *vm_file; // File being mapped (NULL for anon) unsigned long vm_pgoff; // Offset in file (in pages) // Operations const struct vm_operations_struct *vm_ops; // VMA-specific handlers // Anonymous Memory struct anon_vma *anon_vma; // For copy-on-write handling // ... more fields for special cases}; // VMA Flags (vm_flags) - from include/linux/mm.h#define VM_READ 0x00000001 // Readable#define VM_WRITE 0x00000002 // Writable#define VM_EXEC 0x00000004 // Executable#define VM_SHARED 0x00000008 // Shared (vs. private/COW)#define VM_GROWSDOWN 0x00000100 // Stack: can grow toward lower addr#define VM_GROWSUP 0x00000200 // Can grow toward higher addr#define VM_DENYWRITE 0x00000800 // Deny write to file#define VM_LOCKED 0x00002000 // Locked in memory (mlock)#define VM_IO 0x00004000 // Memory-mapped I/O#define VM_DONTCOPY 0x00020000 // Don't copy on fork#define VM_DONTEXPAND 0x00040000 // Cannot expand (mremap)#define VM_HUGETLB 0x00400000 // Huge TLB pages // Example VMA for text (code) segment:// vm_start = 0x400000// vm_end = 0x401000// vm_flags = VM_READ | VM_EXEC | VM_DENYWRITE// vm_file = /path/to/executable// vm_pgoff = 0 (starts at beginning of file)Viewing VMAs: /proc/[pid]/maps
Linux exposes VMAs through the /proc filesystem. Each line represents one VMA:
123456789101112131415161718192021
$ cat /proc/self/maps# Address range Perms Offset Dev Inode Pathname00400000-00452000 r-xp 00000000 08:01 123456 /usr/bin/cat00651000-00652000 r--p 00051000 08:01 123456 /usr/bin/cat00652000-00653000 rw-p 00052000 08:01 123456 /usr/bin/cat00e54000-00e75000 rw-p 00000000 00:00 0 [heap]7f6c88000000-7f6c88021000 rw-p 00000000 00:00 0 7f6c8c000000-7f6c8c1c0000 r-xp 00000000 08:01 789012 /lib/x86_64-linux-gnu/libc.so.6...7ffc8ba00000-7ffc8ba21000 rw-p 00000000 00:00 0 [stack]7ffc8bb0c000-7ffc8bb10000 r--p 00000000 00:00 0 [vvar]7ffc8bb10000-7ffc8bb11000 r-xp 00000000 00:00 0 [vdso]ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall] # Columns explained:# 1. Address range (start-end in hex)# 2. Permissions: r=read, w=write, x=execute, p=private, s=shared# 3. File offset (for file-backed mappings)# 4. Device (major:minor)# 5. Inode# 6. Pathname (or [heap], [stack], [vdso], empty for anon mmap)| VMA Type | Permissions | Backing | Purpose |
|---|---|---|---|
| Text (Code) | r-xp | Executable file | Program instructions (read-only, executable) |
| Data | rw-p | Executable file | Initialized global/static variables |
| BSS | rw-p | Anonymous | Uninitialized global/static variables |
| Heap | rw-p | Anonymous | Dynamic allocation (malloc/new) |
| Stack | rw-p | Anonymous | Function call stack |
| Shared Library Code | r-xp | Library file | Shared code (read-only, shared) |
| Shared Library Data | rw-p | Library file | Per-process library data |
| mmap (file) | varies | File | Memory-mapped file |
| mmap (anon) | varies | Anonymous | Anonymous memory allocation |
| vDSO | r-xp | Kernel | Fast system call interface |
The page tables translate virtual addresses to physical addresses. Each process has its own page table hierarchy, and the pointer to the top-level table is stored in the memory descriptor (and loaded into a CPU register like CR3 during context switches).
Multi-Level Page Tables (x86-64):
On x86-64 with 4-level paging, a 48-bit virtual address is translated using:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748
// x86-64 Page Table Entry Format /* * A 64-bit Page Table Entry (PTE): * * Bit 63: NX (No Execute) - if 1, page is not executable * Bits 62-52: Available for OS use * Bit 51-12: Physical frame number (40 bits -> 52-bit physical address) * Bits 11-9: Available for OS use * Bit 8: G (Global) - TLB not flushed on CR3 change * Bit 7: PAT (Page Attribute Table) * Bit 6: D (Dirty) - page has been written * Bit 5: A (Accessed) - page has been read or written * Bit 4: PCD (Page Cache Disable) * Bit 3: PWT (Page Write Through) * Bit 2: U/S (User/Supervisor) - if 1, accessible from user mode * Bit 1: R/W (Read/Write) - if 1, page is writable * Bit 0: P (Present) - if 0, page fault on access */ typedef unsigned long pte_t; // Macros to extract/check PTE fields#define PTE_PRESENT (1UL << 0)#define PTE_RW (1UL << 1)#define PTE_USER (1UL << 2)#define PTE_ACCESSED (1UL << 5)#define PTE_DIRTY (1UL << 6)#define PTE_NX (1UL << 63)#define PTE_FRAME_MASK 0x000FFFFFFFFFF000UL static inline bool pte_present(pte_t pte) { return pte & PTE_PRESENT;} static inline unsigned long pte_pfn(pte_t pte) { return (pte & PTE_FRAME_MASK) >> 12;} static inline bool pte_write(pte_t pte) { return pte & PTE_RW;} // The kernel uses these to check permissions:// - If PTE_PRESENT is 0: Page fault (page not in memory)// - If PTE_RW is 0 and write attempted: Protection fault// - If PTE_USER is 0 and accessed from user mode: Protection fault// - If PTE_NX is 1 and instruction fetch: Protection faultA full page table hierarchy for a 48-bit address space could require gigabytes of memory. In practice, most entries are not present—the hierarchy is sparse and only populated on demand (demand paging). Huge pages (2 MB or 1 GB) reduce table overhead by using fewer levels.
The memory descriptor tracks various statistics about memory usage. These are used for resource accounting, limits enforcement, and monitoring tools like top and ps.
| Statistic | Field | Meaning | Where Visible |
|---|---|---|---|
| Virtual Size | total_vm | Total pages in address space (mapped, not necessarily resident) | VSZ/VIRT in ps/top |
| Resident Set Size | rss | Pages actually in physical memory now | RSS/RES in ps/top |
| Shared Memory | shared_vm | Pages shared with other processes | SHR in top |
| Locked Memory | locked_vm | Pages locked (not swappable) | mlock() accounting |
| Code Size | exec_vm | Executable pages (text) | CODE in some tools |
| Data Size | data_vm | Data + stack pages | DATA in some tools |
| Stack Size | stack_vm | Stack pages | Stack limit tracking |
| Swap Usage | (external) | Pages on swap device | SWAP in top |
1234567891011121314151617181920212223242526272829303132
# Viewing process memory statistics # Using psps -o pid,vsz,rss,sz,command -p $$# PID VSZ RSS SZ COMMAND# 12345 25000 5000 6250 bash # Using /proccat /proc/$$/status | grep -E "^(Vm|Rss)"# VmPeak: 26000 kB # Peak virtual memory size# VmSize: 25000 kB # Current virtual memory size# VmLck: 0 kB # Locked memory# VmPin: 0 kB # Pinned memory# VmHWM: 5500 kB # Peak resident set size# VmRSS: 5000 kB # Resident set size# RssAnon: 2000 kB # Anonymous RSS# RssFile: 3000 kB # File-backed RSS# RssShmem: 0 kB # Shared memory RSS# VmData: 1500 kB # Data segment size# VmStk: 136 kB # Stack size# VmExe: 900 kB # Text (code) size# VmLib: 2000 kB # Shared library size# VmPTE: 64 kB # Page table size# VmSwap: 0 kB # Swap usage # Memory maps summarycat /proc/$$/smaps_rollup# Shows aggregated memory info without per-VMA detail # Detailed per-VMA memory infocat /proc/$$/smaps | head -30# Shows RSS, PSS, Shared/Private pages per VMAVSZ (Virtual Size) includes all mapped memory—even pages never accessed or swapped out. RSS (Resident Set Size) is the actual physical memory currently used. A process might have 1 GB VSZ but only 50 MB RSS if most pages aren't touched. For capacity planning, RSS (or PSS for shared memory) is more meaningful.
Process creation and program execution fundamentally involve memory management. Understanding how fork() and exec() manipulate the memory descriptor is crucial.
fork(): Creating a Copy of the Address Space
When a process calls fork(), the child gets a "copy" of the parent's address space. But physical copying would be slow and wasteful (many pages are never modified). Instead, the kernel uses Copy-on-Write (COW):
mm_struct is a new structure, but its VMAs initially point to the same physical pages as the parentThis makes fork() fast—only metadata is copied, not actual memory pages.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768
// Simplified fork() memory handling struct mm_struct *dup_mm(struct task_struct *parent) { struct mm_struct *mm; // Allocate new memory descriptor mm = allocate_mm(); if (!mm) return NULL; // Copy basic mm_struct fields memcpy(mm, parent->mm, sizeof(*mm)); // Allocate new page table root (PGD/PML4) mm->pgd = pgd_alloc(mm); if (!mm->pgd) { free_mm(mm); return NULL; } // Set reference counts atomic_set(&mm->mm_users, 1); atomic_set(&mm->mm_count, 1); // Initialize synchronization init_rwsem(&mm->mmap_sem); // Copy all VMAs with COW semantics if (dup_mmap(mm, parent->mm) < 0) { free_pgtables(mm); free_mm(mm); return NULL; } return mm;} int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm) { struct vm_area_struct *mpnt, *tmp; // Iterate through parent's VMAs for (mpnt = oldmm->mmap; mpnt; mpnt = mpnt->vm_next) { // Allocate new VMA structure tmp = vm_area_dup(mpnt); if (!tmp) return -ENOMEM; // Link into new mm tmp->vm_mm = mm; insert_vm_struct(mm, tmp); // Copy page table entries with COW marking copy_page_range(mm, oldmm, tmp); } return 0;} int copy_page_range(...) { // For each PTE in the range: // 1. If writable and not shared: // - Clear write bit in parent's PTE // - Copy PTE to child's page table // - Increment page reference count // - Both PTEs now read-only (COW) // 2. If read-only or shared: // - Just copy PTE (share the page)}exec(): Replacing the Address Space
When a process calls exec(), its entire address space is replaced:
The mm_struct is either reused (after clearing) or replaced entirely.
The operating system enforces resource limits to prevent any process from consuming excessive memory. These limits are stored with the process and checked during memory allocation operations.
| Resource | Constant | Default | Purpose |
|---|---|---|---|
| Virtual Memory | RLIMIT_AS | Unlimited | Maximum address space size |
| Locked Memory | RLIMIT_MEMLOCK | 64 KB | Max memory lockable via mlock() |
| Stack Size | RLIMIT_STACK | 8 MB | Maximum stack segment size |
| Data Segment | RLIMIT_DATA | Unlimited | Maximum data segment size |
| Core Dump | RLIMIT_CORE | 0 or Unlimited | Maximum core file size |
| Resident Set | RLIMIT_RSS | Unlimited | Max resident set size (advisory) |
1234567891011121314151617181920212223242526272829303132333435363738394041
# Viewing and modifying resource limits # Show all limits for current shellulimit -a# -t: cpu time (seconds) unlimited# -f: file size (blocks) unlimited# -d: data seg size (kbytes) unlimited# -s: stack size (kbytes) 8192# -c: core file size (blocks) 0# -m: resident set size (kbytes) unlimited# -u: processes 63636# -n: file descriptors 1024# -l: locked memory (kbytes) 64# -v: virtual memory (kbytes) unlimited # Show specific limit (stack size)ulimit -s# 8192 # Set stack limit (soft limit)ulimit -s 16384 # The kernel tracks limits in rlimit struct:# struct rlimit {# rlim_t rlim_cur; // Soft limit (current)# rlim_t rlim_max; // Hard limit (ceiling)# }; # View limits via /proccat /proc/self/limits | head -10# Limit Soft Limit Hard Limit Units# Max cpu time unlimited unlimited seconds# Max file size unlimited unlimited bytes# Max data size unlimited unlimited bytes# Max stack size 8388608 unlimited bytes# Max core file size 0 unlimited bytes# Max resident set unlimited unlimited bytes# Max processes 63636 63636 processes# Max open files 1024 1048576 files# Max locked memory 65536 65536 bytes# Max address space unlimited unlimited bytesWhen a process exceeds a memory limit: mmap() returns -ENOMEM, malloc() returns NULL (after mmap fails), or the process receives SIGSEGV (stack overflow). Container orchestrators like Kubernetes use cgroups for more sophisticated memory limits with OOM killer integration.
Different operating systems structure memory management information differently, though the concepts are similar.
Windows Memory Management Structures:
Key Differences:
12345678910111213141516171819202122232425262728293031
// Windows VAD structure (conceptual) typedef struct _MMVAD { union { LONG_PTR Balance : 2; struct _MMVAD *Parent; }; struct _MMVAD *LeftChild; struct _MMVAD *RightChild; ULONG_PTR StartingVpn; // Starting virtual page number ULONG_PTR EndingVpn; // Ending virtual page number union { ULONG LongFlags; MMVAD_FLAGS VadFlags; // Protection, state, type }; // For file-backed VADs PCONTROL_AREA ControlArea; PFILE_OBJECT FileObject; // ... more fields} MMVAD, *PMMVAD; // Windows API for memory inspectionMEMORY_BASIC_INFORMATION mbi;VirtualQueryEx(hProcess, address, &mbi, sizeof(mbi));// Returns: BaseAddress, AllocationBase, AllocationProtect,// RegionSize, State (MEM_COMMIT/RESERVE/FREE),// Protect, Type (MEM_IMAGE/MAPPED/PRIVATE)We've explored how the PCB tracks each process's virtual address space. From the memory descriptor to VMAs to page tables, these structures enable the illusion of isolated, abundant memory for every process. Let's consolidate the key insights:
Module Complete:
This concludes our deep dive into the Process Control Block. We've examined:
The PCB is the kernel's fundamental representation of a process. Every scheduling decision, every resource allocation, every context switch—all rely on the information stored in this critical data structure. Understanding the PCB provides a foundation for understanding all of process management.
You now have a comprehensive understanding of the Process Control Block—the kernel data structure that gives each process its identity. From identification to state, from CPU context to memory mapping, you understand how operating systems represent and manage the fundamental unit of execution.