Operating SystemsMemory Management Goals

Memory Management Goals

LevelIntermediate

Duration90 mins

TopicMemory Management Goals

5 / 5

Physical Organization

Managing the Physical Reality

Throughout this module, we've explored memory management goals that relate to abstraction—allocation assigns memory to processes, protection isolates them, sharing enables controlled access, and logical organization matches programmer expectations. These goals operate largely in the realm of virtual memory, creating useful illusions.

But beneath all these abstractions lies a physical reality: actual RAM chips, with finite capacity, varying access times, and hardware-imposed constraints. Physical organization is the memory management goal that deals with this reality—managing the physical memory system's characteristics, organizing it for efficiency, and bridging the gap between memory as an abstraction and memory as hardware.

This is where operating systems confront the messy details: memory hierarchies, NUMA topologies, DMA constraints, and the relentless drive for performance.

What You Will Learn

By the end of this page, you will understand: what physical organization means in the context of memory management, the memory hierarchy and why it matters for OS design, how the OS manages physical page frames, NUMA (Non-Uniform Memory Access) and its implications, memory zones and allocation constraints, and the relationship between physical and virtual memory.

What is Physical Organization?

Physical organization refers to how the operating system manages the actual physical memory hardware. While logical organization concerns the programmer's view, physical organization concerns the hardware's reality.

Key Aspects of Physical Organization:

Physical Organization Responsibilities

•Frame Management — Tracking which physical page frames are free, allocated, or reserved; maintaining allocation data structures; handling frame allocation requests.
•Memory Hierarchy Awareness — Understanding that memory access is not uniform—cache, RAM, and disk have dramatically different speeds; organizing data to maximize cache effectiveness.
•Hardware Constraints — Not all physical memory is equal. Some devices can only DMA to low memory; some memory regions are reserved for BIOS or peripherals; the OS must respect these constraints.
•NUMA Topology — In multi-processor systems, memory access time depends on which CPU accesses which memory bank. Physical organization includes locality-aware allocation.
•Physical-to-Virtual Mapping — Maintaining the reverse mappings needed to locate all virtual references to a physical frame (for page replacement, CoW, etc.).

The Physical-Virtual Dichotomy:

Virtual Address Space (per process)      Physical Address Space (shared)
┌─────────────────────────────────┐     ┌─────────────────────────────────┐
│ Process A sees contiguous       │     │    Actual RAM Organization      │
│ memory 0x0000 to 0xFFFF...     │     │                                 │
│                                 │     │  0x0000: Reserved (BIOS, etc.)  │
│ Process B sees the same range   │     │  0x10000: DMA-capable zone      │
│ (different mappings)            │     │  0x100000: Normal zone          │
│                                 │     │  0x40000000: High memory        │
│ Both processes think they       │     │                                 │
│ have all memory to themselves   │     │  Multiple NUMA nodes            │
└─────────────────────────────────┘     │  with different latencies       │
                                        └─────────────────────────────────┘

Logical org: How virtual memory is structured
Physical org: How the real RAM is managed

Virtual memory hides physical complexity from processes. But the OS must manage that complexity—deciding which physical frame to allocate for each virtual page, respecting hardware constraints, and optimizing for performance.

Why This Matters

A naïve OS that ignores physical organization works correctly but performs poorly. It allocates memory far from the requesting CPU, ignores cache behavior, and violates DMA constraints causing device failures. Physical organization is where systems engineering meets hardware reality.

The Memory Hierarchy

Modern computer systems have a memory hierarchy—multiple levels of storage with different capacity, speed, and cost characteristics. Understanding this hierarchy is fundamental to physical organization.

The Hierarchy Levels:

Memory Hierarchy Characteristics (Approximate 2024 Values)
Level	Technology	Typical Size	Latency	Bandwidth	Managed By
CPU Registers	SRAM	~1 KB	~0.3 ns	TB/s	Compiler/Hardware
L1 Cache	SRAM	32-128 KB/core	~1 ns	~500 GB/s	Hardware
L2 Cache	SRAM	256 KB-1 MB/core	~4 ns	~200 GB/s	Hardware
L3 Cache	SRAM	8-64 MB shared	~12 ns	~100 GB/s	Hardware
Main Memory (RAM)	DRAM	16-512 GB	~80 ns	~50 GB/s	OS
SSD (swap)	Flash	256 GB-8 TB	~50 μs	~5 GB/s	OS
HDD (swap)	Magnetic	1-20 TB	~10 ms	~200 MB/s	OS

Key Observations:

Speed/Size Trade-off: Faster memory is smaller. L1 cache is 100x faster than RAM but 1000x smaller.
Latency Gap: The gap between RAM and disk is enormous—100,000x. A page fault to SSD adds 50 microseconds; to HDD, 10 milliseconds. This dominates performance.
OS Role: The OS primarily manages RAM (and swap). Caches are managed by hardware. But OS decisions (page placement, working set management) profoundly affect cache effectiveness.

The OS and the Hierarchy:

┌─────────────────────────────────────────────────────────────────┐
│                    Hardware Managed                             │
│  ┌─────────┐   ┌─────────┐   ┌─────────┐   ┌─────────┐         │
│  │Registers│ → │L1 Cache │ → │L2 Cache │ → │L3 Cache │         │
│  └─────────┘   └─────────┘   └─────────┘   └─────────┘         │
├─────────────────────────────────────────────────────────────────┤
│                    OS Managed                                   │
│  ┌───────────────────────────────────────────────────────────┐ │
│  │                   Main Memory (RAM)                        │ │
│  │  - Page frame allocation                                   │ │
│  │  - Page replacement decisions                              │ │
│  │  - Memory zone management                                  │ │
│  │  - NUMA locality                                           │ │
│  └───────────────────────────────────────────────────────────┘ │
│                           ↕                                     │
│  ┌───────────────────────────────────────────────────────────┐ │
│  │                   Swap Space (Disk)                        │ │
│  │  - Page-out for memory pressure                            │ │
│  │  - Page-in on demand                                       │ │
│  └───────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘

The OS's goal is to keep "hot" pages (frequently accessed) in RAM and push "cold" pages to disk, maximizing the effective use of the memory hierarchy.

Locality is King

Programs exhibit locality—they tend to access the same memory locations repeatedly (temporal locality) and nearby locations sequentially (spatial locality). The entire memory hierarchy relies on locality. Without it, cache miss rates would approach 100%, and virtual memory would thrash. OS memory management exploits locality to work correctly.

Physical Frame Management

The OS divides physical memory into fixed-size page frames (matching the page size, typically 4KB). Managing these frames—tracking their state, allocating them efficiently, and reclaiming them when needed—is a core physical organization task.

Frame States:

Physical Frame States

•Free — Not in use, available for allocation
•Allocated (Anonymous) — In use by a process for heap/stack/data
•Allocated (File-backed) — In use for memory-mapped file or shared library
•Page Cache — Holding cached file data (can be reclaimed if needed)
•Kernel — Used by the kernel for its own data structures
•Reserved — Hardware-reserved, not available for general use

frame_tracking.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
// Linux-style page frame descriptor
struct page {
    unsigned long flags;          // Page state flags
    atomic_t _refcount;           // Reference count
    atomic_t _mapcount;           // How many page tables map this frame
    
    union {
        struct {
            struct list_head lru;     // LRU list linkage
            struct address_space *mapping;  // Owner (file or anon_vma)
            unsigned long index;      // Offset within mapping
        };
        struct {
            // Slab allocator fields when used for slab
            struct kmem_cache *slab_cache;
            void *freelist;
        };
    };
    
    // Every physical frame has one 'struct page'
    // For 16GB RAM with 4KB pages: 4 million pages
    // struct page ~= 64 bytes, so ~256MB just for tracking!
};
 
// Frame allocation (simplified)
struct page* alloc_page(gfp_t flags) {
    struct page *page;
    
    // Try to get from per-CPU cache (fast path)
    page = get_from_percpu_cache(flags);
    if (page)
        return page;
    
    // Fall back to zone allocator (buddy system)
    page = __alloc_pages(flags, 0);  // order 0 = single page
    
    if (!page && !(flags & GFP_ATOMIC)) {
        // No memory available - try to reclaim
        reclaim_pages();
        page = __alloc_pages(flags, 0);
    }
    
    return page;  // NULL if allocation failed
}

The Page Frame Array:

The kernel maintains a struct page for every physical page frame, organized in a large array. Given a physical address, the kernel can quickly find the corresponding struct page:

Physical Frame Number (PFN) = Physical Address / PAGE_SIZE
struct page *pg = &mem_map[PFN]

Reverse Mappings:

A crucial physical organization feature is reverse mapping—given a physical frame, finding all virtual addresses that map to it. This is needed for:

Page replacement: Before evicting a page, update all PTEs that reference it
Copy-on-write: When copying a CoW page, update the mapping
Migration: When moving a page to a different NUMA node

Linux uses rmap (reverse mapping) structures to maintain these backward references efficiently.

The Tracking Overhead

Every physical page needs a descriptor. On a 1TB RAM server with 4KB pages, that's 256 million pages. At 64 bytes per struct page, that's 16GB just for page descriptors—1.6% of RAM. This overhead is unavoidable but must be kept minimal.

Memory Zones: Handling Hardware Constraints

Not all physical memory is created equal. Hardware limitations and architectural constraints divide memory into zones—regions with different properties and use cases. The OS must allocate from appropriate zones based on the request.

Common Memory Zones (Linux x86-64):

Linux Memory Zones
Zone	Address Range	Purpose	Typical Size
ZONE_DMA	0 - 16 MB	ISA DMA (legacy devices)	16 MB
ZONE_DMA32	0 - 4 GB	32-bit DMA capable devices	~4 GB
ZONE_NORMAL	4 GB+	Regular allocations	Most of RAM
ZONE_HIGHMEM	Varies	Memory not permanently mapped (32-bit only)	N/A on 64-bit
ZONE_MOVABLE	Configurable	Hotplug-able memory, easily migrated	Varies

Why Zones Exist:

DMA Constraints: Some devices (especially legacy ISA devices) can only perform Direct Memory Access to low physical addresses. Memory for their buffers must come from ZONE_DMA.
32-bit Limitations: 32-bit CPUs can't directly address memory above 4GB. ZONE_HIGHMEM (on 32-bit kernels) holds such memory, requiring special handling.
Address Width Restrictions: Some devices have 32-bit DMA engines—they can DMA to addresses 0-4GB but not higher. ZONE_DMA32 satisfies these.
Hotplug and Migration: ZONE_MOVABLE contains only movable pages, enabling memory hotplug (adding/removing RAM while running).

Zone Fallback:

When a zone is exhausted, allocations can fall back to other zones following a hierarchy:

Allocation Request: GFP_KERNEL (prefer ZONE_NORMAL)
    │
    ▼
ZONE_NORMAL has free memory? ─Yes─▶ Return page from ZONE_NORMAL
    │ No
    ▼
ZONE_DMA32 has free memory? ─Yes─▶ Return page from ZONE_DMA32
    │ No
    ▼
ZONE_DMA has free memory? ─Yes─▶ Return page from ZONE_DMA
    │ No
    ▼
Allocation fails (or triggers reclaim)

This fallback is necessary but undesirable—allocating from ZONE_DMA for non-DMA purposes wastes precious DMA-capable memory.

zone_allocation.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// Linux GFP (Get Free Pages) flags control zone selection
 
// Regular kernel allocation - uses ZONE_NORMAL, can sleep
void *ptr = kmalloc(size, GFP_KERNEL);
 
// Atomic allocation - uses ZONE_NORMAL, cannot sleep
void *ptr = kmalloc(size, GFP_ATOMIC);
 
// DMA allocation - must use ZONE_DMA
void *ptr = kmalloc(size, GFP_DMA);
 
// 32-bit DMA - must use ZONE_DMA32 or below
void *ptr = kmalloc(size, GFP_DMA32);
 
// User allocation - normal zone, may trigger reclaim
struct page *p = alloc_page(GFP_USER);
 
// Movable allocation - for user pages, can be migrated
struct page *p = alloc_page(GFP_HIGHUSER_MOVABLE);

Zone Exhaustion

A common problem: ZONE_DMA is small (16MB). If non-DMA allocations spill into ZONE_DMA (via fallback), it can be exhausted. Then, a device driver needing DMA memory fails, even though gigabytes of RAM are free in other zones. This is called "zone imbalance" and requires careful memory management to avoid.

NUMA: Non-Uniform Memory Access

In small systems, all memory is equally accessible—any CPU can access any memory location with the same latency. But as systems scale to multiple processors, this becomes impossible to maintain. NUMA (Non-Uniform Memory Access) architectures have memory distributed across nodes, with varying access times depending on which CPU accesses which memory.

NUMA Architecture:

┌─────────────────────────────────────────────────────────────────────┐
│                        NUMA System Example                          │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   Node 0                              Node 1                        │
│   ┌─────────────────┐                ┌─────────────────┐           │
│   │  CPU 0 │ CPU 1  │                │  CPU 2 │ CPU 3  │           │
│   └────────┴────────┘                └────────┴────────┘           │
│          │                                    │                     │
│   ┌──────▼──────┐                    ┌───────▼──────┐              │
│   │  Local RAM  │◄────Interconnect────►│  Local RAM  │              │
│   │   128 GB    │      (slower)       │   128 GB    │              │
│   │ (fast access│                     │ (fast access│              │
│   │  ~80 ns)    │                     │  ~80 ns)    │              │
│   └─────────────┘                     └─────────────┘              │
│                                                                     │
│   CPU 0 → Node 0 RAM: 80 ns (local)                                │
│   CPU 0 → Node 1 RAM: 140 ns (remote, ~1.75x slower)               │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

NUMA Implications for Physical Organization:

Locality Preference: Allocate memory from the same node as the requesting CPU. A process running on CPU 0 should get memory from Node 0.
Memory Migration: If a process moves to a different CPU/node, consider migrating its memory for better locality.
Node Balancing: Don't exhaust one node while another is empty—spread allocations while maintaining locality.
Affinity Awareness: Know which CPUs belong to which nodes, and which memory belongs where.

numa_allocation.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// NUMA-aware allocation (conceptual)
function alloc_page_numa_aware(flags, preferred_node):
    // First, try the preferred (local) node
    page = alloc_from_node(preferred_node, flags)
    if page:
        return page
    
    // Try nodes in order of increasing distance
    for node in nodes_by_distance(preferred_node):
        page = alloc_from_node(node, flags)
        if page:
            return page
    
    // Last resort: trigger reclaim and retry
    reclaim_memory()
    return alloc_from_any_node(flags)
 
// Linux numactl usage examples:
// Run process with memory from node 0 only:
// $ numactl --membind=0 ./my_program
 
// Run on CPUs 0-3, prefer local memory:
// $ numactl --cpunodebind=0 --localalloc ./my_program
 
// Interleave memory across all nodes (useful for shared data):
// $ numactl --interleave=all ./my_program

NUMA Effects on Performance

NUMA locality can affect performance by 1.5-3x. A memory-intensive application running with all memory on a remote node performs dramatically worse than one with local memory. Database servers, scientific applications, and virtual machines are particularly sensitive. Physical organization that ignores NUMA throws away significant performance.

Page Cache: Bridging RAM and Disk

The Page Cache is a key physical organization feature—it uses "spare" RAM to cache file data, dramatically improving I/O performance. Rather than reading files from disk repeatedly, the OS keeps recently-accessed file pages in memory.

Page Cache Operation:

Page Cache Mechanics

•Read Request: Process reads a file block
•Cache Check: Kernel checks if the page is already in the page cache
•Cache Hit: If present, return data from memory (fast path—no disk I/O!)
•Cache Miss: If absent, allocate a page, read from disk, add to cache, return data
•Write Request: Write to page cache, mark page dirty
•Write-back: Periodically (or on sync), write dirty pages to disk

Why Page Cache is Part of Physical Organization:

The page cache occupies physical memory frames—the same frames that processes might want. Physical organization includes:

Size Management: How much RAM to dedicate to cache vs. process memory
Reclaim Decisions: When under memory pressure, which cache pages to evict
Dirty Page Tracking: Which cached pages need writing to disk
Read-ahead: Prefetching pages that are likely to be accessed soon

Observing Page Cache:

$ free -h
              total        used        free      shared  buff/cache   available
Mem:           62Gi       8.5Gi        12Gi       1.2Gi       42Gi        52Gi
Swap:         8.0Gi          0B       8.0Gi

# Note: 42 GB in buff/cache!
# This is mostly page cache - file data kept in memory
# "available" (52 GB) includes reclaimable cache

The system above has 42GB in cache—file data that's immediately available if re-read. If processes need that memory, the kernel can reclaim cache pages. This is why "available" (52GB) is much higher than "free" (12GB).

Linux Uses All RAM

A healthy Linux system often shows near-zero "free" memory—and that's good! Unused RAM is wasted RAM. The page cache puts spare memory to work, improving I/O performance. Memory is only truly exhausted when both free AND reclaimable cache are gone.

Physical vs. Virtual Memory Relationship

Physical and virtual memory management are deeply intertwined. Virtual memory provides abstraction; physical memory provides reality. Understanding their relationship completes the picture of memory management.

Virtual Memory

•Per-process abstraction
•Can exceed physical RAM size
•Provides isolation illusion
•Contiguous view for processes
•Managed via page tables
•Addresses start at 0

Physical Memory

•System-wide resource
•Fixed size (installed RAM)
•Shared among all processes
•Actually scattered frames
•Managed via frame allocator
•Hardware-determined addresses

The Connection Points:

Page Tables Bridge Both Worlds
- Virtual pages → physical frames via page table entries
- Each PTE contains a physical frame number
- The MMU performs this translation for every memory access
Demand Paging Connects Page Faults to Frame Allocation
- Process accesses virtual page not in RAM
- Page fault occurs
- OS allocates a physical frame
- Loads data (from file or swap)
- Updates PTE to point to the frame
- Resumes process
Page Replacement Involves Both
- Physical memory pressure triggers replacement
- OS chooses a virtual page to evict
- Updates PTE (marks not present)
- Reclaims the physical frame
- Optionally writes to swap (physical I/O)
Sharing Connects Multiple Virtual to One Physical
- Multiple PTEs (from different processes)
- All point to same physical frame
- One physical copy, many virtual copies

memory_flow.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
// Complete flow: Virtual access to physical reality
function handle_memory_access(process, virtual_addr, access_type):
    // Step 1: MMU looks up page table (hardware)
    pte = walk_page_table(process.pgd, virtual_addr)
    
    if pte.present:
        // Fast path: page is in RAM
        physical_addr = (pte.frame_number << PAGE_SHIFT) | 
                        (virtual_addr & PAGE_OFFSET_MASK)
        
        // MMU caches in TLB for future accesses
        return access_memory(physical_addr)
    
    else:
        // Page fault: virtual page not in physical memory
        raise PAGE_FAULT
        
        // --- In page fault handler ---
        
        // Physical: Allocate a frame
        frame = physical_allocate_frame(GFP_USER)
        
        if frame is NULL:
            // Physical: Not enough RAM - need to reclaim
            victim_pte = choose_page_to_evict()  // Virtual decision
            victim_frame = victim_pte.frame_number
            
            if victim_pte.dirty:
                write_to_swap(victim_frame)  // Physical I/O
            
            invalidate_pte(victim_pte)  // Virtual update
            frame = victim_frame  // Physical: reuse frame
        
        // Load content into physical frame
        if page_is_file_backed(virtual_addr):
            read_from_file(frame, file_offset)
        else:
            read_from_swap_or_zero(frame)
        
        // Virtual: Update page table
        pte.frame_number = frame
        pte.present = true
        pte.permissions = calculate_permissions()
        
        // Physical: Update frame metadata
        frame.mapcount += 1
        
        // Resume the faulting instruction

The Dance of Virtual and Physical

Virtual memory can be thought of as a "promise"—processes are promised memory at certain addresses. Physical memory is the "reality"—actual RAM where data lives. The OS's job is to fulfill virtual promises with physical reality, on demand, while making the best use of limited physical resources.

Summary: Physical Organization

This page explored physical organization as the fifth and final fundamental goal of memory management. Let's consolidate what we've learned:

Key Takeaways

•Physical organization manages the actual RAM hardware—frame allocation, hardware constraints, and memory hierarchy awareness.
•The memory hierarchy spans registers to disk, with ~100,000x latency difference between RAM and HDD. The OS manages RAM/swap; hardware manages caches.
•Frame management tracks physical page states, uses descriptors (struct page), and maintains reverse mappings for page replacement.
•Memory zones (DMA, DMA32, Normal, etc.) partition RAM based on hardware constraints like DMA addressing limits.
•NUMA architectures have non-uniform memory access times. Physical organization includes locality-aware allocation.
•Page cache uses spare RAM to cache file data, bridging the huge speed gap between RAM and disk.
•Virtual and physical memory are deeply connected through page tables, demand paging, and page replacement.
•Ignoring physical organization leads to poor performance even when functionally correct—NUMA misses, zone exhaustion, cache thrashing.

Module Complete: The Five Goals

With this page, we've completed our exploration of all five fundamental memory management goals:

Allocation — Assigning memory to processes
Protection — Isolating processes from each other
Sharing — Enabling controlled access to common memory
Logical Organization — Structuring memory to match program structure
Physical Organization — Managing the actual hardware efficiently

These goals form the foundation for everything else in memory management. Subsequent modules will build on this foundation—exploring address binding, paging, segmentation, virtual memory, and more.

Module Complete

Congratulations! You now understand the five fundamental goals that drive memory management in operating systems. These concepts—allocation, protection, sharing, logical organization, and physical organization—underpin every topic in the chapters ahead. You have the conceptual foundation to understand why memory management works the way it does.

5 / 5

Loading learning content...

Operating SystemsMemory Management Goals

Memory Management Goals

LevelIntermediate

Duration90 mins

TopicMemory Management Goals

5 / 5

Physical Organization

Managing the Physical Reality

This is where operating systems confront the messy details: memory hierarchies, NUMA topologies, DMA constraints, and the relentless drive for performance.

What You Will Learn

What is Physical Organization?

Key Aspects of Physical Organization:

Physical Organization Responsibilities

•Frame Management — Tracking which physical page frames are free, allocated, or reserved; maintaining allocation data structures; handling frame allocation requests.
•Memory Hierarchy Awareness — Understanding that memory access is not uniform—cache, RAM, and disk have dramatically different speeds; organizing data to maximize cache effectiveness.
•Hardware Constraints — Not all physical memory is equal. Some devices can only DMA to low memory; some memory regions are reserved for BIOS or peripherals; the OS must respect these constraints.
•NUMA Topology — In multi-processor systems, memory access time depends on which CPU accesses which memory bank. Physical organization includes locality-aware allocation.
•Physical-to-Virtual Mapping — Maintaining the reverse mappings needed to locate all virtual references to a physical frame (for page replacement, CoW, etc.).

The Physical-Virtual Dichotomy:

Virtual Address Space (per process)      Physical Address Space (shared)
┌─────────────────────────────────┐     ┌─────────────────────────────────┐
│ Process A sees contiguous       │     │    Actual RAM Organization      │
│ memory 0x0000 to 0xFFFF...     │     │                                 │
│                                 │     │  0x0000: Reserved (BIOS, etc.)  │
│ Process B sees the same range   │     │  0x10000: DMA-capable zone      │
│ (different mappings)            │     │  0x100000: Normal zone          │
│                                 │     │  0x40000000: High memory        │
│ Both processes think they       │     │                                 │
│ have all memory to themselves   │     │  Multiple NUMA nodes            │
└─────────────────────────────────┘     │  with different latencies       │
                                        └─────────────────────────────────┘

Logical org: How virtual memory is structured
Physical org: How the real RAM is managed

Why This Matters

The Memory Hierarchy

The Hierarchy Levels:

Memory Hierarchy Characteristics (Approximate 2024 Values)
Level	Technology	Typical Size	Latency	Bandwidth	Managed By
CPU Registers	SRAM	~1 KB	~0.3 ns	TB/s	Compiler/Hardware
L1 Cache	SRAM	32-128 KB/core	~1 ns	~500 GB/s	Hardware
L2 Cache	SRAM	256 KB-1 MB/core	~4 ns	~200 GB/s	Hardware
L3 Cache	SRAM	8-64 MB shared	~12 ns	~100 GB/s	Hardware
Main Memory (RAM)	DRAM	16-512 GB	~80 ns	~50 GB/s	OS
SSD (swap)	Flash	256 GB-8 TB	~50 μs	~5 GB/s	OS
HDD (swap)	Magnetic	1-20 TB	~10 ms	~200 MB/s	OS

Key Observations:

Speed/Size Trade-off: Faster memory is smaller. L1 cache is 100x faster than RAM but 1000x smaller.
Latency Gap: The gap between RAM and disk is enormous—100,000x. A page fault to SSD adds 50 microseconds; to HDD, 10 milliseconds. This dominates performance.
OS Role: The OS primarily manages RAM (and swap). Caches are managed by hardware. But OS decisions (page placement, working set management) profoundly affect cache effectiveness.

The OS and the Hierarchy:

┌─────────────────────────────────────────────────────────────────┐
│                    Hardware Managed                             │
│  ┌─────────┐   ┌─────────┐   ┌─────────┐   ┌─────────┐         │
│  │Registers│ → │L1 Cache │ → │L2 Cache │ → │L3 Cache │         │
│  └─────────┘   └─────────┘   └─────────┘   └─────────┘         │
├─────────────────────────────────────────────────────────────────┤
│                    OS Managed                                   │
│  ┌───────────────────────────────────────────────────────────┐ │
│  │                   Main Memory (RAM)                        │ │
│  │  - Page frame allocation                                   │ │
│  │  - Page replacement decisions                              │ │
│  │  - Memory zone management                                  │ │
│  │  - NUMA locality                                           │ │
│  └───────────────────────────────────────────────────────────┘ │
│                           ↕                                     │
│  ┌───────────────────────────────────────────────────────────┐ │
│  │                   Swap Space (Disk)                        │ │
│  │  - Page-out for memory pressure                            │ │
│  │  - Page-in on demand                                       │ │
│  └───────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘

The OS's goal is to keep "hot" pages (frequently accessed) in RAM and push "cold" pages to disk, maximizing the effective use of the memory hierarchy.

Locality is King

Physical Frame Management

Frame States:

Physical Frame States

•Free — Not in use, available for allocation
•Allocated (Anonymous) — In use by a process for heap/stack/data
•Allocated (File-backed) — In use for memory-mapped file or shared library
•Page Cache — Holding cached file data (can be reclaimed if needed)
•Kernel — Used by the kernel for its own data structures
•Reserved — Hardware-reserved, not available for general use

frame_tracking.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
// Linux-style page frame descriptor
struct page {
    unsigned long flags;          // Page state flags
    atomic_t _refcount;           // Reference count
    atomic_t _mapcount;           // How many page tables map this frame
    
    union {
        struct {
            struct list_head lru;     // LRU list linkage
            struct address_space *mapping;  // Owner (file or anon_vma)
            unsigned long index;      // Offset within mapping
        };
        struct {
            // Slab allocator fields when used for slab
            struct kmem_cache *slab_cache;
            void *freelist;
        };
    };
    
    // Every physical frame has one 'struct page'
    // For 16GB RAM with 4KB pages: 4 million pages
    // struct page ~= 64 bytes, so ~256MB just for tracking!
};
 
// Frame allocation (simplified)
struct page* alloc_page(gfp_t flags) {
    struct page *page;
    
    // Try to get from per-CPU cache (fast path)
    page = get_from_percpu_cache(flags);
    if (page)
        return page;
    
    // Fall back to zone allocator (buddy system)
    page = __alloc_pages(flags, 0);  // order 0 = single page
    
    if (!page && !(flags & GFP_ATOMIC)) {
        // No memory available - try to reclaim
        reclaim_pages();
        page = __alloc_pages(flags, 0);
    }
    
    return page;  // NULL if allocation failed
}

The Page Frame Array:

The kernel maintains a struct page for every physical page frame, organized in a large array. Given a physical address, the kernel can quickly find the corresponding struct page:

Physical Frame Number (PFN) = Physical Address / PAGE_SIZE
struct page *pg = &mem_map[PFN]

Reverse Mappings:

A crucial physical organization feature is reverse mapping—given a physical frame, finding all virtual addresses that map to it. This is needed for:

Page replacement: Before evicting a page, update all PTEs that reference it
Copy-on-write: When copying a CoW page, update the mapping
Migration: When moving a page to a different NUMA node

Linux uses rmap (reverse mapping) structures to maintain these backward references efficiently.

The Tracking Overhead

Memory Zones: Handling Hardware Constraints

Common Memory Zones (Linux x86-64):

Linux Memory Zones
Zone	Address Range	Purpose	Typical Size
ZONE_DMA	0 - 16 MB	ISA DMA (legacy devices)	16 MB
ZONE_DMA32	0 - 4 GB	32-bit DMA capable devices	~4 GB
ZONE_NORMAL	4 GB+	Regular allocations	Most of RAM
ZONE_HIGHMEM	Varies	Memory not permanently mapped (32-bit only)	N/A on 64-bit
ZONE_MOVABLE	Configurable	Hotplug-able memory, easily migrated	Varies

Why Zones Exist:

DMA Constraints: Some devices (especially legacy ISA devices) can only perform Direct Memory Access to low physical addresses. Memory for their buffers must come from ZONE_DMA.
32-bit Limitations: 32-bit CPUs can't directly address memory above 4GB. ZONE_HIGHMEM (on 32-bit kernels) holds such memory, requiring special handling.
Address Width Restrictions: Some devices have 32-bit DMA engines—they can DMA to addresses 0-4GB but not higher. ZONE_DMA32 satisfies these.
Hotplug and Migration: ZONE_MOVABLE contains only movable pages, enabling memory hotplug (adding/removing RAM while running).

Zone Fallback:

When a zone is exhausted, allocations can fall back to other zones following a hierarchy:

Allocation Request: GFP_KERNEL (prefer ZONE_NORMAL)
    │
    ▼
ZONE_NORMAL has free memory? ─Yes─▶ Return page from ZONE_NORMAL
    │ No
    ▼
ZONE_DMA32 has free memory? ─Yes─▶ Return page from ZONE_DMA32
    │ No
    ▼
ZONE_DMA has free memory? ─Yes─▶ Return page from ZONE_DMA
    │ No
    ▼
Allocation fails (or triggers reclaim)

This fallback is necessary but undesirable—allocating from ZONE_DMA for non-DMA purposes wastes precious DMA-capable memory.

zone_allocation.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
// Linux GFP (Get Free Pages) flags control zone selection
 
// Regular kernel allocation - uses ZONE_NORMAL, can sleep
void *ptr = kmalloc(size, GFP_KERNEL);
 
// Atomic allocation - uses ZONE_NORMAL, cannot sleep
void *ptr = kmalloc(size, GFP_ATOMIC);
 
// DMA allocation - must use ZONE_DMA
void *ptr = kmalloc(size, GFP_DMA);
 
// 32-bit DMA - must use ZONE_DMA32 or below
void *ptr = kmalloc(size, GFP_DMA32);
 
// User allocation - normal zone, may trigger reclaim
struct page *p = alloc_page(GFP_USER);
 
// Movable allocation - for user pages, can be migrated
struct page *p = alloc_page(GFP_HIGHUSER_MOVABLE);

Zone Exhaustion

NUMA: Non-Uniform Memory Access

NUMA Architecture:

┌─────────────────────────────────────────────────────────────────────┐
│                        NUMA System Example                          │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   Node 0                              Node 1                        │
│   ┌─────────────────┐                ┌─────────────────┐           │
│   │  CPU 0 │ CPU 1  │                │  CPU 2 │ CPU 3  │           │
│   └────────┴────────┘                └────────┴────────┘           │
│          │                                    │                     │
│   ┌──────▼──────┐                    ┌───────▼──────┐              │
│   │  Local RAM  │◄────Interconnect────►│  Local RAM  │              │
│   │   128 GB    │      (slower)       │   128 GB    │              │
│   │ (fast access│                     │ (fast access│              │
│   │  ~80 ns)    │                     │  ~80 ns)    │              │
│   └─────────────┘                     └─────────────┘              │
│                                                                     │
│   CPU 0 → Node 0 RAM: 80 ns (local)                                │
│   CPU 0 → Node 1 RAM: 140 ns (remote, ~1.75x slower)               │
│                                                                     │
└─────────────────────────────────────────────────────────────────────┘

NUMA Implications for Physical Organization:

Locality Preference: Allocate memory from the same node as the requesting CPU. A process running on CPU 0 should get memory from Node 0.
Memory Migration: If a process moves to a different CPU/node, consider migrating its memory for better locality.
Node Balancing: Don't exhaust one node while another is empty—spread allocations while maintaining locality.
Affinity Awareness: Know which CPUs belong to which nodes, and which memory belongs where.

numa_allocation.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
// NUMA-aware allocation (conceptual)
function alloc_page_numa_aware(flags, preferred_node):
    // First, try the preferred (local) node
    page = alloc_from_node(preferred_node, flags)
    if page:
        return page
    
    // Try nodes in order of increasing distance
    for node in nodes_by_distance(preferred_node):
        page = alloc_from_node(node, flags)
        if page:
            return page
    
    // Last resort: trigger reclaim and retry
    reclaim_memory()
    return alloc_from_any_node(flags)
 
// Linux numactl usage examples:
// Run process with memory from node 0 only:
// $ numactl --membind=0 ./my_program
 
// Run on CPUs 0-3, prefer local memory:
// $ numactl --cpunodebind=0 --localalloc ./my_program
 
// Interleave memory across all nodes (useful for shared data):
// $ numactl --interleave=all ./my_program

NUMA Effects on Performance

Page Cache: Bridging RAM and Disk

Page Cache Operation:

Page Cache Mechanics

•Read Request: Process reads a file block
•Cache Check: Kernel checks if the page is already in the page cache
•Cache Hit: If present, return data from memory (fast path—no disk I/O!)
•Cache Miss: If absent, allocate a page, read from disk, add to cache, return data
•Write Request: Write to page cache, mark page dirty
•Write-back: Periodically (or on sync), write dirty pages to disk

Why Page Cache is Part of Physical Organization:

The page cache occupies physical memory frames—the same frames that processes might want. Physical organization includes:

Size Management: How much RAM to dedicate to cache vs. process memory
Reclaim Decisions: When under memory pressure, which cache pages to evict
Dirty Page Tracking: Which cached pages need writing to disk
Read-ahead: Prefetching pages that are likely to be accessed soon

Observing Page Cache:

$ free -h
              total        used        free      shared  buff/cache   available
Mem:           62Gi       8.5Gi        12Gi       1.2Gi       42Gi        52Gi
Swap:         8.0Gi          0B       8.0Gi

# Note: 42 GB in buff/cache!
# This is mostly page cache - file data kept in memory
# "available" (52 GB) includes reclaimable cache

Linux Uses All RAM

Physical vs. Virtual Memory Relationship

Virtual Memory

•Per-process abstraction
•Can exceed physical RAM size
•Provides isolation illusion
•Contiguous view for processes
•Managed via page tables
•Addresses start at 0

Physical Memory

•System-wide resource
•Fixed size (installed RAM)
•Shared among all processes
•Actually scattered frames
•Managed via frame allocator
•Hardware-determined addresses

The Connection Points:

Page Tables Bridge Both Worlds
- Virtual pages → physical frames via page table entries
- Each PTE contains a physical frame number
- The MMU performs this translation for every memory access
Demand Paging Connects Page Faults to Frame Allocation
- Process accesses virtual page not in RAM
- Page fault occurs
- OS allocates a physical frame
- Loads data (from file or swap)
- Updates PTE to point to the frame
- Resumes process
Page Replacement Involves Both
- Physical memory pressure triggers replacement
- OS chooses a virtual page to evict
- Updates PTE (marks not present)
- Reclaims the physical frame
- Optionally writes to swap (physical I/O)
Sharing Connects Multiple Virtual to One Physical
- Multiple PTEs (from different processes)
- All point to same physical frame
- One physical copy, many virtual copies

memory_flow.pseudo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
// Complete flow: Virtual access to physical reality
function handle_memory_access(process, virtual_addr, access_type):
    // Step 1: MMU looks up page table (hardware)
    pte = walk_page_table(process.pgd, virtual_addr)
    
    if pte.present:
        // Fast path: page is in RAM
        physical_addr = (pte.frame_number << PAGE_SHIFT) | 
                        (virtual_addr & PAGE_OFFSET_MASK)
        
        // MMU caches in TLB for future accesses
        return access_memory(physical_addr)
    
    else:
        // Page fault: virtual page not in physical memory
        raise PAGE_FAULT
        
        // --- In page fault handler ---
        
        // Physical: Allocate a frame
        frame = physical_allocate_frame(GFP_USER)
        
        if frame is NULL:
            // Physical: Not enough RAM - need to reclaim
            victim_pte = choose_page_to_evict()  // Virtual decision
            victim_frame = victim_pte.frame_number
            
            if victim_pte.dirty:
                write_to_swap(victim_frame)  // Physical I/O
            
            invalidate_pte(victim_pte)  // Virtual update
            frame = victim_frame  // Physical: reuse frame
        
        // Load content into physical frame
        if page_is_file_backed(virtual_addr):
            read_from_file(frame, file_offset)
        else:
            read_from_swap_or_zero(frame)
        
        // Virtual: Update page table
        pte.frame_number = frame
        pte.present = true
        pte.permissions = calculate_permissions()
        
        // Physical: Update frame metadata
        frame.mapcount += 1
        
        // Resume the faulting instruction

The Dance of Virtual and Physical

Summary: Physical Organization

This page explored physical organization as the fifth and final fundamental goal of memory management. Let's consolidate what we've learned:

Key Takeaways

•Physical organization manages the actual RAM hardware—frame allocation, hardware constraints, and memory hierarchy awareness.
•The memory hierarchy spans registers to disk, with ~100,000x latency difference between RAM and HDD. The OS manages RAM/swap; hardware manages caches.
•Frame management tracks physical page states, uses descriptors (struct page), and maintains reverse mappings for page replacement.
•Memory zones (DMA, DMA32, Normal, etc.) partition RAM based on hardware constraints like DMA addressing limits.
•NUMA architectures have non-uniform memory access times. Physical organization includes locality-aware allocation.
•Page cache uses spare RAM to cache file data, bridging the huge speed gap between RAM and disk.
•Virtual and physical memory are deeply connected through page tables, demand paging, and page replacement.
•Ignoring physical organization leads to poor performance even when functionally correct—NUMA misses, zone exhaustion, cache thrashing.

Module Complete: The Five Goals

With this page, we've completed our exploration of all five fundamental memory management goals:

Allocation — Assigning memory to processes
Protection — Isolating processes from each other
Sharing — Enabling controlled access to common memory
Logical Organization — Structuring memory to match program structure
Physical Organization — Managing the actual hardware efficiently

These goals form the foundation for everything else in memory management. Subsequent modules will build on this foundation—exploring address binding, paging, segmentation, virtual memory, and more.

Module Complete

5 / 5