Loading learning content...
What if addresses could be resolved on every single memory access, not just at compile time or load time? What if a process could run with one mapping, be moved to completely different physical locations, and continue executing as if nothing happened? What if multiple processes could share the exact same physical memory while believing they each have exclusive access?
Execution-time binding (also called runtime binding or dynamic binding) delivers all of this and more. By deferring address resolution to the moment of each memory access—hundreds of millions of times per second—execution-time binding enables the sophisticated memory management that defines modern computing.
This mechanism is the foundation of virtual memory, memory protection, demand paging, shared libraries, and security features like Address Space Layout Randomization (ASLR). Understanding execution-time binding is understanding how modern operating systems actually work.
By the end of this page, you will understand how execution-time binding works at the hardware level (MMU), the translation from virtual to physical addresses, why this constant translation is worthwhile despite its overhead, and the revolutionary capabilities it enables. You'll grasp the core of modern memory management.
Load-time binding enabled multiprogramming, but its "bind once and never change" model created fundamental limitations:
Problems with load-time binding:
The solution insight:
All these problems stem from the same root: addresses are fixed after loading. The solution is equally fundamental: never fix the addresses at all. Translate every address, every time, from a program's logical address to the actual physical address.
This sounds expensive—and it is. Every MOV, every ADD, every CALL that accesses memory requires translation. With billions of memory accesses per second in a modern system, the translation must be extraordinarily fast. This is why execution-time binding requires specialized hardware: the Memory Management Unit (MMU).
Execution-time binding introduces the powerful concept of virtual memory: each process operates in its own virtual address space, completely independent of physical memory. The process sees addresses 0 through N; the MMU transparently maps these to wherever the data actually resides in physical RAM. The process never knows—or needs to know—its physical location.
Execution-time binding introduces a fundamental distinction: the addresses a program uses (virtual or logical addresses) are different from the addresses used to access actual RAM chips (physical addresses).
Virtual Address Space:
Physical Address Space:
VIRTUAL ADDRESS SPACE PHYSICAL ADDRESS SPACE(Each process sees its own) (Actual hardware memory) Process A: ┌─────────────────────────────┐┌─────────────────────────┐ │ 0x00000000 ││ 0x00000000: [code] │─────┬───────>│ [Kernel code] ││ 0x00001000: [data] │ │ │ ... ││ 0x00002000: [heap] │ │ │ 0x00100000 ││ ... │ │ ┌───>│ [Process A code] ││ 0x7FFF0000: [stack] │ │ │ │ 0x00101000 │└─────────────────────────┘ │ │ ┌─>│ [Process A data] │ │ │ │ │ │ ... │ └───────────────────────┼───┘ │ │ 0x00200000 │ │ │ │ [Process B code] │Process B: │ │ │ 0x00201000 │┌─────────────────────────┐ │ │ │ [Process B stack] ││ 0x00000000: [code] │─────┼─────┼─>│ ... ││ 0x00001000: [data] │ │ │ │ 0x00300000 ││ 0x00002000: [heap] │ │ │ │ [Shared library] ││ ... │ │ │ │ ... ││ 0x7FFF0000: [stack] │ │ │ │ 0x003FF000 │└─────────────────────────┘ │ │ │ [Free] │ │ │ │ └─────────────────────────────┘ └───────────────────────┴─────┘ KEY INSIGHT: Both processes use virtual address 0x00000000 for their code,but these map to DIFFERENT physical addresses (0x00100000 vs 0x00200000). The MMU maintains a separate mapping table for each process.| Aspect | Virtual Address | Physical Address |
|---|---|---|
| Generated by | CPU instruction operands | Memory controller/bus |
| Seen by | Process (programs) | Hardware (RAM chips) |
| Address space | Per-process (isolated) | System-wide (shared) |
| Range | Defined by architecture (e.g., 48-bit) | Defined by installed RAM |
| Can exceed physical memory? | Yes (with paging to disk) | No |
| Same address different data? | Yes (in different processes) | No (each address unique) |
The Memory Management Unit (MMU) is the hardware component that makes execution-time binding possible. It sits between the CPU and the memory bus, intercepting every memory access and translating virtual addresses to physical addresses in real-time.
Where the MMU fits in the system:
┌─────────────────────────────────────────────────────────────────────┐│ CPU CHIP ││ ┌───────────────────────────────────────────────────────────────┐ ││ │ CPU CORE │ ││ │ ┌────────────┐ ┌────────────┐ ┌────────────┐ │ ││ │ │ Registers │ ───> │ ALU │ ───> │ Control │ │ ││ │ └────────────┘ └────────────┘ │ Unit │ │ ││ │ └────────────┘ │ ││ │ │ │ ││ │ ▼ Virtual Address │ ││ │ ┌────────────────────────────────────────────────────┐ │ ││ │ │ L1 CACHE │ │ ││ │ └────────────────────────────────────────────────────┘ │ ││ │ │ │ ││ │ ▼ Virtual Address │ ││ └──────────┼────────────────────────────────────────────────────┘ ││ │ ││ ┌──────────▼────────────────────────────────────────────────────┐ ││ │ MEMORY MANAGEMENT UNIT (MMU) │ ││ │ ┌─────────────────────────────────────────────────────────┐ │ ││ │ │ TLB (Cache) │ │ ││ │ │ Virtual Page → Physical Frame translation cache │ │ ││ │ └─────────────────────────────────────────────────────────┘ │ ││ │ ┌─────────────────────────────────────────────────────────┐ │ ││ │ │ Page Table Walker │ │ ││ │ │ (Consults page tables in memory on TLB miss) │ │ ││ │ └─────────────────────────────────────────────────────────┘ │ ││ │ │ ││ │ Input: Virtual Address (from CPU) │ ││ │ Output: Physical Address (to memory bus) │ ││ │ │ ││ │ Also generates: │ ││ │ - Page fault (if page not in memory) │ ││ │ - Protection fault (if access violates permissions) │ ││ └───────────────────────────────────────────────────────────────┘ ││ │ ││ ▼ Physical Address │└─────────────┼───────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ MEMORY BUS / CONTROLLER │ └─────────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ PHYSICAL RAM │ │ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ │ │ │ DIMM │ │ DIMM │ │ DIMM │ │ DIMM │ │ │ │ 0 │ │ 1 │ │ 2 │ │ 3 │ │ │ └────────┘ └────────┘ └────────┘ └────────┘ │ └─────────────────────────────────────────────────────────────────┘MMU responsibilities:
| Function | Description |
|---|---|
| Address Translation | Convert virtual addresses to physical addresses using page tables |
| TLB Management | Cache recent translations for speed |
| Protection Checking | Enforce read/write/execute permissions and prevent cross-process access |
| Page Fault Generation | Signal the OS when accessing unmapped or swapped-out pages |
| Cache Coherence | Ensure consistency between cached and memory data |
On modern processors (x86-64, ARM, RISC-V), the MMU is integrated into the CPU chip itself. It can translate a virtual address to a physical address in a single clock cycle when the translation is in the TLB. This speed is essential—without it, every memory access would require multiple memory lookups, slowing the system to a crawl.
The MMU translates virtual addresses to physical addresses using page tables maintained by the operating system. Memory is divided into fixed-size blocks called pages (virtual) and frames (physical), typically 4 KB each.
Breaking down a virtual address:
VIRTUAL ADDRESS STRUCTURE (assuming 4 KB pages, 32-bit address)═════════════════════════════════════════════════════════════════════ Virtual Address: 0x12345678Binary: 0001 0010 0011 0100 0101 0110 0111 1000 ┌───────────────────────────────────┬───────────────────────────────┐│ PAGE NUMBER │ OFFSET ││ (identifies which page) │ (position within the page) │├───────────────────────────────────┼───────────────────────────────┤│ Bits 31-12 (20 bits) │ Bits 11-0 (12 bits) ││ 0x12345 (page number) │ 0x678 (offset = 1656 bytes) │└───────────────────────────────────┴───────────────────────────────┘ 12 bits for offset → 2^12 = 4096 bytes = 4 KB page size20 bits for page number → 2^20 = 1,048,576 possible pages ═════════════════════════════════════════════════════════════════════TRANSLATION PROCESS:═════════════════════════════════════════════════════════════════════ 1. Extract page number: 0x123452. Look up page number in page table3. Page table returns: frame number 0xABCDE4. Combine frame number with offset: Physical Address = (Frame Number << 12) | Offset = (0xABCDE << 12) | 0x678 = 0xABCDE000 | 0x678 = 0xABCDE678 Virtual: 0x12345678 → Physical: 0xABCDE678 ▲ ▲ │ │ Page 0x12345 Frame 0xABCDE Offset 0x678 Offset 0x678 (unchanged!) KEY INSIGHT: The offset never changes! Only the page→frame mapping changes. Pages and frames are the same size, so offsets align perfectly.The translation process step-by-step:
MOV EAX, [0x12345678])Page tables are data structures maintained by the operating system that store virtual-to-physical page mappings. Each process has its own page table (or set of page tables), ensuring address space isolation.
Page Table Entry (PTE) contents:
PAGE TABLE ENTRY (PTE) - Typical 64-bit Format (x86-64)═══════════════════════════════════════════════════════════════════════════ 63 62-52 51-12 11-9 8 7 6 5 4 3 2 1 0┌───┬───────┬─────────────────────────────┬─────┬─────┬────┬────┬───┬───┬───┬───┬───┬───┐│NX │ Avail │ Frame Number (40 bits) │ AVL │ G │ PS │ D │ A │PCD│PWT│U/S│R/W│ P │└───┴───────┴─────────────────────────────┴─────┴─────┴────┴────┴───┴───┴───┴───┴───┴───┘ │ │ │ │ │ │ │ │ │ │ │ │ └── Present │ │ │ │ │ │ │ │ │ │ │ └────── Read/Write │ │ │ │ │ │ │ │ │ │ └────────── User/Supervisor │ │ │ │ │ │ │ │ │ └────────────── Write-Through │ │ │ │ │ │ │ │ └────────────────── Cache Disable │ │ │ │ │ │ │ └────────────────────── Accessed │ │ │ │ │ │ └────────────────────────── Dirty │ │ │ │ │ └─────────────────────────────── Page Size │ │ │ │ └───────────────────────────────────── Global │ │ │ └─────────────────────────────────────────── Available for OS │ │ └───────────────────────────────────────────────────────────────── Physical Frame Number │ └──────────────────────────────────────────────────────────────────────────────── Available └────────────────────────────────────────────────────────────────────────────────────── No Execute (NX) ═══════════════════════════════════════════════════════════════════════════KEY BITS EXPLAINED:═══════════════════════════════════════════════════════════════════════════ P (Present): 1 = Page is in memory; 0 = Page fault if accessedR/W (Read/Write): 1 = Writable; 0 = Read-onlyU/S (User/Super): 1 = User accessible; 0 = Kernel onlyA (Accessed): Set by hardware when page is readD (Dirty): Set by hardware when page is writtenNX (No Execute): 1 = Cannot execute code from this page (security!) Frame Number: Physical frame containing this page's data The core of the translation!Multi-level page tables:
A 48-bit virtual address space with 4 KB pages would require 2^36 page table entries—512 GB just for the page table! To handle this, modern systems use multi-level (hierarchical) page tables that only allocate entries for used portions of the address space.
x86-64 4-LEVEL PAGE TABLE HIERARCHY═══════════════════════════════════════════════════════════════════════════ Virtual Address (48 bits used):┌─────────┬─────────┬─────────┬─────────┬──────────────┐│ PML4 │ PDPT │ PD │ PT │ Offset ││ (9 bits)│ (9 bits)│ (9 bits)│ (9 bits)│ (12 bits) │└─────────┴─────────┴─────────┴─────────┴──────────────┘ Index Index Index Index Within Page Translation Walk: CR3 Register ───┐(Page Map L4 │ base address) │ ▼ ┌─────────────────────────────────────────────────────────┐ │ PML4 (Page Map Level 4) │ │ ┌─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┐ │ │ │ 0 │ 1 │ ... │ 127 │ ... │ 511 │ │ │ │ └──┬──┴─────┴─────┴─────┴─────┴─────┴───────────┘ │ └─────┼──────────────────────────────────────────────────┘ │ PML4[index] → PDPT base address ▼ ┌─────────────────────────────────────────────────────────┐ │ PDPT (Page Directory Pointer Table) │ │ ┌─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┐ │ │ │ 0 │ 1 │ ... │ 127 │ ... │ 511 │ │ │ │ └──┬──┴─────┴─────┴─────┴─────┴─────┴───────────┘ │ └─────┼──────────────────────────────────────────────────┘ │ PDPT[index] → PD base address ▼ ┌─────────────────────────────────────────────────────────┐ │ PD (Page Directory) │ │ ┌─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┐ │ │ │ 0 │ 1 │ ... │ 127 │ ... │ 511 │ │ │ │ └──┬──┴─────┴─────┴─────┴─────┴─────┴───────────┘ │ └─────┼──────────────────────────────────────────────────┘ │ PD[index] → PT base address ▼ ┌─────────────────────────────────────────────────────────┐ │ PT (Page Table) │ │ ┌─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┐ │ │ │ 0 │ 1 │ ... │ 127 │ ... │ 511 │ │ │ │ └──┬──┴─────┴─────┴─────┴─────┴─────┴───────────┘ │ └─────┼──────────────────────────────────────────────────┘ │ PT[index] → Frame number ▼ ┌─────────────────┐ │ Physical Frame │ + Offset → PHYSICAL ADDRESS └─────────────────┘Multi-level page tables are sparse—only allocate table entries for address regions actually used. A process using 100 MB of virtual space doesn't need page table entries for the unused 127 TB. Upper-level entries simply have the Present bit cleared, indicating no need to allocate lower levels.
Walking a 4-level page table requires 4 memory accesses just to translate one address—before even accessing the target data! This is unacceptable for performance. The Translation Lookaside Buffer (TLB) solves this by caching recent translations.
TLB characteristics:
| Property | L1 DTLB (Data) | L1 ITLB (Instructions) | L2 TLB (Unified) |
|---|---|---|---|
| Entries | 64-128 | 64-128 | 1024-2048 |
| Associativity | 4-8 way | 4-8 way | 4-12 way |
| Lookup latency | 1 cycle | 1 cycle | 6-7 cycles |
| Page sizes | 4 KB, 2 MB | 4 KB, 2 MB | 4 KB, 2 MB, 1 GB |
TLB LOOKUP PROCESS═══════════════════════════════════════════════════════════════════════════ Virtual Address: 0x7FFF12345678 ▼ ┌─────────────────────────────────────────────────────────────┐ │ TLB LOOKUP │ │ │ │ TLB Entry Structure: │ │ ┌──────────────┬────────────────┬─────────────────────┐ │ │ │ Virtual Page │ Physical Frame │ Flags (R/W,U/S,etc)│ │ │ ├──────────────┼────────────────┼─────────────────────┤ │ │ │ 0x7FFF12345 │ 0x000AB678 │ U, R/W, P │ │ │ │ 0x00012340 │ 0x000CD123 │ K, R, P │ │ │ │ 0x7FFFF234 │ 0x000EF456 │ U, R/W, P │ │ │ │ ... │ ... │ ... │ │ │ └──────────────┴────────────────┴─────────────────────┘ │ │ │ │ Search for Virtual Page 0x7FFF12345... │ └─────────────────────────────────────────────────────────────┘ │ ┌─────────┴─────────┐ │ │ ▼ ▼ ┌──────────────┐ ┌──────────────────────────────────────┐ │ TLB HIT │ │ TLB MISS │ │ │ │ │ │ Frame found │ │ 1. Walk page tables in memory │ │ in 1 cycle! │ │ 2. Find frame number │ │ │ │ 3. Insert into TLB │ │ Continue to │ │ 4. Evict old entry if full │ │ memory access │ 5. Retry access (now TLB hit) │ └──────────────┘ │ │ │ │ Cost: ~10-100 cycles │ │ └──────────────────────────────────────┘ ▼ Physical Address: 0x000AB678678 (Frame 0x000AB678 + Offset 0x678) ═══════════════════════════════════════════════════════════════════════════EFFECTIVE ACCESS TIME CALCULATION:═══════════════════════════════════════════════════════════════════════════ Given: - TLB hit rate: 99% - TLB lookup: 1 cycle - Page table walk: 100 cycles (4 memory accesses × 25 cycles each) - Memory access: 25 cycles Effective Access Time = TLB_hit_rate × (TLB + Memory) + TLB_miss_rate × (TLB + Walk + Memory) = 0.99 × (1 + 25) + 0.01 × (1 + 100 + 25) = 0.99 × 26 + 0.01 × 126 = 25.74 + 1.26 = 27 cycles Without TLB: Every access = 100 + 25 = 125 cyclesWith TLB: Effective = 27 cycles (4.6× faster!)TLB invalidation:
When the OS changes page mappings (context switch, map new memory, unmap pages), it must invalidate affected TLB entries:
invlpg on x86)TLB invalidation is a significant performance consideration in OS design.
Execution-time binding's per-access translation enables capabilities impossible with earlier binding methods. These capabilities define modern operating system architecture:
libc (C standard library) is used by nearly every process. With execution-time binding, one copy resides in physical memory but is mapped into thousands of processes—each seeing it at potentially different virtual addresses. This saves gigabytes of RAM on a typical system.
On each execution, the stack, heap, libraries, and executable load at random addresses. An attacker can't know the address of a return gadget or variable. Modern security depends on this—disabled ASLR is a serious vulnerability.
Execution-time binding trades performance for flexibility. Every memory access includes translation overhead. Understanding this overhead and mitigation strategies is essential for systems programming.
| Overhead Source | Cost | Mitigation |
|---|---|---|
| TLB miss (page walk) | ~100 cycles | Larger TLB, huge pages (2 MB/1 GB), locality-aware allocation |
| TLB flush on context switch | Thousands of cycles | ASIDs (tag entries per-process), PCID on x86 |
| Page fault (not in memory) | ~10M cycles (disk) | Working set management, prefetching, SSD |
| Multi-level walk | 4 memory accesses | Page walk cache, MMU optimizations |
| Cross-CPU TLB shootdown | IPI + TLB flush | Minimize shared memory changes, batching |
Huge pages optimization:
Default 4 KB pages mean more TLB entries needed to cover working memory. With 2 MB pages, one TLB entry covers 512× more memory:
| Page Size | Entries to cover 1 GB | TLB Pressure |
|---|---|---|
| 4 KB | 262,144 entries | High |
| 2 MB | 512 entries | Low |
| 1 GB | 1 entry | Minimal |
Database systems, VMs, and high-performance applications often use huge pages to reduce TLB pressure.
TLB misses are among the most significant performance costs in modern systems. A program with poor locality (random access patterns across large memory) can spend more CPU time on page walks than actual computation. Designing for TLB-friendliness is a key optimization strategy.
We've explored execution-time binding—the foundation of modern memory management. Let's consolidate the key concepts:
What's Next:
With the three binding times understood, we'll examine relocatable code in depth—the specific code generation techniques that enable load-time and execution-time binding. You'll understand position-independent code (PIC), the Global Offset Table (GOT), and how compilers generate code that can run at any address.
You now understand execution-time binding—the mechanism underlying virtual memory and modern operating systems. You can explain how the MMU translates addresses, why page tables and TLBs are necessary, and what capabilities this architecture enables. This knowledge is fundamental to systems programming and OS design.