Operating SystemsAddress Binding

Address Binding: From Source to Execution

LevelIntermediate

Duration75 mins

TopicAddress Binding

2 / 5

Load-time Binding

The Flexibility Revolution

Compile-time binding's rigid requirement—that programs must load at predetermined addresses—became untenable as computing evolved. The desire to run multiple programs simultaneously, to share computing resources efficiently, and to develop programs without coordinating memory layouts demanded a more flexible approach.

Load-time binding represents the first major evolution beyond absolute addressing. In this model, the compiler generates relocatable code—code containing offsets rather than absolute addresses—and defers final address resolution until the program loads into memory. The operating system's loader examines available memory, chooses a suitable location, and adjusts all address references to match the chosen load point.

This seemingly simple change—deferring binding from compile time to load time—enabled multiprogramming and transformed computing from a one-program-at-a-time batch system to the multitasking environments we take for granted today.

What You Will Learn

By the end of this page, you will understand how load-time binding works, the structure of relocatable executables, the relocation process, and why this mechanism was pivotal for multiprogramming. You'll grasp the role of base registers and relocation tables, and understand both the capabilities and limitations of load-time binding.

The Multiprogramming Imperative

Before examining load-time binding mechanics, we must understand why it was necessary. The answer lies in multiprogramming—the ability to run multiple programs concurrently in memory.

The CPU utilization problem:

Early computers ran one program at a time. When a program waited for I/O (reading a tape, waiting for operator input), the expensive CPU sat idle. Consider a typical program's execution timeline:

Single-Program Execution

Timeline

Single-Program Execution (Compile-time Binding):
 
Time →  0    10   20   30   40   50   60   70   80   90   100ms
        ├────┴────┴────┴────┴────┴────┴────┴────┴────┴────┤
CPU:    [COMPUTE][  I/O WAIT  ][COMPUTE][   I/O WAIT    ][COMPUTE]
        ████████░░░░░░░░░░░░░░░████████░░░░░░░░░░░░░░░░░░░████████
 
CPU Utilization: ~30% (actively computing only 30% of total time)
                The CPU idles during I/O operations.
 
════════════════════════════════════════════════════════════════════
 
Multiprogrammed Execution (Load-time Binding enables this):
 
Time →  0    10   20   30   40   50   60   70   80   90   100ms
        ├────┴────┴────┴────┴────┴────┴────┴────┴────┴────┤
CPU:    [Program A][  B   ][  A  ][  C  ][  B  ][  A  ][  C  ]
        ███████████████████████████████████████████████████████████
 
CPU Utilization: ~90%+ (while A waits for I/O, run B or C)
                Multiple programs share the CPU efficiently.

The memory problem:

To achieve multiprogramming, multiple programs must reside in memory simultaneously. But with compile-time binding, all programs are compiled for the same fixed addresses. Program A starts at 0x1000. Program B also starts at 0x1000. They cannot coexist.

Solutions considered:

Fixed memory partitions: Divide memory into regions; compile each program for a specific region
- Problem: Wastes memory, limits program sizes, requires pre-planning
Swapping: Load one program, run it, swap it out, load another
- Problem: Slow due to disk I/O, doesn't truly enable multiprogramming
Relocatable code with load-time binding: Let programs run from any address
- Solution: Flexible, efficient, enables true multiprogramming

Load-time binding emerged as the principled solution to the multiprogramming memory problem.

Economic Pressure

Early computers cost millions of dollars. A machine idling at 70% utilization represented massive waste. The economic pressure to maximize CPU utilization drove the development of multiprogramming, which in turn demanded flexible address binding. Load-time binding was an economic necessity, not just a technical improvement.

The Concept of Relocatable Code

Relocatable code is compiled code that uses offsets relative to a base address rather than absolute addresses. The final addresses are computed by adding the base address to each offset when the program loads.

The key insight:

Instead of embedding MOV EAX, [0x1040] (absolute address), the compiler generates MOV EAX, [base + 0x40] conceptually. The value 0x40 is an offset from the program's start. If the program loads at base address 0x1000, the final address is 0x1000 + 0x40 = 0x1040. If it loads at 0x5000, the address is 0x5000 + 0x40 = 0x5040.

Creating relocatable code:

Absolute vs. Relocatable Address Representation
Aspect	Compile-time Binding (Absolute)	Load-time Binding (Relocatable)
Address in executable	0x00401040 (final physical)	0x00000040 (offset from base)
Contains relocation info?	No	Yes — relocation table
Load address known	At compile time	At load time
Can load at different addresses?	No	Yes
Code in executable	MOV EAX, [0x00401040]	MOV EAX, [0x00000040] + relocation entry

Compiler's role in load-time binding:

The compiler in a load-time binding system:

Generates code assuming the program starts at address 0 (or some canonical base)
Uses offsets from this base for all addresses
Creates a relocation table listing every location in the code that contains an address needing adjustment
Outputs an object file with the code, data, symbol information, and relocation table

Relocatable Object File Structure

Structure

┌─────────────────────────────────────────────────────────────────┐
│                    RELOCATABLE OBJECT FILE                       │
├─────────────────────────────────────────────────────────────────┤
│ HEADER                                                          │
│   - Magic number identifying file format                        │
│   - Section sizes and offsets                                   │
│   - Entry point (as offset from base)                           │
├─────────────────────────────────────────────────────────────────┤
│ TEXT SECTION (Code)                                             │
│   - Machine instructions                                        │
│   - Addresses expressed as offsets (e.g., 0x0040 not 0x401040)  │
│   0x0000: PUSH EBP                                              │
│   0x0001: MOV EBP, ESP                                          │
│   0x0003: MOV EAX, [0x0040]    ← offset, needs relocation       │
│   ...                                                           │
├─────────────────────────────────────────────────────────────────┤
│ DATA SECTION                                                    │
│   - Initialized global variables                                │
│   0x0040: .long 100            (data = 100)                     │
├─────────────────────────────────────────────────────────────────┤
│ BSS SECTION                                                     │
│   - Uninitialized globals (just size, no content)               │
│   size: 4 bytes                                                 │
├─────────────────────────────────────────────────────────────────┤
│ SYMBOL TABLE                                                    │
│   main:    offset 0x0000 (function)                             │
│   process: offset 0x0020 (function)                             │
│   data:    offset 0x0040 (variable, data section)               │
│   result:  offset 0x0044 (variable, bss section)                │
├─────────────────────────────────────────────────────────────────┤
│ RELOCATION TABLE                                               │
│   Entry 1: offset 0x0004, type: absolute32, symbol: data        │
│   Entry 2: offset 0x0010, type: absolute32, symbol: result      │
│   Entry 3: offset 0x0025, type: relative32, symbol: process     │
│   ...                                                           │
│   (Lists EVERY address that needs adjustment at load time)      │
└─────────────────────────────────────────────────────────────────┘

The Relocation Table: Heart of Load-time Binding

The relocation table (or relocation section) is the critical data structure that enables load-time binding. It tells the loader exactly which bytes in the executable contain addresses that need adjustment.

Structure of a relocation entry:

Each entry in the relocation table typically contains:

Relocation Entry Components

•Offset: Location within the section (or file) containing the address to be fixed up
•Type: The kind of relocation (absolute, PC-relative, etc.) indicating how to apply the adjustment
•Symbol: The symbol this address refers to (for external references) or section (for internal references)
•Addend: An additional constant to add (in some formats, embedded in the instruction)

Relocation Table Example

Relocation Data

Example Program's Relocation Table:
 
Code section (showing addresses that need relocation):
───────────────────────────────────────────────────────
0x0000: 55              ; PUSH EBP
0x0001: 89 E5           ; MOV EBP, ESP
0x0003: A1 40 00 00 00  ; MOV EAX, [0x00000040]  ← Address at offset 0x0004
        ^^ ^^^^^^^^^^^^
        opcode  address needs relocation!
0x0008: 83 C0 0A        ; ADD EAX, 10
0x000B: A3 44 00 00 00  ; MOV [0x00000044], EAX  ← Address at offset 0x000C
           ^^^^^^^^^^^^
           address needs relocation!
0x0010: E8 10 00 00 00  ; CALL offset 0x10       ← Relative, may need adjustment
           ^^^^^^^^^^^^
           relative offset (PC-relative calls often don't need relocation)
 
Relocation Table:
───────────────────────────────────────────────────────
╔══════════╦══════════════════╦═════════════╦═════════════════════════╗
║ Offset   ║ Type             ║ Symbol      ║ Description             ║
╠══════════╬══════════════════╬═════════════╬═════════════════════════╣
║ 0x0004   ║ R_386_32         ║ .data       ║ Absolute 32-bit address ║
║          ║ (absolute)       ║             ║ → Add base to this word ║
╠══════════╬══════════════════╬═════════════╬═════════════════════════╣
║ 0x000C   ║ R_386_32         ║ .bss        ║ Absolute 32-bit address ║
║          ║ (absolute)       ║             ║ → Add base to this word ║
╠══════════╬══════════════════╬═════════════╬═════════════════════════╣
║ 0x0011   ║ R_386_PC32       ║ process     ║ PC-relative call        ║
║          ║ (PC-relative)    ║             ║ → Adjust if external    ║
╚══════════╩══════════════════╩═════════════╩═════════════════════════╝

Why Mark Every Address?

Without the relocation table, the loader cannot distinguish address bytes from data bytes. The hex value 40 00 00 00 could be an address (0x00000040) or the integer 64. Only the relocation table reveals which interpretation is correct. This metadata is what makes relocation possible.

Types of relocations:

Different instruction encodings require different relocation types:

Relocation Type	Description	When Used	Adjustment Formula
R_386_32 (absolute)	32-bit absolute address	Direct memory access	S + A + B (symbol + addend + base)
R_386_PC32 (PC-relative)	32-bit PC-relative	Near calls, near jumps	S + A - P (symbol + addend - location)
R_386_GOT32	GOT entry	Position-independent	Points into Global Offset Table
R_386_PLT32	PLT entry	External function calls	Points into Procedure Linkage Table

Where: S = symbol value, A = addend, B = base address, P = place being relocated

The Loading Process in Detail

The loader is the operating system component responsible for loading programs into memory and performing relocation. Let's trace through the complete loading process for a program using load-time binding.

Loading Process Steps

•Read executable header — The loader reads the executable file header to determine section sizes, entry point offset, and relocation table location.
•Allocate memory — The loader requests memory from the operating system. The OS finds a suitable free region and returns the base address (e.g., 0x10000).
•Load sections into memory — Code (.text), initialized data (.data), and BSS segments are loaded at contiguous addresses starting from the base.
•Process relocation table — For each relocation entry, the loader calculates the final address and patches the corresponding bytes in memory.
•Resolve external symbols — If the program uses external libraries, their addresses are resolved (static linking) or marked for runtime resolution (dynamic linking).
•Initialize BSS — The BSS section (uninitialized data) is zeroed or handled per platform convention.
•Transfer control — The loader jumps to the entry point (base address + entry offset) to begin program execution.

Relocation Process Example

Example

RELOCATION PROCESS WALKTHROUGH
═══════════════════════════════════════════════════════════════
 
Given:
  - Executable compiled with base address 0x00000000 (relocatable)
  - OS allocates memory starting at 0x00010000 (actual load address)
  - Relocation table entry: offset 0x0004, type: absolute32
 
Before Relocation (as stored in file):
  0x0003: A1 40 00 00 00    ; MOV EAX, [0x00000040]
              ^^^^^^^^^^^^
              Address 0x00000040 (relative to base 0)
 
Relocation Calculation:
  ┌─────────────────────────────────────────────────────────┐
  │  New Address = Old Address + Load Base                   │
  │                                                          │
  │  New Address = 0x00000040 + 0x00010000                   │
  │              = 0x00010040                                 │
  └─────────────────────────────────────────────────────────┘
 
After Relocation (in memory):
  0x10003: A1 40 00 01 00   ; MOV EAX, [0x00010040]
               ^^^^^^^^^^^^
               Address patched to 0x00010040 (little-endian: 40 00 01 00)
 
The instruction now correctly references the data at its loaded location!
 
═══════════════════════════════════════════════════════════════
COMPLETE EXAMPLE - Multiple Relocations:
 
Original file (offsets 0x0000-based):
  0x0000: [header]
  0x0100: A1 40 01 00 00    ; MOV EAX, [0x0140]    <- reloc #1
  0x0105: 03 05 44 01 00 00 ; ADD EAX, [0x0144]    <- reloc #2
  0x010B: A3 48 01 00 00    ; MOV [0x0148], EAX    <- reloc #3
  0x0140: 64 00 00 00       ; data = 100
  0x0144: 0A 00 00 00       ; increment = 10
  0x0148: 00 00 00 00       ; result = 0
 
Loaded at base 0x00400000:
  0x400100: A1 40 41 40 00    ; MOV EAX, [0x00404140]  ✓
  0x400105: 03 05 44 41 40 00 ; ADD EAX, [0x00404144]  ✓
  0x40010B: A3 48 41 40 00    ; MOV [0x00404148], EAX  ✓
  0x400140: 64 00 00 00       ; data (no relocation - it's data)
  0x400144: 0A 00 00 00       ; increment (no relocation)
  0x400148: 00 00 00 00       ; result (no relocation)

Hardware Support: Base and Relocation Registers

While software-only relocation (patching addresses at load time) works, some systems use hardware assistance to simplify the process. The base register (or relocation register) provides a more elegant solution for certain architectures.

Software-only relocation:

Loader modifies actual instruction bytes in memory
Once loaded, the code contains final absolute addresses
Simple but requires write access to code pages

Hardware-assisted relocation:

Program runs with original relative addresses unchanged
CPU adds base register value to every memory reference automatically
Code remains position-independent in memory

Software Relocation:

Memory:
┌─────────────────────────┐
│ MOV EAX, [0x00410040]   │ ← Patched
│ ... (modified code)     │
└─────────────────────────┘
Base register: not used

CPU fetches instruction, uses
address 0x00410040 directly.

Addresses are modified once at load time.

Hardware-Assisted Relocation:

Memory:
┌─────────────────────────┐
│ MOV EAX, [0x00000040]   │ ← Original
│ ... (unmodified code)   │
└─────────────────────────┘
Base register: 0x00410000

CPU adds base register:
Effective = 0x00000040 + 0x00410000
         = 0x00410040

Addresses translated on every access.

The base register mechanism:

Systems with base register support (like early IBM mainframes and some microprocessors) include a special register that holds the program's load address. Every memory reference in the program is automatically added to this register value by the hardware.

Advantages of hardware-assisted relocation:

No code modification — The executable in memory remains identical to the file, enabling read-only code pages
Faster loading — The loader only sets the base register; no patching required
Context switch integration — Different processes can share the same code in memory with different base register values

Disadvantages:

Runtime overhead — Every memory access requires an addition (though often pipelined)
Complexity — Requires hardware support not universally available
Limited flexibility — All addresses must be base-relative; can't easily support discontiguous memory regions

Historical Hardware: IBM System/360

The IBM System/360 (1964) featured base registers specifically for relocation. Programs used base+offset addressing, where the base register contained the load address. This hardware innovation was crucial for OS/360's multiprogramming capabilities. The concept evolved into segment registers in x86 processors and forms the conceptual ancestor of modern virtual memory.

Static Linking and Load-time Binding

Load-time binding works in conjunction with the linker to create complete executables. Understanding the relationship between linking and loading clarifies the full lifecycle of relocatable code.

The linker's role:

Multiple independently compiled object files (each with their own relocation tables) are combined by the linker:

Merge sections — Combine all .text sections, all .data sections, etc.
Resolve symbols — Match references in one file to definitions in another
Update relocation entries — Adjust offsets to reflect merged positions
Produce executable — Create a single file ready for the loader

Linking Multiple Object Files

Diagram

STATIC LINKING PROCESS WITH LOAD-TIME BINDING
═══════════════════════════════════════════════════════════════
 
Object File A (main.o):                Object File B (math.o):
┌────────────────────────────┐        ┌────────────────────────────┐
│ .text:                     │        │ .text:                     │
│   0x0000: main()           │        │   0x0000: add()            │
│   0x0020: call add ← UNDEF │        │   0x0010: multiply()       │
│ .data:                     │        │ .data:                     │
│   0x0000: x = 10           │        │   0x0000: pi = 3.14        │
│ Relocation:                │        │ Relocation:                │
│   0x0021: needs 'add' addr │        │   (internal refs only)     │
│ Symbols:                   │        │ Symbols:                   │
│   main: defined            │        │   add: defined             │
│   add:  undefined          │        │   multiply: defined        │
└────────────────────────────┘        └────────────────────────────┘
            │                                      │
            └──────────────┬───────────────────────┘
                           ▼
                    LINKER STEP
            ┌─────────────────────────┐
            │ 1. Merge .text sections │
            │ 2. Merge .data sections │
            │ 3. Resolve 'add' symbol │
            │ 4. Update relocations   │
            │ 5. Output executable    │
            └─────────────────────────┘
                           ▼
            LINKED EXECUTABLE (a.out):
┌───────────────────────────────────────────────────────────────┐
│ .text (merged):                                               │
│   0x0000: main()               (from A)                       │
│   0x0020: call 0x0040         (resolved! add is at 0x0040)   │
│   0x0040: add()               (from B)                        │
│   0x0050: multiply()          (from B)                        │
│ .data (merged):                                               │
│   0x0060: x = 10              (from A)                        │
│   0x0064: pi = 3.14           (from B)                        │
│ Relocation Table:                                             │
│   (entries for all remaining absolute addresses)              │
│   0x0004: absolute ref to .data                               │
│   ...                                                         │
└───────────────────────────────────────────────────────────────┘
                           │
                           ▼ LOAD TIME
┌───────────────────────────────────────────────────────────────┐
│ Loader allocates at 0x00100000, applies relocations           │
│ All offsets become: offset + 0x00100000                       │
│ Program executes from 0x00100000                              │
└───────────────────────────────────────────────────────────────┘

Static linking characteristics:

Aspect	Description
Library inclusion	Entire library code copied into executable
Executable size	Larger (contains all library code)
Relocation	All relocations resolved at load time
Dependency	Self-contained, no external dependencies
Update libraries	Requires relinking executable
Load time	Faster (no library searching needed)

Limitations of Load-time Binding

Load-time binding significantly improved upon compile-time binding, but it still has important limitations that drove the development of execution-time binding.

Limitations of Load-time Binding

•Fixed once loaded — After loading, addresses cannot change. If the OS needs to move a process (for compaction or memory management), all addresses become invalid. The process would need to be terminated and reloaded.
•Memory fragmentation — As processes load and terminate, memory becomes fragmented. New processes may not find contiguous space, even if enough total memory exists. Compaction requires moving processes, which load-time binding doesn't support.
•No code sharing — Each process loads its own copy of libraries. Ten processes using the same library need ten copies in memory. This wastes significant memory and cache space.
•Relocation overhead at load time — Processing the relocation table adds latency to program startup. For large programs with many relocations, this delay is noticeable.
•Limited security — While better than compile-time (different processes load at different addresses), the address doesn't change during execution. Attackers still have a window to discover addresses after loading.
•Cannot exceed physical memory — A program must fit entirely in physical memory. Virtual memory's ability to exceed physical memory limits requires execution-time binding.

The fragmentation problem visualized:

Initial: Three processes loaded
 ┌──────────┬──────────┬──────────┬─────────────────────┐
 │ Process A│ Process B│ Process C│      Free           │
 │ (20 KB)  │ (30 KB)  │ (25 KB)  │     (25 KB)         │
 └──────────┴──────────┴──────────┴─────────────────────┘

After B terminates:
 ┌──────────┬──────────┬──────────┬─────────────────────┐
 │ Process A│   Free   │ Process C│      Free           │
 │ (20 KB)  │  (30 KB) │ (25 KB)  │     (25 KB)         │
 └──────────┴──────────┴──────────┴─────────────────────┘
              ↑
              Hole!

Problem: A new 40 KB process cannot load!
         Total free = 55 KB, but largest contiguous = 30 KB

With load-time binding: Cannot move C to create larger contiguous space
                        (C's addresses are fixed)

With execution-time binding: Can move C, update address translation
                             (addresses resolved on every access)

The Fundamental Constraint

The core limitation of load-time binding is that binding happens exactly once — at load time. Any subsequent change in memory layout invalidates the bound addresses. Execution-time binding removes this constraint by resolving addresses continuously, on every memory access.

Real-World Examples of Load-time Binding

Load-time binding has been implemented in various systems throughout computing history and continues to exist in certain contexts today.

Historical and Modern Examples

•MS-DOS .EXE format — Unlike the simpler .COM format, DOS .EXE files include a relocation table. The DOS loader processes this table to load executables at available memory addresses. This enabled DOS to run larger programs and multiple TSR (Terminate and Stay Resident) programs.
•Windows PE (without ASLR) — Pre-Vista Windows PE executables without ASLR used primarily load-time binding. The executable had a preferred base address, and relocation occurred only if that address was unavailable. System DLLs were often 'rebased' to avoid conflicts.
•Classic Mac OS (68K) — Early Macintosh executables used segmented code with load-time relocation. The Segment Loader handled loading code segments and resolving addresses.
•Static executables — Modern statically-linked executables on any platform use load-time binding. All library code is included, and the loader resolves any remaining relocations at startup.
•Simple embedded systems with RTOS — Some real-time operating systems use load-time binding for application loading. Tasks are loaded at available addresses and relocated once.
•Object files (.o, .obj) — All compiled object files use relocatable addressing. The linker performs what could be called 'link-time binding,' and any remaining relocations occur at load time.

Examining Relocations (Linux)
Shell
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# View relocation entries in an object file
$ readelf --relocs main.o
 
Relocation section '.rela.text' at offset 0x240 contains 3 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000004  000500000002 R_X86_64_PC32     0000000000000000 global_var - 4
00000000000b  000600000002 R_X86_64_PC32     0000000000000000 helper_func - 4
000000000015  000700000004 R_X86_64_PLT32    0000000000000000 printf - 4
 
# View relocations in an executable (if not fully position-independent)
$ readelf --relocs ./a.out
 
# For statically linked executables, you'll see load-time relocations
# For dynamically linked PIE (position-independent) executables, most 
# relocations are runtime (execution-time binding)

The transition to execution-time binding:

Modern systems predominantly use execution-time binding with virtual memory and MMUs. However, load-time binding concepts persist:

Static executables still use load-time binding
The initial relocation of dynamically-linked executables happens at load time
Embedded systems without MMUs rely on load-time binding
Understanding load-time binding is essential for understanding the evolution to virtual memory

Summary: Load-time Binding

We've thoroughly explored load-time binding—the mechanism that freed programs from fixed memory addresses and enabled multiprogramming. Let's consolidate the key concepts:

Key Takeaways

•Load-time binding defers address resolution to program loading — Instead of embedding absolute addresses at compile time, the compiler generates relocatable code with offsets.
•The relocation table is essential — It identifies every address in the executable that needs adjustment, enabling the loader to patch addresses based on the actual load location.
•Load-time binding enabled multiprogramming — Programs could load at any available address, allowing multiple programs to coexist in memory for the first time.
•Hardware support (base registers) can optimize relocation — Some architectures add the base address on every memory access, eliminating the need to modify code.
•Static linking works with load-time binding — Multiple object files are merged, and remaining relocations are processed at load time.
•Addresses are fixed after loading — Unlike execution-time binding, load-time binding cannot accommodate memory reorganization during execution.
•Fragmentation remains a problem — Without the ability to move loaded programs, memory fragments over time, limiting effective utilization.

What's Next:

Load-time binding solved the multiprogramming problem but introduced new limitations around memory management and security. Execution-time binding represents the next evolution—addresses are translated on every memory access, enabling virtual memory, process relocation, memory sharing, and modern security features like ASLR. We'll explore this powerful mechanism next.

Page Complete

You now understand load-time binding—how relocatable code works, the role of the relocation table and loader, and why this mechanism was pivotal for multiprogramming. You can trace the loading process and understand both the capabilities and limitations that drove the evolution to execution-time binding.

2 / 5

Loading learning content...

Operating SystemsAddress Binding

Address Binding: From Source to Execution

LevelIntermediate

Duration75 mins

TopicAddress Binding

2 / 5

Load-time Binding

The Flexibility Revolution

What You Will Learn

The Multiprogramming Imperative

Before examining load-time binding mechanics, we must understand why it was necessary. The answer lies in multiprogramming—the ability to run multiple programs concurrently in memory.

The CPU utilization problem:

Early computers ran one program at a time. When a program waited for I/O (reading a tape, waiting for operator input), the expensive CPU sat idle. Consider a typical program's execution timeline:

Single-Program Execution

Timeline

Single-Program Execution (Compile-time Binding):
 
Time →  0    10   20   30   40   50   60   70   80   90   100ms
        ├────┴────┴────┴────┴────┴────┴────┴────┴────┴────┤
CPU:    [COMPUTE][  I/O WAIT  ][COMPUTE][   I/O WAIT    ][COMPUTE]
        ████████░░░░░░░░░░░░░░░████████░░░░░░░░░░░░░░░░░░░████████
 
CPU Utilization: ~30% (actively computing only 30% of total time)
                The CPU idles during I/O operations.
 
════════════════════════════════════════════════════════════════════
 
Multiprogrammed Execution (Load-time Binding enables this):
 
Time →  0    10   20   30   40   50   60   70   80   90   100ms
        ├────┴────┴────┴────┴────┴────┴────┴────┴────┴────┤
CPU:    [Program A][  B   ][  A  ][  C  ][  B  ][  A  ][  C  ]
        ███████████████████████████████████████████████████████████
 
CPU Utilization: ~90%+ (while A waits for I/O, run B or C)
                Multiple programs share the CPU efficiently.

The memory problem:

Solutions considered:

Fixed memory partitions: Divide memory into regions; compile each program for a specific region
- Problem: Wastes memory, limits program sizes, requires pre-planning
Swapping: Load one program, run it, swap it out, load another
- Problem: Slow due to disk I/O, doesn't truly enable multiprogramming
Relocatable code with load-time binding: Let programs run from any address
- Solution: Flexible, efficient, enables true multiprogramming

Load-time binding emerged as the principled solution to the multiprogramming memory problem.

Economic Pressure

The Concept of Relocatable Code

The key insight:

Creating relocatable code:

Absolute vs. Relocatable Address Representation
Aspect	Compile-time Binding (Absolute)	Load-time Binding (Relocatable)
Address in executable	0x00401040 (final physical)	0x00000040 (offset from base)
Contains relocation info?	No	Yes — relocation table
Load address known	At compile time	At load time
Can load at different addresses?	No	Yes
Code in executable	MOV EAX, [0x00401040]	MOV EAX, [0x00000040] + relocation entry

Compiler's role in load-time binding:

The compiler in a load-time binding system:

Generates code assuming the program starts at address 0 (or some canonical base)
Uses offsets from this base for all addresses
Creates a relocation table listing every location in the code that contains an address needing adjustment
Outputs an object file with the code, data, symbol information, and relocation table

Relocatable Object File Structure

Structure

┌─────────────────────────────────────────────────────────────────┐
│                    RELOCATABLE OBJECT FILE                       │
├─────────────────────────────────────────────────────────────────┤
│ HEADER                                                          │
│   - Magic number identifying file format                        │
│   - Section sizes and offsets                                   │
│   - Entry point (as offset from base)                           │
├─────────────────────────────────────────────────────────────────┤
│ TEXT SECTION (Code)                                             │
│   - Machine instructions                                        │
│   - Addresses expressed as offsets (e.g., 0x0040 not 0x401040)  │
│   0x0000: PUSH EBP                                              │
│   0x0001: MOV EBP, ESP                                          │
│   0x0003: MOV EAX, [0x0040]    ← offset, needs relocation       │
│   ...                                                           │
├─────────────────────────────────────────────────────────────────┤
│ DATA SECTION                                                    │
│   - Initialized global variables                                │
│   0x0040: .long 100            (data = 100)                     │
├─────────────────────────────────────────────────────────────────┤
│ BSS SECTION                                                     │
│   - Uninitialized globals (just size, no content)               │
│   size: 4 bytes                                                 │
├─────────────────────────────────────────────────────────────────┤
│ SYMBOL TABLE                                                    │
│   main:    offset 0x0000 (function)                             │
│   process: offset 0x0020 (function)                             │
│   data:    offset 0x0040 (variable, data section)               │
│   result:  offset 0x0044 (variable, bss section)                │
├─────────────────────────────────────────────────────────────────┤
│ RELOCATION TABLE                                               │
│   Entry 1: offset 0x0004, type: absolute32, symbol: data        │
│   Entry 2: offset 0x0010, type: absolute32, symbol: result      │
│   Entry 3: offset 0x0025, type: relative32, symbol: process     │
│   ...                                                           │
│   (Lists EVERY address that needs adjustment at load time)      │
└─────────────────────────────────────────────────────────────────┘

The Relocation Table: Heart of Load-time Binding

Structure of a relocation entry:

Each entry in the relocation table typically contains:

Relocation Entry Components

•Offset: Location within the section (or file) containing the address to be fixed up
•Type: The kind of relocation (absolute, PC-relative, etc.) indicating how to apply the adjustment
•Symbol: The symbol this address refers to (for external references) or section (for internal references)
•Addend: An additional constant to add (in some formats, embedded in the instruction)

Relocation Table Example

Relocation Data

Example Program's Relocation Table:
 
Code section (showing addresses that need relocation):
───────────────────────────────────────────────────────
0x0000: 55              ; PUSH EBP
0x0001: 89 E5           ; MOV EBP, ESP
0x0003: A1 40 00 00 00  ; MOV EAX, [0x00000040]  ← Address at offset 0x0004
        ^^ ^^^^^^^^^^^^
        opcode  address needs relocation!
0x0008: 83 C0 0A        ; ADD EAX, 10
0x000B: A3 44 00 00 00  ; MOV [0x00000044], EAX  ← Address at offset 0x000C
           ^^^^^^^^^^^^
           address needs relocation!
0x0010: E8 10 00 00 00  ; CALL offset 0x10       ← Relative, may need adjustment
           ^^^^^^^^^^^^
           relative offset (PC-relative calls often don't need relocation)
 
Relocation Table:
───────────────────────────────────────────────────────
╔══════════╦══════════════════╦═════════════╦═════════════════════════╗
║ Offset   ║ Type             ║ Symbol      ║ Description             ║
╠══════════╬══════════════════╬═════════════╬═════════════════════════╣
║ 0x0004   ║ R_386_32         ║ .data       ║ Absolute 32-bit address ║
║          ║ (absolute)       ║             ║ → Add base to this word ║
╠══════════╬══════════════════╬═════════════╬═════════════════════════╣
║ 0x000C   ║ R_386_32         ║ .bss        ║ Absolute 32-bit address ║
║          ║ (absolute)       ║             ║ → Add base to this word ║
╠══════════╬══════════════════╬═════════════╬═════════════════════════╣
║ 0x0011   ║ R_386_PC32       ║ process     ║ PC-relative call        ║
║          ║ (PC-relative)    ║             ║ → Adjust if external    ║
╚══════════╩══════════════════╩═════════════╩═════════════════════════╝

Why Mark Every Address?

Types of relocations:

Different instruction encodings require different relocation types:

Relocation Type	Description	When Used	Adjustment Formula
R_386_32 (absolute)	32-bit absolute address	Direct memory access	S + A + B (symbol + addend + base)
R_386_PC32 (PC-relative)	32-bit PC-relative	Near calls, near jumps	S + A - P (symbol + addend - location)
R_386_GOT32	GOT entry	Position-independent	Points into Global Offset Table
R_386_PLT32	PLT entry	External function calls	Points into Procedure Linkage Table

Where: S = symbol value, A = addend, B = base address, P = place being relocated

The Loading Process in Detail

Loading Process Steps

•Read executable header — The loader reads the executable file header to determine section sizes, entry point offset, and relocation table location.
•Allocate memory — The loader requests memory from the operating system. The OS finds a suitable free region and returns the base address (e.g., 0x10000).
•Load sections into memory — Code (.text), initialized data (.data), and BSS segments are loaded at contiguous addresses starting from the base.
•Process relocation table — For each relocation entry, the loader calculates the final address and patches the corresponding bytes in memory.
•Resolve external symbols — If the program uses external libraries, their addresses are resolved (static linking) or marked for runtime resolution (dynamic linking).
•Initialize BSS — The BSS section (uninitialized data) is zeroed or handled per platform convention.
•Transfer control — The loader jumps to the entry point (base address + entry offset) to begin program execution.

Relocation Process Example

Example

RELOCATION PROCESS WALKTHROUGH
═══════════════════════════════════════════════════════════════
 
Given:
  - Executable compiled with base address 0x00000000 (relocatable)
  - OS allocates memory starting at 0x00010000 (actual load address)
  - Relocation table entry: offset 0x0004, type: absolute32
 
Before Relocation (as stored in file):
  0x0003: A1 40 00 00 00    ; MOV EAX, [0x00000040]
              ^^^^^^^^^^^^
              Address 0x00000040 (relative to base 0)
 
Relocation Calculation:
  ┌─────────────────────────────────────────────────────────┐
  │  New Address = Old Address + Load Base                   │
  │                                                          │
  │  New Address = 0x00000040 + 0x00010000                   │
  │              = 0x00010040                                 │
  └─────────────────────────────────────────────────────────┘
 
After Relocation (in memory):
  0x10003: A1 40 00 01 00   ; MOV EAX, [0x00010040]
               ^^^^^^^^^^^^
               Address patched to 0x00010040 (little-endian: 40 00 01 00)
 
The instruction now correctly references the data at its loaded location!
 
═══════════════════════════════════════════════════════════════
COMPLETE EXAMPLE - Multiple Relocations:
 
Original file (offsets 0x0000-based):
  0x0000: [header]
  0x0100: A1 40 01 00 00    ; MOV EAX, [0x0140]    <- reloc #1
  0x0105: 03 05 44 01 00 00 ; ADD EAX, [0x0144]    <- reloc #2
  0x010B: A3 48 01 00 00    ; MOV [0x0148], EAX    <- reloc #3
  0x0140: 64 00 00 00       ; data = 100
  0x0144: 0A 00 00 00       ; increment = 10
  0x0148: 00 00 00 00       ; result = 0
 
Loaded at base 0x00400000:
  0x400100: A1 40 41 40 00    ; MOV EAX, [0x00404140]  ✓
  0x400105: 03 05 44 41 40 00 ; ADD EAX, [0x00404144]  ✓
  0x40010B: A3 48 41 40 00    ; MOV [0x00404148], EAX  ✓
  0x400140: 64 00 00 00       ; data (no relocation - it's data)
  0x400144: 0A 00 00 00       ; increment (no relocation)
  0x400148: 00 00 00 00       ; result (no relocation)

Hardware Support: Base and Relocation Registers

Software-only relocation:

Loader modifies actual instruction bytes in memory
Once loaded, the code contains final absolute addresses
Simple but requires write access to code pages

Hardware-assisted relocation:

Program runs with original relative addresses unchanged
CPU adds base register value to every memory reference automatically
Code remains position-independent in memory

Software Relocation:

Memory:
┌─────────────────────────┐
│ MOV EAX, [0x00410040]   │ ← Patched
│ ... (modified code)     │
└─────────────────────────┘
Base register: not used

CPU fetches instruction, uses
address 0x00410040 directly.

Addresses are modified once at load time.

Hardware-Assisted Relocation:

Memory:
┌─────────────────────────┐
│ MOV EAX, [0x00000040]   │ ← Original
│ ... (unmodified code)   │
└─────────────────────────┘
Base register: 0x00410000

CPU adds base register:
Effective = 0x00000040 + 0x00410000
         = 0x00410040

Addresses translated on every access.

The base register mechanism:

Advantages of hardware-assisted relocation:

No code modification — The executable in memory remains identical to the file, enabling read-only code pages
Faster loading — The loader only sets the base register; no patching required
Context switch integration — Different processes can share the same code in memory with different base register values

Disadvantages:

Runtime overhead — Every memory access requires an addition (though often pipelined)
Complexity — Requires hardware support not universally available
Limited flexibility — All addresses must be base-relative; can't easily support discontiguous memory regions

Historical Hardware: IBM System/360

Static Linking and Load-time Binding

Load-time binding works in conjunction with the linker to create complete executables. Understanding the relationship between linking and loading clarifies the full lifecycle of relocatable code.

The linker's role:

Multiple independently compiled object files (each with their own relocation tables) are combined by the linker:

Merge sections — Combine all .text sections, all .data sections, etc.
Resolve symbols — Match references in one file to definitions in another
Update relocation entries — Adjust offsets to reflect merged positions
Produce executable — Create a single file ready for the loader

Linking Multiple Object Files

Diagram

STATIC LINKING PROCESS WITH LOAD-TIME BINDING
═══════════════════════════════════════════════════════════════
 
Object File A (main.o):                Object File B (math.o):
┌────────────────────────────┐        ┌────────────────────────────┐
│ .text:                     │        │ .text:                     │
│   0x0000: main()           │        │   0x0000: add()            │
│   0x0020: call add ← UNDEF │        │   0x0010: multiply()       │
│ .data:                     │        │ .data:                     │
│   0x0000: x = 10           │        │   0x0000: pi = 3.14        │
│ Relocation:                │        │ Relocation:                │
│   0x0021: needs 'add' addr │        │   (internal refs only)     │
│ Symbols:                   │        │ Symbols:                   │
│   main: defined            │        │   add: defined             │
│   add:  undefined          │        │   multiply: defined        │
└────────────────────────────┘        └────────────────────────────┘
            │                                      │
            └──────────────┬───────────────────────┘
                           ▼
                    LINKER STEP
            ┌─────────────────────────┐
            │ 1. Merge .text sections │
            │ 2. Merge .data sections │
            │ 3. Resolve 'add' symbol │
            │ 4. Update relocations   │
            │ 5. Output executable    │
            └─────────────────────────┘
                           ▼
            LINKED EXECUTABLE (a.out):
┌───────────────────────────────────────────────────────────────┐
│ .text (merged):                                               │
│   0x0000: main()               (from A)                       │
│   0x0020: call 0x0040         (resolved! add is at 0x0040)   │
│   0x0040: add()               (from B)                        │
│   0x0050: multiply()          (from B)                        │
│ .data (merged):                                               │
│   0x0060: x = 10              (from A)                        │
│   0x0064: pi = 3.14           (from B)                        │
│ Relocation Table:                                             │
│   (entries for all remaining absolute addresses)              │
│   0x0004: absolute ref to .data                               │
│   ...                                                         │
└───────────────────────────────────────────────────────────────┘
                           │
                           ▼ LOAD TIME
┌───────────────────────────────────────────────────────────────┐
│ Loader allocates at 0x00100000, applies relocations           │
│ All offsets become: offset + 0x00100000                       │
│ Program executes from 0x00100000                              │
└───────────────────────────────────────────────────────────────┘

Static linking characteristics:

Aspect	Description
Library inclusion	Entire library code copied into executable
Executable size	Larger (contains all library code)
Relocation	All relocations resolved at load time
Dependency	Self-contained, no external dependencies
Update libraries	Requires relinking executable
Load time	Faster (no library searching needed)

Limitations of Load-time Binding

Load-time binding significantly improved upon compile-time binding, but it still has important limitations that drove the development of execution-time binding.

Limitations of Load-time Binding

•Fixed once loaded — After loading, addresses cannot change. If the OS needs to move a process (for compaction or memory management), all addresses become invalid. The process would need to be terminated and reloaded.
•Memory fragmentation — As processes load and terminate, memory becomes fragmented. New processes may not find contiguous space, even if enough total memory exists. Compaction requires moving processes, which load-time binding doesn't support.
•No code sharing — Each process loads its own copy of libraries. Ten processes using the same library need ten copies in memory. This wastes significant memory and cache space.
•Relocation overhead at load time — Processing the relocation table adds latency to program startup. For large programs with many relocations, this delay is noticeable.
•Limited security — While better than compile-time (different processes load at different addresses), the address doesn't change during execution. Attackers still have a window to discover addresses after loading.
•Cannot exceed physical memory — A program must fit entirely in physical memory. Virtual memory's ability to exceed physical memory limits requires execution-time binding.

The fragmentation problem visualized:

Initial: Three processes loaded
 ┌──────────┬──────────┬──────────┬─────────────────────┐
 │ Process A│ Process B│ Process C│      Free           │
 │ (20 KB)  │ (30 KB)  │ (25 KB)  │     (25 KB)         │
 └──────────┴──────────┴──────────┴─────────────────────┘

After B terminates:
 ┌──────────┬──────────┬──────────┬─────────────────────┐
 │ Process A│   Free   │ Process C│      Free           │
 │ (20 KB)  │  (30 KB) │ (25 KB)  │     (25 KB)         │
 └──────────┴──────────┴──────────┴─────────────────────┘
              ↑
              Hole!

Problem: A new 40 KB process cannot load!
         Total free = 55 KB, but largest contiguous = 30 KB

With load-time binding: Cannot move C to create larger contiguous space
                        (C's addresses are fixed)

With execution-time binding: Can move C, update address translation
                             (addresses resolved on every access)

The Fundamental Constraint

Real-World Examples of Load-time Binding

Load-time binding has been implemented in various systems throughout computing history and continues to exist in certain contexts today.

Historical and Modern Examples

•MS-DOS .EXE format — Unlike the simpler .COM format, DOS .EXE files include a relocation table. The DOS loader processes this table to load executables at available memory addresses. This enabled DOS to run larger programs and multiple TSR (Terminate and Stay Resident) programs.
•Windows PE (without ASLR) — Pre-Vista Windows PE executables without ASLR used primarily load-time binding. The executable had a preferred base address, and relocation occurred only if that address was unavailable. System DLLs were often 'rebased' to avoid conflicts.
•Classic Mac OS (68K) — Early Macintosh executables used segmented code with load-time relocation. The Segment Loader handled loading code segments and resolving addresses.
•Static executables — Modern statically-linked executables on any platform use load-time binding. All library code is included, and the loader resolves any remaining relocations at startup.
•Simple embedded systems with RTOS — Some real-time operating systems use load-time binding for application loading. Tasks are loaded at available addresses and relocated once.
•Object files (.o, .obj) — All compiled object files use relocatable addressing. The linker performs what could be called 'link-time binding,' and any remaining relocations occur at load time.

Examining Relocations (Linux)
Shell
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# View relocation entries in an object file
$ readelf --relocs main.o
 
Relocation section '.rela.text' at offset 0x240 contains 3 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000000004  000500000002 R_X86_64_PC32     0000000000000000 global_var - 4
00000000000b  000600000002 R_X86_64_PC32     0000000000000000 helper_func - 4
000000000015  000700000004 R_X86_64_PLT32    0000000000000000 printf - 4
 
# View relocations in an executable (if not fully position-independent)
$ readelf --relocs ./a.out
 
# For statically linked executables, you'll see load-time relocations
# For dynamically linked PIE (position-independent) executables, most 
# relocations are runtime (execution-time binding)

The transition to execution-time binding:

Modern systems predominantly use execution-time binding with virtual memory and MMUs. However, load-time binding concepts persist:

Static executables still use load-time binding
The initial relocation of dynamically-linked executables happens at load time
Embedded systems without MMUs rely on load-time binding
Understanding load-time binding is essential for understanding the evolution to virtual memory

Summary: Load-time Binding

We've thoroughly explored load-time binding—the mechanism that freed programs from fixed memory addresses and enabled multiprogramming. Let's consolidate the key concepts:

Key Takeaways

•Load-time binding defers address resolution to program loading — Instead of embedding absolute addresses at compile time, the compiler generates relocatable code with offsets.
•The relocation table is essential — It identifies every address in the executable that needs adjustment, enabling the loader to patch addresses based on the actual load location.
•Load-time binding enabled multiprogramming — Programs could load at any available address, allowing multiple programs to coexist in memory for the first time.
•Hardware support (base registers) can optimize relocation — Some architectures add the base address on every memory access, eliminating the need to modify code.
•Static linking works with load-time binding — Multiple object files are merged, and remaining relocations are processed at load time.
•Addresses are fixed after loading — Unlike execution-time binding, load-time binding cannot accommodate memory reorganization during execution.
•Fragmentation remains a problem — Without the ability to move loaded programs, memory fragments over time, limiting effective utilization.

What's Next:

Page Complete

2 / 5