Linkers And Loaders - Learning Module

Loading content...

0/227

Linking (Static and Dynamic)

The Art of Assembling the Pieces

Object files are incomplete—they contain machine code with "holes" where external references need to be filled in. The linker is the sophisticated tool that combines multiple object files and libraries into a complete, executable program. Without the linker, separate compilation would be impossible, and every program would need to be written as a single monolithic source file.

Linking comes in two fundamental flavors:

Static linking: Combines all code into a single executable at build time
Dynamic linking: Defers linking of shared libraries until runtime

The choice between these approaches involves tradeoffs in file size, memory usage, load time, update flexibility, and security. Understanding both mechanisms is essential for operating systems internals, performance optimization, and debugging library issues.

What You Will Learn

By the end of this page, you will understand how the linker resolves symbols, performs relocations, and produces executables. You'll master both static linking (archive libraries) and dynamic linking (shared objects), including the PLT/GOT mechanism that enables position-independent code.

What the Linker Does

The linker (ld on Unix, link.exe on Windows) performs several critical operations to transform a collection of object files into an executable:

1. Symbol Resolution

The linker builds a global symbol table from all input files, matching every undefined symbol reference to exactly one definition. This process must handle:

Strong symbols (functions, initialized globals): Exactly one definition allowed
Weak symbols: Can be overridden by strong symbols
Common symbols: Uninitialized globals that can be merged

2. Section Merging

The linker combines same-named sections from all object files:

All .text sections → single executable code section
All .data sections → single initialized data section
All .bss sections → single zero-initialized section

3. Relocation

After determining final addresses, the linker processes every relocation entry, computing and patching the correct addresses into the code.

4. Output Generation

Finally, the linker produces either:

An executable (fully linked, ready to run)
A shared library (dynamically linkable at runtime)
A relocatable object (partially linked, for further linking)

Converting Mermaid diagram...

Basic Linking Commands
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Link object files into executable
gcc main.o utils.o -o program
 
# Verbose linking (see what's happening)
gcc -v main.o utils.o -o program
 
# Direct linker invocation (not recommended - too many options)
ld -dynamic-linker /lib64/ld-linux-x86-64.so.2 \
   /usr/lib/x86_64-linux-gnu/crt1.o \
   /usr/lib/x86_64-linux-gnu/crti.o \
   main.o utils.o \
   -lc \
   /usr/lib/x86_64-linux-gnu/crtn.o \
   -o program

Linker Error Patterns

Two common linker errors: 'undefined reference' means a symbol is used but never defined anywhere. 'multiple definition' means the same symbol is defined in multiple files. Understanding these helps diagnose linking problems quickly.

Symbol Resolution in Depth

Symbol resolution is surprisingly nuanced. The linker must follow specific rules to handle the complex scenarios that arise in real software:

Strong vs. Weak Symbols

Strong symbols are:

Function definitions
Initialized global variables

Weak symbols are:

Uninitialized global variables (in some compilers)
Explicitly marked weak with __attribute__((weak))

Symbol Resolution Rules
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// Rule 1: Multiple strong symbols → ERROR
// file1.c
int counter = 0;    // Strong
 
// file2.c  
int counter = 0;    // Strong → "multiple definition of 'counter'" error
 
// Rule 2: Strong + Weak → Strong wins
// file1.c
int value = 42;     // Strong
 
// file2.c
int value;          // Weak (uninitialized) → ignored, value = 42
 
// Rule 3: Multiple weak symbols → Linker picks one (arbitrary)
// file1.c
int x;              // Weak
 
// file2.c
int x;              // Weak → linker picks one, size may vary! (dangerous)

The Danger of Weak Symbol Merging

When multiple weak symbols exist, the linker picks one arbitrarily. If they have different sizes (e.g., int x in one file, double x in another), one definition will be used everywhere—causing subtle memory corruption bugs. Always initialize global variables or use static for file-local data.

Processing Order Matters

The linker processes object files and libraries in command-line order. This has important implications for library linking:

When processing an object file, the linker adds all symbols to its tables
When processing a library, the linker only extracts members that resolve undefined symbols
Once a library is processed, it's not revisited

This means library order matters:

Library Order Demonstration
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# WRONG: library before object file that uses it
gcc -lmath main.o
# Error: undefined reference to 'sin'
# Because: -lmath was processed first, nothing was undefined yet,
#          so nothing was extracted from libmath
 
# CORRECT: object files first, then libraries
gcc main.o -lmath
# Works: main.o processed first, 'sin' is undefined,
#        then libmath is searched, sin definition extracted
 
# For circular dependencies: list libraries multiple times
gcc main.o -lA -lB -lA
# If libA needs something from libB and vice versa

Static Linking: Complete Self-Containment

Static linking combines all necessary code into a single executable at build time. The resulting executable is self-contained—it doesn't depend on external library files at runtime.

Static Libraries (Archives)

Static libraries (.a on Unix, .lib on Windows) are simply archives of object files. The ar tool creates and manages them:

Creating and Using Static Libraries
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# Create object files
gcc -c math_utils.c -o math_utils.o
gcc -c string_utils.c -o string_utils.o
 
# Create static library (archive)
ar rcs libmyutils.a math_utils.o string_utils.o
# r = insert/replace, c = create, s = add symbol index
 
# View archive contents
ar -t libmyutils.a
# math_utils.o
# string_utils.o
 
# View symbols in archive
nm libmyutils.a
 
# Link against static library
gcc main.c -L. -lmyutils -o program
# -L. adds current directory to library search path
# -lmyutils searches for libmyutils.a (or .so)
 
# Force static linking (even if .so exists)
gcc main.c -L. -Wl,-Bstatic -lmyutils -Wl,-Bdynamic -o program

How the Linker Processes Archives

The linker treats archives differently from object files:

Scan the archive's symbol index (created by ar s)
Check for needed symbols that resolve undefined references
Extract only the required object files from the archive
Move on — the archive is not revisited

This selective extraction is why archives can be large (containing many functions) without bloating every executable that uses them. Only the functions actually called get linked in.

Static Linking Advantages

•Self-contained — No external dependencies at runtime
•Faster startup — No dynamic linking overhead
•Simpler deployment — Single file to distribute
•Version isolation — Library updates don't affect existing executables
•Predictable behavior — No DLL hell or shared library conflicts

Static Linking Disadvantages

•Larger executables — Each program contains its own copy of library code
•Memory waste — Same library code duplicated across processes
•Update difficulty — Bug fixes require relinking all affected executables
•Security concerns — Vulnerability patches require rebuilding
•Licensing issues — GPL libraries may require source distribution

When to Use Static Linking

Static linking is ideal for: embedded systems, portable executables, security-critical applications (minimizing attack surface), Go and Rust binaries (which statically link by default), containers where minimal dependencies are desired.

Dynamic Linking: Shared Libraries

Dynamic linking defers the resolution of shared library symbols until the program is loaded or even later, during execution. Shared libraries (.so on Linux, .dll on Windows, .dylib on macOS) are loaded into memory once and shared across all processes that use them.

Creating Shared Libraries

Shared libraries require position-independent code (PIC) — code that works correctly regardless of where it's loaded in memory:

Creating and Using Shared Libraries
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Compile with PIC (position-independent code)
gcc -c -fPIC math_utils.c -o math_utils.o
gcc -c -fPIC string_utils.c -o string_utils.o
 
# Create shared library
gcc -shared -o libmyutils.so math_utils.o string_utils.o
 
# Set SONAME for versioning (recommended)
gcc -shared -Wl,-soname,libmyutils.so.1 \
    -o libmyutils.so.1.0.0 math_utils.o string_utils.o
 
# Create symlinks for versioning
ln -s libmyutils.so.1.0.0 libmyutils.so.1    # SONAME link
ln -s libmyutils.so.1 libmyutils.so           # Linker link
 
# Link program against shared library
gcc main.c -L. -lmyutils -o program
 
# Run (library must be findable)
LD_LIBRARY_PATH=. ./program
# Or install to standard location: /usr/lib, /usr/local/lib

Position-Independent Code (PIC)

PIC is crucial because shared libraries may be loaded at different virtual addresses in different processes. PIC uses relative addressing instead of absolute addresses:

PC-relative addressing: Code references use offsets from the instruction pointer
Global Offset Table (GOT): A per-process table containing absolute addresses of global symbols
Procedure Linkage Table (PLT): Enables lazy binding of function calls

Non-PIC vs PIC Code Generation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// C code: extern int global_var;
// int get_value() { return global_var; }
 
// WITHOUT PIC (absolute address):
// This would hardcode the address, breaking if loaded elsewhere
mov    $0x601040, %eax    ; Load from fixed address
mov    (%eax), %eax       ; Dereference
 
// WITH PIC (GOT-relative):
// Works regardless of load address
mov    global_var@GOTPCREL(%rip), %rax  ; Get GOT entry address
mov    (%rax), %eax                      ; Load value through GOT
 
// The GOT entry contains the actual runtime address of global_var
// The dynamic linker fills in GOT entries at load time

PIC Overhead

PIC adds a small runtime overhead (extra indirection through GOT/PLT). On x86-64, this is minimal due to RIP-relative addressing. On 32-bit x86, PIC required using a register (typically EBX) as a base pointer, causing more significant overhead.

GOT and PLT: The Dynamic Linking Machinery

The Global Offset Table (GOT) and Procedure Linkage Table (PLT) are the core mechanisms enabling dynamic linking. Understanding them is essential for security research, debugging, and low-level optimization.

Global Offset Table (GOT)

The GOT is a table of pointers, one for each global symbol accessed by the program. Key properties:

Each process has its own GOT (even if sharing code)
Contains absolute addresses of global data and functions
Filled in by the dynamic linker at load time
Located in writable memory (security implications)

Converting Mermaid diagram...

Procedure Linkage Table (PLT)

The PLT enables lazy binding — function addresses are resolved only when first called, reducing startup time. Here's how it works:

First call to a function: PLT stub invokes the dynamic linker
Dynamic linker resolves address: Finds the function in the shared library
GOT updated: The resolved address is written to GOT
Subsequent calls: Go directly through GOT to the function

PLT/GOT Mechanism Explained
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Calling printf via PLT
 
# In main code:
call printf@plt           ; Call PLT stub
 
# PLT stub (printf@plt):
jmp    *printf@GOTPLT(%rip)   ; Jump through GOT entry
pushq  $0                      ; Push relocation index
jmp    .plt                    ; Jump to PLT[0] (resolver)
 
# First call: GOT contains address of "pushq" instruction
# → Falls through to resolver
# → Dynamic linker finds printf, updates GOT
# → Jumps to actual printf
 
# Subsequent calls: GOT contains actual printf address
# → jmp goes directly to printf (no resolver needed)
 
# PLT[0] (resolver trampoline):
pushq  GOT[1](%rip)       ; Push link_map pointer
jmp    *GOT[2](%rip)      ; Call _dl_runtime_resolve

Viewing PLT/GOT (objdump)
1
2
3
4
5
6
7
8
9
10
11
$ objdump -d program | grep -A3 'printf@plt'
0000000000401030 <printf@plt>:
  401030:       ff 25 e2 2f 00 00       jmp    *0x2fe2(%rip)  # 404018
  401036:       68 00 00 00 00          push   $0x0
  40103b:       e9 e0 ff ff ff          jmp    401020 <_init+0x20>
 
$ readelf -r program | grep printf
0000000000404018  R_X86_64_JUMP_SLOT  printf@GLIBC_2.2.5
 
# 404018 is the GOT entry for printf
# JUMP_SLOT relocation means lazy binding will fill it

Security: GOT Overwrites

The GOT is writable, making it a target for attackers. GOT overwrite attacks replace function pointers with attacker-controlled addresses. Mitigations include RELRO (Relocation Read-Only), which marks GOT as read-only after loading. Full RELRO disables lazy binding entirely.

The Dynamic Linker (ld.so)

The dynamic linker (also called the runtime linker or ld.so) is a special shared library that the kernel loads along with your program. It's responsible for:

Loading shared libraries listed in the ELF's dynamic section
Performing relocations for shared library symbols
Handling lazy binding through PLT
Managing library dependencies (loading libraries needed by other libraries)
Symbol versioning for ABI compatibility

Dynamic Linker Identification
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# See which dynamic linker an executable needs
$ readelf -l program | grep interpreter
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
 
# List shared library dependencies
$ ldd program
    linux-vdso.so.1 (0x00007fff...)
    libmyutils.so => ./libmyutils.so (0x00007f...)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f...)
    /lib64/ld-linux-x86-64.so.2 (0x00007f...)
 
# Show detailed library loading
$ LD_DEBUG=libs ./program 2>&1 | head -20
# Shows search paths, found libraries, load order
 
# Show symbol resolution
$ LD_DEBUG=symbols ./program 2>&1 | head -20

Library Search Order

The dynamic linker searches for libraries in this order:

DT_RPATH in executable (deprecated)
LD_LIBRARY_PATH environment variable
DT_RUNPATH in executable (modern replacement for RPATH)
ldconfig cache (/etc/ld.so.cache)
Default paths (/lib, /usr/lib, etc.)

Library Path Configuration
1
2
3
4
5
6
7
8
9
10
11
12
13
# Temporary: Set library path for single run
LD_LIBRARY_PATH=/custom/path ./program
 
# Embed RUNPATH in executable at link time
gcc main.c -L. -lmyutils -Wl,-rpath,'$ORIGIN/lib' -o program
# $ORIGIN expands to directory containing the executable
 
# System-wide: Add to ldconfig
echo "/custom/path" | sudo tee /etc/ld.so.conf.d/mylibs.conf
sudo ldconfig    # Rebuild cache
 
# View ldconfig cache
ldconfig -p | grep myutils

Debugging Library Issues

When programs fail with 'library not found', use ldd to see missing dependencies. Use LD_DEBUG=libs to trace the search process. Common fixes: set LD_LIBRARY_PATH, embed RUNPATH, or install libraries to standard locations.

Static vs Dynamic: Making the Choice

Choosing between static and dynamic linking involves weighing multiple factors. Let's analyze the tradeoffs systematically:

Static vs Dynamic Linking Comparison
Aspect	Static Linking	Dynamic Linking
Executable Size	Larger (includes all library code)	Smaller (libraries separate)
Memory Usage	No sharing between processes	Code pages shared across processes
Load Time	Faster (no runtime linking)	Slower (must load and link libraries)
Deployment	Single file, no dependencies	Must ensure libraries are present
Updates	Recompile needed for library fixes	Library update fixes all programs
Security Patches	Slow rollout (each program rebuilt)	Fast rollout (one library update)
Symbol Resolution	All at build time	Mix of build-time and run-time

When to Choose Static Linking

Embedded systems with no shared library infrastructure
Single-binary deployments (containers, serverless)
Security-critical applications where minimizing dependencies is paramount
Performance-critical hotpaths where PLT overhead matters
Portable executables that must run on multiple Linux distributions

When to Choose Dynamic Linking

System utilities that leverage system libraries
Desktop applications where memory sharing benefits multiple users
Plugin architectures requiring runtime extensibility
License compliance (LGPL libraries can be dynamically linked without source release)
Large deployments where security updates must propagate quickly

Memory Comparison Example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Compare memory usage: static vs dynamic
 
# Static: 100 processes, each loading 2MB libc
# Memory used: 100 × 2MB = 200MB
 
# Dynamic: 100 processes, shared 2MB libc
# Code sections shared: ~2MB shared
# Data sections per-process: 100 × small
# Memory used: ~2MB + overhead
 
# Check shared memory with pmap
$ pmap <pid> | grep libc
00007f... 2032K r-x-- libc-2.31.so  ← Shared code (marked 'x')
00007f...  20K r---- libc-2.31.so  ← Shared readonly
00007f...   8K rw--- libc-2.31.so  ← Per-process writable

Hybrid Approaches

You can mix static and dynamic linking. Critical libraries can be linked statically for performance/reliability, while less critical ones are dynamically linked. Use -Wl,-Bstatic -lfoo -Wl,-Bdynamic to control linking mode per library.

Summary: Mastering Linking

Linking transforms incomplete object files into executable programs. Whether static or dynamic, linking resolves symbols, merges sections, and produces the final binary that the operating system can load and execute.

Key Takeaways

•The linker's core jobs are symbol resolution, section merging, and relocation—transforming incomplete object files into complete executables.
•Static linking embeds all library code into the executable, creating self-contained binaries at the cost of size and update flexibility.
•Dynamic linking defers library loading until runtime, enabling code sharing and easy updates at the cost of deployment complexity.
•GOT and PLT enable position-independent code, with lazy binding deferring function resolution until first call.
•The dynamic linker (ld.so) loads shared libraries, performs runtime relocations, and manages the entire dynamic linking process.
•Symbol resolution rules matter—strong symbols, weak symbols, and library ordering can cause subtle bugs if misunderstood.

What's next:

With linking understood, we now turn to loading—the process by which the operating system takes a linked executable and creates a running process. The next page explores how the kernel sets up the process address space, maps the executable into memory, and transfers control to the program.

Page Complete

You now understand both static and dynamic linking—from archive libraries to shared objects, from symbol resolution to GOT/PLT mechanics. This knowledge is essential for debugging linking errors, optimizing deployments, and understanding how modern software is assembled.

Linking (Static and Dynamic)

The Art of Assembling the Pieces

Linking comes in two fundamental flavors:

Static linking: Combines all code into a single executable at build time
Dynamic linking: Defers linking of shared libraries until runtime

What You Will Learn

What the Linker Does

The linker (ld on Unix, link.exe on Windows) performs several critical operations to transform a collection of object files into an executable:

1. Symbol Resolution

The linker builds a global symbol table from all input files, matching every undefined symbol reference to exactly one definition. This process must handle:

Strong symbols (functions, initialized globals): Exactly one definition allowed
Weak symbols: Can be overridden by strong symbols
Common symbols: Uninitialized globals that can be merged

2. Section Merging

The linker combines same-named sections from all object files:

All .text sections → single executable code section
All .data sections → single initialized data section
All .bss sections → single zero-initialized section

3. Relocation

After determining final addresses, the linker processes every relocation entry, computing and patching the correct addresses into the code.

4. Output Generation

Finally, the linker produces either:

An executable (fully linked, ready to run)
A shared library (dynamically linkable at runtime)
A relocatable object (partially linked, for further linking)

Converting Mermaid diagram...

Basic Linking Commands
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Link object files into executable
gcc main.o utils.o -o program
 
# Verbose linking (see what's happening)
gcc -v main.o utils.o -o program
 
# Direct linker invocation (not recommended - too many options)
ld -dynamic-linker /lib64/ld-linux-x86-64.so.2 \
   /usr/lib/x86_64-linux-gnu/crt1.o \
   /usr/lib/x86_64-linux-gnu/crti.o \
   main.o utils.o \
   -lc \
   /usr/lib/x86_64-linux-gnu/crtn.o \
   -o program

Linker Error Patterns

Symbol Resolution in Depth

Symbol resolution is surprisingly nuanced. The linker must follow specific rules to handle the complex scenarios that arise in real software:

Strong vs. Weak Symbols

Strong symbols are:

Function definitions
Initialized global variables

Weak symbols are:

Uninitialized global variables (in some compilers)
Explicitly marked weak with __attribute__((weak))

Symbol Resolution Rules
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
// Rule 1: Multiple strong symbols → ERROR
// file1.c
int counter = 0;    // Strong
 
// file2.c  
int counter = 0;    // Strong → "multiple definition of 'counter'" error
 
// Rule 2: Strong + Weak → Strong wins
// file1.c
int value = 42;     // Strong
 
// file2.c
int value;          // Weak (uninitialized) → ignored, value = 42
 
// Rule 3: Multiple weak symbols → Linker picks one (arbitrary)
// file1.c
int x;              // Weak
 
// file2.c
int x;              // Weak → linker picks one, size may vary! (dangerous)

The Danger of Weak Symbol Merging

Processing Order Matters

The linker processes object files and libraries in command-line order. This has important implications for library linking:

When processing an object file, the linker adds all symbols to its tables
When processing a library, the linker only extracts members that resolve undefined symbols
Once a library is processed, it's not revisited

This means library order matters:

Library Order Demonstration
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# WRONG: library before object file that uses it
gcc -lmath main.o
# Error: undefined reference to 'sin'
# Because: -lmath was processed first, nothing was undefined yet,
#          so nothing was extracted from libmath
 
# CORRECT: object files first, then libraries
gcc main.o -lmath
# Works: main.o processed first, 'sin' is undefined,
#        then libmath is searched, sin definition extracted
 
# For circular dependencies: list libraries multiple times
gcc main.o -lA -lB -lA
# If libA needs something from libB and vice versa

Static Linking: Complete Self-Containment

Static linking combines all necessary code into a single executable at build time. The resulting executable is self-contained—it doesn't depend on external library files at runtime.

Static Libraries (Archives)

Static libraries (.a on Unix, .lib on Windows) are simply archives of object files. The ar tool creates and manages them:

Creating and Using Static Libraries
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# Create object files
gcc -c math_utils.c -o math_utils.o
gcc -c string_utils.c -o string_utils.o
 
# Create static library (archive)
ar rcs libmyutils.a math_utils.o string_utils.o
# r = insert/replace, c = create, s = add symbol index
 
# View archive contents
ar -t libmyutils.a
# math_utils.o
# string_utils.o
 
# View symbols in archive
nm libmyutils.a
 
# Link against static library
gcc main.c -L. -lmyutils -o program
# -L. adds current directory to library search path
# -lmyutils searches for libmyutils.a (or .so)
 
# Force static linking (even if .so exists)
gcc main.c -L. -Wl,-Bstatic -lmyutils -Wl,-Bdynamic -o program

How the Linker Processes Archives

The linker treats archives differently from object files:

Scan the archive's symbol index (created by ar s)
Check for needed symbols that resolve undefined references
Extract only the required object files from the archive
Move on — the archive is not revisited

This selective extraction is why archives can be large (containing many functions) without bloating every executable that uses them. Only the functions actually called get linked in.

Static Linking Advantages

•Self-contained — No external dependencies at runtime
•Faster startup — No dynamic linking overhead
•Simpler deployment — Single file to distribute
•Version isolation — Library updates don't affect existing executables
•Predictable behavior — No DLL hell or shared library conflicts

Static Linking Disadvantages

•Larger executables — Each program contains its own copy of library code
•Memory waste — Same library code duplicated across processes
•Update difficulty — Bug fixes require relinking all affected executables
•Security concerns — Vulnerability patches require rebuilding
•Licensing issues — GPL libraries may require source distribution

When to Use Static Linking

Dynamic Linking: Shared Libraries

Creating Shared Libraries

Shared libraries require position-independent code (PIC) — code that works correctly regardless of where it's loaded in memory:

Creating and Using Shared Libraries
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Compile with PIC (position-independent code)
gcc -c -fPIC math_utils.c -o math_utils.o
gcc -c -fPIC string_utils.c -o string_utils.o
 
# Create shared library
gcc -shared -o libmyutils.so math_utils.o string_utils.o
 
# Set SONAME for versioning (recommended)
gcc -shared -Wl,-soname,libmyutils.so.1 \
    -o libmyutils.so.1.0.0 math_utils.o string_utils.o
 
# Create symlinks for versioning
ln -s libmyutils.so.1.0.0 libmyutils.so.1    # SONAME link
ln -s libmyutils.so.1 libmyutils.so           # Linker link
 
# Link program against shared library
gcc main.c -L. -lmyutils -o program
 
# Run (library must be findable)
LD_LIBRARY_PATH=. ./program
# Or install to standard location: /usr/lib, /usr/local/lib

Position-Independent Code (PIC)

PIC is crucial because shared libraries may be loaded at different virtual addresses in different processes. PIC uses relative addressing instead of absolute addresses:

PC-relative addressing: Code references use offsets from the instruction pointer
Global Offset Table (GOT): A per-process table containing absolute addresses of global symbols
Procedure Linkage Table (PLT): Enables lazy binding of function calls

Non-PIC vs PIC Code Generation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// C code: extern int global_var;
// int get_value() { return global_var; }
 
// WITHOUT PIC (absolute address):
// This would hardcode the address, breaking if loaded elsewhere
mov    $0x601040, %eax    ; Load from fixed address
mov    (%eax), %eax       ; Dereference
 
// WITH PIC (GOT-relative):
// Works regardless of load address
mov    global_var@GOTPCREL(%rip), %rax  ; Get GOT entry address
mov    (%rax), %eax                      ; Load value through GOT
 
// The GOT entry contains the actual runtime address of global_var
// The dynamic linker fills in GOT entries at load time

PIC Overhead

GOT and PLT: The Dynamic Linking Machinery

Global Offset Table (GOT)

The GOT is a table of pointers, one for each global symbol accessed by the program. Key properties:

Each process has its own GOT (even if sharing code)
Contains absolute addresses of global data and functions
Filled in by the dynamic linker at load time
Located in writable memory (security implications)

Converting Mermaid diagram...

Procedure Linkage Table (PLT)

The PLT enables lazy binding — function addresses are resolved only when first called, reducing startup time. Here's how it works:

First call to a function: PLT stub invokes the dynamic linker
Dynamic linker resolves address: Finds the function in the shared library
GOT updated: The resolved address is written to GOT
Subsequent calls: Go directly through GOT to the function

PLT/GOT Mechanism Explained
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Calling printf via PLT
 
# In main code:
call printf@plt           ; Call PLT stub
 
# PLT stub (printf@plt):
jmp    *printf@GOTPLT(%rip)   ; Jump through GOT entry
pushq  $0                      ; Push relocation index
jmp    .plt                    ; Jump to PLT[0] (resolver)
 
# First call: GOT contains address of "pushq" instruction
# → Falls through to resolver
# → Dynamic linker finds printf, updates GOT
# → Jumps to actual printf
 
# Subsequent calls: GOT contains actual printf address
# → jmp goes directly to printf (no resolver needed)
 
# PLT[0] (resolver trampoline):
pushq  GOT[1](%rip)       ; Push link_map pointer
jmp    *GOT[2](%rip)      ; Call _dl_runtime_resolve

Viewing PLT/GOT (objdump)
1
2
3
4
5
6
7
8
9
10
11
$ objdump -d program | grep -A3 'printf@plt'
0000000000401030 <printf@plt>:
  401030:       ff 25 e2 2f 00 00       jmp    *0x2fe2(%rip)  # 404018
  401036:       68 00 00 00 00          push   $0x0
  40103b:       e9 e0 ff ff ff          jmp    401020 <_init+0x20>
 
$ readelf -r program | grep printf
0000000000404018  R_X86_64_JUMP_SLOT  printf@GLIBC_2.2.5
 
# 404018 is the GOT entry for printf
# JUMP_SLOT relocation means lazy binding will fill it

Security: GOT Overwrites

The Dynamic Linker (ld.so)

The dynamic linker (also called the runtime linker or ld.so) is a special shared library that the kernel loads along with your program. It's responsible for:

Loading shared libraries listed in the ELF's dynamic section
Performing relocations for shared library symbols
Handling lazy binding through PLT
Managing library dependencies (loading libraries needed by other libraries)
Symbol versioning for ABI compatibility

Dynamic Linker Identification
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# See which dynamic linker an executable needs
$ readelf -l program | grep interpreter
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
 
# List shared library dependencies
$ ldd program
    linux-vdso.so.1 (0x00007fff...)
    libmyutils.so => ./libmyutils.so (0x00007f...)
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f...)
    /lib64/ld-linux-x86-64.so.2 (0x00007f...)
 
# Show detailed library loading
$ LD_DEBUG=libs ./program 2>&1 | head -20
# Shows search paths, found libraries, load order
 
# Show symbol resolution
$ LD_DEBUG=symbols ./program 2>&1 | head -20

Library Search Order

The dynamic linker searches for libraries in this order:

DT_RPATH in executable (deprecated)
LD_LIBRARY_PATH environment variable
DT_RUNPATH in executable (modern replacement for RPATH)
ldconfig cache (/etc/ld.so.cache)
Default paths (/lib, /usr/lib, etc.)

Library Path Configuration
1
2
3
4
5
6
7
8
9
10
11
12
13
# Temporary: Set library path for single run
LD_LIBRARY_PATH=/custom/path ./program
 
# Embed RUNPATH in executable at link time
gcc main.c -L. -lmyutils -Wl,-rpath,'$ORIGIN/lib' -o program
# $ORIGIN expands to directory containing the executable
 
# System-wide: Add to ldconfig
echo "/custom/path" | sudo tee /etc/ld.so.conf.d/mylibs.conf
sudo ldconfig    # Rebuild cache
 
# View ldconfig cache
ldconfig -p | grep myutils

Debugging Library Issues

Static vs Dynamic: Making the Choice

Choosing between static and dynamic linking involves weighing multiple factors. Let's analyze the tradeoffs systematically:

Static vs Dynamic Linking Comparison
Aspect	Static Linking	Dynamic Linking
Executable Size	Larger (includes all library code)	Smaller (libraries separate)
Memory Usage	No sharing between processes	Code pages shared across processes
Load Time	Faster (no runtime linking)	Slower (must load and link libraries)
Deployment	Single file, no dependencies	Must ensure libraries are present
Updates	Recompile needed for library fixes	Library update fixes all programs
Security Patches	Slow rollout (each program rebuilt)	Fast rollout (one library update)
Symbol Resolution	All at build time	Mix of build-time and run-time

When to Choose Static Linking

Embedded systems with no shared library infrastructure
Single-binary deployments (containers, serverless)
Security-critical applications where minimizing dependencies is paramount
Performance-critical hotpaths where PLT overhead matters
Portable executables that must run on multiple Linux distributions

When to Choose Dynamic Linking

System utilities that leverage system libraries
Desktop applications where memory sharing benefits multiple users
Plugin architectures requiring runtime extensibility
License compliance (LGPL libraries can be dynamically linked without source release)
Large deployments where security updates must propagate quickly

Memory Comparison Example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Compare memory usage: static vs dynamic
 
# Static: 100 processes, each loading 2MB libc
# Memory used: 100 × 2MB = 200MB
 
# Dynamic: 100 processes, shared 2MB libc
# Code sections shared: ~2MB shared
# Data sections per-process: 100 × small
# Memory used: ~2MB + overhead
 
# Check shared memory with pmap
$ pmap <pid> | grep libc
00007f... 2032K r-x-- libc-2.31.so  ← Shared code (marked 'x')
00007f...  20K r---- libc-2.31.so  ← Shared readonly
00007f...   8K rw--- libc-2.31.so  ← Per-process writable

Hybrid Approaches

Summary: Mastering Linking

Key Takeaways

•The linker's core jobs are symbol resolution, section merging, and relocation—transforming incomplete object files into complete executables.
•Static linking embeds all library code into the executable, creating self-contained binaries at the cost of size and update flexibility.
•Dynamic linking defers library loading until runtime, enabling code sharing and easy updates at the cost of deployment complexity.
•GOT and PLT enable position-independent code, with lazy binding deferring function resolution until first call.
•The dynamic linker (ld.so) loads shared libraries, performs runtime relocations, and manages the entire dynamic linking process.
•Symbol resolution rules matter—strong symbols, weak symbols, and library ordering can cause subtle bugs if misunderstood.

What's next:

Page Complete