Loading content...
Object files are incomplete—they contain machine code with "holes" where external references need to be filled in. The linker is the sophisticated tool that combines multiple object files and libraries into a complete, executable program. Without the linker, separate compilation would be impossible, and every program would need to be written as a single monolithic source file.
Linking comes in two fundamental flavors:
The choice between these approaches involves tradeoffs in file size, memory usage, load time, update flexibility, and security. Understanding both mechanisms is essential for operating systems internals, performance optimization, and debugging library issues.
By the end of this page, you will understand how the linker resolves symbols, performs relocations, and produces executables. You'll master both static linking (archive libraries) and dynamic linking (shared objects), including the PLT/GOT mechanism that enables position-independent code.
The linker (ld on Unix, link.exe on Windows) performs several critical operations to transform a collection of object files into an executable:
The linker builds a global symbol table from all input files, matching every undefined symbol reference to exactly one definition. This process must handle:
The linker combines same-named sections from all object files:
.text sections → single executable code section.data sections → single initialized data section.bss sections → single zero-initialized sectionAfter determining final addresses, the linker processes every relocation entry, computing and patching the correct addresses into the code.
Finally, the linker produces either:
1234567891011121314
# Link object files into executablegcc main.o utils.o -o program # Verbose linking (see what's happening)gcc -v main.o utils.o -o program # Direct linker invocation (not recommended - too many options)ld -dynamic-linker /lib64/ld-linux-x86-64.so.2 \ /usr/lib/x86_64-linux-gnu/crt1.o \ /usr/lib/x86_64-linux-gnu/crti.o \ main.o utils.o \ -lc \ /usr/lib/x86_64-linux-gnu/crtn.o \ -o programTwo common linker errors: 'undefined reference' means a symbol is used but never defined anywhere. 'multiple definition' means the same symbol is defined in multiple files. Understanding these helps diagnose linking problems quickly.
Symbol resolution is surprisingly nuanced. The linker must follow specific rules to handle the complex scenarios that arise in real software:
Strong symbols are:
Weak symbols are:
__attribute__((weak))1234567891011121314151617181920
// Rule 1: Multiple strong symbols → ERROR// file1.cint counter = 0; // Strong // file2.c int counter = 0; // Strong → "multiple definition of 'counter'" error // Rule 2: Strong + Weak → Strong wins// file1.cint value = 42; // Strong // file2.cint value; // Weak (uninitialized) → ignored, value = 42 // Rule 3: Multiple weak symbols → Linker picks one (arbitrary)// file1.cint x; // Weak // file2.cint x; // Weak → linker picks one, size may vary! (dangerous)When multiple weak symbols exist, the linker picks one arbitrarily. If they have different sizes (e.g., int x in one file, double x in another), one definition will be used everywhere—causing subtle memory corruption bugs. Always initialize global variables or use static for file-local data.
The linker processes object files and libraries in command-line order. This has important implications for library linking:
This means library order matters:
1234567891011121314
# WRONG: library before object file that uses itgcc -lmath main.o# Error: undefined reference to 'sin'# Because: -lmath was processed first, nothing was undefined yet,# so nothing was extracted from libmath # CORRECT: object files first, then librariesgcc main.o -lmath# Works: main.o processed first, 'sin' is undefined,# then libmath is searched, sin definition extracted # For circular dependencies: list libraries multiple timesgcc main.o -lA -lB -lA# If libA needs something from libB and vice versaStatic linking combines all necessary code into a single executable at build time. The resulting executable is self-contained—it doesn't depend on external library files at runtime.
Static libraries (.a on Unix, .lib on Windows) are simply archives of object files. The ar tool creates and manages them:
1234567891011121314151617181920212223
# Create object filesgcc -c math_utils.c -o math_utils.ogcc -c string_utils.c -o string_utils.o # Create static library (archive)ar rcs libmyutils.a math_utils.o string_utils.o# r = insert/replace, c = create, s = add symbol index # View archive contentsar -t libmyutils.a# math_utils.o# string_utils.o # View symbols in archivenm libmyutils.a # Link against static librarygcc main.c -L. -lmyutils -o program# -L. adds current directory to library search path# -lmyutils searches for libmyutils.a (or .so) # Force static linking (even if .so exists)gcc main.c -L. -Wl,-Bstatic -lmyutils -Wl,-Bdynamic -o programThe linker treats archives differently from object files:
ar s)This selective extraction is why archives can be large (containing many functions) without bloating every executable that uses them. Only the functions actually called get linked in.
Static linking is ideal for: embedded systems, portable executables, security-critical applications (minimizing attack surface), Go and Rust binaries (which statically link by default), containers where minimal dependencies are desired.
Dynamic linking defers the resolution of shared library symbols until the program is loaded or even later, during execution. Shared libraries (.so on Linux, .dll on Windows, .dylib on macOS) are loaded into memory once and shared across all processes that use them.
Shared libraries require position-independent code (PIC) — code that works correctly regardless of where it's loaded in memory:
123456789101112131415161718192021
# Compile with PIC (position-independent code)gcc -c -fPIC math_utils.c -o math_utils.ogcc -c -fPIC string_utils.c -o string_utils.o # Create shared librarygcc -shared -o libmyutils.so math_utils.o string_utils.o # Set SONAME for versioning (recommended)gcc -shared -Wl,-soname,libmyutils.so.1 \ -o libmyutils.so.1.0.0 math_utils.o string_utils.o # Create symlinks for versioningln -s libmyutils.so.1.0.0 libmyutils.so.1 # SONAME linkln -s libmyutils.so.1 libmyutils.so # Linker link # Link program against shared librarygcc main.c -L. -lmyutils -o program # Run (library must be findable)LD_LIBRARY_PATH=. ./program# Or install to standard location: /usr/lib, /usr/local/libPIC is crucial because shared libraries may be loaded at different virtual addresses in different processes. PIC uses relative addressing instead of absolute addresses:
123456789101112131415
// C code: extern int global_var;// int get_value() { return global_var; } // WITHOUT PIC (absolute address):// This would hardcode the address, breaking if loaded elsewheremov $0x601040, %eax ; Load from fixed addressmov (%eax), %eax ; Dereference // WITH PIC (GOT-relative):// Works regardless of load addressmov global_var@GOTPCREL(%rip), %rax ; Get GOT entry addressmov (%rax), %eax ; Load value through GOT // The GOT entry contains the actual runtime address of global_var// The dynamic linker fills in GOT entries at load timePIC adds a small runtime overhead (extra indirection through GOT/PLT). On x86-64, this is minimal due to RIP-relative addressing. On 32-bit x86, PIC required using a register (typically EBX) as a base pointer, causing more significant overhead.
The Global Offset Table (GOT) and Procedure Linkage Table (PLT) are the core mechanisms enabling dynamic linking. Understanding them is essential for security research, debugging, and low-level optimization.
The GOT is a table of pointers, one for each global symbol accessed by the program. Key properties:
The PLT enables lazy binding — function addresses are resolved only when first called, reducing startup time. Here's how it works:
123456789101112131415161718192021
# Calling printf via PLT # In main code:call printf@plt ; Call PLT stub # PLT stub (printf@plt):jmp *printf@GOTPLT(%rip) ; Jump through GOT entrypushq $0 ; Push relocation indexjmp .plt ; Jump to PLT[0] (resolver) # First call: GOT contains address of "pushq" instruction# → Falls through to resolver# → Dynamic linker finds printf, updates GOT# → Jumps to actual printf # Subsequent calls: GOT contains actual printf address# → jmp goes directly to printf (no resolver needed) # PLT[0] (resolver trampoline):pushq GOT[1](%rip) ; Push link_map pointerjmp *GOT[2](%rip) ; Call _dl_runtime_resolve1234567891011
$ objdump -d program | grep -A3 'printf@plt'0000000000401030 <printf@plt>: 401030: ff 25 e2 2f 00 00 jmp *0x2fe2(%rip) # 404018 401036: 68 00 00 00 00 push $0x0 40103b: e9 e0 ff ff ff jmp 401020 <_init+0x20> $ readelf -r program | grep printf0000000000404018 R_X86_64_JUMP_SLOT printf@GLIBC_2.2.5 # 404018 is the GOT entry for printf# JUMP_SLOT relocation means lazy binding will fill itThe GOT is writable, making it a target for attackers. GOT overwrite attacks replace function pointers with attacker-controlled addresses. Mitigations include RELRO (Relocation Read-Only), which marks GOT as read-only after loading. Full RELRO disables lazy binding entirely.
The dynamic linker (also called the runtime linker or ld.so) is a special shared library that the kernel loads along with your program. It's responsible for:
1234567891011121314151617
# See which dynamic linker an executable needs$ readelf -l program | grep interpreter [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2] # List shared library dependencies$ ldd program linux-vdso.so.1 (0x00007fff...) libmyutils.so => ./libmyutils.so (0x00007f...) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f...) /lib64/ld-linux-x86-64.so.2 (0x00007f...) # Show detailed library loading$ LD_DEBUG=libs ./program 2>&1 | head -20# Shows search paths, found libraries, load order # Show symbol resolution$ LD_DEBUG=symbols ./program 2>&1 | head -20The dynamic linker searches for libraries in this order:
/etc/ld.so.cache)/lib, /usr/lib, etc.)12345678910111213
# Temporary: Set library path for single runLD_LIBRARY_PATH=/custom/path ./program # Embed RUNPATH in executable at link timegcc main.c -L. -lmyutils -Wl,-rpath,'$ORIGIN/lib' -o program# $ORIGIN expands to directory containing the executable # System-wide: Add to ldconfigecho "/custom/path" | sudo tee /etc/ld.so.conf.d/mylibs.confsudo ldconfig # Rebuild cache # View ldconfig cacheldconfig -p | grep myutilsWhen programs fail with 'library not found', use ldd to see missing dependencies. Use LD_DEBUG=libs to trace the search process. Common fixes: set LD_LIBRARY_PATH, embed RUNPATH, or install libraries to standard locations.
Choosing between static and dynamic linking involves weighing multiple factors. Let's analyze the tradeoffs systematically:
| Aspect | Static Linking | Dynamic Linking |
|---|---|---|
| Executable Size | Larger (includes all library code) | Smaller (libraries separate) |
| Memory Usage | No sharing between processes | Code pages shared across processes |
| Load Time | Faster (no runtime linking) | Slower (must load and link libraries) |
| Deployment | Single file, no dependencies | Must ensure libraries are present |
| Updates | Recompile needed for library fixes | Library update fixes all programs |
| Security Patches | Slow rollout (each program rebuilt) | Fast rollout (one library update) |
| Symbol Resolution | All at build time | Mix of build-time and run-time |
123456789101112131415
# Compare memory usage: static vs dynamic # Static: 100 processes, each loading 2MB libc# Memory used: 100 × 2MB = 200MB # Dynamic: 100 processes, shared 2MB libc# Code sections shared: ~2MB shared# Data sections per-process: 100 × small# Memory used: ~2MB + overhead # Check shared memory with pmap$ pmap <pid> | grep libc00007f... 2032K r-x-- libc-2.31.so ← Shared code (marked 'x')00007f... 20K r---- libc-2.31.so ← Shared readonly00007f... 8K rw--- libc-2.31.so ← Per-process writableYou can mix static and dynamic linking. Critical libraries can be linked statically for performance/reliability, while less critical ones are dynamically linked. Use -Wl,-Bstatic -lfoo -Wl,-Bdynamic to control linking mode per library.
Linking transforms incomplete object files into executable programs. Whether static or dynamic, linking resolves symbols, merges sections, and produces the final binary that the operating system can load and execute.
What's next:
With linking understood, we now turn to loading—the process by which the operating system takes a linked executable and creates a running process. The next page explores how the kernel sets up the process address space, maps the executable into memory, and transfers control to the program.
You now understand both static and dynamic linking—from archive libraries to shared objects, from symbol resolution to GOT/PLT mechanics. This knowledge is essential for debugging linking errors, optimizing deployments, and understanding how modern software is assembled.