Loading content...
Throughout this module, we've discussed library functions and system calls abstractly. Now it's time to see them in action. How do you verify that your program is making the syscalls you expect? How do you debug a mysterious I/O bug? How do you understand what an unfamiliar program is actually doing?
The answer is strace—a powerful Linux utility that traces every system call a program makes. With strace, you can:
strace transforms system calls from an abstract concept into concrete, observable reality. After this page, you'll have a powerful debugging skill that most developers lack.
By the end of this page, you will master strace's essential features, understand how to interpret its output, know techniques for debugging common I/O issues, and be able to use strace to verify the concepts learned throughout this module.
strace (System TRACE) is a Linux diagnostic utility that intercepts and records every system call made by a process. It uses the ptrace() system call to attach to the target process and receive notifications of syscall entry and exit.
# Run a program under strace
strace ./myprogram
# Attach to a running process
strace -p 12345
# Follow child processes (fork/exec)
strace -f ./myprogram
# Write output to a file
strace -o trace.log ./myprogram
For each syscall, strace outputs:
syscall_name(arguments...) = return_value
For example:
open("/etc/passwd", O_RDONLY) = 3
read(3, "root:x:0:0:root:/root:/bin/bash\n"..., 4096) = 2462
close(3) = 0
123456789101112131415161718192021222324252627282930
$ cat hello.c#include <stdio.h>int main() { printf("Hello, World!\n"); return 0;} $ gcc -o hello hello.c$ strace ./hello # Key excerpts from output:execve("./hello", ["./hello"], 0x7ffd... /* 54 vars */) = 0brk(NULL) = 0x5555557a7000mmap(NULL, 8192, PROT_READ|PROT_WRITE, ...) = 0x7ffff7fc6000# ... dynamic linker activity ...openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3# ... libc loading ... # The actual printf operation:fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0), ...}) = 0brk(0x5555557c8000) = 0x5555557c8000write(1, "Hello, World!\n", 14) = 14exit_group(0) = ?+++ exited with 0 +++ # Observations:# 1. printf("Hello, World!\n") becomes write(1, "Hello, World!\n", 14)# 2. fd 1 is stdout (verified by fstat check)# 3. 14 bytes = 13 chars + 1 newline# 4. Only ONE write syscall, not per-characterThis simple trace reveals profound truths:
write()main() runs, the dynamic linker loads libcstrace makes the invisible visible.
strace is Linux-specific. Equivalents on other systems: • macOS/BSD: dtruss (requires root), or dtrace • Windows: Process Monitor (procmon), API Monitor • Solaris: truss The concepts transfer, though syntax and capabilities differ.
strace has dozens of options. Here are the most important ones for day-to-day debugging:
# Only show file-related syscalls
strace -e trace=file ./program
# Only show network syscalls
strace -e trace=network ./program
# Only show process management syscalls
strace -e trace=process ./program
# Only show specific syscalls
strace -e trace=open,read,write,close ./program
# Exclude specific syscalls (noisy ones)
strace -e trace=\!brk,mmap,mprotect ./program
Trace categories:
file — open, stat, chmod, unlink, etc.process — fork, exec, wait, exit, etc.network — socket, connect, send, recv, etc.signal — signal, sigaction, kill, etc.ipc — shmget, semop, msgget, etc.desc — read, write, dup, select, poll, etc.memory — mmap, brk, mlock, etc.# Show string arguments up to 256 chars (default is 32)
strace -s 256 ./program
# Show full pathname resolution for file opens
strace -y ./program
# Output: read(3</etc/passwd>, ...) instead of read(3, ...)
# Show syscall timing (relative to previous)
strace -r ./program
# Show absolute timestamps
strace -t ./program # HH:MM:SS
strace -tt ./program # HH:MM:SS.microseconds
strace -ttt ./program # Epoch.microseconds
# Show time spent in each syscall
strace -T ./program
# Output: write(1, "hello", 5) = 5 <0.000015>
| Option | Purpose | Example |
|---|---|---|
-e trace=X | Filter by syscall type | strace -e trace=file |
-p PID | Attach to running process | strace -p 12345 |
-f | Follow child processes | strace -f ./multiproc |
-o FILE | Write to file instead of stderr | strace -o out.log ./prog |
-c | Summary statistics only | strace -c ./prog |
-s NUM | Max string length to print | strace -s 1000 |
-y | Show file paths with descriptors | strace -y |
-T | Show syscall duration | strace -T |
-tt | Show microsecond timestamps | strace -tt |
-k | Show stack trace for each syscall | strace -k |
12345678910111213141516171819202122232425
# The -c option provides a summary without individual syscall output$ strace -c ls /usr % time seconds usecs/call calls errors syscall------ ----------- ----------- --------- --------- ---------------- 33.24 0.000295 36 8 mmap 13.51 0.000120 30 4 openat 12.84 0.000114 28 4 mprotect 10.14 0.000090 14 6 fstat 7.21 0.000064 12 5 close 5.97 0.000053 17 3 read 5.40 0.000048 24 2 getdents64 4.73 0.000042 21 2 pread64 2.59 0.000023 23 1 munmap 1.80 0.000016 16 1 write 1.35 0.000012 12 1 statfs 0.90 0.000008 8 1 arch_prctl 0.34 0.000003 3 1 set_tid_address------ ----------- ----------- --------- --------- ----------------100.00 0.000888 39 total # This reveals:# - Most time spent in mmap (dynamic linking)# - 4 file opens, all successful (no errors column)# - Only 1 write call for all output (buffered)strace is invaluable for diagnosing I/O problems. Here are common debugging scenarios:
$ ./myprogram
Error: Cannot open configuration file
$ strace ./myprogram 2>&1 | grep open
openat(AT_FDCWD, "/home/user/myprogram.conf", O_RDONLY) = -1 ENOENT (No such file)
openat(AT_FDCWD, "/etc/myprogram.conf", O_RDONLY) = -1 ENOENT (No such file)
openat(AT_FDCWD, "./config/myprogram.conf", O_RDONLY) = -1 ENOENT (No such file)
Solution: The program is looking in specific paths. Create the file where it expects it, or check the search order logic.
$ # Program is stuck - what's it waiting for?
$ strace -p $(pgrep myprogram)
Process 12345 attached
read(0, <unfinished ...> # <-- Waiting for input on stdin!
$ # Or network case:
$ strace -p $(pgrep myservice)
recvfrom(5, ^C # <-- Waiting for network data
$ # Or file lock:
$ strace -p $(pgrep dbprocess)
flock(3, LOCK_EX # <-- Waiting for file lock
Diagnosis: The syscall name and arguments reveal exactly what the program is blocked on—stdin read, network receive, or lock acquisition.
$ # Time each syscall to find the slow ones
$ strace -T ./slowprogram 2>&1 | head -20
read(3, "...data...\n", 4096) = 156 <0.000018>
read(3, "...data...\n", 4096) = 156 <0.000012>
read(3, "...data...\n", 4096) = 156 <0.000011>
fsync(4) = 0 <0.523417> # <- 0.5 SECONDS!
read(3, "...data...\n", 4096) = 156 <0.000015>
Diagnosis: The fsync() is taking half a second—synchronizing to disk. Solutions: batch more work before fsync, use fdatasync, or accept async durability.
12345678910111213141516171819202122232425262728293031323334353637
# Problem: Output appears in wrong order or is lost $ cat problem.c#include <stdio.h>int main() { printf("Before crash"); // No newline! int *p = NULL; *p = 42; // Crash printf("After crash\n"); return 0;} $ ./problemSegmentation fault# Note: "Before crash" never appeared! $ strace ./problem 2>&1 | tail -10...--- SIGSEGV {si_signo=SIGSEGV, si_code=SEGV_MAPERR, ...} ---+++ killed by SIGSEGV +++ # Notice: NO write() syscall for "Before crash"# The printf was buffered but never flushed before crash! # Fix: Add fflush(stdout) or \n$ cat fixed.c#include <stdio.h>int main() { printf("Before crash\n"); // Newline forces flush // Or: fflush(stdout); int *p = NULL; *p = 42;} $ strace ./fixed 2>&1 | grep writewrite(1, "Before crash\n", 13) = 13# Now we see the output before the crash!If a program dies from SIGKILL (kill -9) or SIGTERM, buffered output is lost. strace helps verify this by showing that write() was never called. When debugging crash output, always ensure you're flushing or using unbuffered stderr for diagnostics.
strace is the definitive tool for understanding how library buffering affects syscall behavior. Let's verify the concepts from earlier in this module.
12345678910111213141516171819202122232425262728
$ cat line_test.c#include <stdio.h>int main() { printf("one "); printf("two "); printf("three\n"); // Newline printf("four "); printf("five\n"); // Newline return 0;} # Running to terminal (line-buffered stdout):$ strace -e write ./line_test 2>&1write(1, "one two three\n", 14) = 14write(1, "four five\n", 10) = 10+++ exited with 0 +++ # Only 2 write() calls! Each at newline. # Running piped through cat (fully-buffered stdout):$ strace -e write ./line_test 2>&1 | catwrite(1, "one two three\nfour five\n", 24) = 24+++ exited with 0 +++ # Only 1 write() call - everything buffered until exit! # This proves stdout is line-buffered to terminals,# fully-buffered when piped.123456789101112131415161718192021222324252627282930313233343536373839
$ cat compare_io.c#include <stdio.h>#include <unistd.h> int main(int argc, char **argv) { if (argc > 1 && argv[1][0] == 'u') { // Unbuffered: direct write() per character for (int i = 0; i < 100; i++) { write(STDOUT_FILENO, "x", 1); } } else { // Buffered: multiple putchar -> one write for (int i = 0; i < 100; i++) { putchar('x'); } printf("\n"); } return 0;} # Buffered (default) - count syscalls:$ strace -e write ./compare_io 2>&1 | wc -l2 # (one write + "exited" message) $ strace -e write ./compare_io 2>&1 | grep writewrite(1, "xxxx...xxxx\n", 101) = 101 # Unbuffered - count syscalls:$ strace -e write ./compare_io u 2>&1 | wc -l 101 # 100 writes + exited message! $ strace -e write ./compare_io u 2>&1 | head -5write(1, "x", 1) = 1write(1, "x", 1) = 1write(1, "x", 1) = 1write(1, "x", 1) = 1write(1, "x", 1) = 1 # Dramatic difference: 1 syscall vs 100 syscalls!123456789101112131415161718192021222324252627282930313233343536
$ cat flush_demo.c#include <stdio.h>#include <unistd.h> int main() { // Each printf adds to buffer printf("Writing to buffer "); sleep(1); // 1 second - buffer NOT flushed printf("still buffering "); sleep(1); printf("now adding newline\n"); // Flush! sleep(1); printf("more output without newline"); fflush(stdout); // Explicit flush sleep(1); return 0; // Exit flushes remaining} # With timestamps (-tt) we see WHEN writes occur:$ strace -tt -e write ./flush_demo 2>&1 # First 2 seconds: NO write syscall (buffered)# At ~2 seconds (newline):14:35:12.003421 write(1, "Writing to buffer still buffering now adding newline\n", 54) = 54 # At ~3 seconds (fflush):14:35:13.007892 write(1, "more output without newline", 27) = 27 # At ~4 seconds (exit):# (no additional write - buffer already flushed) # This PROVES buffering behavior we discussed!Be aware that strace itself adds overhead (it traps every syscall), so timing measurements are affected. For accurate performance analysis, use the -c option for summary statistics, or detach after collecting enough data. For production analysis, consider eBPF-based tools like bpftrace.
Beyond basic tracing, strace offers powerful capabilities for complex debugging scenarios.
# Show call stack for each syscall (requires debug symbols)
$ strace -k ./program 2>&1 | head -30
write(1, "hello\n", 6) = 6
> /lib/x86_64-linux-gnu/libc.so.6(__write+0x14) [0x114a94]
> /lib/x86_64-linux-gnu/libc.so.6(_IO_file_write+0x2d) [0x836ed]
> /lib/x86_64-linux-gnu/libc.so.6(_IO_do_write+0xb9) [0x85099]
> /lib/x86_64-linux-gnu/libc.so.6(_IO_file_overflow+0x101) [0x850f1]
> /lib/x86_64-linux-gnu/libc.so.6(__overflow+0x44) [0x86884]
> /lib/x86_64-linux-gnu/libc.so.6(puts+0x122) [0x74f92]
> ./program(main+0x15) [0x1155]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf3) [0x27083]
This reveals the full call chain from application through libc to the syscall—invaluable for understanding complex programs.
# Make all read() calls fail with ENOSPC
$ strace -e fault=read:error=ENOSPC ./program
read(3, 0x7ffe..., 1024) = -1 ENOSPC (No space left on device) (INJECTED)
# Fail only the 5th open() call
$ strace -e fault=openat:when=5:error=EACCES ./program
# Delay syscalls (simulate slow I/O)
$ strace -e inject=read:delay_enter=100ms ./program
Fault injection lets you test error handling paths that are hard to trigger naturally.
# Only show writes to fd 1 (stdout)
$ strace -e trace=write -e write=1 ./program
# Show all syscalls that touch a specific path
$ strace -P /etc/passwd ./program
openat(AT_FDCWD, "/etc/passwd", O_RDONLY) = 3
read(3, "root:x:0:0:...", 4096) = 2462
close(3) = 0
# Only show failed syscalls
$ strace -z ./program # -z = print only if successful
$ strace -Z ./program # -Z = print only if failed
123456789101112131415161718192021222324252627282930313233
# ===== Find which library is causing an error =====$ strace -k -e openat ./mystery_program 2>&1 | grep -A5 "ENOENT"openat(AT_FDCWD, "/missing/config.yml", O_RDONLY) = -1 ENOENT > /lib/x86_64-linux-gnu/libc.so.6(open64+0x50) > /usr/lib/libsomelib.so(load_config+0x42) # <-- HERE > ./mystery_program(initialize+0x1a) > ./mystery_program(main+0x22) # ===== Capture syscalls to replay later (debugging tool) =====$ strace -o trace.log -f ./program$ grep -E "^[0-9]+ +write" trace.log | head12345 write(1, "output line 1\n", 14) = 1412345 write(1, "output line 2\n", 14) = 14 # ===== Follow multi-process programs =====$ strace -ff -o trace ./multiproc_server# Creates trace.12345, trace.12346, etc. for each process $ ls trace.*trace.12345 trace.12346 trace.12347 # ===== Measure syscall distribution =====$ strace -c -S calls ./program% time calls syscall------ -------- -------- 45.0 10000 read 35.0 8000 write 15.0 3000 fstat 5.0 1000 close # Sort by time spent: -S time (default)# Sort by call count: -S calls# Sort by errors: -S errorsUsing strace in Docker containers requires --cap-add=SYS_PTRACE or running in --privileged mode. For Kubernetes, add the capability to your security context. Without ptrace capability, strace will fail with 'operation not permitted'.
strace isn't the only tool for understanding system behavior. Here's a toolkit for comprehensive debugging:
# ltrace shows library function calls, not syscalls
$ ltrace ./program 2>&1 | head
__libc_start_main(0x401126, 1, 0x7ffe..., ...)
printf("Hello %s\n", "World") = 12
puts("Done") = 5
+++ exited (status 0) +++
# Combine with strace for full picture:
$ ltrace -e printf -S ./program
printf("Hello %s\n", "World") = 12
SYS_write(1, "Hello World\n", 12) = 12
ltrace shows you the library layer (printf, malloc), while strace shows the kernel layer (write, brk). Together, they reveal the complete path.
# See all files open by a process
$ lsof -p $(pgrep myserver)
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
myserver 1234 user cwd DIR 254,1 4096 123 /app
myserver 1234 user 0u CHR 1,3 0t0 125 /dev/null
myserver 1234 user 1w REG 254,1 10240 456 /var/log/app.log
myserver 1234 user 3u IPv4 98765 0t0 TCP *:8080 (LISTEN)
myserver 1234 user 4u IPv4 98766 0t0 TCP 10.0.0.1:42840 (ESTABLISHED)
# See what has a file open
$ lsof /var/log/messages
# See network connections
$ lsof -i :8080
lsof complements strace by showing the current state of file descriptors, not just the syscalls.
| Tool | Shows | Best For |
|---|---|---|
strace | System calls | I/O debugging, syscall tracing |
ltrace | Library calls | Understanding library behavior |
lsof | Open files/sockets | What resources a process holds |
perf | Performance counters | CPU profiling, cache misses |
bpftrace | Kernel events (eBPF) | Production tracing, low overhead |
gdb | Everything (debugger) | Interactive debugging |
tcpdump | Network packets | Protocol-level network debugging |
ss/netstat | Network connections | Connection state |
dmesg | Kernel messages | Hardware/driver issues |
The /proc filesystem provides real-time information about running processes:
# File descriptors of process 1234
$ ls -la /proc/1234/fd/
lrwx------ 1 user user 64 ... 0 -> /dev/pts/2
lrwx------ 1 user user 64 ... 1 -> /dev/pts/2
lrwx------ 1 user user 64 ... 2 -> /dev/pts/2
lrwx------ 1 user user 64 ... 3 -> /path/to/opened/file.txt
# Memory map
$ cat /proc/1234/maps | head
00400000-00401000 r-xp 00000000 254:01 1234567 /app/program
...
# Environment variables
$ cat /proc/1234/environ | tr '\0' '\n' | head
PATH=/usr/bin:/bin
HOME=/home/user
...
# I/O stats
$ cat /proc/1234/io
rchar: 1234567
wchar: 7654321
read_bytes: 1234000
write_bytes: 7654000
eBPF (extended Berkeley Packet Filter) enables efficient, safe tracing with minimal overhead—even in production. Tools like bpftrace, bcc, and perf use eBPF for advanced observability. While strace is great for development debugging, eBPF-based tools are preferred for production analysis.
Let's put everything together with a practical debugging workflow for I/O issues.
strace -p PID to see what it's waiting forstrace -e write ./programstrace -e trace=file ./programstrace -c ./program1234567891011121314151617181920212223242526272829303132333435
#!/bin/bash# Comprehensive trace for debugging PROGRAM="./myprogram"OUTPUT_DIR="debug_traces"mkdir -p "$OUTPUT_DIR" # 1. Full trace with timingecho "Running full trace..."strace -o "$OUTPUT_DIR/full.trace" -tt -T -f -s 256 "$PROGRAM" 2>&1 # 2. Summary statisticsecho "Generating summary..."strace -c -o "$OUTPUT_DIR/summary.txt" "$PROGRAM" 2>&1 # 3. File operations onlyecho "Tracing file operations..."strace -o "$OUTPUT_DIR/files.trace" -e trace=file -y "$PROGRAM" 2>&1 # 4. Extract key informationecho "Analyzing traces..." # Failed syscallsecho "=== FAILED SYSCALLS ===" > "$OUTPUT_DIR/analysis.txt"grep " = -1 " "$OUTPUT_DIR/full.trace" >> "$OUTPUT_DIR/analysis.txt" # Slowest syscalls (> 10ms)echo -e "\n=== SLOW SYSCALLS (>10ms) ===" >> "$OUTPUT_DIR/analysis.txt"grep -E "<[0-9]{1,}\.[0-9]{2,}" "$OUTPUT_DIR/full.trace" | awk -F'<|>' '{if ($2 > 0.01) print}' >> "$OUTPUT_DIR/analysis.txt" # Write patternsecho -e "\n=== WRITE PATTERNS ===" >> "$OUTPUT_DIR/analysis.txt"grep "^write\|^writev\|^pwrite" "$OUTPUT_DIR/full.trace" | head -20 >> "$OUTPUT_DIR/analysis.txt" echo "Done. Check $OUTPUT_DIR/ for results."Look for these red flags in your traces:
| Pattern | Indicates | Solution |
|---|---|---|
| Many small write() calls | Missing buffering | Use stdio or custom buffer |
| ENOENT on expected files | Wrong path or working dir | Print getcwd(), fix path |
| EAGAIN loops | Non-blocking socket busy-looping | Add proper select/poll |
| Long fsync() times | Slow disk or excessive syncing | Batch syncs, use async |
| Repeated open/close | Inefficient file handling | Keep file descriptor open |
| SIGPIPE | Writing to closed pipe | Handle SIGPIPE, check reader |
After implementing a fix, re-run strace to confirm:
# Before fix: many small writes
$ strace -c ./program_before 2>&1 | grep write
0.12 0.002000 0 100 0 write
# After fix: few large writes
$ strace -c ./program_after 2>&1 | grep write
0.01 0.000200 0 2 0 write
strace transforms system calls from abstract concepts into observable reality. With this tool, you can verify buffering behavior, diagnose I/O issues, understand unfamiliar programs, and validate your understanding of library vs syscall behavior.
Module Complete:
Congratulations! You've completed the comprehensive study of Library Functions vs System Calls. You now understand:
This knowledge distinguishes engineers who truly understand their systems from those who merely write code. You can now reason about I/O behavior from first principles, make informed optimization decisions, and debug issues that mystify others.
You've mastered the critical boundary between user-space libraries and kernel system calls. This foundation will serve you throughout your career—whether debugging production issues, optimizing performance, or simply understanding what your code actually does beneath the abstractions.