Loading content...
eBPF programs are unlike any other software you'll write. They execute directly within the Linux kernel, triggered by specific events—a network packet arriving, a system call being invoked, a function being entered. They run with kernel privileges yet are constrained by the verifier to guarantee safety. They can observe everything happening in the system while maintaining near-zero overhead.
Understanding how to write, structure, and deploy eBPF programs is essential for leveraging this powerful technology. This page takes you through the complete lifecycle of an eBPF program: from C source code to verified bytecode running in kernel space.
By the end of this page, you will understand how eBPF programs are structured, the role of sections and attributes, how helper functions provide kernel access, how maps enable data persistence and communication, and the complete compilation-to-execution workflow. You'll be equipped to read, understand, and begin writing eBPF programs.
An eBPF program is written in a restricted subset of C (or Rust with Aya), compiled to eBPF bytecode, and loaded into the kernel. Let's examine the essential components that make up an eBPF program.
Core Components
Every eBPF program contains these fundamental elements:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091
// ============================================// INCLUDES// ============================================// vmlinux.h: Generated type definitions from kernel BTF// Contains all kernel structures (task_struct, sk_buff, etc.)#include "vmlinux.h" // BPF helper definitions and macros#include <bpf/bpf_helpers.h>#include <bpf/bpf_core_read.h>#include <bpf/bpf_tracing.h> // ============================================// LICENSE DECLARATION (Required)// ============================================// Must be GPL-compatible for many helper functions// Options: "GPL", "GPL v2", "GPL and additional rights",// "Dual BSD/GPL", "Dual MIT/GPL", "Dual MPL/GPL"char LICENSE[] SEC("license") = "GPL"; // ============================================// MAP DEFINITIONS// ============================================// BPF maps persist data across program invocations// and allow communication with user space // Hash map: key-value storagestruct { __uint(type, BPF_MAP_TYPE_HASH); __uint(max_entries, 10240); __type(key, u32); // PID as key __type(value, u64); // Count as value} pid_count SEC(".maps"); // Per-CPU array for statisticsstruct { __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); __uint(max_entries, 1); __type(key, u32); __type(value, u64);} total_count SEC(".maps"); // ============================================// HELPER STRUCTURES// ============================================// Custom structures for ring buffer events, etc.struct event { u32 pid; u32 tid; u64 ts; char comm[16];}; // ============================================// BPF PROGRAM (Main Entry Point)// ============================================// SEC("...") defines program type and attachment point// The section name determines:// - Which program type (kprobe, tracepoint, xdp, etc.)// - Where it attaches (function name, tracepoint path, etc.) SEC("kprobe/do_sys_openat2")int trace_openat(struct pt_regs *ctx) { // Get current PID/TID u64 pid_tgid = bpf_get_current_pid_tgid(); u32 pid = pid_tgid >> 32; u32 tid = (u32)pid_tgid; // Look up existing count for this PID u64 *count = bpf_map_lookup_elem(&pid_count, &pid); u64 new_count = count ? *count + 1 : 1; // Update the map bpf_map_update_elem(&pid_count, &pid, &new_count, BPF_ANY); // Update total count (per-CPU, no locking needed) u32 zero = 0; u64 *total = bpf_map_lookup_elem(&total_count, &zero); if (total) __sync_fetch_and_add(total, 1); return 0;} // Multiple programs can exist in one fileSEC("kprobe/do_sys_close")int trace_close(struct pt_regs *ctx) { bpf_printk("close() called by PID %d\n", bpf_get_current_pid_tgid() >> 32); return 0;}Section Names and Program Types
The SEC() macro places the function in a specific ELF section, which tells the loader the program type and attachment point. The section name follows conventions understood by libbpf:
| Section Pattern | Program Type | Example |
|---|---|---|
kprobe/<func> | BPF_PROG_TYPE_KPROBE | SEC("kprobe/vfs_read") |
kretprobe/<func> | BPF_PROG_TYPE_KPROBE (return) | SEC("kretprobe/vfs_read") |
tracepoint/<cat>/<name> | BPF_PROG_TYPE_TRACEPOINT | SEC("tracepoint/syscalls/sys_enter_open") |
raw_tracepoint/<name> | BPF_PROG_TYPE_RAW_TRACEPOINT | SEC("raw_tracepoint/sys_enter") |
xdp | BPF_PROG_TYPE_XDP | SEC("xdp") |
tc | BPF_PROG_TYPE_SCHED_CLS | SEC("tc") |
lsm/<hook> | BPF_PROG_TYPE_LSM | SEC("lsm/bprm_check_security") |
fentry/<func> | BPF_PROG_TYPE_TRACING | SEC("fentry/do_sys_open") |
fexit/<func> | BPF_PROG_TYPE_TRACING | SEC("fexit/do_sys_open") |
kprobe/kretprobe work on any kernel function but are unstable (function signatures can change). Tracepoints are stable but limited to predefined points. fentry/fexit are the modern alternative—they're faster than kprobes (no int3 trap) and include BTF information for type-safe access. For new programs, prefer tracepoints for stability or fentry/fexit for performance with BTF-enabled kernels.
eBPF programs execute in a sandboxed environment and cannot directly call arbitrary kernel functions. Instead, they interact with the kernel through helper functions—a well-defined, stable API that the kernel exposes to eBPF programs.
Helper functions are essential because they:
There are over 200 helper functions in modern kernels, categorized by purpose:
| Category | Helper Functions | Purpose |
|---|---|---|
| Map Operations | bpf_map_lookup_elem, bpf_map_update_elem, bpf_map_delete_elem | Read/write BPF map data |
| Current Task Info | bpf_get_current_pid_tgid, bpf_get_current_uid_gid, bpf_get_current_comm | Get info about current process |
| Memory Access | bpf_probe_read_kernel, bpf_probe_read_user, bpf_copy_from_user | Safely read kernel/user memory |
| Time | bpf_ktime_get_ns, bpf_ktime_get_boot_ns, bpf_ktime_get_coarse_ns | Get timestamps |
| Output/Logging | bpf_trace_printk, bpf_printk, bpf_perf_event_output | Debug output, send events |
| Networking | bpf_skb_load_bytes, bpf_redirect, bpf_clone_redirect | Packet manipulation |
| Random | bpf_get_prandom_u32 | Pseudo-random number generation |
| Tail Calls | bpf_tail_call | Chain to another BPF program |
| Ring Buffer | bpf_ringbuf_reserve, bpf_ringbuf_submit, bpf_ringbuf_output | Efficient event streaming |
| Spinlocks | bpf_spin_lock, bpf_spin_unlock | Synchronize map access |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126
// ============================================// TASK/PROCESS INFORMATION// ============================================SEC("kprobe/do_sys_openat2")int get_process_info(struct pt_regs *ctx) { // Get PID (high 32 bits) and TID (low 32 bits) u64 pid_tgid = bpf_get_current_pid_tgid(); u32 pid = pid_tgid >> 32; u32 tid = (u32)pid_tgid; // Get UID (high 32 bits) and GID (low 32 bits) u64 uid_gid = bpf_get_current_uid_gid(); u32 uid = uid_gid >> 32; u32 gid = (u32)uid_gid; // Get process name (comm) char comm[16]; bpf_get_current_comm(&comm, sizeof(comm)); // Get current cgroup ID u64 cgroup_id = bpf_get_current_cgroup_id(); return 0;} // ============================================// MEMORY ACCESS// ============================================SEC("kprobe/vfs_read")int trace_read(struct pt_regs *ctx) { // Get the file* argument (first argument on x86-64) struct file *f = (struct file *)PT_REGS_PARM1(ctx); // Read the filename from kernel memory // MUST use bpf_probe_read_* for kernel pointers char filename[256]; struct dentry *dentry; // CO-RE safe read of nested structures bpf_probe_read_kernel(&dentry, sizeof(dentry), &f->f_path.dentry); // Read the name from dentry if (dentry) { bpf_probe_read_kernel_str(filename, sizeof(filename), &dentry->d_name.name); } // For user-space memory (e.g., syscall arguments) char user_buf[64]; void __user *user_ptr = (void *)PT_REGS_PARM2(ctx); // This would fail - user memory might be paged out // long ret = bpf_probe_read_user(user_buf, sizeof(user_buf), user_ptr); return 0;} // ============================================// TIME-BASED OPERATIONS// ============================================struct { __uint(type, BPF_MAP_TYPE_HASH); __uint(max_entries, 10240); __type(key, u32); __type(value, u64);} start_times SEC(".maps"); // Track function latencySEC("kprobe/vfs_read")int trace_read_entry(struct pt_regs *ctx) { u32 pid = bpf_get_current_pid_tgid() >> 32; u64 ts = bpf_ktime_get_ns(); // Monotonic nanoseconds bpf_map_update_elem(&start_times, &pid, &ts, BPF_ANY); return 0;} SEC("kretprobe/vfs_read")int trace_read_exit(struct pt_regs *ctx) { u32 pid = bpf_get_current_pid_tgid() >> 32; u64 *start_ts = bpf_map_lookup_elem(&start_times, &pid); if (start_ts) { u64 duration = bpf_ktime_get_ns() - *start_ts; bpf_printk("vfs_read latency: %llu ns\n", duration); bpf_map_delete_elem(&start_times, &pid); } return 0;} // ============================================// SENDING EVENTS TO USER SPACE (Ring Buffer)// ============================================struct event { u32 pid; u64 timestamp; char comm[16]; char filename[256];}; struct { __uint(type, BPF_MAP_TYPE_RINGBUF); __uint(max_entries, 256 * 1024); // 256 KB buffer} events SEC(".maps"); SEC("tracepoint/syscalls/sys_enter_openat")int trace_openat_rb(struct trace_event_raw_sys_enter *ctx) { // Reserve space in ring buffer struct event *e = bpf_ringbuf_reserve(&events, sizeof(*e), 0); if (!e) return 0; // Buffer full, drop event // Fill in the event e->pid = bpf_get_current_pid_tgid() >> 32; e->timestamp = bpf_ktime_get_ns(); bpf_get_current_comm(&e->comm, sizeof(e->comm)); // Read filename from syscall args const char *pathname = (const char *)ctx->args[1]; bpf_probe_read_user_str(&e->filename, sizeof(e->filename), pathname); // Submit to ring buffer (makes it visible to user space) bpf_ringbuf_submit(e, 0); return 0;}Some helper functions are only available to GPL-licensed eBPF programs. These include bpf_probe_read_kernel, bpf_probe_write_user, and most tracing helpers. If your program declares a non-GPL license, the verifier will reject calls to these helpers. When in doubt, use GPL for the license declaration.
BPF maps are the primary mechanism for:
Maps are kernel-side data structures with well-defined semantics and performance characteristics. The kernel provides numerous map types, each optimized for specific use cases.
| Map Type | Description | Use Case |
|---|---|---|
BPF_MAP_TYPE_HASH | Hash table with arbitrary keys | Keyed lookups (PID -> data) |
BPF_MAP_TYPE_ARRAY | Array with integer indices | Direct indexed access, counters |
BPF_MAP_TYPE_PERCPU_HASH | Per-CPU hash table | High-contention hash without locking |
BPF_MAP_TYPE_PERCPU_ARRAY | Per-CPU array | Per-CPU statistics |
BPF_MAP_TYPE_LRU_HASH | LRU evicting hash table | Bounded caches |
BPF_MAP_TYPE_LRU_PERCPU_HASH | Per-CPU LRU hash | Per-CPU bounded caches |
BPF_MAP_TYPE_RINGBUF | Single producer ring buffer | Efficient event streaming to user space |
BPF_MAP_TYPE_PERF_EVENT_ARRAY | Per-CPU ring buffers | Legacy event streaming (prefer ringbuf) |
BPF_MAP_TYPE_PROG_ARRAY | Array of BPF program FDs | Tail calls (program chaining) |
BPF_MAP_TYPE_STACK_TRACE | Stack trace storage | Profiling, stack unwinding |
BPF_MAP_TYPE_CGROUP_ARRAY | Array of cgroup FDs | cgroup-based filtering |
BPF_MAP_TYPE_SOCKMAP | Socket storage | Socket-level proxying |
BPF_MAP_TYPE_BLOOM_FILTER | Probabilistic set membership | Efficient existence checks |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115
// ============================================// MODERN MAP DEFINITIONS (BTF-defined maps)// ============================================// This is the recommended syntax for modern libbpf // Hash map: arbitrary key-value storagestruct { __uint(type, BPF_MAP_TYPE_HASH); __uint(max_entries, 10240); __type(key, u32); // Key type: PID __type(value, struct data); // Value type: custom struct} my_hash SEC(".maps"); // Array: indexed by u32, O(1) accessstruct { __uint(type, BPF_MAP_TYPE_ARRAY); __uint(max_entries, 256); __type(key, u32); __type(value, u64);} my_array SEC(".maps"); // Per-CPU array: no locking, each CPU has its own copy// Total values = max_entries * num_cpusstruct { __uint(type, BPF_MAP_TYPE_PERCPU_ARRAY); __uint(max_entries, 1); __type(key, u32); __type(value, u64);} cpu_stats SEC(".maps"); // LRU hash: automatically evicts least-recently-used entriesstruct { __uint(type, BPF_MAP_TYPE_LRU_HASH); __uint(max_entries, 1024); __type(key, struct flow_key); __type(value, struct flow_stats);} flow_cache SEC(".maps"); // Ring buffer: efficient event streamingstruct { __uint(type, BPF_MAP_TYPE_RINGBUF); __uint(max_entries, 256 * 1024); // 256 KB} events SEC(".maps"); // ============================================// MAP OPERATIONS IN BPF PROGRAMS// ============================================SEC("tracepoint/syscalls/sys_enter_read")int trace_read(void *ctx) { u32 pid = bpf_get_current_pid_tgid() >> 32; // === LOOKUP === // Returns pointer to value or NULL if not found u64 *count = bpf_map_lookup_elem(&my_hash, &pid); // === UPDATE === // Flags: BPF_ANY (insert or update) // BPF_NOEXIST (insert only, fail if exists) // BPF_EXIST (update only, fail if not exists) u64 new_count = count ? *count + 1 : 1; bpf_map_update_elem(&my_hash, &pid, &new_count, BPF_ANY); // === DELETE === // bpf_map_delete_elem(&my_hash, &pid); // === PER-CPU ACCESS === // No locking needed - each CPU sees its own value u32 zero = 0; u64 *cpu_count = bpf_map_lookup_elem(&cpu_stats, &zero); if (cpu_count) __sync_fetch_and_add(cpu_count, 1); // === RING BUFFER === struct event *e = bpf_ringbuf_reserve(&events, sizeof(*e), 0); if (e) { e->pid = pid; e->timestamp = bpf_ktime_get_ns(); bpf_ringbuf_submit(e, 0); } return 0;} // ============================================// INNER MAP (Map-in-Map) for Dynamic Structures// ============================================// Outer map holds file descriptors to inner mapsstruct inner_map { __uint(type, BPF_MAP_TYPE_HASH); __uint(max_entries, 1024); __type(key, u32); __type(value, u64);} inner_map SEC(".maps"); struct { __uint(type, BPF_MAP_TYPE_ARRAY_OF_MAPS); __uint(max_entries, 4); __uint(key_size, sizeof(u32)); __array(values, struct inner_map);} outer_map SEC(".maps"); SEC("kprobe/some_func")int use_inner_map(struct pt_regs *ctx) { u32 outer_key = 0; void *inner = bpf_map_lookup_elem(&outer_map, &outer_key); if (!inner) return 0; u32 inner_key = 42; u64 *val = bpf_map_lookup_elem(inner, &inner_key); if (val) bpf_printk("Found: %llu\n", *val); return 0;}By default, BPF maps are destroyed when the loading program exits. To persist maps beyond the loader's lifetime, pin them to the BPF filesystem (bpffs). This creates a file at /sys/fs/bpf/<name> that holds a reference to the map. Other programs can then access the map by opening this path. This enables map sharing across processes and program hot-reloading.
Transforming C source code into running eBPF programs involves multiple stages. Understanding this pipeline is essential for debugging compilation issues and optimizing program size.
The Complete Pipeline
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ C Source │────▶│ Clang/LLVM │────▶│ ELF Object │────▶│ Loader │
│ (.bpf.c) │ │ -target bpf │ │ (.bpf.o) │ │ (libbpf) │
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
│
▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Native Code │◀────│ JIT │◀────│ Verifier │◀────│ bpf() │
│ (in kernel) │ │ Compiler │ │ (safety) │ │ syscall │
└─────────────┘ └─────────────┘ └─────────────┘ └─────────────┘
Stage 1: Clang Compilation
Clang with LLVM is the only production-ready compiler for eBPF. The compilation produces an ELF object file containing:
123456789101112131415161718192021222324
# Basic compilationclang -O2 -g -target bpf -c program.bpf.c -o program.bpf.o # Full production compilation with all options:clang \ -O2 \ # Optimization level (always use -O2) -g \ # Generate debug info (needed for BTF) -target bpf \ # Target: BPF bytecode -D__TARGET_ARCH_x86 \ # Target arch for vmlinux.h macros -I/path/to/libbpf/include \ # libbpf headers (bpf_helpers.h, etc.) -I. \ # Local headers (vmlinux.h) -c program.bpf.c \ # Input: C source -o program.bpf.o # Output: ELF object # Generate vmlinux.h from kernel BTFbpftool btf dump file /sys/kernel/btf/vmlinux format c > vmlinux.h # Inspect the compiled programllvm-objdump -d program.bpf.o # Disassemble bytecodebpftool prog dump xlated pinned /sys/fs/bpf/prog # See translated bytecodebpftool prog dump jited pinned /sys/fs/bpf/prog # See JIT'd machine code # Check BTF informationbpftool btf dump file program.bpf.oStage 2: The ELF Object File
The compiled .bpf.o file is a standard ELF object with special sections:
| Section | Purpose |
|---|---|
.text | Default code section |
kprobe/..., tracepoint/... | Named program sections |
.maps | Map definitions |
.rodata | Read-only data (const globals) |
.data | Read-write global data |
.bss | Zero-initialized data |
.BTF | Type information |
.BTF.ext | Extended BTF (line info, CO-RE relocations) |
.rel* | Relocations for the corresponding section |
Stage 3: Loading with libbpf
libbpf is the canonical library for loading eBPF programs. It handles:
bpf() syscallbpf() syscallModern libbpf development uses skeleton generation for type-safe access:
123456789101112
# 1. Compile BPF programclang -O2 -g -target bpf -c program.bpf.c -o program.bpf.o # 2. Generate skeleton headerbpftool gen skeleton program.bpf.o > program.skel.h # The skeleton provides:# - struct program_bpf: holds all BPF objects# - program_bpf__open(): parse ELF, prepare for loading# - program_bpf__load(): create maps, load programs# - program_bpf__attach(): attach to hooks# - program_bpf__destroy(): cleanupWhen the verifier rejects your program, set LIBBPF_LOG_LEVEL=debug or use libbpf_set_print() to see the full verifier output. Common issues include: unbounded loops (add explicit bounds), uninitialized register access (check all paths initialize variables), and out-of-bounds memory access (add explicit bounds checks before access).
After loading and verification, eBPF programs must be attached to their execution points. The attachment mechanism differs by program type, and understanding the lifecycle is crucial for reliable operation.
Attachment Methods by Program Type
| Program Type | Attachment Method | Kernel Interface |
|---|---|---|
| kprobe/kretprobe | perf_event or link | kprobe_register / perf_event_open |
| tracepoint | perf_event or link | perf_event_open with tracepoint ID |
| raw_tracepoint | bpf(BPF_RAW_TRACEPOINT_OPEN) | Direct syscall |
| fentry/fexit | bpf_link | bpf(BPF_LINK_CREATE) |
| XDP | netlink or bpf_link | IFLA_XDP or bpf(BPF_LINK_CREATE) |
| TC (sched_cls) | netlink (tc) | RTM_NEWTFILTER |
| cgroup/* | bpf(BPF_PROG_ATTACH) or link | File descriptor to cgroup dir |
| LSM | bpf_link | bpf(BPF_LINK_CREATE) |
| socket filter | setsockopt(SO_ATTACH_BPF) | Socket file descriptor |
BPF Links: The Modern Attachment API
Traditional attachment methods had issues:
BPF links (introduced in kernel 5.7) solve these problems:
bpf_link_update()1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465
// ============================================// KPROBE ATTACHMENT// ============================================// Using libbpf skeleton (handles attachment automatically)struct program_bpf *skel = program_bpf__open_and_load();struct bpf_link *link = bpf_program__attach(skel->progs.trace_openat);if (!link) { fprintf(stderr, "Failed to attach kprobe\n");} // Manual kprobe attachment with explicit functionstruct bpf_link *link = bpf_program__attach_kprobe( skel->progs.my_kprobe, false, // retprobe? false = entry, true = return "vfs_read" // function name); // ============================================// XDP ATTACHMENT// ============================================int ifindex = if_nametoindex("eth0"); // Attach with flags// XDP_FLAGS_DRV_MODE: Native driver mode (best performance)// XDP_FLAGS_SKB_MODE: Generic mode (works everywhere)// XDP_FLAGS_HW_OFFLOAD: Hardware offload (NIC executes BPF)int err = bpf_xdp_attach( ifindex, bpf_program__fd(skel->progs.xdp_prog), XDP_FLAGS_DRV_MODE, NULL); // Or using bpf_link for better lifecycle managementstruct bpf_link *xdp_link = bpf_program__attach_xdp( skel->progs.xdp_prog, ifindex); // ============================================// CGROUP ATTACHMENT// ============================================int cgroup_fd = open("/sys/fs/cgroup/my_cgroup", O_RDONLY); struct bpf_link *link = bpf_program__attach_cgroup( skel->progs.cgroup_skb_prog, cgroup_fd); // ============================================// LINK PINNING FOR PERSISTENCE// ============================================// Pin link to bpffs - survives process exiterr = bpf_link__pin(link, "/sys/fs/bpf/my_link"); // Later, from another process, reopen the linkstruct bpf_link *reopened = bpf_link__open("/sys/fs/bpf/my_link"); // Atomic program replacementint new_prog_fd = bpf_program__fd(new_skel->progs.updated_prog);err = bpf_link__update_program(reopened, new_skel->progs.updated_prog); // Cleanupbpf_link__unpin(link);bpf_link__destroy(link);Program Lifecycle States
┌──────────┐ ┌────────┐ ┌──────────┐ ┌──────────────┐
│ Compiled │────▶│ Loaded │────▶│ Attached │────▶│ Detached/ │
│ (.bpf.o) │ │ │ │ │ │ Destroyed │
└──────────┘ └────────┘ └──────────┘ └──────────────┘
│ │ ▲
│ │ │
│ └───────────────────┘
│ (unpin/destroy link)
│
└─────────────────────────────────▶
(close FD without attachment = destroyed)
Key Lifecycle Points:
BPF programs and maps are reference-counted. They're destroyed when the last reference is closed. References include: open file descriptors, pinned paths on bpffs, active attachments (links), and maps referencing programs (prog_array for tail calls). To keep programs alive after your process exits, pin them to bpffs or use systemd to manage the loader process.
eBPF programs have strict size limits (1 million verified instructions), and the verifier limits complexity to prevent infinite loops. Tail calls provide a mechanism to work around these limits by chaining multiple programs together.
What is a Tail Call?
A tail call transfers execution from one eBPF program to another, replacing the current program entirely (like execve() for processes). The key properties are:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107
// ============================================// TAIL CALL MAP (prog_array)// ============================================// Holds file descriptors to BPF programs, indexed by u32struct { __uint(type, BPF_MAP_TYPE_PROG_ARRAY); __uint(max_entries, 8); __type(key, u32); __type(value, u32); // Actually holds prog_fd, managed by loader} jump_table SEC(".maps"); // Program indices (used as keys in the jump table)#define PROG_PARSER 0#define PROG_TCP 1#define PROG_UDP 2#define PROG_ICMP 3 // ============================================// DISPATCHER PROGRAM (Entry Point)// ============================================SEC("xdp")int xdp_dispatcher(struct xdp_md *ctx) { void *data_end = (void *)(long)ctx->data_end; void *data = (void *)(long)ctx->data; struct ethhdr *eth = data; if ((void *)(eth + 1) > data_end) return XDP_DROP; // Tail call to parser program bpf_tail_call(ctx, &jump_table, PROG_PARSER); // If tail call fails (program not in map), continue here return XDP_PASS;} // ============================================// PARSER PROGRAM// ============================================SEC("xdp")int xdp_parser(struct xdp_md *ctx) { void *data_end = (void *)(long)ctx->data_end; void *data = (void *)(long)ctx->data; struct ethhdr *eth = data; struct iphdr *ip = (void *)(eth + 1); if ((void *)(ip + 1) > data_end) return XDP_DROP; // Dispatch to protocol-specific handler switch (ip->protocol) { case IPPROTO_TCP: bpf_tail_call(ctx, &jump_table, PROG_TCP); break; case IPPROTO_UDP: bpf_tail_call(ctx, &jump_table, PROG_UDP); break; case IPPROTO_ICMP: bpf_tail_call(ctx, &jump_table, PROG_ICMP); break; } // Unknown protocol or tail call failed return XDP_PASS;} // ============================================// PROTOCOL-SPECIFIC HANDLERS// ============================================SEC("xdp")int xdp_tcp_handler(struct xdp_md *ctx) { // TCP-specific processing bpf_printk("Processing TCP packet\n"); return XDP_PASS;} SEC("xdp")int xdp_udp_handler(struct xdp_md *ctx) { // UDP-specific processing bpf_printk("Processing UDP packet\n"); return XDP_PASS;} // ============================================// USER-SPACE: Loading programs into jump table// ============================================/*// After loading all programs:int dispatcher_fd = bpf_program__fd(skel->progs.xdp_dispatcher);int parser_fd = bpf_program__fd(skel->progs.xdp_parser);int tcp_fd = bpf_program__fd(skel->progs.xdp_tcp_handler);int udp_fd = bpf_program__fd(skel->progs.xdp_udp_handler); int jump_table_fd = bpf_map__fd(skel->maps.jump_table); // Populate the jump tableu32 key;key = PROG_PARSER;bpf_map_update_elem(jump_table_fd, &key, &parser_fd, BPF_ANY); key = PROG_TCP;bpf_map_update_elem(jump_table_fd, &key, &tcp_fd, BPF_ANY); key = PROG_UDP;bpf_map_update_elem(jump_table_fd, &key, &udp_fd, BPF_ANY);*/Tail calls have limitations: max 33 in a chain (prevents infinite loops), all programs must have the same type, tail-called programs don't inherit verifier state (may need to re-validate pointers). For function reuse within a single program, prefer BPF-to-BPF function calls (static functions with __always_inline or BPF function calls in 4.16+) which share verifier state.
eBPF programs can use global variables for configuration and shared state. The kernel and libbpf provide mechanisms to initialize these variables from user space and even modify them at runtime.
Types of Global Variables
| Type | Section | Modifiable at Runtime | Use Case |
|---|---|---|---|
const volatile | .rodata | No (before load only) | Configuration, feature flags |
| Regular global | .data | Yes (via map) | Runtime-modifiable state |
| Static | .bss | Yes (via map) | Zero-initialized state |
Key Insight: Global variables are implemented as implicit BPF maps. libbpf automatically creates array maps for .rodata, .data, and .bss sections, allowing user space to read and (for .data/.bss) modify them.
123456789101112131415161718192021222324252627282930313233343536373839404142434445
// ============================================// CONSTANT CONFIGURATION (set before load)// ============================================// 'const volatile' tells the compiler:// - const: Value doesn't change during program execution// - volatile: Don't optimize away reads (value set externally) const volatile u32 filter_pid = 0; // Target PID to traceconst volatile bool debug_mode = false; // Enable debug outputconst volatile u64 sample_rate = 100; // Sample 1 in N events // ============================================// MUTABLE GLOBAL STATE// ============================================// Regular globals can be modified at runtime via the .data mapu64 event_count = 0;u32 last_pid = 0; // ============================================// USING GLOBALS IN BPF PROGRAMS// ============================================SEC("tracepoint/syscalls/sys_enter_openat")int trace_openat(void *ctx) { u32 pid = bpf_get_current_pid_tgid() >> 32; // Read-only config: filter by PID if set if (filter_pid && pid != filter_pid) return 0; // Increment global counter __sync_fetch_and_add(&event_count, 1); last_pid = pid; // Conditional debug output if (debug_mode) bpf_printk("openat from PID %d, total events: %llu\n", pid, event_count); // Sampling if (event_count % sample_rate != 0) return 0; // Process sampled event... return 0;}Use const volatile for configuration that doesn't change after loading. The verifier can sometimes use known values of const volatile variables to eliminate dead code paths. For example, if filter_pid is set to 0, the verifier knows the condition if (filter_pid && ...) is always false, potentially eliminating that code path entirely.
We've covered the complete anatomy of eBPF programs. Let's consolidate the key concepts:
What's Next:
Now that you understand how eBPF programs are structured and operate, the next page explores tracing and observability—one of the most powerful applications of eBPF. You'll learn how to instrument the kernel, capture performance data, trace system calls, and build the foundation for tools like bpftrace, Falco, and production observability platforms.
You now understand the complete structure and lifecycle of eBPF programs. In the next page, we'll put this knowledge to use by exploring eBPF's tracing and observability capabilities—the foundation for understanding what's happening inside your systems at the kernel level.