Sandboxing - Learning Module

Loading content...

0/240

Process Sandboxing

Confining Processes to Their Own Universe

In the previous page, we established the conceptual foundations of sandboxing. Now we descend from theory to practice: how do operating systems actually confine processes?

A process, from the operating system's perspective, is an abstraction that bundles code execution with system resource access. Processes have memory mappings, file descriptors, network connections, and permissions. A sandboxed process is one where the operating system systematically restricts this access—creating a confined environment where the process can execute but cannot interact freely with the rest of the system.

Process sandboxing represents the most common and practical form of sandboxing in modern systems. It operates at the kernel level, providing strong guarantees with relatively low overhead. Understanding these mechanisms is essential for anyone building secure systems or trying to understand how browsers, containers, and security-critical applications protect themselves.

What You Will Learn

By the end of this page, you will understand the key operating system primitives for process sandboxing: namespaces, chroot, pivot_root, resource isolation, privilege dropping, and the overall architecture of a sandboxed process. You will be able to design and reason about process-level sandboxing strategies.

The Process as an Attack Surface

Before we can sandbox a process, we must understand what resources a process can access and what attacks become possible through each resource. A process's attack surface consists of every channel through which it can interact with the system or other processes.

The Anatomy of a Process:

A Unix/Linux process consists of several components, each representing potential attack vectors:

Process Components and Attack Vectors
Component	Description	Potential Attack	Sandboxing Goal
Address Space	Virtual memory containing code and data	Memory corruption exploits, ROP	Prevent access to other processes' memory
File Descriptors	Handles to files, sockets, pipes	Read/write sensitive files, exfiltrate data	Restrict to minimal required descriptors
Credentials	UID, GID, supplementary groups	Access files/resources as privileged user	Drop to unprivileged credentials
Capabilities	Fine-grained privilege tokens	Escalate privileges, perform admin operations	Remove all unnecessary capabilities
Namespace Memberships	Views of system resources (PIDs, network)	Interact with other processes, access network	Create isolated namespaces
System Call Interface	Gateway to kernel functionality	Exploit kernel vulnerabilities	Filter to minimal syscall set
Environment Variables	Process configuration data	Inject malicious configuration	Sanitize or restrict environment
Signal Handlers	Asynchronous notification mechanism	Interrupt execution, trigger handlers	Limit signal delivery

The Principle of Least Privilege:

Process sandboxing is the practical application of the principle of least privilege: every process should have only the minimum privileges necessary to perform its function. This minimizes the damage possible if the process is compromised.

Consider a web browser's renderer process. Its function is to:

Receive HTML/CSS/JavaScript from the browser process
Parse and execute this content
Render visual output
Send user input back to the browser process

For this function, the renderer process does not need:

Access to the file system
Direct network access
Ability to spawn new processes
Access to audio, video capture, or clipboard
Knowledge of other processes in the system

A properly sandboxed renderer strips all these unnecessary capabilities, so even if an attacker exploits a vulnerability in the JavaScript engine, they remain confined.

Sandbox from Privilege, Not to Privilege

The correct approach is to start with full privileges and progressively drop them until only the required capabilities remain. Do NOT try to sandbox by denying specific things—you'll miss something. Instead, strip everything and add back only what's needed.

File System Isolation

One of the oldest and most fundamental sandboxing techniques is file system isolation—restricting what parts of the file system a process can see and access. On Unix systems, this is achieved through chroot and its more powerful successor, pivot_root.

The chroot System Call:

The chroot(path) system call changes the root directory of the calling process to path. After a chroot, all file path resolution starts from the new root. The process cannot access files outside the new root using normal path traversal.

// Change root directory to /sandbox
if (chroot("/sandbox") != 0) {
    perror("chroot failed");
    exit(1);
}
// Change to the new root
if (chdir("/") != 0) {
    perror("chdir failed");
    exit(1);
}

Limitations of chroot:

While chroot provides file system isolation, it has significant limitations:

chroot Escape Techniques

•Double chroot escape — A root user can create a new chroot within the jail, then chdir to a path outside both jails: chroot("new_jail"); chdir("../../..");
•File descriptor escape — If the process retains an open file descriptor outside the jail before chroot, it can fchdir() to that descriptor and escape.
•Mounting file systems — A root user inside chroot can mount file systems that provide access to the outside world.
•/proc and /sys access — If these special file systems are mounted inside the jail, they provide escape routes.
•Unix sockets — Pre-existing Unix sockets can communicate with processes outside the jail.
•Ptrace — A process with CAP_SYS_PTRACE can attach to and manipulate processes outside the jail.

pivot_root: A Stronger Alternative:

The pivot_root(new_root, put_old) system call is more robust than chroot. Instead of just changing the root reference, it actually moves the old root to a subdirectory and makes the new root the actual system root. This allows the old root to be unmounted entirely, making escape much harder.

// Setup mount namespace first (required for pivot_root)
unshare(CLONE_NEWNS);
mount(NULL, "/", NULL, MS_REC | MS_PRIVATE, NULL);

// Setup new root
mount("/sandbox", "/sandbox", NULL, MS_BIND | MS_REC, NULL);

// Pivot to new root
mkdir("/sandbox/old_root", 0755);
pivot_root("/sandbox", "/sandbox/old_root");
chdir("/");

// Unmount old root
umount2("/old_root", MNT_DETACH);
rmdir("/old_root");

Mount Namespaces Required

pivot_root only works within a mount namespace. Without a private mount namespace, pivot_root would affect the global file system view. This is why containers always create a mount namespace before setting up their root file system.

Best Practices for File System Isolation:

Robust File System Sandboxing

•Use mount namespaces — Create a private mount namespace before any filesystem manipulation.
•Use pivot_root over chroot — pivot_root allows complete unmounting of the old root.
•Mount minimal file systems — Only mount /proc, /sys, /dev if absolutely necessary, and with restrictions.
•Use read-only mounts — Mount file systems as read-only whenever possible: mount -o ro,remount /.
•Use bind mounts for specific paths — Instead of exposing directories, bind-mount only the specific files needed.
•Drop privileges before chroot — Never leave root privileges after chroot; they enable escape.
•Close inherited file descriptors — Enumerate and close all FDs except stdin/stdout/stderr before sandboxing.

Linux Namespaces: Virtualizing System Resources

Linux namespaces are kernel features that partition system resources so that different sets of processes see different views of those resources. Namespaces are the foundation of container technologies and provide powerful, fine-grained isolation.

Available Namespace Types:

Linux provides several namespace types, each isolating a different aspect of the system:

Linux Namespace Types
Namespace	Clone Flag	Introduced	Isolates
Mount	CLONE_NEWNS	Linux 2.4.19 (2002)	Mount points, filesystem view
UTS	CLONE_NEWUTS	Linux 2.6.19 (2006)	Hostname and domain name
IPC	CLONE_NEWIPC	Linux 2.6.19 (2006)	System V IPC, POSIX message queues
PID	CLONE_NEWPID	Linux 2.6.24 (2008)	Process IDs
Network	CLONE_NEWNET	Linux 2.6.29 (2009)	Network devices, stacks, ports
User	CLONE_NEWUSER	Linux 3.8 (2013)	User and group IDs, capabilities
Cgroup	CLONE_NEWCGROUP	Linux 4.6 (2016)	Cgroup root directory
Time	CLONE_NEWTIME	Linux 5.6 (2020)	System time (clock_gettime)

Creating and Entering Namespaces:

Namespaces can be created and entered using several mechanisms:

// Method 1: clone() - create new process in new namespaces
int flags = CLONE_NEWNS | CLONE_NEWPID | CLONE_NEWNET | SIGCHLD;
pid_t pid = clone(child_func, stack, flags, arg);

// Method 2: unshare() - move current process to new namespaces
unshare(CLONE_NEWNS | CLONE_NEWPID | CLONE_NEWNET);

// Method 3: setns() - enter existing namespace via fd
int ns_fd = open("/proc/1234/ns/net", O_RDONLY);
setns(ns_fd, CLONE_NEWNET);

PID Namespace Deep Dive:

The PID namespace deserves special attention because it profoundly affects how processes perceive the system. In a new PID namespace:

The first process becomes PID 1 (like init)
PIDs inside the namespace are separate from outside PIDs
Processes inside cannot see or signal processes outside
When PID 1 (namespace init) dies, all processes in the namespace are killed

pid_t pid = fork();
if (pid == 0) {
    // Child: create new PID namespace
    unshare(CLONE_NEWPID);
    
    pid_t inner_pid = fork();
    if (inner_pid == 0) {
        // This process is PID 1 in the new namespace
        printf("My PID: %d
", getpid());  // Prints: My PID: 1
        
        // Cannot see other system processes
        // Cannot send signals to processes outside namespace
        
        // Act as init: reap zombies
        while (1) {
            int status;
            wait(&status);
        }
    }
    exit(0);
}

PID Namespace Init Responsibility

PID 1 in a namespace has special responsibilities: it must reap zombie processes and handle signals appropriately. If PID 1 exits or crashes, all processes in the namespace are killed with SIGKILL. Proper PID 1 handling is critical for container stability.

Network Namespace Deep Dive:

The network namespace creates a completely isolated network stack:

Separate network interfaces (only loopback initially)
Separate routing tables
Separate iptables/nftables rules
Separate ports (process in ns can bind to port 80 without conflict)

Connecting a network namespace to the outside world requires explicit configuration using virtual ethernet pairs (veth), bridges, or NAT rules.

# Create new network namespace
ip netns add sandbox

# Create veth pair connecting namespace to host
ip link add veth0 type veth peer name veth1
ip link set veth1 netns sandbox

# Configure host side
ip addr add 10.0.0.1/24 dev veth0
ip link set veth0 up

# Configure namespace side (run in namespace)
ip netns exec sandbox ip addr add 10.0.0.2/24 dev veth1
ip netns exec sandbox ip link set veth1 up
ip netns exec sandbox ip link set lo up

# Now processes in 'sandbox' can reach 10.0.0.1

User Namespaces and Unprivileged Sandboxing

User namespaces are perhaps the most powerful namespace type because they enable unprivileged sandboxing. Before user namespaces, creating namespaces and sandboxes required root privileges—which meant you needed privileges to drop privileges. User namespaces solve this paradox.

How User Namespaces Work:

A user namespace provides a separate mapping of user and group IDs. A process can be root (UID 0) inside the namespace while being an ordinary unprivileged user outside:

// Unprivileged user creates user namespace
if (unshare(CLONE_NEWUSER) != 0) {
    perror("unshare");
    exit(1);
}

// Now UID 0 inside namespace, but still unprivileged outside
printf("UID inside namespace: %d
", getuid());  // 65534 (nobody)
printf("Effective capabilities: ...full set...");

UID/GID Mapping:

User namespaces require explicit UID/GID mappings to be configured. These mappings are written to /proc/[pid]/uid_map and /proc/[pid]/gid_map:

# Format: <id-inside-ns> <id-outside-ns> <range>
# Map UID 0 inside to UID 1000 outside (count=1)
echo "0 1000 1" > /proc/self/uid_map

# Map GID 0 inside to GID 1000 outside
# Note: must write 'deny' to /proc/self/setgroups first
echo "deny" > /proc/self/setgroups
echo "0 1000 1" > /proc/self/gid_map

Implications:

With user namespaces, an unprivileged user can:

Create all other namespace types (mount, PID, network, etc.)
Have "root" capabilities inside the user namespace
Mount certain filesystem types (proc, sysfs, tmpfs)
Create subusernamespaces with further mappings

However, this "root" is fake root—the kernel still enforces that operations against host resources use the mapped (unprivileged) UID.

User Namespace Capabilities

•Create nested namespaces
•Mount tmpfs, procfs
•Change hostname (UTS ns)
•Bind to low ports inside ns
•Create network config inside ns
•Access files owned by mapped UID

User Namespace Limitations

•Cannot access files owned by unmapped UIDs
•Cannot mount most filesystem types
•Cannot load kernel modules
•Cannot access raw sockets (without CAP_NET_RAW on host)
•Cannot mknod for most device types
•Capabilities don't extend to init ns

Rootless Containers

User namespaces enable 'rootless containers'—Docker, Podman, and other container runtimes can run without any privileges. The container engine runs as an unprivileged user, creates a user namespace, and inside that namespace creates all other namespaces. This dramatically improves security by eliminating the privileged container runtime.

Linux Capabilities: Fine-Grained Privilege Control

Traditional Unix has a binary privilege model: either you're root (UID 0) with full privileges, or you're not root with limited privileges. This is problematic because many programs need only a single privileged operation (e.g., binding to port 80) but receive all root privileges.

Linux capabilities break root privileges into distinct units that can be granted independently. Instead of giving a process full root access, you grant only the specific capabilities it needs.

Capability Sets:

Each process has several capability sets:

Permitted (P) — Maximum capabilities the process can have
Effective (E) — Capabilities currently in effect (used for permission checks)
Inheritable (I) — Capabilities that can be inherited across execve()
Bounding — Upper limit on capabilities that can be added
Ambient — Capabilities preserved across execve() for non-privileged programs

Selected Linux Capabilities
Capability	Allows	Sandboxing Notes
CAP_NET_BIND_SERVICE	Bind to ports < 1024	Often the only capability needed by web servers
CAP_NET_RAW	Use raw sockets	Needed for ping, packet capture; dangerous
CAP_SYS_ADMIN	Many privileged operations	The 'new root'; avoid granting
CAP_SYS_PTRACE	Use ptrace()	Allows debugging any process; escape risk
CAP_DAC_OVERRIDE	Bypass file permission checks	Read/write any file; avoid
CAP_CHOWN	Change file ownership	Can chown any file to any user
CAP_SETUID	Set UID	Can become any user; avoid
CAP_SYS_CHROOT	Use chroot()	Needed for chroot; escape risk
CAP_NET_ADMIN	Network configuration	IP config, routing, firewall rules
CAP_KILL	Send signals to any process	Can kill processes of other users

Dropping Capabilities for Sandboxing:

The key sandboxing operation is dropping capabilities. A process should start with necessary capabilities and drop them before handling untrusted input:

#include <sys/capability.h>
#include <sys/prctl.h>

void drop_capabilities() {
    // Get current capabilities
    cap_t caps = cap_get_proc();
    
    // Clear all capabilities
    cap_clear(caps);
    
    // Optionally keep specific capabilities
    // cap_value_t keep[] = { CAP_NET_BIND_SERVICE };
    // cap_set_flag(caps, CAP_PERMITTED, 1, keep, CAP_SET);
    // cap_set_flag(caps, CAP_EFFECTIVE, 1, keep, CAP_SET);
    
    // Apply
    cap_set_proc(caps);
    cap_free(caps);
    
    // Lock the bounding set to prevent capability elevation
    for (int cap = 0; cap <= CAP_LAST_CAP; cap++) {
        prctl(PR_CAPBSET_DROP, cap, 0, 0, 0);
    }
    
    // Prevent regaining capabilities
    prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
}

CAP_SYS_ADMIN: The Dangerous Capability

CAP_SYS_ADMIN has become a catch-all for privileged operations that don't fit elsewhere. It allows mounting filesystems, creating namespaces, many ioctl operations, and more. A process with CAP_SYS_ADMIN has almost as much power as root. Well-designed sandboxes never grant it.

NO_NEW_PRIVS: Sealing the Sandbox:

The PR_SET_NO_NEW_PRIVS prctl prevents a process from gaining new privileges through execve(). Without this, a sandboxed process could execute a setuid binary and escape the sandbox:

// Enable no_new_privs - cannot be disabled once set
prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);

// Now execve() of setuid binary runs without privilege elevation
// This prevents sandbox escape via setuid binaries

This flag is inherited by children and cannot be cleared, making it a critical part of the sandbox.

Ambient Capabilities:

Ambient capabilities (since Linux 4.3) address a usability issue: how to run a non-setuid program with specific privileges. They allow capabilities to be preserved across execve() for programs that don't have file capabilities set, enabling capability-based privilege without setuid.

Credential and Identity Sandboxing

Beyond capabilities, process credentials (UID, GID) provide another layer of isolation. By switching to dedicated sandboxed identities, processes are isolated by traditional Unix permission mechanisms.

Dedicated Sandbox Users:

A common pattern is to create dedicated user accounts for sandboxed services:

# Create sandbox user with restricted shell
useradd --system --shell /usr/sbin/nologin 
        --home /var/empty --no-create-home 
        sandbox_user
        
# Service starts as root, then drops to sandbox_user

This approach provides:

Separation from other users' files
Audit trail (who did what)
Traditional Unix permission enforcement
Resource accounting via cgroups

Dropping Privileges:

The privilege drop sequence must be performed carefully to avoid race conditions and ensure complete privilege separation:

void drop_to_sandbox_user(uid_t uid, gid_t gid) {
    // 1. Clear supplementary groups first
    if (setgroups(0, NULL) != 0) {
        perror("setgroups");
        exit(1);
    }
    
    // 2. Set GID before UID (can't change GID after dropping root UID)
    if (setresgid(gid, gid, gid) != 0) {
        perror("setresgid");
        exit(1);
    }
    
    // 3. Set UID (this drops root)
    if (setresuid(uid, uid, uid) != 0) {
        perror("setresuid");
        exit(1);
    }
    
    // 4. Verify privilege drop succeeded
    if (getuid() != uid || geteuid() != uid) {
        fprintf(stderr, "UID drop failed
");
        exit(1);
    }
    if (getgid() != gid || getegid() != gid) {
        fprintf(stderr, "GID drop failed
");
        exit(1);
    }
}

Order Matters: GID Before UID

Always set GID before UID. Once you drop root UID, you cannot change GID. The kernel checks permissions based on effective UID, and without root, setresgid() fails. Getting the order wrong leaves the sandbox with the original (often privileged) group.

Nobody User Anti-Pattern:

Historically, services would drop to the "nobody" user (typically UID 65534). This is now considered an anti-pattern:

Multiple services as nobody can access each other's files
No isolation between unrelated services
Audit logs show 'nobody' for all sandboxed operations
Resource accounting is shared

Modern best practice is one dedicated user per service, providing true isolation.

Dynamic User Allocation:

Systemd provides dynamic user allocation via DynamicUser=yes in unit files. At service start, systemd allocates a unique UID/GID from a pool, runs the service as that identity, and reclaims the ID when the service stops. This provides per-service identity isolation without explicit user management.

Resource Limits and Control Groups

Sandboxing isn't just about preventing unauthorized access—it's also about preventing resource abuse. A sandboxed process should not be able to consume unlimited CPU, memory, disk I/O, or network bandwidth, causing denial-of-service for the host system.

Traditional Resource Limits (rlimits):

The setrlimit() system call provides per-process resource limits:

#include <sys/resource.h>

void set_resource_limits() {
    struct rlimit rl;
    
    // Limit address space to 256MB
    rl.rlim_cur = rl.rlim_max = 256 * 1024 * 1024;
    setrlimit(RLIMIT_AS, &rl);
    
    // Limit maximum file size to 10MB
    rl.rlim_cur = rl.rlim_max = 10 * 1024 * 1024;
    setrlimit(RLIMIT_FSIZE, &rl);
    
    // Limit number of open files to 64
    rl.rlim_cur = rl.rlim_max = 64;
    setrlimit(RLIMIT_NOFILE, &rl);
    
    // Limit number of processes to 1 (prevent fork bombs)
    rl.rlim_cur = rl.rlim_max = 1;
    setrlimit(RLIMIT_NPROC, &rl);
    
    // No core dumps
    rl.rlim_cur = rl.rlim_max = 0;
    setrlimit(RLIMIT_CORE, &rl);
}

Control Groups (cgroups):

cgroups provide more sophisticated resource control at the process group level. Unlike rlimits, cgroups can:

Limit aggregate resources for a group of processes
Prioritize resources between groups
Account for resource usage
Freeze/unfreeze process groups

cgroup v2 Controllers:

cgroup v2 Resource Controllers
Controller	Resource	Key Settings
cpu	CPU time	cpu.max (quota period), cpu.weight (priority)
memory	Memory usage	memory.max, memory.high, memory.swap.max
io	Block I/O	io.max (BPS/IOPS limits), io.weight
pids	Process count	pids.max (fork bomb protection)
cpuset	CPU/memory node affinity	cpuset.cpus, cpuset.mems
rdma	RDMA resources	rdma.max (HCA handles, objects)

Setting cgroup Limits:

# Create cgroup for sandbox
mkdir /sys/fs/cgroup/sandbox

# Set memory limit to 256MB
echo "268435456" > /sys/fs/cgroup/sandbox/memory.max

# Set CPU limit to 50% of one core
echo "50000 100000" > /sys/fs/cgroup/sandbox/cpu.max

# Set max processes to 10
echo "10" > /sys/fs/cgroup/sandbox/pids.max

# Add current process to cgroup
echo $$ > /sys/fs/cgroup/sandbox/cgroup.procs

Memory Limit Behavior:

When a process exceeds its memory limit:

memory.max — Hard limit; process allocation fails or OOM killer triggered
memory.high — Throttle point; process is throttled but not killed
memory.low — Best-effort protection; memory not reclaimed if possible
memory.min — Guaranteed minimum; absolute protection from reclaim

cgroup Namespace for Sandbox Privacy

The cgroup namespace (CLONE_NEWCGROUP) allows a sandboxed process to see its cgroup as the root cgroup. This prevents the sandbox from observing host cgroup structure or manipulating host cgroups. Combine with appropriate cgroup placement for defense in depth.

Putting It All Together: A Complete Sandbox

Let's examine how these components combine to create a robust process sandbox. This example demonstrates a layered approach that's conceptually similar to what browsers and containers use.

Sandbox Setup Sequence:

The order of operations matters critically. Here's the typical sequence:

Sandbox Initialization Order

•Create namespaces — unshare(CLONE_NEWUSER | CLONE_NEWNS | CLONE_NEWPID | CLONE_NEWNET | CLONE_NEWIPC | CLONE_NEWUTS)
•Setup namespace contents — UID mappings, mount namespace filesystem, network configuration
•Change root — pivot_root to minimal filesystem, unmount old root
•Set resource limits — Configure cgroups and rlimits
•Drop supplementary groups — setgroups(0, NULL)
•Drop GID — setresgid() to sandbox GID
•Drop UID — setresuid() to sandbox UID
•Drop capabilities — Clear capability sets, drop bounding set
•Set no_new_privs — prctl(PR_SET_NO_NEW_PRIVS, 1)
•Install seccomp filter — Restrict system calls (covered next page)
•Close file descriptors — Close all FDs except those explicitly needed
•Execute sandboxed program — execve() the target program

Conceptual Code Outline:

void setup_sandbox_and_exec(const char *program) {
    // Phase 1: Namespace creation
    unshare(CLONE_NEWUSER | CLONE_NEWNS | CLONE_NEWPID | 
            CLONE_NEWNET | CLONE_NEWIPC | CLONE_NEWUTS);
    
    // Need to fork after CLONE_NEWPID
    if (fork() != 0) exit(0);  // Parent exits, child continues
    
    // Phase 2: Namespace configuration
    setup_uid_gid_mappings();
    setup_mount_namespace("/sandbox/rootfs");
    setup_network_namespace();
    
    // Phase 3: Privilege restriction
    chdir("/");
    drop_supplementary_groups();
    drop_gid(SANDBOX_GID);
    drop_uid(SANDBOX_UID);
    drop_capabilities();
    prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0);
    
    // Phase 4: System call filtering
    install_seccomp_filter();
    
    // Phase 5: File descriptor cleanup
    close_fds_except(STDIN_FILENO, STDOUT_FILENO, STDERR_FILENO);
    
    // Phase 6: Execute target
    execv(program, program_args);
    _exit(127);  // execv failed
}

Defense in Depth

Each layer provides independent protection. If an attacker bypasses seccomp, they still face namespace isolation. If they escape the namespace, capabilities are still restricted. If they somehow regain capabilities, rlimits still constrain resource use. This layering makes complete sandbox escape exponentially harder.

Summary: Process Sandboxing

We have explored the operating system mechanisms for confining processes within controlled environments. Let's consolidate the key insights:

Key Takeaways

•Processes have multiple attack surfaces — Memory, files, network, credentials, capabilities, syscalls all require restriction.
•File system isolation via chroot/pivot_root — Restrict the visible filesystem to a minimal set; prefer pivot_root with mount namespace.
•Linux namespaces virtualize system resources — Mount, PID, network, user, IPC, UTS, cgroup, and time namespaces create isolated views.
•User namespaces enable unprivileged sandboxing — Ordinary users can create sandboxes without root privileges.
•Capabilities provide fine-grained privilege — Drop all capabilities except those strictly required.
•Credentials enforce identity separation — Use dedicated users per sandboxed service; drop groups and UID properly.
•cgroups and rlimits prevent resource abuse — Limit memory, CPU, processes, and I/O to prevent denial-of-service.
•Order of operations matters — Setup sequence must be correct; dropping privileges is irreversible.

What's Next:

Process sandboxing restricts the process's environment, but the process still has access to the system call interface—the gateway to kernel functionality. The next page explores system call filtering: how to restrict not just what resources a process can see, but what operations it can perform.

Page Complete

You now understand the mechanisms for sandboxing processes at the operating system level: namespaces for resource virtualization, chroot/pivot_root for filesystem isolation, capabilities for privilege control, credentials for identity separation, and cgroups for resource limiting. The next page will cover system call filtering with seccomp.