Loading content...
Every access control mechanism we've examined so far shares a common approach: permissions are stored with the object. The file knows who can access it; the kernel checks this list at access time.
But there's another way. What if instead of objects knowing their permissions, subjects possessed unforgeable tokens that granted access? You don't ask "does Alice have permission to read this file?" but rather "does Alice possess a valid capability for reading this file?"
This is capability-based security—a fundamentally different access control paradigm that influences everything from file descriptors in Unix to modern container isolation and the least-privilege designs in production systems.
By the end of this page, you will understand: the conceptual difference between ACLs and capabilities, how capabilities represent the row-major view of the access matrix, why file descriptors are capabilities (partially), capability properties—unforgeability and confinement, how Linux capabilities divide root privileges, and the tradeoffs between capability and ACL models.
A capability is an unforgeable token that both identifies an object and grants specific rights to that object. Think of it as a key that simultaneously specifies which lock it opens and what actions it permits.
Real-world analogy:
Consider a concert ticket. It:
You don't show ID at the door and ask "do I have permission to enter?" You present the ticket—a capability that proves you have the right.
Contrast with ACLs:
| Aspect | ACL Model | Capability Model |
|---|---|---|
| Where permissions live | With the object | With the subject |
| Access matrix view | Column-major (per object) | Row-major (per subject) |
| "Can Alice read X?" | Check X's ACL for Alice | Check Alice's capabilities for X |
| "Who can access X?" | Easy (read X's ACL) | Hard (check all subjects) |
| "What can Alice access?" | Hard (check all objects) | Easy (list Alice's capabilities) |
| Granting access | Modify object's permission list | Transfer a capability to subject |
| Revoking access | Modify object's permission list | Complex (must revoke capability) |
| Name implies access? | No (name and check are separate) | Yes (capability IS the name) |
In the ACL model, you first NAME an object (pathname) and then ACCESS is checked separately. This separation enables confused deputy attacks. In capability systems, naming and authority are unified: you cannot even refer to an object without holding a capability that grants access. No capability, no reference.
Recall the access control matrix from Page 1. It shows subjects (rows) and objects (columns), with permissions in cells. There are two ways to implement this matrix:
Column-major (ACLs): Store each column with its object. The file /home/alice/secret.txt stores a list: [(alice, rw), (bob, r), (auditors, r)].
Row-major (Capabilities): Store each row with its subject. Alice holds a list of capabilities: [(secret.txt, rw), (project/, rx), (/bin/ls, x)].
123456789101112131415161718192021222324252627
Access Control Matrix ═══════════════════════ Objects → ┌─────────────┬─────────────┬─────────────┐Subjects │ /etc/passwd │ alice/secret│ /var/log │ ↓ ├─────────────┼─────────────┼─────────────┤ alice │ r │ rw │ - │ bob │ r │ r │ r │ root │ rw │ rw │ rw │ └─────────────┴─────────────┴─────────────┘ Column-Major (ACL) Storage:───────────────────────────/etc/passwd stores: [(alice,r), (bob,r), (root,rw)]alice/secret stores: [(alice,rw), (bob,r), (root,rw)]/var/log stores: [(bob,r), (root,rw)] "Each OBJECT knows its permitted SUBJECTS" Row-Major (Capability) Storage:───────────────────────────────alice holds caps: [(/etc/passwd,r), (alice/secret,rw)]bob holds caps: [(/etc/passwd,r), (alice/secret,r), (/var/log,r)]root holds caps: [(/etc/passwd,rw), (alice/secret,rw), (/var/log,rw)] "Each SUBJECT knows its accessible OBJECTS"Implications of each view:
ACLs make it easy to answer: "Who can access this file?" (Just read the ACL.) But hard to answer: "What can this user access?" (Must scan all ACLs.)
Capabilities invert this: easy to answer "What can this user do?" (List their capabilities.) But hard to answer: "Who has access to this file?" (Must check all users' capability lists.)
Why this matters:
Security audits and access reviews often need both questions answered. Pure ACL systems struggle with per-user audits; pure capability systems struggle with per-object audits. Most real systems are hybrids.
ACL revocation is simple: edit the ACL. Capability revocation is hard: you must somehow revoke or invalidate a token that the subject already possesses. Solutions include: (1) indirection through a revocable reference, (2) expiring capabilities, (3) capability lists maintained by the kernel (not truly held by subjects). This is why hybrid approaches dominate.
If you've programmed in C or any Unix-like environment, you've already used capabilities. File descriptors are capabilities—or at least, they embody capability principles.
When you call open("/path/to/file", O_RDONLY), the kernel:
Subsequent read() and write() calls don't re-check the pathname's permissions. They use the file descriptor—a capability that proves you passed the access check.
1234567891011121314151617181920212223242526272829303132
#include <fcntl.h>#include <unistd.h> int main() { // ACL check happens here — the ambient authority check int fd = open("/etc/passwd", O_RDONLY); if (fd < 0) { return 1; // Permission denied or file not found } // fd is now a capability: unforgeable proof of read access char buffer[1024]; // These calls use capability only — no ACL check ssize_t bytes = read(fd, buffer, sizeof(buffer)); // Even if someone changes /etc/passwd permissions right now, // this process keeps its access through the fd. // fd can be inherited by child processes (capability passing) if (fork() == 0) { // Child also has the fd capability read(fd, buffer, sizeof(buffer)); // Works! } // fd can be passed to unrelated processes via Unix sockets // (SCM_RIGHTS mechanism) — capability transfer close(fd); // Revoke the capability for this process return 0;}File descriptor as capability checklist:
✅ Designates an object — The fd refers to a specific open file (the kernel maintains the mapping) ✅ Grants authority — The fd specifies read, write, or both (depending on open flags) ✅ Unforgeable — Processes cannot create arbitrary fds; the kernel assigns them ✅ Transferable — Can be inherited (fork) or sent (SCM_RIGHTS)
But Unix isn't a pure capability system:
Because Unix separates naming (pathname) from authority (UID check), a privileged program can be tricked into accessing the wrong file. Example: A setuid compiler accepts an output path from the user. If the user specifies /etc/passwd, the compiler—running as root—overwrites it. The compiler (the 'deputy') is 'confused' about whose authority to use. Pure capability systems avoid this: the user would have to pass a capability to /etc/passwd, which they don't have.
Confusingly, Linux has a feature called "capabilities" that is not the same as capability-based security. Linux capabilities divide the monolithic root privilege into discrete, assignable units.
Traditionally, you're either root (UID 0, can do anything) or you're not. This violates least privilege—a program needing only to bind port 80 gets full system access if run as root.
Linux capabilities split root into ~40 discrete privileges:
| Capability | Grants Permission To | Example Use Case |
|---|---|---|
CAP_NET_BIND_SERVICE | Bind to ports < 1024 | Web server on port 80 |
CAP_NET_ADMIN | Configure network interfaces | VPN software, network tools |
CAP_NET_RAW | Use raw sockets | Ping, network diagnostics |
CAP_SYS_ADMIN | Many admin operations | Mount, sethostname, etc. |
CAP_DAC_OVERRIDE | Bypass file permission checks | Backup software |
CAP_CHOWN | Change file ownership arbitrarily | Archive extraction |
CAP_KILL | Send signals to any process | Process managers |
CAP_SETUID | Set UID arbitrarily | Login services, su/sudo |
CAP_SYS_PTRACE | Trace any process (debugging) | Debuggers, strace |
CAP_SYS_TIME | Set system time | NTP daemon |
123456789101112131415161718192021222324252627282930
# View capabilities of a running process$ cat /proc/$$/status | grep CapCapInh: 0000000000000000 # Inheritable (passed to children)CapPrm: 0000000000000000 # Permitted (max available)CapEff: 0000000000000000 # Effective (currently active)CapBnd: 000001ffffffffff # Bounding set (absolute limit)CapAmb: 0000000000000000 # Ambient (inherited across exec) # Decode capability bits$ capsh --decode=000001ffffffffff0x000001ffffffffff=cap_chown,cap_dac_override,... # View capabilities of a file (file capabilities)$ getcap /usr/bin/ping/usr/bin/ping cap_net_raw=ep # e = effective (use when executed)# p = permitted (allowed to use)# i = inheritable (passed to child programs) # Set file capabilities (requires CAP_SETFCAP)$ sudo setcap cap_net_bind_service=+ep /usr/bin/my-server # Now my-server can bind to port 80 without running as root! # Run a program with specific capabilities only$ capsh --drop=all --caps="cap_net_bind_service+eip" -- -c "/my-server" # List all defined capabilities$ capsh --print | grep "Current IAB"Capability sets:
Each process has multiple capability sets:
Modern Linux can eliminate many setuid programs:
Before: ping is setuid root (gains full root on execution)
After: ping has cap_net_raw=ep (gains only raw socket capability)
This drastically reduces the attack surface. A vulnerability in ping no longer grants full root—only raw socket access.
Pure capability systems enforce least privilege by construction. A process starts with no capabilities and must receive them from a parent or be explicitly granted them. This is radically different from Unix's ambient authority where any process can attempt to open any file.
The principle of capability discipline:
12345678910111213141516171819202122232425
// Traditional Unix: ambient authorityfunction processUserInput(input: string) { // This process can access ANYTHING its UID allows // The compiler function might write anywhere! let output = compile(input); let path = getUserSpecifiedPath(); // User controls this writeFile(path, output); // DANGER: confused deputy} // Capability discipline: explicit authorityfunction processUserInput(input: string, outputCapability: WriteCapability) { // This process can ONLY write to what it has a capability for let output = compile(input); // outputCapability was provided by caller - can only write there outputCapability.write(output); // SAFE: uses provided capability // Even if malicious, cannot write elsewhere - no capability! // writeFile("/etc/passwd", "hacked"); // Would need capability!} // The caller provides exactly the capabilities needed:let sandboxedCompiler = startProcess(compileCode);let outputCap = createWriteCapability(tempDir); // Write to temp onlysandboxedCompiler.invoke(untrustedCode, outputCap);// compileCode cannot access network, cannot read $HOME, etc.Sandboxing through capability restriction:
Modern security measures like seccomp, Capsicum (FreeBSD), and Landlock (Linux) implement capability-style confinement:
These systems don't use capabilities directly but achieve similar confinement by removing ambient authority.
Docker and container runtimes drop Linux capabilities by default. A container might run as 'root' (UID 0) but without CAP_SYS_ADMIN, CAP_NET_ADMIN, etc. This is capability-based thinking applied to containerization—minimize privilege even for processes that appear privileged.
While mainstream OSes aren't pure capability systems, capability ideas appear throughout computing, and several research/specialized systems fully embrace the model.
| System | Type | Capability Aspect |
|---|---|---|
| seL4 | Microkernel | All system resources accessed via capabilities; formally verified |
| Capsicum (FreeBSD) | Capability mode | Sandbox mode with no global namespace access |
| Google Fuchsia | Operating system | Objects accessed via capabilities (handles) |
| E Language | Programming language | Object-capabilities; unforgeable object references |
| WebAssembly (WASI) | Runtime | Host-provided capabilities for syscall access |
| File Descriptors | Unix mechanism | Unforgeable references post-open() |
| iOS/Android Permissions | Mobile OS | Apps declare/granted capabilities (camera, location, etc.) |
| CloudABI | ABI sandbox | POSIX variant using capabilities; no ambient authority |
Capsicum: A deep example:
FreeBSD's Capsicum allows processes to enter "capability mode" where:
12345678910111213141516171819202122232425262728293031323334
// Capsicum sandboxing example (FreeBSD)#include <sys/capsicum.h> int main(int argc, char *argv[]) { // Open necessary files BEFORE entering capability mode int input_fd = open(argv[1], O_RDONLY); int output_fd = open(argv[2], O_WRONLY | O_CREAT, 0644); // Limit rights on descriptors (attenuation) cap_rights_t rights; cap_rights_init(&rights, CAP_READ); // input: read-only cap_rights_limit(input_fd, &rights); cap_rights_init(&rights, CAP_WRITE); // output: write-only cap_rights_limit(output_fd, &rights); // Enter capability mode - lose all ambient authority! cap_enter(); // Now in sandbox: // - open("/etc/passwd", ...) → FAILS (no global namespace) // - socket(), connect() → FAIL (no network capability) // - read(input_fd, ...) → WORKS (have this capability) // - write(output_fd, ...) → WORKS (have this capability) // Even if this code has a vulnerability, attacker can only: // - Read from input_fd // - Write to output_fd // Cannot access network, other files, etc. process_data(input_fd, output_fd); // Sandboxed processing return 0;}Even if not using formal capability systems, modern security practice applies the same ideas: explicit grants instead of ambient authority, minimal privilege, sandboxing by default. Understanding capabilities helps you design secure systems even in traditional environments.
Neither model is universally superior. Each has strengths suited to different security requirements:
When to prefer each:
ACLs work well for:
Capabilities work well for:
Unix uses ACLs for acquiring file access (open() checks ACL) but capabilities for using it (fd is a capability). Modern systems layer capability-based sandboxing (seccomp, Landlock) over ACL-based filesystems. Pure models are conceptually clean but hybrid approaches are practical.
Even without a pure capability OS, you can apply capability principles to improve security:
processFile(fd) is safer than processFile("/path") because the caller controls what resource is accessed.setcap to grant specific privileges instead of making programs setuid root.12345678910111213141516171819202122232425
// BAD: Ambient authority patternvoid processUserData(const char* username) { // Function reaches into filesystem based on username char path[256]; snprintf(path, sizeof(path), "/data/users/%s/profile.json", username); int fd = open(path, O_RDONLY); // What if username is "../../../etc/passwd"? // ... vulnerable to path traversal} // GOOD: Capability patternvoid processUserData(int profile_fd) { // Caller provides the fd; this function cannot access other files // Even if logic is buggy, cannot escape to other resources char buffer[4096]; read(profile_fd, buffer, sizeof(buffer)); // ...} // Caller opens the file with proper validation:int main() { // Validate username, construct path safely int fd = openat(users_dir_fd, validated_username, O_RDONLY); processUserData(fd); // Pass capability, not name close(fd);}The *at() family of syscalls (openat, mkdirat, readlinkat, etc.) enable capability-style programming. Instead of using the global namespace, you open paths relative to a directory fd. This limits which directories a function can access to those you explicitly provide.
What's next:
The final page of this module explores Permission Models—bringing together ACLs, capabilities, and extensions like Role-Based Access Control (RBAC), Mandatory Access Control (MAC), and attribute-based policies. We'll see how these models compose to form the layered security frameworks of modern operating systems.
You now understand capability-based security—its theoretical basis, its manifestation in file descriptors, Linux capabilities, and modern sandboxing, and how to apply capability thinking even in non-capability systems. This perspective is essential for secure system design.