Loading learning content...
We have established that capabilities are unforgeable tokens encoding access rights. But where do these tokens live? How does a process keep track of the capabilities it holds? How does the operating system organize and protect this critical security information?\n\nThe answer lies in Capability Lists, often abbreviated as C-lists—the data structures that store a subject's complete set of capabilities. If the access matrix is the abstract model of all access rights in a system, then a capability list is a concrete implementation of one row of that matrix: all the objects a particular subject can access and the associated rights.\n\nUnderstanding capability lists takes us from theory to implementation. We will examine how C-lists are structured, where they are stored, how they are protected, and how different systems have approached these design challenges.
By the end of this page, you will understand the structure and organization of capability lists, how they are protected from unauthorized access and modification, different implementation approaches (kernel-managed vs. process-managed), capability addressing modes, and the performance and security trade-offs in C-list design.
A Capability List (C-list) is an ordered collection of capabilities associated with a particular subject (typically a process or protection domain). It represents the complete authority of that subject—everything it can access and how.\n\nFormal Definition\n\nMathematically, if we have an access matrix A where A[s,o] represents the rights that subject s has to object o, then the capability list for subject s is:\n\n\nC-list(s) = { (o, A[s,o]) | A[s,o] ≠ ∅ for all objects o }\n\n\nIn other words, the C-list contains all (object, rights) pairs where the subject has non-empty rights.\n\nKey Characteristics
12345678910111213141516171819202122232425262728293031323334
// Conceptual structure of a Capability List// In real systems, this is managed by the kernel #define MAX_CAPABILITIES 256 // Typical limit per process typedef struct { bool valid; // Is this slot occupied? uint32_t rights; // Access rights bitmask ObjectRef object; // Reference to the protected object uint64_t generation; // For revocation checking uint32_t flags; // Additional metadata} CapabilitySlot; typedef struct { uint32_t owner_pid; // Process that owns this C-list uint32_t num_capabilities; // Current count of valid capabilities uint32_t max_capabilities; // Maximum allowed CapabilitySlot slots[MAX_CAPABILITIES]; // Capability list metadata bool sealed; // Can new capabilities be added? uint64_t created_time; uint64_t last_modified;} CapabilityList; // User processes never see CapabilityList directly// They only hold indices (capability handles)typedef uint32_t CapHandle; // Actually just an index into the C-list // When a process wants to access an object:// 1. It provides a CapHandle (index)// 2. Kernel looks up C-list[index]// 3. Kernel checks if slot is valid and has required rights// 4. Kernel performs the operation on behalf of the processRelationship to the Access Matrix\n\nRecall that the access matrix can be viewed in two ways:\n\n- Column-wise (ACLs): For each object, list all subjects that can access it\n- Row-wise (C-lists): For each subject, list all objects it can access\n\nC-lists are the row-wise decomposition of the access matrix. This structural choice has profound implications:\n\n- Easy to determine all objects a subject can access: Simply enumerate the C-list\n- Hard to determine all subjects that can access an object: Must search all C-lists in the system\n- Natural support for least privilege: A subject's authority is explicit and bounded\n- Delegation is straightforward: Copy capabilities between C-lists
The security of a capability system fundamentally depends on protecting capability lists from unauthorized access and modification. If a process could directly manipulate its C-list, it could grant itself arbitrary access rights, defeating the entire purpose of the capability model.\n\nThe Core Protection Problem\n\nC-lists must satisfy two seemingly contradictory requirements:\n\n1. Accessibility: The system must be able to quickly locate and read a process's capabilities during access control checks\n2. Unmodifiability: Processes must not be able to directly add, modify, or forge capabilities in their C-lists\n\nDifferent systems resolve this tension through different architectural choices.
C-list protection is not optional security hardening—it is essential to capability semantics. A system where processes can modify their own C-lists is not a capability system at all; it is an honor system with no security guarantees. Every C-list implementation must provide absolute protection against unauthorized modification.
Kernel-Space Storage\n\nThe most common approach stores C-lists entirely in kernel memory, inaccessible to user processes:\n\n- C-lists reside in kernel address space\n- Processes hold only opaque handles (indices)\n- All capability operations go through system calls\n- Kernel validates every capability use\n\nThis approach provides strong protection but requires kernel involvement for every capability operation.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172
// Kernel-managed capability list implementation // === KERNEL SIDE ===// C-lists stored in protected kernel memory struct clist_table { struct capability_list* lists[MAX_PROCESSES]; spinlock_t lock;}; static struct clist_table global_clist_table; // Create capability for a process (kernel internal)static int kernel_add_capability(pid_t pid, struct object* obj, uint32_t rights) { struct capability_list* clist = global_clist_table.lists[pid]; spin_lock(&clist->lock); // Find empty slot for (int i = 0; i < clist->max_slots; i++) { if (!clist->slots[i].valid) { clist->slots[i].valid = true; clist->slots[i].object = obj; clist->slots[i].rights = rights; clist->slots[i].generation = obj->generation; spin_unlock(&clist->lock); return i; // Return slot index as handle } } spin_unlock(&clist->lock); return -ENOMEM; // No free slots} // Validate capability use (called on every capability operation)static int validate_capability(pid_t pid, int cap_handle, uint32_t required_rights) { struct capability_list* clist = global_clist_table.lists[pid]; // Bounds check if (cap_handle < 0 || cap_handle >= clist->max_slots) return -EINVAL; struct capability_slot* slot = &clist->slots[cap_handle]; // Validity check if (!slot->valid) return -EBADF; // Invalid capability // Revocation check (generational) if (slot->generation != slot->object->generation) return -ESTALE; // Capability has been revoked // Rights check if ((slot->rights & required_rights) != required_rights) return -EACCES; // Insufficient rights return 0; // Success} // === USER SIDE ===// User processes only see opaque handles // System call to read through a capabilityssize_t cap_read(int cap_handle, void* buffer, size_t count) { // Kernel: // 1. Validate cap_handle in current process's C-list // 2. Check CAP_READ right // 3. Perform read on the object // 4. Return result to user return syscall(SYS_cap_read, cap_handle, buffer, count);}Tagged Memory Storage\n\nHardware capability systems can store C-lists in user-accessible memory by using tagged memory—each memory location has an associated tag bit that indicates whether the location contains a capability.\n\n- Capabilities can be stored anywhere in process memory\n- Tag bits are maintained by hardware, not accessible via normal loads/stores\n- Only special capability instructions can set tag bits\n- Loss of tag bit when touched by non-capability instructions ("capability tainting")\n\nThis approach provides better performance (no kernel trap for capability operations) while maintaining unforgeability through hardware protection.
| Approach | Storage Location | Protection Mechanism | Trade-offs |
|---|---|---|---|
| Kernel Tables | Kernel memory only | Address space isolation | System call overhead, strong isolation |
| Tagged Memory | User memory with tags | Hardware tag bits | Fast access, requires special hardware |
| Encrypted Caps | User memory | Cryptographic MACs | Portable, computational overhead |
| Segmented Memory | Protected segments | Segment descriptors | Historic approach, complex addressing |
| Typed Objects | Language runtime | Type system enforcement | Language-specific, garbage collection |
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061
// Tagged Memory Capability Storage (CHERI-style) // In hardware capability systems, capabilities ARE pointers// They can be stored anywhere but have special hardware properties // A CHERI capability is 128 or 256 bits wide, containing:// - Virtual address (pointer value)// - Base address (lower bound of valid access)// - Length (size of valid region)// - Permissions (read, write, execute, etc.)// - Object type (for sealing)// - Tag bit (set by hardware, indicates validity) // User code can have arrays of capabilitiesvoid* __capability my_capabilities[100]; // Legal! // But user code CANNOT forge a capabilityvoid* __capability attempt_forge() { // This creates an integer, not a capability uintptr_t addr = 0x12345678; // This does NOT create a valid capability void* __capability forged = (void* __capability)addr; // The tag bit is NOT set // Any attempt to dereference 'forged' will cause hardware trap // Only way to get capabilities: // 1. Derive from existing capability (bounds can only shrink) // 2. Receive from another holder (IPC, memory sharing) // 3. Get from kernel (system calls that return capabilities)} // Capability derivation - parent to child relationshipvoid* __capability derive_subset(void* __capability parent, size_t offset, size_t len) { // Create capability to subset of parent's region // Hardware ensures: derived.base >= parent.base // derived.base + derived.len <= parent.base + parent.len void* __capability child = cheri_offset_set(parent, offset); child = cheri_bounds_set(child, len); // Child can access [parent.base+offset, parent.base+offset+len) // Parent still can access [parent.base, parent.base+parent.len) return child;} // Storing capabilities - tag bits follow the datavoid capability_array_example() { void* __capability caps[10]; // Store a capability (hardware sets tag bit in memory) caps[0] = get_some_capability(); // Tag bit = 1 // Overwrite with non-capability (hardware clears tag bit) *(uintptr_t*)&caps[0] = 0; // Tag bit = 0 (tainted) // caps[0] is now invalid; dereferencing will trap}When a process wants to exercise a capability, it must identify which capability to use. The mechanism by which processes refer to their capabilities is called capability addressing. Different addressing schemes have different performance, security, and usability characteristics.\n\nIndexed Addressing (Capability Slots)\n\nThe most common approach assigns each capability a slot number within the C-list. Processes refer to capabilities by their slot indices—small integers that serve as opaque handles.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879
// Different capability addressing approaches // === APPROACH 1: Simple Indexed Addressing ===// Used by: Unix file descriptors, Windows handles typedef int CapIndex; // Simple slot number #define MAX_CAPS 256 struct simple_clist { struct capability slots[MAX_CAPS]; uint64_t valid_bitmap[MAX_CAPS / 64]; // Which slots are in use}; // Fast O(1) lookupstruct capability* lookup_indexed(struct simple_clist* clist, CapIndex idx) { if (idx < 0 || idx >= MAX_CAPS) return NULL; if (!(clist->valid_bitmap[idx/64] & (1ULL << (idx % 64)))) return NULL; return &clist->slots[idx];} // === APPROACH 2: Hierarchical Addressing (CSpace) ===// Used by: seL4 // Capabilities organized in a tree structure// Address is a path through the tree: (guard, depth, slot_bits...) typedef struct { uint64_t guard; // Prefix bits that must match uint32_t guard_size; // How many bits in guard uint32_t radix; // Branching factor (2^radix children per node) uint32_t depth; // Current depth in tree uint64_t path; // Path bits used for navigation} CSpaceAddress; // seL4 CNode structure (capability node in the tree)struct cnode { struct cnode_slot { bool is_cnode; // true = points to another CNode, false = leaf cap union { struct cnode* next_level; struct capability cap; }; } slots[/* 2^radix */];}; // Lookup: walk the tree following path bitsstruct capability* lookup_hierarchical(struct cnode* root, CSpaceAddress* addr) { struct cnode* current = root; while (addr->depth > 0) { // Extract next slot index from path uint32_t slot_idx = (addr->path >> (addr->depth - addr->radix)) & ((1 << addr->radix) - 1); struct cnode_slot* slot = ¤t->slots[slot_idx]; if (slot->is_cnode) { current = slot->next_level; addr->depth -= addr->radix; } else { return &slot->cap; // Found leaf capability } } return NULL; // Not found} // === APPROACH 3: Sparse Table with Hashing ===// Used when capability counts vary widely struct sparse_clist { struct hash_table ht; // Object ID -> capability mapping}; struct capability* lookup_sparse(struct sparse_clist* clist, uint64_t object_id) { return hash_table_lookup(&clist->ht, object_id);}seL4's Capability Space (CSpace)\n\nseL4 uses a sophisticated hierarchical capability addressing scheme called CSpace. The CSpace is a tree structure where:\n\n- CNodes are internal nodes containing fixed-size arrays of capability slots\n- Leaf capabilities are stored in CNode slots\n- Addresses are bit strings that encode the path through the tree\n- Guards allow efficient representation of sparse address spaces\n\nThis design enables:\n- Flexible capability organization (can add more CNodes dynamically)\n- Efficient sparse representation (most slots empty)\n- Delegation of subtrees (transfer entire CNode)\n- Per-address-space capability lookup isolation
Hierarchical addressing solves the problem of scaling. With flat tables, the kernel must pre-allocate slots for every process. With hierarchical structures, capability storage grows on demand. A process needing 10 capabilities uses ~10 slots; a process needing 10,000 capabilities extends its CSpace to accommodate them. This is analogous to page tables for virtual memory—sparse structures with demand allocation.
Direct Capability References (Hardware Systems)\n\nIn hardware capability systems like CHERI, capabilities are often referenced directly as values rather than through indices:\n\n- A capability is a fat pointer (128-256 bits)\n- The capability value itself contains all information\n- No table lookup needed—just use the capability directly\n- Capabilities can be stored in registers or memory\n\nThis provides the best performance but requires hardware support and larger pointer sizes.
| Method | Lookup Cost | Space Efficiency | Max Capabilities | Best For |
|---|---|---|---|---|
| Simple Index | O(1) | Fixed table size | Fixed limit | Small, known sets (file descriptors) |
| Hierarchical (CSpace) | O(depth) | Demand-allocated | Very large | Systems needing flexibility |
| Hash Table | O(1) amortized | Proportional to count | Memory-limited | Sparse, dynamic sets |
| Direct (Hardware) | O(1) | Per-capability (large) | Memory-limited | Hardware capability systems |
C-lists are not static—they change as processes acquire, transfer, duplicate, and release capabilities. The operating system provides well-defined operations on C-lists that maintain security invariants while enabling flexible capability management.\n\nFundamental C-List Operations
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100
// Capability List Operations - Kernel Implementation // Duplicate: Copy capability within same C-listint cap_dup(pid_t pid, CapHandle src, CapHandle* dest) { struct clist* cl = get_clist(pid); struct cap_slot* src_slot = get_slot(cl, src); if (!src_slot->valid) return -EBADF; // Find free slot CapHandle new_slot = find_free_slot(cl); if (new_slot < 0) return -ENOMEM; // Copy capability (same rights, same object) cl->slots[new_slot] = *src_slot; *dest = new_slot; return 0;} // Attenuate: Create capability with reduced rightsint cap_attenuate(pid_t pid, CapHandle src, uint32_t new_rights, CapHandle* dest) { struct clist* cl = get_clist(pid); struct cap_slot* src_slot = get_slot(cl, src); if (!src_slot->valid) return -EBADF; // CRITICAL: Can only REMOVE rights, never ADD if ((new_rights & ~src_slot->rights) != 0) return -EACCES; // Attempt to add rights - DENIED CapHandle new_slot = find_free_slot(cl); if (new_slot < 0) return -ENOMEM; // Copy with reduced rights cl->slots[new_slot] = *src_slot; cl->slots[new_slot].rights = new_rights; // Subset of original *dest = new_slot; return 0;} // Grant: Transfer capability copy to another processint cap_grant(pid_t sender, CapHandle cap, pid_t receiver, CapHandle* recv_slot) { struct clist* sender_cl = get_clist(sender); struct cap_slot* src = get_slot(sender_cl, cap); if (!src->valid) return -EBADF; // Check if sender has GRANT right if (!(src->rights & CAP_RIGHT_GRANT)) return -EACCES; // No permission to grant struct clist* recv_cl = get_clist(receiver); CapHandle new_slot = find_free_slot(recv_cl); if (new_slot < 0) return -ENOMEM; // Copy capability to receiver (sender keeps original) recv_cl->slots[new_slot] = *src; // Note: Receiver gets same rights as sender had (minus GRANT if not inheritable) *recv_slot = new_slot; return 0;} // Transfer: Move capability to another process (sender loses it)int cap_transfer(pid_t sender, CapHandle cap, pid_t receiver, CapHandle* recv_slot) { int ret = cap_grant(sender, cap, receiver, recv_slot); if (ret < 0) return ret; // Remove from sender cap_delete(sender, cap); return 0;} // Delete: Remove capability from C-listint cap_delete(pid_t pid, CapHandle cap) { struct clist* cl = get_clist(pid); struct cap_slot* slot = get_slot(cl, cap); if (!slot->valid) return -EBADF; // Mark slot as invalid slot->valid = false; // May trigger revocation tracking updates notify_revocation_system(slot->object, slot); return 0;}The Monotonicity Principle\n\nA crucial invariant in capability systems is rights monotonicity: capabilities can only be weakened (attenuated), never strengthened. This means:\n\n- If you have read-only access, you cannot derive read-write\n- If you have access to a subregion, you cannot derive access to the full region\n- If your capability expires at time T, you cannot derive one that expires later\n\nThis principle ensures that the capability graph forms a partial order: authority can only decrease as capabilities are derived, never increase. Combined with the unforgeability requirement, this means processes cannot escalate their privileges through capability manipulation.
Standard capability operations never increase rights. However, some systems intentionally provide controlled 'rights amplification' for specific purposes—like the setuid bit in Unix, or Hydra's amplification templates. These exceptions are carefully designed and tightly controlled, never available to arbitrary code.
Capability Invocation\n\nThe most common C-list operation is invocation—using a capability to perform an operation on its object. The invocation process typically involves:\n\n1. Addressing: Process specifies which capability to use (handle, index, or direct reference)\n2. Validation: Kernel retrieves capability and checks validity\n3. Authorization: Kernel verifies the capability has the required rights\n4. Execution: Kernel performs the operation on the object\n5. Result: Operation result returned to the process\n\nIn high-performance systems, steps 2-4 must be extremely fast, as they occur on nearly every I/O operation.
The operating system kernel bears ultimate responsibility for managing capability lists. This includes creating C-lists for new processes, populating them with initial capabilities, maintaining their integrity, and cleaning up when processes terminate.\n\nProcess Creation and Initial Capabilities\n\nWhen a process is created, it needs an initial set of capabilities to do useful work. Where do these come from?
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455
// Process creation with capability inheritance struct new_process_caps { CapHandle caps[MAX_INHERITED_CAPS]; int count; uint32_t rights_mask[MAX_INHERITED_CAPS]; // May be attenuated}; pid_t spawn_process(const char* program, struct new_process_caps* init_caps) { // 1. Create new process structure struct process* child = create_process(); child->clist = allocate_clist(); // 2. Populate C-list with granted capabilities for (int i = 0; i < init_caps->count; i++) { struct cap_slot* parent_cap = get_current_cap(init_caps->caps[i]); if (!parent_cap->valid) continue; // Check if we have grant rights if (!(parent_cap->rights & CAP_RIGHT_GRANT)) continue; // Copy to child with optional attenuation struct cap_slot* child_cap = &child->clist->slots[i]; child_cap->valid = true; child_cap->object = parent_cap->object; child_cap->rights = parent_cap->rights & init_caps->rights_mask[i]; child_cap->generation = parent_cap->generation; } // 3. Kernel may add essential capabilities add_essential_caps(child); // e.g., IPC endpoint to parent // 4. Start execution schedule_process(child); return child->pid;} // Essential capabilities typically include:void add_essential_caps(struct process* proc) { // Capability to communicate with logging service grant_cap(proc, get_log_endpoint_cap(), CAP_SEND); // Capability to own memory allocator pool (for malloc) grant_cap(proc, get_memory_pool_cap(), CAP_WRITE); // Capability to its own TCB (for thread control) grant_cap(proc, proc->tcb_cap, CAP_OWNER); // In pure capability systems, there's NO ambient file system access // Any file access must be through explicit capabilities}C-List Cleanup on Process Termination\n\nWhen a process terminates, its C-list must be properly cleaned up. This involves:\n\n1. Capability Release: Decrement reference counts on held objects\n2. Revocation Tracking: Update any revocation metadata\n3. Resource Reclamation: Free C-list memory\n4. Notification: Inform relevant subsystems of capability loss\n\nIn systems with capability-based lifetime management (like EROS/CapROS), releasing the last capability to an object may trigger garbage collection of the object itself.
C-lists naturally track what resources a process is using. When the last capability to an object is deleted, the object may become garbage collectible. This unifies access control with resource management—a powerful property that simplifies kernel design.
C-List Quota Management\n\nProcesses typically have limits on how many capabilities they can hold (C-list size limits). This prevents:\n\n- Denial-of-service through C-list exhaustion\n- Unbounded kernel memory consumption\n- Resource hoarding\n\nQuotas can be static (fixed maximum) or dynamic (allocated from a pool shared with other resources). seL4, for example, uses explicit memory objects as the backing store for CNodes, so C-list growth consumes visible resources.\n\nC-List Persistence\n\nSome capability systems (KeyKOS, EROS) support persistent capabilities—capabilities that survive system shutdown and restart. This requires:\n\n- Serializing C-lists to stable storage\n- Maintaining object identity across restarts\n- Handling revocation state persistence\n- Checkpointing capability graphs\n\nPersistent capabilities enable systems that can resume exactly where they left off, including all security relationships.
Examining real-world C-list implementations reveals the practical trade-offs and design decisions that shape capability systems. Let us explore how several influential systems organize their capability lists.\n\nUnix File Descriptor Tables\n\nWhile Unix is not a pure capability system, its file descriptor mechanism exhibits many C-list properties:
1234567891011121314151617181920212223242526272829303132
// Unix file descriptor table - a limited capability list // Per-process file descriptor table (simplified)struct files_struct { atomic_t count; // Reference count struct fdtable* fdt; // Pointer to fd table spinlock_t file_lock; int next_fd; // Next likely free slot (optimization) // The actual table struct file* fd_array[NR_OPEN_DEFAULT]; // Typically 64 // Expanded dynamically if more fds needed}; struct file { struct path f_path; // File path (for display) const struct file_operations* f_op; // Operations void* private_data; f_mode mode; // Access mode: READ, WRITE, etc. // ... more fields}; // File descriptor properties making them capability-like:// 1. Unforgeable: fds are indices into kernel-protected table// 2. Access rights: mode encodes permitted operations// 3. Object reference: points to specific file structure// 4. Transferable: can be passed via SCM_RIGHTS // But NOT pure capabilities because:// 1. Ambient authority: can open("/etc/passwd") by name// 2. Identity-based: permission checks use process UID// 3. No fine-grained delegation without sendmsg complexityseL4 CSpace Implementation\n\nseL4's CSpace is a carefully designed hierarchical C-list structure optimized for embedded and high-assurance systems. Key characteristics:\n\n- CNodes are fixed-size arrays of capability slots (2^n entries for configurable n)\n- Guards allow skipping levels in sparse spaces\n- Memory-backed: CNode memory explicitly allocated from untyped memory\n- Formally verified: CSpace implementation proven correct mathematically
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152
// seL4 Capability Space structure (simplified) // CTE: Capability Table Entrytypedef struct cte { cap_t cap; // The capability itself mdb_node_t cteMDBNode; // For revocation: doubly-linked list} cte_t; // cap_t: The capability valuetypedef struct cap { uint64_t words[2]; // 128-bit capability encoding: // - Type tag (endpoint, memory frame, cnode, etc.) // - Object pointer // - Rights bitmask // - Type-specific fields} cap_t; // CNode: A node in the CSpace tree// Contains 2^radix capability slots// Addressed by 'radix' bits of the capability address // Example CSpace structure://// Root CNode (radix=4, 16 slots)// |// +--[0] -> Endpoint cap (to IPC partner)// +--[1] -> Frame cap (some memory page)// +--[2] -> CNode cap (to nested CNode) ---> Child CNode (16 more slots)// +--[3] -> ... |// ... +--[0] -> Frame cap// +--[1] -> Endpoint cap// ... // Lookup: resolve capability path// Address: 0x21 = 0b0010_0001// bits [7:4] = 2 -> slot 2 in root (CNode cap)// bits [3:0] = 1 -> slot 1 in child (Endpoint cap) cte_t* resolve_address_bits(cte_t* root, word_t cap_ptr, word_t n_bits) { cte_t* slot = root + (cap_ptr >> (n_bits - root->cap.radix)); n_bits -= root->cap.radix; if (n_bits == 0 || !isCNodeCap(slot->cap)) return slot; // Found it, or can't continue // Recurse into nested CNode return resolve_address_bits( getCNodeSlots(slot->cap), cap_ptr, n_bits );}CHERI: Capabilities in Memory\n\nCHERI takes a radically different approach: capabilities are stored throughout memory, protected by hardware rather than kernel isolation:
1234567891011121314151617181920212223242526272829303132333435363738394041424344
// CHERI: Capabilities stored in regular memory with hardware protection // A CHERI capability (simplified 128-bit format)typedef struct { uint64_t cursor; // Current pointer value (virtual address) // Compressed bounds (base + length reconstructed from this) unsigned int E:6; // Exponent for bounds unsigned int B:14; // Base adjustment unsigned int T:14; // Top (limit) adjustment // Permissions and metadata unsigned int perms:16; // LOAD, STORE, EXECUTE, etc. unsigned int otype:18; // Object type (for sealing) unsigned int flags:2; // Special flags unsigned int reserved:6; // Tag bit: 1 bit stored separately by hardware // NOT in this struct - maintained in hidden tag memory} cheri_capability_t; // Key insight: This IS the C-list// Every pointer a process can use is a capability// There's no separate C-list structure in kernel// The set of valid capabilities a process holds =// all tagged values in its registers and memory // Memory layout example:// Address Tag Content// 0x1000 0 Integer data// 0x1008 0 Integer data // 0x1010 1 Capability to buffer <- Valid capability!// 0x1018 0 Integer data// 0x1020 1 Capability to function <- Valid capability! // Process's authority = all words with tag=1 it can reach // Stack frame with capabilities:struct stack_frame { uint64_t saved_regs[10]; // Plain integers (tag=0) void* __capability return_cap; // Return capability (tag=1) void* __capability caller_frame; // Caller's frame (tag=1) void* __capability local_buffer; // Buffer cap (tag=1)}| Aspect | Unix FD Table | seL4 CSpace | CHERI Memory |
|---|---|---|---|
| Storage | Kernel memory | Kernel CNodes | Tagged user memory |
| Addressing | Integer index | Hierarchical path | Direct pointer |
| Max Capabilities | ~1024 per process | Bounded by CNodes | Bounded by memory |
| Lookup Cost | O(1) array | O(depth) tree walk | O(1) direct |
| Protection | Kernel isolation | Kernel isolation | Hardware tags |
| Delegation | SCM_RIGHTS | Cap grant syscall | Memory sharing |
C-list design has significant performance implications. Since capability lookups occur on nearly every I/O operation and object interaction, even small inefficiencies multiply across millions of operations.\n\nLookup Performance\n\nThe most critical operation is looking up a capability given its address. This happens on every capability invocation, so it must be fast.
Memory Overhead\n\nC-lists consume memory: each capability requires storage for object reference, rights, and metadata. For systems with many processes and many capabilities per process, this adds up.\n\nPer-Capability Overhead:\n- Unix fd table entry: ~8-16 bytes\n- seL4 capability slot: ~16 bytes\n- CHERI capability: 16-32 bytes (but replaces pointers)\n\nMitigation strategies:\n- Compress rights and metadata into capability value\n- Share common capabilities across processes (copy-on-write)\n- Use sparse data structures for processes with few capabilities\n- Hardware: CHERI encodes bounds efficiently using floating-point-like representation
CHERI doubles pointer size (64→128 or 128→256 bits), increasing memory footprint for pointer-heavy data structures. However, this replaces separate capability tables entirely, and CHERI enables memory safety benefits that often justify the overhead. Measurement shows 10-50% memory increase depending on workload—significant but manageable.
System Call Overhead\n\nIn kernel-managed C-lists, every capability operation requires a system call—a context switch into kernel mode, which is expensive (hundreds to thousands of cycles). Hardware capability systems avoid this:\n\n- CHERI capability operations execute in user mode\n- Bounds checks happen in hardware, no trap required\n- Invalid capability dereference traps, but valid operations don't\n\nThis performance difference can be 100x or more for tight loops with many pointer dereferences (and thus bounds checks).
| Metric | Kernel Tables | Hardware Capabilities | Difference |
|---|---|---|---|
| Capability Lookup | ~100-500 cycles (syscall) | ~1-5 cycles (inline) | 50-500x faster |
| Bounds Check | Kernel validation | Hardware inline | Similar to lookup |
| Memory per Cap | 8-32 bytes | 16-32 bytes | Similar or slightly larger |
| Creation Cost | Syscall + allocation | Derive from parent | Faster derivation |
| Transfer Cost | IPC message | Memory copy | Depends on mechanism |
| Revocation Cost | Table scan or indirect | Per-object or table scan | Similar complexity |
We have explored capability lists in depth—from their formal definition as rows of the access matrix to their practical implementation in real operating systems. Let us consolidate the key concepts:
What's Next\n\nNow that we understand how capabilities are stored and organized, we need to examine how they move between subjects. The next page explores Capability Delegation—how capabilities are transferred, shared, and propagated through a system while maintaining security invariants.
You now understand capability lists—how they organize a subject's authority, how they are protected, addressed, and managed, and how real systems implement them. Continue to the next page to learn how capabilities are delegated between subjects.