Loading content...
The Access Matrix is a beautiful theoretical model—a simple 2D array capturing the complete protection state of a system. But theory and practice diverge dramatically when we attempt to implement this matrix in real systems. A modern operating system might have thousands of users and millions of files. A cloud environment might have millions of principals and billions of resources. Storing a complete matrix would require astronomical amounts of memory.
Fortunately, real access matrices are extraordinarily sparse. Most subjects don't have rights to most objects—the vast majority of cells are empty. This sparsity enables efficient implementation strategies that store only the non-empty cells. The two fundamental approaches are Access Control Lists (ACLs), which store the matrix by columns, and Capability Lists (C-Lists), which store it by rows.
By the end of this page, you will understand the two fundamental implementation strategies for access matrices, their algorithmic properties, the profound trade-offs between them, and how real systems combine techniques. You'll be equipped to analyze and design access control implementations for any scale.
Before examining specific implementations, let's understand the scale of the challenge:
Scale Calculation:
Consider a modest system:
A naive full matrix implementation:
For a large enterprise:
For cloud-scale:
The full matrix is clearly infeasible. The saving grace: sparsity.
| System | Subjects | Objects | Typical Non-Empty Cells | Sparsity |
|---|---|---|---|---|
| Personal laptop | 5 | 50,000 | ~50,000 | 99.98% empty |
| Workgroup server | 100 | 500,000 | ~500,000 | 99.999% empty |
| Enterprise | 10,000 | 10,000,000 | ~10,000,000 | 99.99999% empty |
| Cloud (single account) | 1,000 | 1,000,000 | ~1,000,000 | 99.9999% empty |
| Cloud (global) | 10,000,000 | 10,000,000,000 | ~10,000,000,000 | 99.9999999% empty |
The Sparsity Insight:
Why is the matrix so sparse? In typical systems:
Sparse matrix representation exploits this by storing only non-empty cells. The two fundamental approaches differ in how they organize this sparse data:
ACLs optimize for the question 'Who can access this object?' while Capabilities optimize for 'What can this subject access?' Different operations are efficient under different representations. This trade-off shapes system architecture deeply.
An Access Control List (ACL) stores the access matrix by columns. Each object has an associated list of (subject, rights) pairs, representing all subjects that have non-empty rights for that object.
Conceptual Structure:
Object: /home/alice/document.txt
ACL:
- (alice, {read, write, own})
- (bob, {read})
- (editors_group, {read, write})
This is equivalent to a column of the access matrix:
| Subject | Rights |
|---|---|
| alice | read, write, own |
| bob | read |
| editors_group | read, write |
| (all others) | (empty - not stored) |
Why Column-wise Storage?
ACLs are natural for file systems because:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142
/** * Access Control List (ACL) data structures and operations * * This represents how ACLs are typically implemented in file systems. */ #include <stdint.h>#include <stdbool.h> // An ACL Entry (ACE)typedef struct acl_entry { uint32_t tag; // ACL_USER, ACL_GROUP, ACL_USER_OBJ, etc. uint32_t qualifier; // UID or GID (for ACL_USER or ACL_GROUP) uint32_t permissions; // Bitmask of permissions} acl_entry_t; // Entry tag types (POSIX ACL)#define ACL_USER_OBJ 0x01 // File owner#define ACL_USER 0x02 // Named user#define ACL_GROUP_OBJ 0x04 // File group#define ACL_GROUP 0x08 // Named group#define ACL_MASK 0x10 // Maximum permissions for users/groups#define ACL_OTHER 0x20 // Everyone else // Full ACL structuretypedef struct acl { uint32_t count; // Number of entries acl_entry_t entries[]; // Variable-length array of entries} acl_t; // File structure includes pointer to ACLtypedef struct inode { // ... other inode fields ... uid_t owner; gid_t group; mode_t mode; // Traditional Unix permissions acl_t *acl; // Extended ACL (may be NULL)} inode_t; /** * Check if subject can perform operation on file * * This is the ACL checking algorithm used in Linux. */int acl_check_access( inode_t *inode, uid_t uid, gid_t gid, gid_t *groups, int ngroups, int requested_permission) { // Step 1: Check if owner (always use owner permissions) if (uid == inode->owner) { if ((inode->mode >> 6) & requested_permission) return 0; // ALLOWED return -EACCES; // DENIED } // Step 2: If ACL exists, use ACL checking if (inode->acl != NULL) { return acl_extended_check( inode->acl, uid, gid, groups, ngroups, requested_permission ); } // Step 3: Fall back to traditional Unix check return unix_permission_check( inode->mode, inode->group, uid, gid, groups, ngroups, requested_permission );} /** * POSIX ACL extended checking algorithm */int acl_extended_check( acl_t *acl, uid_t uid, gid_t gid, gid_t *groups, int ngroups, int requested) { int found_group = 0; int group_permissions = 0; acl_entry_t *mask = NULL; // First pass: find relevant entries for (int i = 0; i < acl->count; i++) { acl_entry_t *e = &acl->entries[i]; switch (e->tag) { case ACL_USER: // Named user entry if (e->qualifier == uid) { // Apply mask and check int effective = e->permissions; if (mask) effective &= mask->permissions; if (effective & requested) return 0; // ALLOWED return -EACCES; // DENIED } break; case ACL_GROUP: case ACL_GROUP_OBJ: // Check if user is in this group gid_t check_gid = (e->tag == ACL_GROUP_OBJ) ? gid : e->qualifier; if (user_in_group(uid, gid, groups, ngroups, check_gid)) { found_group = 1; group_permissions |= e->permissions; } break; case ACL_MASK: mask = e; break; case ACL_OTHER: // Will use this if no group match break; } } // Check accumulated group permissions if (found_group) { int effective = group_permissions; if (mask) effective &= mask->permissions; if (effective & requested) return 0; // ALLOWED return -EACCES; // DENIED } // Fall through to ACL_OTHER for (int i = 0; i < acl->count; i++) { if (acl->entries[i].tag == ACL_OTHER) { if (acl->entries[i].permissions & requested) return 0; // ALLOWED break; } } return -EACCES; // DENIED}ACL Evaluation Order:
ACL checking follows a specific order:
The mask entry limits the maximum permissions for named users and groups—important for compatibility with traditional chmod operations.
Storage Strategies:
Where is the ACL physically stored?
POSIX has two ACLs per directory: the access ACL (for the directory itself) and the default ACL (inherited by new files created in the directory). Windows uses inheritance flags per ACE. Inheritance significantly reduces administrative burden but creates complex effective permission calculations.
A Capability List (C-List) stores the access matrix by rows. Each subject has a list of (object, rights) pairs—their "capabilities"—representing all objects they can access.
Conceptual Structure:
Subject: Alice
Capability List:
- (file:/home/alice/doc.txt, {read, write, own})
- (file:/shared/report.pdf, {read})
- (device:/dev/tty1, {read, write})
This is equivalent to a row of the access matrix:
| Object | Rights |
|---|---|
| /home/alice/doc.txt | read, write, own |
| /shared/report.pdf | read |
| /dev/tty1 | read, write |
| (all others) | (empty - not stored) |
Why Row-wise Storage?
Capabilities are natural for distributed systems and process isolation because:
Capability Properties:
A capability is more than just a (object, rights) pair. It's an unforgeable token of authority:
Types of Capabilities:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118
/** * Capability-based access control implementation * * Unix file descriptors are a form of capability! */ #include <stdint.h>#include <stdbool.h> // A capability (as used in capability-based systems)typedef struct capability { uint64_t object_id; // Reference to the object uint32_t rights; // Permitted operations uint32_t flags; // CAP_INHERITABLE, CAP_DELEGABLE, etc. uint64_t cryptographic_check; // For unforgettability verification} capability_t; // A process's capability list (like a file descriptor table)typedef struct c_list { uint32_t count; uint32_t capacity; capability_t *capabilities;} c_list_t; // Process structure includes capability listtypedef struct process { // ... other fields ... c_list_t c_list; // Capabilities this process holds} process_t; /** * Capability-based access check * * Instead of "does subject have right on object?", * we ask "does this capability grant this right?" */int capability_check( capability_t *cap, uint64_t target_object, uint32_t requested_permission) { // Verify capability refers to correct object if (cap->object_id != target_object) { return -ENOENT; // Wrong object } // Verify capability grants requested permission if ((cap->rights & requested_permission) != requested_permission) { return -EACCES; // Insufficient rights } // Optionally verify cryptographic integrity if (!verify_capability_signature(cap)) { return -EINVAL; // Forged or corrupted capability } return 0; // Access granted} /** * File descriptors as capabilities: Unix open() and access */ // open() creates a capability (file descriptor)int sys_open(const char *path, int flags) { process_t *proc = current_process(); // Resolve path to inode inode_t *inode = namei(path); if (!inode) return -ENOENT; // Check if we can create this capability (ACL check at open time!) int access_mode = flags_to_access_mode(flags); if (check_permission(inode, proc->euid, proc->egid, access_mode) != 0) { return -EACCES; // Cannot create capability } // Create file object and capability file_t *file = create_file(inode, flags); capability_t cap = { .object_id = (uint64_t)file, .rights = access_mode_to_rights(access_mode), .flags = 0, }; // Add to process's capability list and return index (fd) int fd = clist_add(&proc->c_list, cap); return fd;} // read() uses the capability (fd) — no further permission check!ssize_t sys_read(int fd, void *buf, size_t count) { process_t *proc = current_process(); // Get capability from process's C-list capability_t *cap = clist_get(&proc->c_list, fd); if (!cap) return -EBADF; // No such capability // Check capability grants read right if (!(cap->rights & READ_RIGHT)) { return -EBADF; // Capability doesn't grant read } // Perform read — no ACL check needed! file_t *file = (file_t *)cap->object_id; return do_read(file, buf, count);} /** * Key insight: Unix file descriptors ARE capabilities! * * - open() checks permissions and creates capability (fd) * - read()/write() only check capability, not ACL * - Capabilities can be delegated (via fork(), sendmsg()) * - Capabilities can be restricted (fcntl() to reduce rights) * - Revocation is tricky (closing fd doesn't revoke delegated copies) */Unix file descriptors are often not recognized as capabilities, but they are exactly that. open() performs ACL check and creates capability (fd). Subsequent operations check only the fd, not the ACL. FDs can be delegated (fork, sendmsg). This is why you can continue reading a file after its permissions are revoked—you still hold the capability!
The choice between ACLs and Capabilities has profound implications for system design. Each approach excels in different scenarios:
Query Efficiency:
| Query | ACL | Capability |
|---|---|---|
| "Who can access object O?" | O(entries in O's ACL) | O(all subjects' C-lists) |
| "What can subject S access?" | O(all objects' ACLs) | O(entries in S's C-list) |
| "Can S access O with right R?" | O(entries in O's ACL) | O(entries in S's C-list) |
Storage Location:
| Aspect | ACL | Capability |
|---|---|---|
| Data stored with | Object | Subject |
| Backup captures | Permissions with data | Permissions separate |
| Subject creation | No permission migration | Must grant initial caps |
| Object creation | Set initial ACL | Grant caps to subjects |
The Revocation Problem:
Capabilities have a fundamental challenge: revocation. Once a capability is granted and potentially delegated, how do you take it back?
Delegation Comparison:
Capabilities enable spontaneous delegation without central authority, which is powerful for distributed systems but concerning for centralized control.
Ambient vs. Explicit Authority:
Explicit authority prevents the confused deputy problem but requires more careful programming.
With ACLs, a privileged program ("deputy") must carefully track which operations are on behalf of which user to avoid using its own privileges inappropriately. With capabilities, the deputy only uses capabilities explicitly passed to it for that operation, naturally avoiding the confusion. This is why capability-based security is favored for sandboxing and principle of least privilege enforcement.
Real systems rarely use pure ACLs or pure Capabilities. Instead, they combine techniques to get the benefits of both:
Unix: ACL-Based with Capability Overlay
Unix file systems use ACLs (permission bits stored in inodes), but file descriptors function as capabilities:
Windows: Full ACL with Handle Capabilities
Windows uses comprehensive ACLs (security descriptors), but again handles function as capabilities:
Capsicum: Capability Mode for Sandboxing
FreeBSD's Capsicum explicitly combines models:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384
/** * Capsicum capability-based sandboxing example * * Combines ACL-based file opening with pure capability mode */ #include <sys/capsicum.h>#include <unistd.h>#include <fcntl.h> int main(int argc, char *argv[]) { // Phase 1: ACL-based authorization // Open all files we'll need while we have ambient authority int input_fd = open("/data/input.txt", O_RDONLY); if (input_fd < 0) { perror("Cannot open input"); return 1; } int output_fd = open("/data/output.txt", O_WRONLY | O_CREAT, 0644); if (output_fd < 0) { perror("Cannot open output"); return 1; } int log_fd = open("/var/log/myapp.log", O_WRONLY | O_APPEND); // log_fd might fail — that's okay // Phase 2: Limit capabilities on file descriptors // Reduce rights to minimum necessary cap_rights_t rights_read; cap_rights_init(&rights_read, CAP_READ, CAP_SEEK); cap_rights_limit(input_fd, &rights_read); cap_rights_t rights_write; cap_rights_init(&rights_write, CAP_WRITE); // No seek, no read cap_rights_limit(output_fd, &rights_write); // Phase 3: Enter capability mode // After this, we can ONLY use our existing capabilities (fds) if (cap_enter() < 0) { perror("Cannot enter capability mode"); return 1; } // NOW WE ARE SANDBOXED // - Cannot open() any new files // - Cannot access network (no socket fds) // - Cannot exec() other programs // - Cannot access filesystem via paths // - Can ONLY use: input_fd (read/seek), output_fd (write) // Even if this code has vulnerabilities, attacker is contained // Phase 4: Process data using only our capabilities char buffer[4096]; ssize_t n; while ((n = read(input_fd, buffer, sizeof(buffer))) > 0) { // Process buffer (potentially vulnerable code) process_data(buffer, n); // Write to output write(output_fd, buffer, n); } // Attempting to break out of sandbox will fail: // open("/etc/passwd", O_RDONLY); // Returns ECAPMODE // socket(AF_INET, SOCK_STREAM, 0); // Returns ECAPMODE return 0;} /** * Capsicum demonstrates the power of hybrid approach: * * - ACLs control initial access to files * - Capabilities (fds) carry authority into sandbox * - Capability mode removes ambient authority * - Result: powerful sandboxing with minimal API changes */| System | ACL Component | Capability Component | Integration |
|---|---|---|---|
| Unix/Linux | File permissions, POSIX ACLs | File descriptors, capabilities(7) | open() checks ACL, fd is capability |
| Windows | Security descriptors, DACL | Object handles | Access check at open, handle carries rights |
| Capsicum (FreeBSD) | Standard Unix ACLs | cap_rights limited fds | cap_enter() for capability-only mode |
| seL4 microkernel | None (pure capability) | Kernel-managed capabilities | Everything is a capability |
| Fuchsia | Namespace capabilities | Kernel handles (zx_handle_t) | Namespace scopes capabilities |
| Android | Linux ACLs + SELinux | Binder file descriptors | SELinux MAC + capability passing |
Use ACLs for: persistent stored objects, administrator control, auditing who can access what, systems where revocation must be immediate. Use Capabilities for: ephemeral access, delegation scenarios, sandboxing, distributed systems, minimal authority patterns. Best results often come from using both together.
Efficient access control requires carefully designed data structures. Let's examine the internal representations used in real systems:
Linux Inode Permissions:
The traditional Unix permissions are stored directly in the inode:
struct inode {
kuid_t i_uid; // Owner UID
kgid_t i_gid; // Group GID
umode_t i_mode; // Permission bits (12 bits)
// ...
};
i_mode encoding:
POSIX ACL Storage:
Extended ACLs are stored as extended attributes (xattr):
// Stored in xattr: system.posix_acl_access
struct posix_acl_xattr_entry {
__le16 e_tag; // ACL_USER, ACL_GROUP, etc.
__le16 e_perm; // Permission bits
__le32 e_id; // UID or GID (for named entries)
};
struct posix_acl_xattr_header {
__le32 a_version; // ACL version
struct posix_acl_xattr_entry a_entries[]; // Variable length
};
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495
/** * Efficient ACL implementation strategies */ // Strategy 1: Sorted array for binary search// Good for small to medium ACLs (< 100 entries)typedef struct { uint32_t count; struct { uint32_t subject_id; // Sorted by this field uint32_t permissions; } entries[];} sorted_acl_t; // O(log n) lookupint sorted_acl_lookup(sorted_acl_t *acl, uint32_t subject_id) { int lo = 0, hi = acl->count - 1; while (lo <= hi) { int mid = (lo + hi) / 2; if (acl->entries[mid].subject_id == subject_id) { return acl->entries[mid].permissions; } else if (acl->entries[mid].subject_id < subject_id) { lo = mid + 1; } else { hi = mid - 1; } } return 0; // Not found} // Strategy 2: Hash table for large ACLs// Good for very large ACLs (1000+ entries)typedef struct { uint32_t bucket_count; uint32_t entry_count; struct acl_entry { uint32_t subject_id; uint32_t permissions; struct acl_entry *next; } **buckets;} hash_acl_t; // O(1) average lookupint hash_acl_lookup(hash_acl_t *acl, uint32_t subject_id) { uint32_t bucket = subject_id % acl->bucket_count; struct acl_entry *e = acl->buckets[bucket]; while (e) { if (e->subject_id == subject_id) { return e->permissions; } e = e->next; } return 0;} // Strategy 3: Cached/memoized results// For frequently accessed objects with expensive ACL checkstypedef struct { uint64_t subject_id; uint64_t object_id; uint32_t permissions; time_t cached_at; time_t ttl;} acl_cache_entry_t; #define ACL_CACHE_SIZE 1024acl_cache_entry_t acl_cache[ACL_CACHE_SIZE]; int cached_acl_check(uint64_t subject, uint64_t object, uint32_t requested) { // Compute cache key uint32_t key = (subject ^ object) % ACL_CACHE_SIZE; acl_cache_entry_t *e = &acl_cache[key]; // Check for cache hit if (e->subject_id == subject && e->object_id == object && time(NULL) - e->cached_at < e->ttl) { // Cache hit! return (e->permissions & requested) == requested ? 0 : -EACCES; } // Cache miss - perform full ACL check int permissions = full_acl_check(subject, object); // Update cache e->subject_id = subject; e->object_id = object; e->permissions = permissions; e->cached_at = time(NULL); e->ttl = 60; // 60 second TTL return (permissions & requested) == requested ? 0 : -EACCES;}Capability Table Implementations:
Process capability lists (like file descriptor tables) use different strategies:
Fixed-size array (traditional Unix):
struct files_struct {
struct file *fd_array[NR_OPEN_DEFAULT]; // 64 entries
};
Growable sparse array (Linux):
struct fdtable {
unsigned int max_fds; // Current capacity
struct file **fd; // Pointer to fd array
unsigned long *close_on_exec; // Bitmap
unsigned long *open_fds; // Bitmap of used fds
};
Object handle table (Windows):
// Simplified - actual implementation uses multi-level tables
struct handle_table {
uint32_t count;
struct handle_entry {
void *object; // Kernel object pointer
uint32_t access_mask; // Granted rights
uint32_t attributes; // INHERIT, PROTECT, etc.
} entries[HANDLE_COUNT];
};
Caching Considerations:
Caching authorization decisions is attractive for performance but creates invalidation challenges. When an ACL changes, all cached decisions for that object must be invalidated. When group membership changes, decisions for that subject on all objects must be reconsidered. Distributed systems with caching face even harder coherence problems.
When access control spans multiple machines, implementation becomes significantly more complex:
The Distribution Problem:
In a single-machine system:
In a distributed system:
Network File System (NFS) Approach:
NFS traditionally uses a credential-based approach:
This is vulnerable: a compromised client can claim any UID. NFSv4 addresses this with Kerberos authentication.
Token-Based Authorization:
Modern distributed systems use signed tokens:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103
"""JWT-based distributed authorization example This demonstrates how tokens carry authorization informationacross network boundaries.""" import jwtimport timefrom datetime import datetime, timedelta # Authorization server's secret (in practice, use asymmetric keys)AUTH_SERVER_SECRET = "super-secret-key-dont-use-in-production" def issue_access_token(user_id: str, groups: list, permissions: dict) -> str: """ Authorization server issues token after authentication Token contains: - Subject identity - Group memberships - Granted permissions (capability-like) - Expiration time """ now = datetime.utcnow() payload = { "sub": user_id, # Subject "iat": now, # Issued at "exp": now + timedelta(hours=1), # Expires in 1 hour "groups": groups, # Group memberships "permissions": permissions, # Object:rights mappings "iss": "auth.example.com", # Issuer } token = jwt.encode(payload, AUTH_SERVER_SECRET, algorithm="HS256") return token def validate_and_check_access( token: str, requested_object: str, requested_permission: str) -> tuple[bool, str]: """ Resource server validates token and checks authorization This is the distributed equivalent of checking ACL + capability """ try: # Step 1: Validate token signature and expiration payload = jwt.decode( token, AUTH_SERVER_SECRET, algorithms=["HS256"] ) # Step 2: Extract subject identity (for logging/audit) subject = payload["sub"] groups = payload.get("groups", []) # Step 3: Check permissions in token (capability-style) permissions = payload.get("permissions", {}) if requested_object in permissions: allowed = permissions[requested_object] if requested_permission in allowed: return True, f"Access granted to {subject}" # Step 4: Check group-based permissions (ACL-style) # Would typically lookup object's ACL and check against groups # Simplified here: if "admin" in groups: return True, f"Admin access for {subject}" return False, f"Access denied for {subject}" except jwt.ExpiredSignatureError: return False, "Token expired" except jwt.InvalidTokenError as e: return False, f"Invalid token: {e}" # Example usage:# 1. User authenticates, gets token with embedded permissionstoken = issue_access_token( user_id="alice@example.com", groups=["developers", "project-x"], permissions={ "file:project-x/data.csv": ["read", "write"], "file:shared/docs/*": ["read"], }) print(f"Token: {token[:50]}...") # 2. User presents token to resource serverallowed, reason = validate_and_check_access( token, "file:project-x/data.csv", "read")print(f"Access to data.csv: {allowed} - {reason}") # 3. Token carries authorization - no backend lookup needed!Modern systems use OAuth 2.0 + OpenID Connect: Authentication produces ID token (who you are). Authorization produces access token (what you can do). Access tokens are capability-like, carrying granted permissions. Resource servers validate tokens and enforce access. This separates authentication, authorization, and resource concerns across services.
We've explored how the theoretical Access Matrix becomes practical implementation. Let's consolidate the key concepts:
What's Next:
Now that we understand implementation strategies, we'll examine the critical challenge of matrix size in real systems. The next page explores how systems cope with the explosive growth of subjects, objects, and rights—through grouping, role-based access control, attribute-based policies, and hierarchical compression.
You now understand how the elegant Access Matrix theory translates into practical implementations. Whether analyzing kernel permission checks, designing distributed authorization, or evaluating security architectures, you can reason about the ACL vs. capability trade-offs and their implications for your system.