Loading learning content...
When you type cat /home/alice/documents/report.pdf, the shell must transform that human-readable path into something the file system can use—specifically, the inode number (or equivalent) of the file. This transformation, called pathname resolution or name lookup, is one of the most frequently executed operations in any operating system.
Consider the magnitude: every open(), stat(), access(), exec(), and hundreds of other system calls begin with name lookup. A single compilation of a moderately complex program might invoke name lookup tens of thousands of times. The efficiency of this operation directly impacts system responsiveness.
This page provides comprehensive coverage of name lookup: the step-by-step algorithm that traverses directory hierarchies, caching mechanisms that avoid repeated disk access, handling of symbolic links and mount points, permission checking at each step, and optimizations that make modern file systems fast.
By the end of this page, you will understand: (1) The complete pathname resolution algorithm, (2) How the directory entry cache (dcache) accelerates lookups, (3) Symbolic link handling and loop detection, (4) Mount point traversal mechanics, (5) Permission checking during path traversal, and (6) Case sensitivity and normalization considerations.
Pathname resolution is fundamentally iterative: starting from a known directory, look up each component of the path in sequence until reaching the final target.
The Basic Algorithm:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667
// Simplified pathname resolution// Returns inode of target file, or error struct inode* resolve_path(const char* path, int flags) { struct inode* current; const char* component; int component_len; // Step 1: Determine starting point if (path[0] == '/') { // Absolute path: start from root current = get_root_inode(); path++; // Skip leading '/' } else { // Relative path: start from current working directory current = get_cwd_inode(); } // Step 2: Process each path component while ((component_len = next_component(path, &component)) > 0) { // Step 2a: Verify current is a directory if (!S_ISDIR(current->i_mode)) { return ERR_PTR(-ENOTDIR); } // Step 2b: Check execute permission (search permission for dirs) if (!may_exec(current)) { return ERR_PTR(-EACCES); } // Step 2c: Look up component in current directory struct inode* next = dir_lookup(current, component, component_len); if (IS_ERR(next)) { return next; // Component not found } // Step 2d: Handle symbolic links if (S_ISLNK(next->i_mode) && should_follow_link(flags, path)) { next = follow_symlink(next, flags); if (IS_ERR(next)) { return next; } } // Step 2e: Handle mount points if (is_mounted(next)) { next = get_mount_root(next); } // Move to next component iput(current); // Release reference to old inode current = next; path = skip_slashes(component + component_len); } return current;} // Helper: Extract next path componentint next_component(const char* path, const char** out) { *out = path; int len = 0; while (path[len] && path[len] != '/') { len++; } return len;}Concrete Example:
Resolving /home/alice/report.pdf:
12345678910111213141516171819202122232425262728293031
Path: /home/alice/report.pdf Step 1: Start at root directory (inode 2) Path: "home/alice/report.pdf" Step 2: Look up "home" in root directory → Scan root directory entries → Find entry: name="home", inode=100 → Current = inode 100 Path: "alice/report.pdf" Step 3: Look up "alice" in /home (inode 100) → Check execute permission on inode 100 ✓ → Scan /home directory entries → Find entry: name="alice", inode=500 → Current = inode 500 Path: "report.pdf" Step 4: Look up "report.pdf" in /home/alice (inode 500) → Check execute permission on inode 500 ✓ → Scan /home/alice directory entries → Find entry: name="report.pdf", inode=1234 → No more components Result: inode 1234 Statistics: - Directory lookups: 3 - Permission checks: 3 - Disk reads (cold cache): 3 directories + 3 inodes = 6 I/O ops - With hot dcache: 0 I/O ops (all in memory)Directories require 'execute' permission (x bit) for traversal, not 'read'. Read permission lets you list contents; execute lets you traverse through. You can have 'x' without 'r'—allowing access to files whose names you know, but not listing.
Without caching, every path resolution would require disk I/O. The directory entry cache (dcache) stores recent (parent directory, filename) → inode mappings in memory.
dcache Architecture:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950
// Linux dentry structure (simplified)struct dentry { // Identity struct qstr d_name; // Filename with precomputed hash struct dentry *d_parent; // Parent dentry struct inode *d_inode; // Associated inode (NULL if negative) // Cache management struct hlist_node d_hash; // Hash table linkage struct list_head d_lru; // LRU list for reclaim struct list_head d_child; // Child list for parent struct list_head d_subdirs; // List of subdirectories // Reference counting atomic_t d_count; // Reference count unsigned int d_flags; // DCACHE_* flags // Operations const struct dentry_operations *d_op;}; // qstr: Quick string with precomputed hashstruct qstr { unsigned int hash; // Hash for fast comparison unsigned int len; // String length const unsigned char *name; // The actual name}; // Lookup in dcache: O(1) averagestruct dentry* d_lookup(struct dentry *parent, struct qstr *name) { // Compute hash bucket unsigned int hash = full_name_hash(parent, name->name, name->len); // Search in hash table struct hlist_head *head = d_hash(hash); struct dentry *dentry; hlist_for_each_entry(dentry, head, d_hash) { if (dentry->d_parent == parent && dentry->d_name.hash == name->hash && dentry->d_name.len == name->len && memcmp(dentry->d_name.name, name->name, name->len) == 0) { // Found! Increment reference and return dget(dentry); return dentry; } } return NULL; // Not in cache}Negative Entries:
The dcache also caches failed lookups (negative entries). If you access a nonexistent file, the dcache remembers "this name doesn't exist"—preventing repeated disk scans:
123456789101112131415161718192021222324252627
// Negative dentry: d_inode is NULLstruct dentry* lookup_result; if (found_in_directory) { // Positive entry: inode exists lookup_result->d_inode = found_inode;} else { // Negative entry: file doesn't exist lookup_result->d_inode = NULL; // Cache this negative result!} // Next lookup for same name:dentry = d_lookup(parent, name);if (dentry != NULL) { if (dentry->d_inode == NULL) { // Negative cache hit: "definitely doesn't exist" return -ENOENT; // No disk I/O needed! } // Positive cache hit: return the inode} // Why negative caching matters:// - Linkers probe for many nonexistent paths// - Shell PATH search tries many directories// - Build systems stat() thousands of potential files// Without negative caching, each probe hits disk| Scenario | Without dcache | With dcache | Speedup |
|---|---|---|---|
| Same file opened 1000 times | 1000 × disk lookups | 1 disk + 999 memory | ~1000x |
| walk /usr/lib/* (1000 files) | 1000 directory reads | 1 directory read | ~1000x |
| Stat nonexistent file repeatedly | Scan directory each time | Immediate ENOENT | infinite |
| Compile project (50K files) | Hours of I/O | Seconds | ~100x |
The dcache must stay synchronized with on-disk directories. When files are created/deleted, dentries are added/removed. Network file systems (NFS) face extra challenges—remote changes may not be immediately visible. This is why NFS has 'attribute caching timeout' settings.
Symbolic links add complexity to path resolution. When the kernel encounters a symlink, it must read the link's target and continue resolution from there—potentially recursively.
Symlink Resolution Algorithm:
1234567891011121314151617181920212223242526272829303132333435363738394041424344
// Symlink resolution with loop detection#define MAX_SYMLINK_DEPTH 40 // Linux limit struct inode* follow_symlink( struct inode* link, int flags, int* depth // Current recursion depth) { char target[PATH_MAX]; int len; // Step 1: Check recursion depth (prevent infinite loops) if (*depth >= MAX_SYMLINK_DEPTH) { return ERR_PTR(-ELOOP); // Too many levels of symbolic links } (*depth)++; // Step 2: Read symlink target len = link->i_op->readlink(link, target, PATH_MAX); if (len < 0) { return ERR_PTR(len); } target[len] = '\0'; // Step 3: Resolve the target path // If absolute, start from root // If relative, start from symlink's parent directory if (target[0] == '/') { return resolve_path_internal(target, flags, depth); } else { struct dentry *parent = link->i_dentry->d_parent; return resolve_path_from(parent, target, flags, depth); }} // Example symlink chains:// /usr/lib -> /usr/lib64 (simple redirect)// /var/run -> /run (backward compat)// a -> b -> c -> a (LOOP! Returns -ELOOP) // Special handling for different operations:// - open(O_NOFOLLOW): Don't follow final symlink// - lstat(): Return symlink info, don't follow// - unlink() on symlink: Remove the link, not targetSymlink Complexity Examples:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647
Example 1: Simple symlink──────────────────────────Path: /lib/libc.so.6/lib is symlink -> /usr/lib Resolution: 1. "/" -> inode 2 (root) 2. "lib" in / -> inode 50, is symlink! Read symlink: "/usr/lib" Restart from root with "/usr/lib" 3. "/" -> inode 2 4. "usr" in / -> inode 10 5. "lib" in /usr -> inode 20 6. Continue: "libc.so.6" in /usr/lib -> inode 789 Total lookups: 6 (more than path components due to symlink) Example 2: Relative symlink───────────────────────────/home/alice/latest -> reports/2024/final.pdf Resolution: 1-3. Resolve to /home/alice, find "latest" is symlink 4. Read symlink: "reports/2024/final.pdf" (relative!) 5. Resolve from /home/alice: "reports" -> inode 100 6. "2024" in reports -> inode 101 7. "final.pdf" in 2024 -> inode 500 Example 3: Symlink loop───────────────────────/tmp/a -> b/tmp/b -> c /tmp/c -> a Path: /tmp/a Resolution: 1. "/" -> inode 2 2. "tmp" -> inode 5 3. "a" -> symlink! Read: "b" 4. "b" in /tmp -> symlink! Read: "c", depth=2 5. "c" in /tmp -> symlink! Read: "a", depth=3 6. "a" in /tmp -> symlink! Read: "b", depth=4 ... continues until depth=40 ... 41. Return -ELOOPSymlinks create Time-Of-Check-Time-Of-Use (TOCTOU) security risks. Between checking 'is this file safe?' and 'open this file', an attacker might change a symlink to point elsewhere. This is why security-critical code uses O_NOFOLLOW and fstat() on open file descriptors.
When a filesystem is mounted on a directory, that directory becomes a gateway to a different filesystem. Path resolution must detect mount points and switch to the mounted filesystem's root.
Mount Architecture:
1234567891011121314151617181920212223242526272829303132333435363738
// Simplified mount structurestruct mount { struct vfsmount mnt; // VFS mount info struct dentry *mnt_root; // Root of mounted filesystem struct dentry *mnt_mountpoint; // Where it's mounted struct mount *mnt_parent; // Parent mount // ...}; // During path resolution:struct inode* lookup_with_mounts( struct dentry *parent, struct qstr *name) { struct dentry *dentry = d_lookup(parent, name); if (!dentry) { // Not in cache, read from disk dentry = real_lookup(parent, name); } // Check if this dentry is a mount point while (d_mountpoint(dentry)) { struct mount *mnt = lookup_mnt(dentry); if (!mnt) break; // Switch to mounted filesystem's root dentry = mnt->mnt_root; // Continue checking: mounts can stack! } return dentry->d_inode;} // Mount stacking example:// /mnt/usb <- ext4 USB drive mounted here// /mnt/usb/encrypted <- LUKS container from USB mounted here// /mnt/usb/encrypted/secret <- another mount// Path resolution traverses all mount boundaries transparentlyMount Namespace Isolation:
Modern Linux supports mount namespaces—different processes can see different mount hierarchies:
123456789101112131415161718192021222324
Mount Namespace Example (containers):═══════════════════════════════════════════════════════════════════ Host namespace: / ├── home/ ├── var/ └── mnt/ └── docker/ └── overlay/ <- Container layered filesystem Container namespace (isolated view): / <- Actually docker/overlay/merged ├── home/ <- Only container's /home ├── var/ └── etc/ Same path "/home" resolves to different directoriesdepending on which namespace the process is in! Path resolution must consider:1. Which mount namespace is the process in?2. What mounts exist in that namespace?3. Which filesystems can this process access?chroot() changes a process's view of '/'. Path resolution starts from the chroot directory instead of true root. But escapes are possible if the process can access file descriptors or mount points outside the jail. Modern containers use mount namespaces for stronger isolation.
Path resolution is so frequent that even small optimizations have outsized impact. Modern kernels employ numerous techniques.
RCU-Based dcache Lookup:
1234567891011121314151617181920212223242526272829
// Linux uses RCU (Read-Copy-Update) for lock-free dcache reads// Most lookups require NO LOCKS at all! struct dentry* d_lookup_rcu(struct dentry *parent, struct qstr *name) { struct dentry *dentry; unsigned seq; rcu_read_lock(); // Lightweight: no actual locking retry: seq = read_seqcount_begin(&parent->d_seq); // Walk hash chain without locks dentry = __d_lookup_rcu(parent, name); if (dentry) { if (read_seqcount_retry(&parent->d_seq, seq)) { goto retry; // Parent changed, retry } } rcu_read_unlock(); return dentry;} // Performance impact:// - Typical lookup: ~50-100 CPU cycles// - No cache line bouncing between CPUs// - Scales linearly with coresShort-Circuit Optimizations:
123456789101112131415161718192021222324252627
// Special cases handled without full resolution // Case 1: Empty path or just slashesif (path[0] == '\0' || (path[0] == '/' && path[1] == '\0')) { return get_cwd(); // No resolution needed} // Case 2: "." (current directory)if (path[0] == '.' && (path[1] == '\0' || path[1] == '/')) { // Skip this component entirely} // Case 3: ".." (parent directory) - cached in dentrydentry = current->d_parent; // O(1), no lookup // Case 4: AT_EMPTY_PATH with fd// openat(fd, "", AT_EMPTY_PATH) operates on fd itself// No path resolution at all // Case 5: Cached working directory// cwd dentry kept in task_struct// No resolution of /home/user/project each time // Case 6: O_PATH opens// open("/some/path", O_PATH) just returns fd// Actual file content not accessed// Used for fstatat(), fchmod() etc.Hash Optimization:
1234567891011121314151617181920212223242526272829303132
// Linux uses a specialized hash for path components// full_name_hash() includes parent's hash for uniqueness unsigned int full_name_hash( const struct dentry *parent, const unsigned char *name, unsigned int len) { unsigned int hash = parent->d_name.hash; // Rolling hash: very fast, good distribution while (len--) { hash = partial_name_hash(*name++, hash); } return end_name_hash(hash);} // partial_name_hash: single byte additionstatic inline unsigned int partial_name_hash( unsigned long c, unsigned long prevhash) { // Multiply and add: excellent avalanche properties return (prevhash + (c << 4) + (c >> 4)) * 11;} // Benefits:// 1. Incremental: can compute while parsing path// 2. Parent-aware: "foo" in different dirs has different hash// 3. Fast: few instructions per character// 4. Good distribution: minimizes hash collisions| Factor | Impact | Optimization |
|---|---|---|
| dcache hit rate | Major (10-100x) | Large cache, smart eviction |
| Lock contention | Moderate (2-5x on many cores) | RCU, per-directory locks |
| Hash quality | Minor (1.1-1.5x) | Good hash function, sizing |
| Symlink depth | Can be major (40x max) | Limit depth, cache resolutions |
| Mount complexity | Usually minor | Mount cache, lazy unmount |
Use 'perf' to analyze lookup performance: 'perf stat -e dentry:d_lookup' counts cache hits. High dcache hit rates (>90%) indicate healthy caching. Low rates may indicate memory pressure or unusual access patterns.
Different file systems handle case differently, which affects path resolution significantly.
Case Handling Modes:
| File System | Case Sensitive | Case Preserving | Notes |
|---|---|---|---|
| ext4 (Linux) | Yes | Yes | 'File.txt' ≠ 'file.txt' |
| NTFS (Windows) | No* | Yes | Registry controls sensitivity |
| APFS (macOS) | No* | Yes | Per-volume option |
| FAT32 | No | Limited (8.3 uppercase) | LFN preserves case |
| HFS+ | No | Yes | Mac default |
12345678910111213141516171819202122232425262728
// Case-insensitive comparison (NTFS style)int compare_names_ci(const char *a, const char *b, size_t len) { for (size_t i = 0; i < len; i++) { // Convert to uppercase for comparison char ca = toupper((unsigned char)a[i]); char cb = toupper((unsigned char)b[i]); if (ca != cb) return ca - cb; } return 0;} // But it's more complex with Unicode:// - "straße" should equal "STRASSE" (German sharp S)// - Turkish: "i" has two uppercase forms: "I" and "İ"// - Unicode has multiple normalization forms // Windows solution: Upcase table in $UpCase file// Maps every Unicode character to uppercase equivalent// Lookup uses: upcase_table[char1] == upcase_table[char2] // macOS HFS+ uses Unicode decomposition:// "café" stored as "cafe\u0301" (e + combining accent)// Comparisons normalize before matching // Implications for dcache:// - Case-sensitive: exact hash of name// - Case-insensitive: hash of uppercased/normalized name// - Need separate caches for each mode!Git repositories created on case-insensitive systems (Mac, Windows) may contain 'file.txt' and 'File.txt' as the same file. On Linux, these become two files. This causes subtle bugs when checking out on different platforms. Use 'git config core.ignorecase' to control behavior.
Name lookup is the gateway between human-readable paths and file system internals. Its efficiency determines overall system responsiveness.
Module Complete:
You've now completed the comprehensive coverage of directory implementation—from the data structures (linear lists, hash tables, B-trees) through the entries they contain, to the lookup algorithms that traverse them. This knowledge provides the foundation for understanding how file systems organize and access files efficiently.
Congratulations! You now understand directory implementation comprehensively: linear lists, hash tables, B-trees, directory entry structures, and name lookup algorithms. This knowledge explains how file systems scale from tiny directories to those containing millions of files, and why operations like 'ls' and 'open' can be so fast.