Loading content...
When you delete a file in Unix, you might expect the data to be immediately erased. But here's a surprising reality: deleting files doesn't always delete data. The file's name disappears from the directory, yet the actual bytes might persist on disk—sometimes indefinitely.
The key to understanding this behavior is the link count, a humble 16-bit integer stored in every inode that determines when a file truly ceases to exist. This reference counter is the heartbeat of file lifecycle management, and mastering it reveals deep truths about how Unix file systems actually work.
By the end of this page, you will understand how link count governs file existence, how various operations affect it, why files can survive 'deletion,' and how utilities inspect and leverage link count for correct operation.
The link count (also called nlink or i_links_count) is a field in the inode that tracks the number of directory entries (hard links) pointing to that inode. It serves as a reference counter for the file's existence.
The fundamental rule:
When link count drops to zero and no process has the file open, the file system reclaims the inode and its data blocks.
This rule has two critical conditions, not one. A file can have zero hard links but still exist if a process holds an open file descriptor to it. Conversely, a file with non-zero link count is never reclaimed, regardless of whether anyone is using it.
| Link Count | Open File Descriptors | File Status | Disk Space |
|---|---|---|---|
| ≥ 1 | Any | Normal file, accessible via directory | Allocated |
| 0 | ≥ 1 | Orphaned but alive; accessible via FD only | Allocated |
| 0 | 0 | Deleted; scheduled for reclamation | Pending free |
123456789101112131415161718192021222324252627282930
# Observe link count in actiontouch newfile.txt # Check link count with statstat newfile.txt | grep Links# Output: Links: 1 # Using ls -l, the second column shows link countls -l newfile.txt# Output: -rw-r--r-- 1 user group 0 Jan 16 10:00 newfile.txt# ^# link count = 1 # Create a hard linkln newfile.txt hardlink.txtstat newfile.txt | grep Links# Output: Links: 2 # Create anotherln newfile.txt another_link.txtstat newfile.txt | grep Links# Output: Links: 3 # Delete original - link count decrements but file persistsrm newfile.txtstat hardlink.txt | grep Links# Output: Links: 2 # Data is still fully accessiblecat hardlink.txt # Works fineThe stat command provides the most detailed view: Links: N. The ls -l command shows it as the second column after permissions. Programming interfaces access it via the st_nlink field in the struct stat returned by stat() and fstat() system calls.
Several file system operations modify the link count. Understanding which operations increment, decrement, or leave it unchanged is crucial for predicting file behavior.
| Operation | Effect on Link Count | Details |
|---|---|---|
creat() / open(O_CREAT) | +1 (new file) | New inode allocated with link count 1 |
link(old, new) | +1 | Adds directory entry for existing inode |
unlink(path) | -1 | Removes directory entry; may trigger deletion |
remove(path) on file | -1 | Equivalent to unlink for regular files |
rename(old, new) | 0 (net) | -1 for old path, +1 for new path, -1 if new existed |
mkdir(path) | +2 (new dir) | New dir inode created with links from parent (entry) and self (.) |
rmdir(path) | -1 | Removes directory entry; directory must be empty |
open() / close() | 0 | Affects reference count in kernel, not on-disk link count |
read() / write() | 0 | No effect on link count |
The unlink operation in detail:
The unlink() system call is the true deletion primitive. Despite its name suggesting 'remove a link,' its behavior is precisely:
The term 'unlink' rather than 'delete' reflects this reality—you're removing a link, not necessarily destroying data.
1234567891011121314151617181920212223242526272829303132333435363738394041
#include <stdio.h>#include <unistd.h>#include <fcntl.h>#include <sys/stat.h> int main() { // Create a file and write some data int fd = open("temp_file.txt", O_CREAT | O_RDWR | O_TRUNC, 0644); write(fd, "Persistent data", 15); // Check link count before unlink struct stat st; fstat(fd, &st); printf("Link count before unlink: %ld", (long)st.st_nlink); // Output: 1 // Unlink the file while it's still open unlink("temp_file.txt"); // Directory entry is gone, but file descriptor still works fstat(fd, &st); printf("Link count after unlink: %ld", (long)st.st_nlink); // Output: 0 // We can still read and write! char buffer[16]; lseek(fd, 0, SEEK_SET); read(fd, buffer, 15); buffer[15] = '\0'; printf("Data read: %s", buffer); // Output: Persistent data // The file truly disappears only when we close the FD close(fd); // Now the inode and blocks are freed return 0;}Files with link count 0 but open file descriptors still consume disk space. If a process keeps a deleted file open indefinitely (common with log files), the space is never reclaimed. Use lsof +L1 to find such orphaned files—they show link count 0 but still have open FDs.
Directories have special link count behavior that differs from regular files. Understanding this reveals how the . and .. entries work and provides a technique for counting subdirectories.
Link count of a new directory:
When you create a directory with mkdir(), its initial link count is 2, not 1:
. entry inside the new directory)The parent directory's link count also increases by 1 because the child's .. entry points back to the parent. So creating a subdirectory increments both:
.)..)1234567891011121314151617181920212223242526272829303132
# Create a test directory structuremkdir -p /tmp/linktestcd /tmp/linktest # Check link count of an empty directorystat . | grep Links# Output: Links: 2# (Parent's entry for 'linktest' + own '.') # Create some subdirectoriesmkdir subdir1 subdir2 subdir3 # Check parent link count nowstat . | grep Links# Output: Links: 5# (2 original + 3 from subdirectories' '..' entries) # Formula: directory link count = 2 + number_of_subdirectories # This is why 'find' can use link count to optimize directory traversal!# A directory with link count 2 has no subdirectories, so 'find' # doesn't need to descend into it looking for more directories. # Check a subdirectory (empty, no children)stat subdir1 | grep Links# Output: Links: 2 # Add a subdirectory to subdir1mkdir subdir1/nested stat subdir1 | grep Links# Output: Links: 3| Scenario | Link Count | Reason |
|---|---|---|
| Empty directory | 2 | Parent entry + self '.' |
| Directory with only files | 2 | Files don't add to directory link count |
| Directory with N subdirectories | 2 + N | Each subdirectory's '..' adds 1 |
| Root directory '/' | Varies (often 20+) | Many system subdirectories |
To count subdirectories without listing: stat -c%h directory returns the link count. Subtract 2 to get the number of immediate subdirectories. This is O(1) regardless of how many files or subdirectories exist!
The find optimization:
The find command uses directory link counts to optimize directory traversal when searching only for directories (not for files with -type f). When find sees a directory with link count 2, it knows there are no subdirectories to descend into, saving stat calls and I/O.
However, some file systems (btrfs, some network file systems) don't maintain accurate directory link counts for performance reasons, always reporting link count 1. In these cases, find falls back to listing directory contents. This is controlled by the -noleaf option:
# For file systems with unreliable link counts
find /mnt/btrfs -noleaf -type d
The interaction between link count and open file descriptors is one of the most subtle aspects of Unix file systems. Files can exist in a 'zombie' state—technically deleted but still consuming resources and accessible to processes that have them open.
The kernel maintains two reference counts:
On-disk link count (st_nlink) — Directory entries pointing to the inode. Stored in the inode on disk.
In-kernel reference count — Open file descriptors + kernel internal references. Stored in the in-memory vnode/inode structure.
The file is reclaimed only when both reach zero:
This separation enables powerful patterns like atomic file replacement and temporary file handling.
123456789101112131415161718192021222324252627282930
# Finding orphaned files (deleted but still open) # lsof can find files where link count is 0lsof +L1# Example output:# COMMAND PID USER FD TYPE DEVICE SIZE/OFF NLINK NODE NAME# apache2 1234 www 12w REG 8,1 1048576 0 98765 /var/log/apache2/access.log (deleted)# mysql 5678 mysql 15w REG 8,1 524288000 0 112233 /var/lib/mysql/ib_logfile0 (deleted) # These files consume disk space but have no directory entries# The '(deleted)' marker indicates link count 0 # Check disk space consumed by deleted fileslsof +L1 | awk '{sum += $7} END {print "Orphaned space: " sum " bytes"}' # Common causes:# 1. Log files deleted while app still writes to them# 2. Temp files whose creator crashed before cleanup# 3. Files replaced via rename() while old version still open # Solution: restart the process holding the file open, or:# For some apps, send SIGHUP to trigger log rotation/reopen # Investigating a specific processlsof -p <pid> +L1 # To reclaim space without restart (if the app supports it):# truncate the file via /proc/<pid>/fd/<fd>echo "" > /proc/1234/fd/12# This writes to the deleted file, releasing its blocksIf df shows a filesystem at 100% but du accounts for less space, orphaned files are likely the cause. This commonly happens after log rotation when services don't reopen their log files. Always check lsof +L1 when investigating disk space discrepancies.
The temporary file pattern:
A classic Unix idiom exploits this behavior for creating temporary files that are automatically cleaned up:
// Create temp file
int fd = open("/tmp/scratch.XXXXXX", O_CREAT | O_RDWR, 0600);
// Immediately unlink - link count becomes 0
unlink("/tmp/scratch.XXXXXX");
// File is now invisible in directory listings
// but we can still read/write via fd
write(fd, data, len);
lseek(fd, 0, SEEK_SET);
read(fd, buffer, len);
// When we close or process exits, file is automatically freed
close(fd); // Cleanup happens here, guaranteed
This pattern ensures cleanup even if the program crashes—the kernel closes all file descriptors on process termination, triggering reclamation of zero-link files.
Every file system has a maximum link count, and specific scenarios can lead to unexpected behavior. Understanding these limits prevents surprising failures in production environments.
| File System | Max Link Count | Limiting Factor |
|---|---|---|
| ext2/ext3 | 32,000 for directories | i_links_count is 16-bit; dir limit is lower |
| ext4 | 65,000 (default) | Configurable per-filesystem |
| ext4 (dir_nlink) | Unlimited for directories | Feature flag enables arbitrary subdirectory count |
| XFS | ~4 billion | 32-bit link count field |
| btrfs | 65,535 | 16-bit field in btrfs_inode_item |
| NTFS | 1,024 | MFT hard link limit |
| HFS+ | 32,767 | Signed 16-bit integer |
| APFS | Unlimited | 64-bit counter |
The ext3 subdirectory limit:
Historically, ext3's 32,000 subdirectory limit caused real problems. Build systems creating many output directories (one per component in large projects) could hit this limit. Solutions included:
dir_nlink featureDirectory link count overflow:
When a directory exceeds the link limit in ext3, the kernel sets the link count to 1 (a special sentinel value indicating 'unknown'). This signals to applications that the link count is no longer reliable.
In ext4 with dir_nlink enabled (now the default), hitting the limit causes the directory link count to be set to 1, and find treats this as 'don't optimize using link count.'
123456789101112131415161718
# Check if ext4 has dir_nlink enabled (unlimited subdirectories)tune2fs -l /dev/sda1 | grep dir_nlink# If listed in filesystem features, unlimited subdirectories are supported # Check current link count of a directorystat -c "%h" /path/to/directory # To test if you're hitting limits, attempt to create a hard linkln existing_file.txt new_link.txt# If EMLINK is returned: "Too many links" # Monitor for link count issues in logsdmesg | grep -i "link count" # For directories hitting the old 32000 limit:# Link count will show as 1 (special overflow indicator)stat /var/spool/some_large_cache | grep Links# Links: 1 suggests the directory exceeded limitsNFS and other network file systems may return link count 1 for all directories (or inaccurate values) because querying accurate counts across the network is expensive. Tools that depend on link count accuracy should handle this gracefully or use -noleaf mode.
Understanding link counts programmatically enables building robust file handling logic, detecting hard links, implementing safe deletion, and more.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687
#include <stdio.h>#include <sys/stat.h>#include <unistd.h>#include <stdlib.h> // Function to safely delete a file only if it's the last link// Returns: 0 on success, -1 on error, 1 if skipped (not last link)int safe_delete_if_last_link(const char *path) { struct stat st; if (stat(path, &st) == -1) { perror("stat"); return -1; } if (st.st_nlink > 1) { fprintf(stderr, "Warning: %s has %ld hard links, skipping delete", path, (long)st.st_nlink); return 1; // File has other links, don't delete } // This is the last link, safe to delete if (unlink(path) == -1) { perror("unlink"); return -1; } return 0;} // Function to detect if two paths refer to the same fileint are_same_file(const char *path1, const char *path2) { struct stat st1, st2; if (stat(path1, &st1) == -1 || stat(path2, &st2) == -1) { return 0; // Can't determine, assume different } // Same file if device and inode match return (st1.st_dev == st2.st_dev && st1.st_ino == st2.st_ino);} // Function to find all hard links in a directory tree// (Simplified: just identifies files with nlink > 1)void find_hardlinked_files(const char *path) { struct stat st; if (stat(path, &st) == -1) { return; } if (S_ISREG(st.st_mode) && st.st_nlink > 1) { printf("%s: inode %ld, %ld links", path, (long)st.st_ino, (long)st.st_nlink); }} // Demonstrationint main() { struct stat st; // Check link count of a file if (stat("/etc/passwd", &st) == 0) { printf("/etc/passwd:"); printf(" Inode: %ld", (long)st.st_ino); printf(" Link count: %ld", (long)st.st_nlink); printf(" Device: %ld", (long)st.st_dev); } // Check if stat returns link count for directories if (stat("/tmp", &st) == 0) { printf("/tmp:"); printf(" Link count: %ld", (long)st.st_nlink); printf(" Estimated subdirectories: %ld", (long)(st.st_nlink - 2)); } return 0;}When implementing backup or copy utilities, track (device, inode) pairs to detect hard links. Otherwise, you'll copy the same data multiple times, wasting space and breaking the hard link relationship on restore. Tools like tar, rsync, and cp use this technique.
We've explored the link count from its fundamental role in file lifecycle management to its practical applications in programming and system administration. Here are the essential insights:
.). Each subdirectory adds 1 via ...lsof +L1 to find deleted files still held open.What's next:
With hard links and link counts understood, we'll explore symbolic links (soft links)—a fundamentally different linking mechanism that overcomes many hard link limitations at the cost of introducing new complexities.
You now understand link count mechanics deeply—how operations affect it, why files can outlive their names, and how to leverage this knowledge in both scripting and programming. Next, we'll contrast this with the symbolic link approach.