Operating SystemsHard and Soft Links

Hard and Soft Links

LevelIntermediate

Duration55 mins

TopicHard and Soft Links

1 / 5

Hard Links

The Illusion of Multiple Files

Imagine having the same document accessible from three different folders—your home directory, your projects folder, and a shared team directory—without consuming triple the disk space. This isn't a copy operation, and it's not a shortcut in the traditional sense. Each location provides genuine, first-class access to the same underlying data, and any modification through one path is instantly visible through all others.

This powerful capability is provided by hard links, one of the most fundamental yet often misunderstood features of Unix-like file systems. Understanding hard links requires a paradigm shift in how we conceptualize files themselves—moving from the intuitive notion of 'files stored in folders' to the deeper reality of inodes, directory entries, and reference counting.

What You Will Learn

By the end of this page, you will understand the true nature of hard links—how they work at the inode level, why they behave differently from copies, their inherent limitations, and their practical applications in system administration and software development.

Rethinking What a File Really Is

To understand hard links, we must first deconstruct our everyday notion of a 'file.' When users interact with files, they typically think:

A file is something stored in a folder, with a name, content, and properties.

This mental model conflates three distinct entities that file systems carefully separate:

The file's data — The actual bytes stored on disk
The file's metadata — Properties like size, permissions, timestamps, owner
The file's name — How we reference it in the directory hierarchy

In Unix-like file systems, the inode holds both the data and metadata, while directory entries (names) are merely references to inodes. This separation is the foundation of hard links.

The Key Insight

A filename is not the file itself—it's a pointer to the file. Just as multiple variables in a program can reference the same object, multiple filenames can reference the same inode. Each such reference is a hard link.

The inode architecture:

An inode (index node) is a data structure on disk that stores:

File type — Regular file, directory, device, socket, etc.
Permissions — Read, write, execute for owner, group, others
Owner and group IDs — Numeric identifiers for ownership
Timestamps — Access time, modification time, change time
Size — Number of bytes in the file
Block pointers — Locations of data blocks on disk
Link count — Number of hard links pointing to this inode

Notice what the inode does not contain: the filename. The filename lives in a directory entry that maps the name string to the inode number.

The Three Layers of File System Identity
Layer	What It Contains	Where It Lives	Uniqueness
Inode	Metadata + block pointers	Inode table on disk	Unique per file system
Data blocks	Actual file content	Data region on disk	Shared via inode
Directory entry	Name → inode mapping	Parent directory	Can have multiple per inode

Hard Link Definition and Creation

A hard link is a directory entry that points directly to an inode. When you create a file, the file system:

Allocates a new inode
Writes the file's data to disk blocks
Updates the inode with metadata and block pointers
Creates a directory entry mapping your chosen filename to that inode
Sets the inode's link count to 1

When you create a hard link, the file system:

Creates a new directory entry with the new name
Points that entry to the same inode as the original
Increments the inode's link count

No data is copied. No new inode is created. The only change is an additional directory entry and an incremented counter.

creating_hard_links.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# Create an original file
echo "Important data that exists exactly once on disk" > original.txt
 
# Verify the inode number and link count
ls -li original.txt
# Output: 1234567 -rw-r--r-- 1 user group 48 Jan 16 10:00 original.txt
#         ^^^^^^^         ^
#       inode number   link count = 1
 
# Create a hard link using the 'ln' command (no -s flag)
ln original.txt linked.txt
 
# Verify both names point to the same inode
ls -li original.txt linked.txt
# Output:
# 1234567 -rw-r--r-- 2 user group 48 Jan 16 10:00 original.txt
# 1234567 -rw-r--r-- 2 user group 48 Jan 16 10:00 linked.txt
#         ^^^^^^^   ^
#     same inode   link count = 2
 
# Both files have identical inode numbers
# Both files show link count of 2
# This is the same file accessed through two different names

System call mechanics:

At the kernel level, creating a hard link involves the link() system call:

int link(const char *oldpath, const char *newpath);

The kernel implementation:

Resolve oldpath — Traverse the directory hierarchy to find the target inode
Verify permissions — Caller must have write permission on the destination directory
Check restrictions — Cannot hard link directories (on most systems)
Check filesystem — Both paths must be on the same filesystem
Create directory entry — Add newpath entry pointing to the resolved inode
Increment link count — Atomically update the inode's nlink field
Update ctime — Mark the inode's status change time

Atomic Operation

The link count increment and directory entry creation are atomic. Either both succeed or neither does. This atomicity is crucial for file system consistency—otherwise, an inode could be orphaned (link count says 1 but no directory entries exist) or leaked (link count says 0 but a directory entry still points to it).

Hard Link Characteristics

Hard links exhibit several distinctive characteristics that stem from their nature as multiple directory entries pointing to the same inode. Understanding these characteristics is essential for effective use.

Defining Characteristics of Hard Links

•Complete equality — No 'original' vs 'link' distinction; all hard links are equally valid references to the same inode. Deleting the 'original' filename has no special effect if other links exist.
•Shared everything — All hard links share content, permissions, ownership, timestamps, and all other inode metadata. Changing any attribute through one link changes it for all.
•Same inode number — All hard links display the same inode number in ls -i output. This is the definitive test for hard links.
•Same filesystem only — Hard links cannot cross filesystem boundaries because inode numbers are only unique within a single filesystem.
•No directories — Most Unix systems prohibit hard links to directories to prevent cycles in the directory tree (with exceptions like . and ..).
•Space efficiency — Hard links consume only the space for a directory entry (typically 256-4096 bytes depending on name length), not additional file data.

hard_link_equality.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# Demonstrating equality: there is no "original"
echo "Original content" > file1.txt
ln file1.txt file2.txt
ln file1.txt file3.txt
 
# All three names are equal references
stat file1.txt file2.txt file3.txt
# All show: Links: 3, Inode: 1234567 (same number)
 
# Delete the "original" - file2 and file3 are unaffected
rm file1.txt
cat file2.txt
# Output: Original content
 
# Modify through file2 - file3 sees the change instantly
echo "Modified content" > file2.txt
cat file3.txt
# Output: Modified content
 
# Check link count dropped to 2
stat file2.txt
# Links: 2

The equality principle in practice:

This equality has profound implications. When you 'delete' a file, you're actually:

Removing a directory entry (the name)
Decrementing the inode's link count
The inode and its data are freed only when the link count reaches zero

This means:

Deleting a file with multiple hard links merely removes one name
The data persists as long as any hard link remains
There's no way to 'delete the file completely' without removing all hard links
Programs holding open file descriptors prevent deletion even if all links are removed (the kernel maintains an elevated reference count)

File Descriptors and Link Counts

When a process opens a file, the kernel increments an internal reference count (distinct from the on-disk link count). Even if all hard links are deleted while a file is open, the inode and data persist until the process closes the file descriptor. This is how Unix handles log rotation: rename old log, create new log, and the old log is cleaned up when logging processes restart.

Implementation Deep Dive

Understanding the on-disk structures involved in hard links illuminates why they behave as they do. Let's trace through the data structures in a typical ext4 file system.

Directory entry structure:

In ext4, a directory is itself a file containing a sequence of directory entries. Each entry contains:

struct ext4_dir_entry_2 {
    __le32  inode;      /* Inode number (4 bytes) */
    __le16  rec_len;    /* Directory entry length (2 bytes) */
    __u8    name_len;   /* Name length (1 byte) */
    __u8    file_type;  /* File type (1 byte) */
    char    name[];     /* File name (variable, up to 255 bytes) */
};

When you create a hard link, the file system adds a new directory entry with:

The same inode number as the existing file
The new name you specified
The appropriate file type byte

Inode structure (simplified):

struct ext4_inode {
    __le16  i_mode;         /* File type and permissions */
    __le16  i_uid;          /* Owner user ID (low 16 bits) */
    __le32  i_size_lo;      /* File size in bytes */
    __le32  i_atime;        /* Access time */
    __le32  i_ctime;        /* Inode change time */
    __le32  i_mtime;        /* Modification time */
    __le32  i_dtime;        /* Deletion time */
    __le16  i_gid;          /* Group ID (low 16 bits) */
    __le16  i_links_count;  /* Hard link count */
    __le32  i_blocks_lo;    /* Block count */
    __le32  i_flags;        /* File flags */
    /* ... block pointers and extended attributes ... */
};

The i_links_count field is a 16-bit unsigned integer, meaning a single inode can have up to 65,535 hard links in ext4 (though practical limits may be lower).

Hard Link Operation Workflow
Step	Data Structure Modified	Change Made
Parse destination directory	VFS dentry cache	Locate parent directory inode
Verify constraints	Source inode, filesystem superblock	Check same FS, not directory, permissions
Add directory entry	Destination directory data blocks	Insert new entry with source inode number
Increment link count	Source inode	i_links_count++
Update timestamps	Source inode, destination directory inode	Update ctime and mtime respectively
Journal transaction	Journal area	Commit all changes atomically

Journaling Ensures Consistency

Modern file systems like ext4 use journaling to ensure link operations are atomic. If a crash occurs mid-operation, the journal replay either completes the operation or rolls it back, preventing inconsistent link counts.

Restrictions and Limitations

Hard links come with several fundamental restrictions that arise from their design. Understanding these limitations helps you choose when hard links are appropriate.

Hard Link Limitations

•Same filesystem requirement — Hard links cannot span filesystem boundaries. Since inode numbers are only unique within a filesystem, an inode number on /dev/sda1 is meaningless on /dev/sda2. Attempting to create a cross-filesystem hard link returns EXDEV: Invalid cross-device link.
•No directory hard links — Most Unix systems prohibit hard links to directories to prevent cycles. The . and .. entries are special cases created by the filesystem itself. Cycles would break utilities that traverse directories recursively (find, rm -r, du) and could cause infinite loops.
•Cannot track moves — If someone moves or renames a file, all hard links remain valid (they follow the inode). But there's no way to detect that a move happened—the link still works, it just accesses data from an unexpected location.
•Backup complexity — Backup utilities must detect hard links (by tracking inode numbers) to avoid storing multiple copies. Not all backup tools handle this correctly, potentially wasting space or creating false duplication.
•No broken links — Unlike symbolic links, hard links cannot become 'dangling.' They always point to valid inodes. This is both a benefit and a limitation—you can't create a hard link to a file that doesn't yet exist.

hard_link_errors.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Attempting cross-filesystem hard link
echo "test" > /home/user/file.txt
ln /home/user/file.txt /mnt/usb/file_link.txt
# Error: ln: failed to create hard link '/mnt/usb/file_link.txt': Invalid cross-device link
 
# Attempting to hard link a directory
mkdir my_directory
ln my_directory my_dir_link
# Error: ln: my_directory: hard link not allowed for directory
 
# Attempting to link a non-existent file
ln does_not_exist.txt new_link.txt
# Error: ln: failed to access 'does_not_exist.txt': No such file or directory
 
# These restrictions are fundamental, not configuration issues

Why directories cannot have hard links:

The prohibition on directory hard links prevents several dangerous scenarios:

Infinite traversal loops — A hard link from /a/b/c to /a would create a cycle that find, ls -R, du, and rm -r would never escape.
Ambiguous parent references — If a directory has multiple parents via hard links, what does .. resolve to? The answer is undefined and breaks fundamental navigation assumptions.
File system damage — Many file system repair tools (fsck) assume the directory structure is a tree. Cycles violate this invariant and could cause data loss during repair.
Inconsistent semantics — Deleting a directory should free all its contents, but what if another hard link still references that directory from elsewhere?

The . and .. entries are created automatically by the filesystem and are carefully managed to maintain tree structure invariants.

Practical Applications

Despite their limitations, hard links are valuable in several real-world scenarios. Their key advantages are space efficiency and perfect synchronization—changes through one link are immediately visible through all others.

Common Hard Link Use Cases
Use Case	How Hard Links Help	Example
Incremental backups	Unchanged files are hard linked to previous backup, saving space	rsync --link-dest, Time Machine
Build systems	Share object files between build directories without copying	Bazel, distributed builds
Package management	Multiple packages referencing the same library file	dpkg, RPM deduplication
Safe file replacement	Create new version, then atomically rename over old version	Configuration updates
Multi-location access	Same file accessible from multiple directory contexts	Shared data directories

Deep dive: Incremental backups with hard links

The most significant application of hard links is in incremental backup systems. Consider backing up a 100 GB home directory daily for a year:

Naive approach: 365 × 100 GB = 36.5 TB of storage
With hard links: Only changed files consume new space

If only 1 GB changes daily on average, hard link-based backups use:

100 GB (first backup) + 364 × 1 GB (changes) = 464 GB

The rsync --link-dest option implements this:

# First full backup
rsync -av /home/user/ /backup/2024-01-01/

# Subsequent incremental backups
rsync -av --link-dest=/backup/2024-01-01/ 
    /home/user/ /backup/2024-01-02/

Unchanged files are hard linked to the previous day's backup. Each backup directory appears complete (you can ls and see all files), but unchanged files don't consume additional space. Apple's Time Machine uses this exact technique.

Massive Space Savings

A server with 1 TB of data and 2% daily churn, backed up daily for 90 days, would need 90 TB with full copies but only ~3 TB with hard link-based incrementals. This makes frequent backups practical where they would otherwise be prohibitively expensive.

safe_file_replacement.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Safe atomic file replacement pattern
# Used for configuration files, databases, etc.
 
# Original config file
cat /etc/myapp/config.json
# { "version": 1, "setting": "old" }
 
# Create new version in the same directory
cat > /etc/myapp/config.json.new << 'EOF'
{ "version": 2, "setting": "new" }
EOF
 
# Atomically replace old with new
# mv on the same filesystem is atomic
mv /etc/myapp/config.json.new /etc/myapp/config.json
 
# If the system crashes between write and mv:
# - The old config is intact
# - The .new file may be incomplete but is ignored
# This ensures config is always in a valid state
 
# Hard links enable an even safer pattern with rollback:
ln /etc/myapp/config.json /etc/myapp/config.json.bak
# Now we have a backup that doesn't consume extra space
# (until the inode is modified, at which point COW may apply)

Summary: Hard Links

We've explored hard links from their conceptual foundation to their practical applications. Let's consolidate the key insights:

Key Takeaways

•Hard links are directory entries pointing to inodes — A filename is a pointer, not the file itself. Multiple pointers can reference the same file.
•All hard links are equal — There's no distinction between 'original' and 'link.' All are first-class references to the same inode.
•Link count tracks references — The inode's link count determines when the file is truly deleted (when count reaches zero).
•Hard links are space-efficient — They consume only directory entry space, not file data. Perfect for backup deduplication.
•Restricted to same filesystem — Inode numbers are only meaningful within a single filesystem.
•No directory hard links — Prevents cycles that would break directory traversal algorithms.

What's next:

Now that we understand how hard links work through inode references, we'll examine the link count in greater detail. Understanding link count behavior is crucial for predicting when files are actually freed and how utilities like rm, unlink, and nlink interact with the filesystem.

Page Complete

You now understand hard links at both conceptual and implementation levels. You can explain why hard links behave as they do, what restrictions apply, and when to use them effectively. Next, we'll explore the critical role of link count in file lifecycle management.

1 / 5

Loading learning content...

Operating SystemsHard and Soft Links

Hard and Soft Links

LevelIntermediate

Duration55 mins

TopicHard and Soft Links

1 / 5

Hard Links

The Illusion of Multiple Files

What You Will Learn

Rethinking What a File Really Is

To understand hard links, we must first deconstruct our everyday notion of a 'file.' When users interact with files, they typically think:

A file is something stored in a folder, with a name, content, and properties.

This mental model conflates three distinct entities that file systems carefully separate:

The file's data — The actual bytes stored on disk
The file's metadata — Properties like size, permissions, timestamps, owner
The file's name — How we reference it in the directory hierarchy

In Unix-like file systems, the inode holds both the data and metadata, while directory entries (names) are merely references to inodes. This separation is the foundation of hard links.

The Key Insight

The inode architecture:

An inode (index node) is a data structure on disk that stores:

File type — Regular file, directory, device, socket, etc.
Permissions — Read, write, execute for owner, group, others
Owner and group IDs — Numeric identifiers for ownership
Timestamps — Access time, modification time, change time
Size — Number of bytes in the file
Block pointers — Locations of data blocks on disk
Link count — Number of hard links pointing to this inode

Notice what the inode does not contain: the filename. The filename lives in a directory entry that maps the name string to the inode number.

The Three Layers of File System Identity
Layer	What It Contains	Where It Lives	Uniqueness
Inode	Metadata + block pointers	Inode table on disk	Unique per file system
Data blocks	Actual file content	Data region on disk	Shared via inode
Directory entry	Name → inode mapping	Parent directory	Can have multiple per inode

Hard Link Definition and Creation

A hard link is a directory entry that points directly to an inode. When you create a file, the file system:

Allocates a new inode
Writes the file's data to disk blocks
Updates the inode with metadata and block pointers
Creates a directory entry mapping your chosen filename to that inode
Sets the inode's link count to 1

When you create a hard link, the file system:

Creates a new directory entry with the new name
Points that entry to the same inode as the original
Increments the inode's link count

No data is copied. No new inode is created. The only change is an additional directory entry and an incremented counter.

creating_hard_links.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# Create an original file
echo "Important data that exists exactly once on disk" > original.txt
 
# Verify the inode number and link count
ls -li original.txt
# Output: 1234567 -rw-r--r-- 1 user group 48 Jan 16 10:00 original.txt
#         ^^^^^^^         ^
#       inode number   link count = 1
 
# Create a hard link using the 'ln' command (no -s flag)
ln original.txt linked.txt
 
# Verify both names point to the same inode
ls -li original.txt linked.txt
# Output:
# 1234567 -rw-r--r-- 2 user group 48 Jan 16 10:00 original.txt
# 1234567 -rw-r--r-- 2 user group 48 Jan 16 10:00 linked.txt
#         ^^^^^^^   ^
#     same inode   link count = 2
 
# Both files have identical inode numbers
# Both files show link count of 2
# This is the same file accessed through two different names

System call mechanics:

At the kernel level, creating a hard link involves the link() system call:

int link(const char *oldpath, const char *newpath);

The kernel implementation:

Resolve oldpath — Traverse the directory hierarchy to find the target inode
Verify permissions — Caller must have write permission on the destination directory
Check restrictions — Cannot hard link directories (on most systems)
Check filesystem — Both paths must be on the same filesystem
Create directory entry — Add newpath entry pointing to the resolved inode
Increment link count — Atomically update the inode's nlink field
Update ctime — Mark the inode's status change time

Atomic Operation

Hard Link Characteristics

Defining Characteristics of Hard Links

•Complete equality — No 'original' vs 'link' distinction; all hard links are equally valid references to the same inode. Deleting the 'original' filename has no special effect if other links exist.
•Shared everything — All hard links share content, permissions, ownership, timestamps, and all other inode metadata. Changing any attribute through one link changes it for all.
•Same inode number — All hard links display the same inode number in ls -i output. This is the definitive test for hard links.
•Same filesystem only — Hard links cannot cross filesystem boundaries because inode numbers are only unique within a single filesystem.
•No directories — Most Unix systems prohibit hard links to directories to prevent cycles in the directory tree (with exceptions like . and ..).
•Space efficiency — Hard links consume only the space for a directory entry (typically 256-4096 bytes depending on name length), not additional file data.

hard_link_equality.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# Demonstrating equality: there is no "original"
echo "Original content" > file1.txt
ln file1.txt file2.txt
ln file1.txt file3.txt
 
# All three names are equal references
stat file1.txt file2.txt file3.txt
# All show: Links: 3, Inode: 1234567 (same number)
 
# Delete the "original" - file2 and file3 are unaffected
rm file1.txt
cat file2.txt
# Output: Original content
 
# Modify through file2 - file3 sees the change instantly
echo "Modified content" > file2.txt
cat file3.txt
# Output: Modified content
 
# Check link count dropped to 2
stat file2.txt
# Links: 2

The equality principle in practice:

This equality has profound implications. When you 'delete' a file, you're actually:

Removing a directory entry (the name)
Decrementing the inode's link count
The inode and its data are freed only when the link count reaches zero

This means:

Deleting a file with multiple hard links merely removes one name
The data persists as long as any hard link remains
There's no way to 'delete the file completely' without removing all hard links
Programs holding open file descriptors prevent deletion even if all links are removed (the kernel maintains an elevated reference count)

File Descriptors and Link Counts

Implementation Deep Dive

Understanding the on-disk structures involved in hard links illuminates why they behave as they do. Let's trace through the data structures in a typical ext4 file system.

Directory entry structure:

In ext4, a directory is itself a file containing a sequence of directory entries. Each entry contains:

struct ext4_dir_entry_2 {
    __le32  inode;      /* Inode number (4 bytes) */
    __le16  rec_len;    /* Directory entry length (2 bytes) */
    __u8    name_len;   /* Name length (1 byte) */
    __u8    file_type;  /* File type (1 byte) */
    char    name[];     /* File name (variable, up to 255 bytes) */
};

When you create a hard link, the file system adds a new directory entry with:

The same inode number as the existing file
The new name you specified
The appropriate file type byte

Inode structure (simplified):

struct ext4_inode {
    __le16  i_mode;         /* File type and permissions */
    __le16  i_uid;          /* Owner user ID (low 16 bits) */
    __le32  i_size_lo;      /* File size in bytes */
    __le32  i_atime;        /* Access time */
    __le32  i_ctime;        /* Inode change time */
    __le32  i_mtime;        /* Modification time */
    __le32  i_dtime;        /* Deletion time */
    __le16  i_gid;          /* Group ID (low 16 bits) */
    __le16  i_links_count;  /* Hard link count */
    __le32  i_blocks_lo;    /* Block count */
    __le32  i_flags;        /* File flags */
    /* ... block pointers and extended attributes ... */
};

The i_links_count field is a 16-bit unsigned integer, meaning a single inode can have up to 65,535 hard links in ext4 (though practical limits may be lower).

Hard Link Operation Workflow
Step	Data Structure Modified	Change Made
Parse destination directory	VFS dentry cache	Locate parent directory inode
Verify constraints	Source inode, filesystem superblock	Check same FS, not directory, permissions
Add directory entry	Destination directory data blocks	Insert new entry with source inode number
Increment link count	Source inode	i_links_count++
Update timestamps	Source inode, destination directory inode	Update ctime and mtime respectively
Journal transaction	Journal area	Commit all changes atomically

Journaling Ensures Consistency

Restrictions and Limitations

Hard links come with several fundamental restrictions that arise from their design. Understanding these limitations helps you choose when hard links are appropriate.

Hard Link Limitations

•Same filesystem requirement — Hard links cannot span filesystem boundaries. Since inode numbers are only unique within a filesystem, an inode number on /dev/sda1 is meaningless on /dev/sda2. Attempting to create a cross-filesystem hard link returns EXDEV: Invalid cross-device link.
•No directory hard links — Most Unix systems prohibit hard links to directories to prevent cycles. The . and .. entries are special cases created by the filesystem itself. Cycles would break utilities that traverse directories recursively (find, rm -r, du) and could cause infinite loops.
•Cannot track moves — If someone moves or renames a file, all hard links remain valid (they follow the inode). But there's no way to detect that a move happened—the link still works, it just accesses data from an unexpected location.
•Backup complexity — Backup utilities must detect hard links (by tracking inode numbers) to avoid storing multiple copies. Not all backup tools handle this correctly, potentially wasting space or creating false duplication.
•No broken links — Unlike symbolic links, hard links cannot become 'dangling.' They always point to valid inodes. This is both a benefit and a limitation—you can't create a hard link to a file that doesn't yet exist.

hard_link_errors.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Attempting cross-filesystem hard link
echo "test" > /home/user/file.txt
ln /home/user/file.txt /mnt/usb/file_link.txt
# Error: ln: failed to create hard link '/mnt/usb/file_link.txt': Invalid cross-device link
 
# Attempting to hard link a directory
mkdir my_directory
ln my_directory my_dir_link
# Error: ln: my_directory: hard link not allowed for directory
 
# Attempting to link a non-existent file
ln does_not_exist.txt new_link.txt
# Error: ln: failed to access 'does_not_exist.txt': No such file or directory
 
# These restrictions are fundamental, not configuration issues

Why directories cannot have hard links:

The prohibition on directory hard links prevents several dangerous scenarios:

Infinite traversal loops — A hard link from /a/b/c to /a would create a cycle that find, ls -R, du, and rm -r would never escape.
Ambiguous parent references — If a directory has multiple parents via hard links, what does .. resolve to? The answer is undefined and breaks fundamental navigation assumptions.
File system damage — Many file system repair tools (fsck) assume the directory structure is a tree. Cycles violate this invariant and could cause data loss during repair.
Inconsistent semantics — Deleting a directory should free all its contents, but what if another hard link still references that directory from elsewhere?

The . and .. entries are created automatically by the filesystem and are carefully managed to maintain tree structure invariants.

Practical Applications

Common Hard Link Use Cases
Use Case	How Hard Links Help	Example
Incremental backups	Unchanged files are hard linked to previous backup, saving space	rsync --link-dest, Time Machine
Build systems	Share object files between build directories without copying	Bazel, distributed builds
Package management	Multiple packages referencing the same library file	dpkg, RPM deduplication
Safe file replacement	Create new version, then atomically rename over old version	Configuration updates
Multi-location access	Same file accessible from multiple directory contexts	Shared data directories

Deep dive: Incremental backups with hard links

The most significant application of hard links is in incremental backup systems. Consider backing up a 100 GB home directory daily for a year:

Naive approach: 365 × 100 GB = 36.5 TB of storage
With hard links: Only changed files consume new space

If only 1 GB changes daily on average, hard link-based backups use:

100 GB (first backup) + 364 × 1 GB (changes) = 464 GB

The rsync --link-dest option implements this:

# First full backup
rsync -av /home/user/ /backup/2024-01-01/

# Subsequent incremental backups
rsync -av --link-dest=/backup/2024-01-01/ 
    /home/user/ /backup/2024-01-02/

Massive Space Savings

safe_file_replacement.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
# Safe atomic file replacement pattern
# Used for configuration files, databases, etc.
 
# Original config file
cat /etc/myapp/config.json
# { "version": 1, "setting": "old" }
 
# Create new version in the same directory
cat > /etc/myapp/config.json.new << 'EOF'
{ "version": 2, "setting": "new" }
EOF
 
# Atomically replace old with new
# mv on the same filesystem is atomic
mv /etc/myapp/config.json.new /etc/myapp/config.json
 
# If the system crashes between write and mv:
# - The old config is intact
# - The .new file may be incomplete but is ignored
# This ensures config is always in a valid state
 
# Hard links enable an even safer pattern with rollback:
ln /etc/myapp/config.json /etc/myapp/config.json.bak
# Now we have a backup that doesn't consume extra space
# (until the inode is modified, at which point COW may apply)

Summary: Hard Links

We've explored hard links from their conceptual foundation to their practical applications. Let's consolidate the key insights:

Key Takeaways

•Hard links are directory entries pointing to inodes — A filename is a pointer, not the file itself. Multiple pointers can reference the same file.
•All hard links are equal — There's no distinction between 'original' and 'link.' All are first-class references to the same inode.
•Link count tracks references — The inode's link count determines when the file is truly deleted (when count reaches zero).
•Hard links are space-efficient — They consume only directory entry space, not file data. Perfect for backup deduplication.
•Restricted to same filesystem — Inode numbers are only meaningful within a single filesystem.
•No directory hard links — Prevents cycles that would break directory traversal algorithms.

What's next:

Page Complete

1 / 5