Hard And Soft Links - Learning Module

Loading content...

0/227

Soft (Symbolic) Links

Pointers Within the File System Namespace

While hard links are powerful, they carry fundamental limitations: they cannot cross filesystem boundaries, cannot link to directories, and provide no visible distinction from the 'original' file. Symbolic links (also called soft links or symlinks) solve these problems through a radically different mechanism.

Instead of sharing an inode, a symbolic link is a separate file that contains a text path pointing to another location. When the operating system encounters a symlink during path resolution, it reads the stored path and continues resolution from there—transparently redirecting access to the target.

This indirection layer unlocks capabilities impossible with hard links, but it also introduces new failure modes and behavioral subtleties that every systems programmer must understand.

What You Will Learn

By the end of this page, you will understand how symbolic links work at the file system level, their path resolution mechanics, key behavioral differences from hard links, and their role in modern system administration.

Symbolic Link Fundamentals

A symbolic link is a special file type whose content is a text string representing a path to another file or directory. This path string is called the symlink target or referent.

Key distinctions from hard links:

Aspect	Hard Link	Symbolic Link
Nature	Directory entry pointing to inode	Separate file containing path text
Inode	Shares target's inode	Has its own distinct inode
Target types	Files only (no directories)	Files, directories, or anything
Filesystem scope	Same filesystem only	Can cross filesystem boundaries
Target existence	Target must exist at creation	Target can be nonexistent
Link count effect	Increments target's link count	No effect on target's link count

How symbolic links work:

When you create a symlink with ln -s target linkname:
- A new inode is allocated for the symlink itself
- The inode's file type is set to 'symbolic link' (S_IFLNK)
- The target path string is stored (either in the inode or in data blocks)
- A directory entry is created mapping linkname to this new inode
When a process accesses a symlink:
- The kernel reads the symlink's content (the target path)
- Path resolution continues from the target path
- This happens automatically for most operations (transparent resolution)
- Special operations like lstat() and readlink() operate on the symlink itself

creating_symlinks.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# Create a regular file
echo "Original content" > original.txt
 
# Create a symbolic link using ln -s (the -s is CRITICAL)
ln -s original.txt symlink.txt
 
# Compare inodes - they're DIFFERENT (unlike hard links)
ls -li original.txt symlink.txt
# Output:
# 1234567 -rw-r--r-- 1 user group 17 Jan 16 10:00 original.txt
# 1234999 lrwxrwxrwx 1 user group 12 Jan 16 10:00 symlink.txt -> original.txt
#          ^                         ^^
#   'l' = symlink        size = length of target path
 
# Notice:
# 1. Different inode numbers (1234567 vs 1234999)
# 2. File type 'l' at start of permissions
# 3. Size is 12 bytes = length of "original.txt"
# 4. The " -> original.txt" shows the symlink target
 
# Link count of original is still 1 (symlinks don't affect it)
stat original.txt | grep Links
# Output: Links: 1
 
# Read through the symlink - works transparently
cat symlink.txt
# Output: Original content
 
# Modify through symlink - affects the original
echo "Modified content" > symlink.txt
cat original.txt
# Output: Modified content

The Critical -s Flag

Forgetting the -s flag when creating symlinks is a common error. ln target link creates a hard link; ln -s target link creates a symbolic link. This distinction is critical—they behave completely differently!

Storage and Implementation

The implementation of symbolic links varies across file systems, with optimizations for the common case of short target paths.

Fast symlinks (inline storage):

Most modern file systems store short symlink targets directly in the inode, avoiding the need to allocate and read a separate data block. This is called a 'fast symlink' or 'inline symlink.'

In ext4:

The inode has ~60 bytes available for inline data (the block pointer area)
Symlinks with targets ≤ 60 characters use this inline storage
No data block allocation needed; symlink resolution is faster

Slow symlinks (block storage):

Longer target paths require allocating a data block:

The inode's block pointers point to a data block
The data block contains the target path string
Additional disk I/O is required to read the target

Symlink Storage by File System
File System	Inline Limit	Max Target Length	Storage Method
ext2/ext3/ext4	60 bytes	PATH_MAX (4096)	Inode inline or data block
XFS	~156 bytes	1024 bytes	Inode fork or extent block
btrfs	~253 bytes	PATH_MAX	Inline in tree item or extent
NTFS	Varies	32,767 chars	Reparse point data
APFS	Varies	Large	Extended attribute or extent

symlink_storage_analysis.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
# Analyze symlink storage characteristics
 
# Create short symlink (fast/inline)
ln -s short.txt short_symlink
stat short_symlink
# Size: 9 (length of "short.txt")
# Blocks: 0 (no data blocks allocated - inline storage!)
 
# Create long symlink target
long_target=$(printf 'a%.0s' {1..100})  # 100 character path
ln -s "$long_target" long_symlink
stat long_symlink
# Size: 100
# Blocks: 0 or 8 depending on file system
# May need a block if exceeding inline limit
 
# Verify inline vs block storage with debugfs (ext4)
sudo debugfs -R "stat <$(stat -c %i short_symlink)>" /dev/sda1
# EXTENTS: (inline) shows inline storage
# BLOCKS: list shows block allocation
 
# Read symlink target
readlink short_symlink
# Output: short.txt
 
readlink -f short_symlink  # -f gives absolute resolved path
# Output: /home/user/short.txt (full resolved path)
 
# Size = target string length, NOT the size of the target file
echo "This file is 1000 bytes of content...." > big_file.txt
ln -s big_file.txt big_symlink
stat big_symlink
# Size: 12 (length of "big_file.txt", not 1000)

Symlink Size Field

The size field of a symlink's stat structure is the length of the target path string in bytes—not the size of the target file. A symlink to a 10 GB file has size ~20 bytes (the path length). This is crucial when calculating disk usage or implementing backup tools.

Path Resolution and Indirection

When the kernel encounters a symbolic link during path resolution, it performs symlink expansion—reading the target path and continuing resolution from there. This behavior is automatic and transparent for most operations, but understanding the details is essential for predicting symlink behavior.

Path resolution algorithm with symlinks:

Start with the pathname components (e.g., /home/user/link/subdir/file)
For each component, look it up in the current directory
If the component is a symlink:
- Read the symlink target
- If target starts with /: restart from root
- If target is relative: insert it in place of the symlink component
- Increment symlink counter
- If counter exceeds limit (typically 40): return ELOOP
Continue with next component
Final component handling depends on the operation

symlink_resolution.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
# Setup for path resolution examples
mkdir -p /tmp/demo/nested/deep
echo "Target file" > /tmp/demo/target.txt
 
# Absolute symlink - target starts with /
cd /home/user
ln -s /tmp/demo/target.txt abs_link
cat abs_link
# Resolution: /home/user/abs_link 
#          -> /tmp/demo/target.txt (absolute, restart from /)
 
# Relative symlink - target is relative to symlink location
ln -s ../demo/target.txt rel_link
cat rel_link
# Resolution: /home/user/rel_link
#          -> ../demo/target.txt
#          -> /home/demo/target.txt (relative to symlink's directory)
 
# IMPORTANT: Relative symlinks are relative to the SYMLINK's location,
# not the current working directory!
 
# Symlink to directory
ln -s /tmp/demo linked_dir
ls linked_dir/
# Works! Lists contents of /tmp/demo
 
ls linked_dir/nested/deep
# Also works - resolution continues through the directory
 
# Chained symlinks
ln -s abs_link chain1
ln -s chain1 chain2
cat chain2
# Resolution: chain2 -> chain1 -> abs_link -> /tmp/demo/target.txt
 
# Too many symlinks causes ELOOP
# Create a circular chain
ln -s loop2 loop1
ln -s loop1 loop2
cat loop1
# Error: Too many levels of symbolic links (ELOOP)
 
# Check the system's symlink limit
getconf SYMLOOP_MAX
# Typically 40

Relative Symlink Pitfall

Relative symlinks are resolved relative to the symlink's directory, NOT your current working directory. If you move a relative symlink to a different directory, it will break. Always use absolute paths for symlinks that might be moved, or use ln -sr (relative flag) to auto-compute correct relative paths.

Symlink-aware vs symlink-following operations:

Most system calls follow symlinks automatically, but some have variants that operate on the symlink itself:

Operation	Follows Symlinks	Operates on Symlink
`stat()`	Yes	`lstat()`
`open()`	Yes (usually)	`open()` with `O_NOFOLLOW`
`chown()`	Yes	`lchown()`
`chmod()`	Yes	No equivalent (symlink perms are fixed)
`readlink()`	No	N/A (reads symlink target)
`unlink()`	No	N/A (always removes symlink)

The l prefix typically indicates 'do not follow symlinks.'

Permissions and Ownership

Symbolic links have interesting permission semantics that differ significantly from regular files. The symlink itself has an owner and theoretical permissions, but these behave in counterintuitive ways.

Symlink permissions are largely meaningless:

When you ls -l a symlink, you typically see lrwxrwxrwx (777). This isn't because anyone can modify the symlink—it's because symlink permissions are not enforced by most Unix systems.

What actually matters:

Creating/deleting symlinks: Controlled by permissions on the parent directory
Reading symlink target: Always allowed (symlink content is considered public)
Following symlinks: No permission check on the symlink itself
Accessing the target: Controlled by the target's permissions

Some systems (like macOS with fs.restrictions) do enforce symlink permissions, but this is the exception, not the rule.

symlink_permissions.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# Observe symlink permissions
ln -s /etc/passwd passwd_link
ls -l passwd_link
# lrwxrwxrwx 1 user group 11 Jan 16 10:00 passwd_link -> /etc/passwd
# Note: Always 777 (lrwxrwxrwx) regardless of umask
 
# chmod on symlink typically changes the TARGET, not the symlink
chmod 600 passwd_link  # This changes /etc/passwd, not the symlink!
# (Will fail unless you're root)
 
ls -l passwd_link
# Still shows lrwxrwxrwx
 
# To operate on the symlink itself, use lchmod (where available)
# Note: lchmod is not available on Linux (symlink permissions ignored)
 
# Ownership of symlinks
ls -l passwd_link
# Shows owner/group of the symlink itself
 
# To change symlink ownership without following it
chown -h newuser:newgroup passwd_link
# The -h (--no-dereference) flag operates on the symlink
 
# Access check demonstration
mkdir secure_dir
chmod 700 secure_dir
echo "secret" > secure_dir/secret.txt
 
# Create symlink to secret file
ln -s secure_dir/secret.txt secret_link
 
# Can read symlink target (readlink always works)
readlink secret_link
# Output: secure_dir/secret.txt
 
# But can't access the target without directory permission
sudo -u other_user cat secret_link
# Error: Permission denied (secure_dir blocks access)

Security Implications

Symbolic links can point anywhere in the filesystem. A malicious user might create symlinks in a world-writable directory (like /tmp) pointing to sensitive files, hoping a privileged process will follow them. This is why secure programs use O_NOFOLLOW and carefully validate paths. The Linux kernel's protected_symlinks sysctl provides additional protection.

Broken and Dangling Links

Unlike hard links, symbolic links can become broken (also called dangling)—pointing to a target that doesn't exist. This can happen by design or by accident, and understanding this behavior is crucial for robust scripting and programming.

How symlinks become broken:

Target never existed — Symlink created with -f or to a path that doesn't exist yet
Target was deleted — File removed but symlink remains
Target was moved — Renamed or moved to a different location
Filesystem unmounted — Target on a removable filesystem
Relative path breakage — Symlink moved, breaking relative target resolution

broken_symlinks.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
# Create a symlink to nonexistent target (allowed!)
ln -s does_not_exist.txt broken_link
ls -l broken_link
# lrwxrwxrwx 1 user group 19 Jan 16 10:00 broken_link -> does_not_exist.txt
 
# Try to access it
cat broken_link
# Error: No such file or directory
 
# The symlink EXISTS, but the TARGET doesn't
# ls shows the symlink itself; cat tries to follow it and fails
 
# Detect broken symlinks programmatically
# test -L checks if it's a symlink
# test -e checks if target exists (follows symlinks)
if [ -L broken_link ] && [ ! -e broken_link ]; then
    echo "broken_link is a broken symlink"
fi
 
# Find all broken symlinks in a directory
find /path/to/dir -xtype l
# -xtype l matches symlinks whose target doesn't exist
 
# Or using the -L flag with -type
find -L /path/to/dir -type l
# With -L, symlinks are dereferenced, so -type l matches only broken ones
 
# Another method using file command
file broken_link
# Output: broken_link: broken symbolic link to does_not_exist.txt
 
file good_link
# Output: good_link: symbolic link to existing_file.txt
 
# Remove only broken symlinks
find /path/to/dir -xtype l -delete
 
# Statistics about symlink health
find /usr -type l -print0 | while IFS= read -r -d '' link; do
    if [ ! -e "$link" ]; then
        echo "Broken: $link -> $(readlink "$link")"
    fi
done

Broken Symlinks Are Silent Failures

Broken symlinks don't cause errors until you try to use them. Scripts that iterate over symlinks should check validity with test -e before proceeding. Many system issues stem from broken symlinks to libraries, configs, or binaries that were once valid.

Testing Symlinks: File Test Operators
Test	Returns True When	Follows Symlinks
`-e file`	File exists (any type, including target)	Yes
`-f file`	Regular FILE exists	Yes
`-d file`	DIRECTORY exists	Yes
`-L file`	File is a symbolic link	No
`-h file`	File is a symbolic link (same as -L)	No
`-L f && ! -e f`	Symlink exists but target doesn't	Combination

Practical Applications

Symbolic links are ubiquitous in Unix systems, enabling flexible configuration, clean versioning, and portable references. Their ability to cross filesystems and link to directories makes them far more versatile than hard links for many use cases.

Common Symbolic Link Use Cases
Use Case	How Symlinks Help	Example
Version management	Point to 'current' version without changing paths	`/opt/java -> /opt/jdk-17.0.1`
Library compatability	Provide multiple names for shared libraries	`libssl.so -> libssl.so.1.1`
config organization	Centralize configs, symlink to expected locations	`~/.bashrc -> ~/dotfiles/bashrc`
Cross-fsystem access	Reference files on mounted filesystems	`/data -> /mnt/external/data`
Build systems	Switch build configurations	`config.h -> config_debug.h`
Web server roots	Point to current deployment	`/var/www/app -> /releases/v2.3.1`

symlink_use_cases.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
# USE CASE 1: Version switching with symlinks
# Very common for Java, Node.js, Python installations
 
# Setup
mkdir -p /opt/java/jdk-11 /opt/java/jdk-17
ln -sfn /opt/java/jdk-17 /opt/java/current
 
# Switch to Java 11
ln -sfn /opt/java/jdk-11 /opt/java/current
# The -n flag treats destination as file, not directory
 
# Applications use /opt/java/current/bin/java
# Switching versions is instant, no app changes needed
 
# USE CASE 2: Atomic deployments
# Deploy new version without downtime
 
# Current state
ls -l /var/www/app
# /var/www/app -> /var/www/releases/v2.3.0
 
# Deploy new version
rsync -a newcode/ /var/www/releases/v2.3.1/
 
# Atomic switch (web server sees change immediately)
ln -sfn /var/www/releases/v2.3.1 /var/www/app.new
mv -T /var/www/app.new /var/www/app
 
# The mv command is atomic within a filesystem!
# Zero downtime deployment achieved
 
# USE CASE 3: Shared library versioning
ls -la /usr/lib/libssl*
# libssl.so -> libssl.so.1.1
# libssl.so.1 -> libssl.so.1.1
# libssl.so.1.1 -> libssl.so.1.1.1
# libssl.so.1.1.1
 
# Programs link against libssl.so (symlink)
# ldconfig manages the symlink chain
# Minor updates don't break old binaries
 
# USE CASE 4: Dotfiles management
# Store configs in git, symlink to expected locations
cd ~
git clone https://github.com/user/dotfiles
 
# Create symlinks to actual config locations
ln -sf ~/dotfiles/bashrc ~/.bashrc
ln -sf ~/dotfiles/vimrc ~/.vimrc
ln -sf ~/dotfiles/gitconfig ~/.gitconfig
 
# Now dotfiles are version-controlled
# Changes sync across machines via git

Atomic Symlink Replacement

To atomically replace a symlink: create the new symlink with a temporary name, then mv -T temp_link real_link. The mv command is atomic within a filesystem, so there's never a moment when the symlink is missing or points to an invalid target. This is the foundation of zero-downtime deployments.

Summary: Soft (Symbolic) Links

We've explored symbolic links comprehensively—from their implementation to their practical applications. Here are the essential insights:

Key Takeaways

•Symlinks are files containing paths — They have their own inodes and store a target path as their content.
•Transparent resolution — The kernel automatically follows symlinks during path resolution, with configurable limits to prevent loops.
•Can cross filesystem boundaries — Unlike hard links, symlinks can point to any path, even on different filesystems.
•Can link to directories — Symlinks overcome hard links' restriction against directory linking.
•Can become broken — If the target is deleted or moved, the symlink becomes dangling.
•Permissions on symlinks are mostly ignored — Access is controlled by directory permissions and target file permissions.

What's next:

Now that we understand both hard and symbolic links, we'll explore dangling links in greater depth—why they occur, how they cause problems, and techniques for detection and remediation.

Page Complete

You now understand symbolic links at a deep level—their architecture, resolution mechanics, and practical applications. You can explain the trade-offs between hard and soft links and choose appropriately for different scenarios.