Loading learning content...
When we think of files, we typically imagine documents, images, or executables—collections of data stored on disk. But the Unix philosophy of "everything is a file" extends the file abstraction far beyond simple data storage. In the operating system world, the term "file" encompasses a rich taxonomy of objects:
Understanding file types is crucial for system programming, administration, and security. Each type has distinct behaviors, uses, and implications.
By the end of this page, you will understand all major file types in Unix/Linux and Windows systems, how to identify them, their purpose and behavior, and the practical implications of working with each type in programming and system administration.
Unix-like systems classify files into distinct types, stored as part of the file's inode. The file type is fundamental—it determines how the kernel handles operations on the file.
The Seven File Types:
$ ls -l
drwxr-xr-x 2 user user 4096 Jan 15 10:00 directory/
-rw-r--r-- 1 user user 1234 Jan 15 10:00 regular_file.txt
lrwxrwxrwx 1 user user 11 Jan 15 10:00 symlink -> target.txt
crw-rw-rw- 1 root root 1, 3 Jan 15 10:00 /dev/null
brw-rw---- 1 root disk 8, 0 Jan 15 10:00 /dev/sda
prw-r--r-- 1 user user 0 Jan 15 10:00 named_pipe
srwxrwxrwx 1 user user 0 Jan 15 10:00 socket_file
^
└── File type indicator
| Symbol | Type | Purpose | Example |
|---|---|---|---|
- | Regular file | Data storage | document.pdf, program.exe |
d | Directory | File organization | /home/user, /etc |
l | Symbolic link | Reference to another file | /usr/bin/python -> python3 |
c | Character device | Byte-stream hardware access | /dev/tty, /dev/null |
b | Block device | Block-level hardware access | /dev/sda, /dev/nvme0n1 |
p | Named pipe (FIFO) | Inter-process communication | Created with mkfifo |
s | Socket | Network/IPC communication | /var/run/docker.sock |
The file type is an intrinsic property stored in the inode's mode field. It cannot be changed after creation—you cannot convert a regular file into a directory or a symlink. You must delete and recreate with the correct type.
Regular files are the most common file type—what most people mean when they say "file." They contain user data: documents, images, executables, libraries, configuration files, logs, and any other persistent data.
Characteristics of Regular Files:
Creating Regular Files:
# Create empty file
touch newfile.txt
# Create with content
echo "Hello" > greeting.txt
# Create via program
dd if=/dev/zero of=empty.bin bs=1M count=100
Content vs. Type:
The operating system does NOT interpret regular file contents. A .txt extension doesn't make a file text—it's just a naming convention. A file named image.jpg containing random bytes is still a valid regular file to the OS.
Determining Actual Content Type:
$ file document.pdf
document.pdf: PDF document, version 1.7
$ file mystery_file
mystery_file: ELF 64-bit LSB executable, x86-64
$ file fake.jpg
fake.jpg: ASCII text
The file command examines "magic numbers" (initial bytes) to identify content, regardless of filename extension.
While the OS sees all regular files equally, humans categorize them: text files (human-readable), binary files (machine code, data formats), scripts (executable text), archives (compressed collections), and more. These distinctions are conventions above the OS level.
Directories are special files that contain mappings from names to files. A directory's "data" is a list of directory entries, each associating a name with an inode number.
Directory Entry Structure (Conceptual):
Directory: /home/user/
┌──────────────────┬─────────────┐
│ Name │ Inode │
├──────────────────┼─────────────┤
│ . │ 1234567 │ (this directory)
│ .. │ 1234000 │ (parent directory)
│ documents │ 1234568 │ (subdirectory)
│ file.txt │ 1234600 │ (regular file)
│ link │ 1234601 │ (could be symlink)
└──────────────────┴─────────────┘
Directory Operations:
| Operation | Command | System Call |
|---|---|---|
| Create | mkdir dir | mkdir() |
| Delete | rmdir dir | rmdir() |
| List | ls dir | opendir(), readdir() |
| Change current | cd dir | chdir() |
| Get current | pwd | getcwd() |
Special Entries: "." and ".."
Every directory contains two special entries:
. (dot): Reference to itself; useful for relative paths.. (dot-dot): Reference to parent directory; how cd .. worksThe root directory (/) has .. pointing to itself.
Why Directories Are "Files":
Directories are files because:
But directories have restrictions:
write() to a directory directlyPermissions on directories mean different things than on files: Read (r) = list contents; Write (w) = create/delete files within; Execute (x) = traverse (access files inside). Without 'x', you can't even access files whose names you know!
Symbolic links (symlinks or soft links) are files that contain a reference to another file or directory. When you access a symlink, the system transparently redirects to the target.
Creating Symbolic Links:
$ ln -s /path/to/target linkname
# Examples
$ ln -s /usr/bin/python3.11 /usr/bin/python
$ ln -s ../shared/config.json local-config.json
Symlink Contents:
A symlink stores the target path as its data:
$ ls -l /usr/bin/python
lrwxrwxrwx 1 root root 16 Jan 15 10:00 /usr/bin/python -> python3.11
^^-- symlink size (length of "python3.11")
Symlink Resolution:
When accessing /home/user/data where data is a symlink to /mnt/storage/data:
/mnt/storage/dataControlling Symlink Behavior:
# Operate on the link itself (not target)
$ ls -l link # Shows link info
$ rm link # Removes the link, not target
$ readlink link # Print target path
# Operate on target (default)
$ cat link # Reads target's content
$ chmod 644 link # Changes target's permissions (usually)
The symlink() system call creates a symbolic link. The readlink() system call reads the target path stored in the symlink. These correspond to 'ln -s' and 'readlink' commands. Most other operations follow the symlink transparently.
Hard links are not a separate file type—they're additional directory entries pointing to the same inode. Every file has at least one hard link (its name). Multiple hard links mean multiple names for the identical file.
Creating Hard Links:
$ ln original.txt hardlink.txt
$ ls -li
1234567 -rw-r--r-- 2 user user 1000 Jan 15 10:00 hardlink.txt
1234567 -rw-r--r-- 2 user user 1000 Jan 15 10:00 original.txt
^^^^^^^^^ ^
same inode link count = 2
Hard Link Characteristics:
| Property | Hard Link | Symbolic Link |
|---|---|---|
| Same inode? | Yes | No (has its own inode) |
| Cross filesystem? | No | Yes |
| Link to directories? | No (except . and ..) | Yes |
| Target must exist? | Yes (it IS the target) | No (can dangle) |
| Survives target rename? | Yes (same inode) | No (path breaks) |
| Space usage | Just directory entry | Directory entry + path string |
| Resolution overhead | None (direct inode) | Kernel must read link target |
Common Hard Link Uses:
. and .. are hard links to directoriesHard Link Restriction: Directories
Users cannot create hard links to directories (except . and .. which the filesystem manages). Why? Hard links could create cycles in the directory tree, making traversal algorithms fail and confusing the filesystem.
A link count > 1 indicates hard links exist. Use 'ls -i' to see inode numbers; files with the same inode are hard links to each other. 'find -samefile filename' locates all hard links to a file. 'stat --printf="%h\n" file' shows the link count directly.
Device files (also called device nodes or special files) represent hardware devices or kernel interfaces. They live primarily in /dev/ and allow programs to interact with hardware using standard file operations.
Two Types of Device Files:
Character Devices (c):
/dev/tty), serial ports, pseudo-random generatorsBlock Devices (b):
Device Numbers:
Device files don't store data—they store two numbers that identify the driver:
$ ls -l /dev/sda /dev/null
brw-rw---- 1 root disk 8, 0 Jan 15 10:00 /dev/sda
crw-rw-rw- 1 root root 1, 3 Jan 15 10:00 /dev/null
^ ^
| └── Minor number
└───── Major number
Major number: Identifies the driver (e.g., 8 = SCSI disk driver) Minor number: Identifies the specific device (e.g., 0 = first disk)
Creating Device Files (requires root):
$ mknod /dev/mydevice c 100 0 # Character device
$ mknod /dev/myblock b 200 1 # Block device
| Device | Type | Purpose |
|---|---|---|
/dev/null | char | Discards all input; reads return EOF |
/dev/zero | char | Returns infinite zeros on read |
/dev/random | char | Cryptographically secure random (blocking) |
/dev/urandom | char | Pseudo-random numbers (non-blocking) |
/dev/tty | char | Current terminal |
/dev/sda | block | First SCSI/SATA disk |
/dev/sda1 | block | First partition of sda |
/dev/loop0 | block | Loopback device (file as block device) |
Block device files provide raw access to disk storage, bypassing the filesystem. Writing to /dev/sda directly overwrites the disk's partition table and data. Always be extremely careful with device files—they can destroy systems instantly.
Named pipes (also called FIFOs—First In, First Out) are files that act as communication channels between processes. Unlike anonymous pipes (used with |), named pipes persist in the filesystem and can connect unrelated processes.
Creating Named Pipes:
$ mkfifo mypipe
$ ls -l mypipe
prw-r--r-- 1 user user 0 Jan 15 10:00 mypipe
^
└── 'p' indicates named pipe
Named Pipe Behavior:
open() blocks until both ends connectedrm removes the pipeCommunication Example:
Terminal 1 (writer):
$ echo "Hello from process A" > mypipe
# (blocks until reader connects and reads)
Terminal 2 (reader):
$ cat < mypipe
Hello from process A
# (received the data)
Programmatic Use:
// Writer process
int fd = open("mypipe", O_WRONLY); // Blocks until reader
write(fd, "Hello", 5);
close(fd);
// Reader process
int fd = open("mypipe", O_RDONLY); // Blocks until writer
char buf[100];
read(fd, buf, sizeof(buf));
close(fd);
Named pipes elegantly connect programs that weren't designed to work together. A GUI program can write commands to a pipe that a background daemon reads. Logs can be piped to processing tools. This is the Unix philosophy of small, cooperating programs in action.
Unix domain sockets are files that enable bidirectional communication between processes on the same machine. They use socket APIs but communicate via the filesystem rather than the network.
Socket Files in the Filesystem:
$ ls -l /var/run/docker.sock
srw-rw---- 1 root docker 0 Jan 15 10:00 /var/run/docker.sock
^
└── 's' indicates socket
$ file /var/run/docker.sock
/var/run/docker.sock: socket
Characteristics:
Common Unix Socket Locations:
| Path | Service |
|---|---|
/var/run/docker.sock | Docker daemon |
/var/run/mysqld/mysqld.sock | MySQL database |
/tmp/.X11-unix/X0 | X Window System display |
/var/run/dbus/system_bus_socket | D-Bus system bus |
/var/run/snapd.socket | Snap package daemon |
Socket vs Named Pipe:
| Feature | Named Pipe | Unix Socket |
|---|---|---|
| Direction | Unidirectional | Bidirectional |
| Mode | Byte stream only | Stream or datagram |
| Multiple clients | Complex | Native support |
| Credential passing | No | Yes |
| FD passing | No | Yes |
Tools like 'socat', 'nc' (netcat with -U flag), and 'curl --unix-socket' can connect to Unix sockets from the command line. Database clients often support socket connections as more secure and faster alternatives to TCP/IP localhost connections.
Pseudo-files look like regular files but don't store persistent data. Instead, they're interfaces to kernel internals, providing a file-based way to interact with the operating system.
The /proc Filesystem:
/proc provides information about processes and system state:
$ cat /proc/version
Linux version 5.15.0 (gcc 11.2.0)
$ cat /proc/cpuinfo
processor : 0
model name : Intel Core i7-10700
...
$ ls /proc/1234/ # Process with PID 1234
cmdline cwd environ exe fd maps mem status ...
These files are generated on-the-fly when read—they don't exist on any disk.
The /sys Filesystem:
/sys exposes device and driver information hierarchically:
$ cat /sys/class/thermal/thermal_zone0/temp
45000 # Temperature in milli-degrees Celsius (45°C)
$ cat /sys/block/sda/size
1953525168 # Disk size in 512-byte sectors
$ echo 1 > /sys/class/leds/input0::capslock/brightness
# Turn on Caps Lock LED (if you have permission)
Writable Pseudo-Files:
Many /proc and /sys files accept writes to configure the kernel:
# Enable IP forwarding
$ echo 1 > /proc/sys/net/ipv4/ip_forward
# Change CPU governor
$ echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
| Filesystem | Mount Point | Purpose |
|---|---|---|
| procfs | /proc | Process and system information |
| sysfs | /sys | Device and driver interfaces |
| devtmpfs | /dev | Device nodes (auto-created) |
| tmpfs | /tmp, /run | Memory-backed temporary storage |
| cgroupfs | /sys/fs/cgroup | Control group interfaces |
| debugfs | /sys/kernel/debug | Kernel debugging information |
Because kernel interfaces appear as files, you can use standard tools (cat, echo, grep) for system administration. No special APIs needed. Scripts can examine and modify system behavior using file operations. This Unix innovation from the 1970s remains powerful today.
Windows has a different file type model than Unix. While the "everything is a file" philosophy is less pervasive, Windows still has various object types accessible through file-like interfaces.
NTFS File Types:
| Type | Example Path | Purpose |
|---|---|---|
| Devices | \\.\PhysicalDrive0 | Direct disk access |
| Named Pipes | \\.\pipe\mypipe | IPC (like Unix pipes) |
| Mailslots | \\.\mailslot\name | One-way IPC |
| COM Ports | \\.\COM1 | Serial ports |
| Console | CONIN$, CONOUT$ | Console input/output |
Creating Links on Windows:
# Symbolic link (requires admin or developer mode)
> New-Item -ItemType SymbolicLink -Path link -Target targetfile
# Hard link
> New-Item -ItemType HardLink -Path link -Target targetfile
# Junction point (directories only)
> New-Item -ItemType Junction -Path link -Target targetdir
# Or using cmd mklink
> mklink link target # Symlink to file
> mklink /D link target # Symlink to directory
> mklink /H link target # Hard link
> mklink /J link target # Junction
Windows uses the '\.' prefix (device namespace) for non-filesystem objects. This is similar to Unix's /dev/ but less integrated. Programs must use specific functions (CreateFile with special paths) rather than normal file operations.
Correctly identifying file types is essential for system programming and administration. Multiple methods exist:
1. Using ls -l:
$ ls -l /dev/* /tmp/pipe /home/user/file.txt
crw-rw-rw- 1 root root ... /dev/null # Character device
brw-rw---- 1 root disk ... /dev/sda # Block device
prw-r--r-- 1 user user ... /tmp/pipe # Named pipe
-rw-r--r-- 1 user user ... file.txt # Regular file
2. Using stat:
$ stat --format=%F /dev/null
character special file
$ stat --format=%F /dev/sda
block special file
3. Using test/[ command:
[ -f file ] # True if regular file
[ -d path ] # True if directory
[ -L path ] # True if symbolic link
[ -p path ] # True if named pipe
[ -S path ] # True if socket
[ -b path ] # True if block device
[ -c path ] # True if character device
4. Programmatic Detection (C):
#include <sys/stat.h>
struct stat st;
if (stat(path, &st) == 0) {
if (S_ISREG(st.st_mode)) printf("Regular file\n");
if (S_ISDIR(st.st_mode)) printf("Directory\n");
if (S_ISLNK(st.st_mode)) printf("Symbolic link\n");
if (S_ISCHR(st.st_mode)) printf("Character device\n");
if (S_ISBLK(st.st_mode)) printf("Block device\n");
if (S_ISFIFO(st.st_mode)) printf("Named pipe\n");
if (S_ISSOCK(st.st_mode)) printf("Socket\n");
}
5. Python:
import os, stat
mode = os.stat(path).st_mode
if stat.S_ISREG(mode): print("Regular file")
if stat.S_ISDIR(mode): print("Directory")
if stat.S_ISLNK(mode): print("Symbolic link")
# ... etc.
stat() follows symbolic links—if you stat a symlink, you get info about the target. Use lstat() to get info about the symlink itself. This distinction is critical when checking if something IS a symlink rather than what it points to.
We've explored the complete taxonomy of file types in operating systems—from regular files to devices, pipes, and sockets. Let's consolidate:
What's next:
Now that we understand file types, we'll explore file structure—the internal organization of file data. How are text files different from binary files? What defines a file format? How do applications interpret raw bytes as structured information?
You now understand the complete taxonomy of file types in operating systems. This knowledge is essential for system programming, administration, security analysis, and understanding how the Unix philosophy creates powerful, composable systems.