Operating System Services - Learning Module

Loading content...

0/227

File System Manipulation

Organizing Persistent Data

Storage devices present data as vast arrays of blocks—billions of 512-byte or 4KB sectors with numeric addresses. Without higher-level organization, managing data would be nearly impossible. How would you find a document among trillions of bytes? How would you share files between users while protecting sensitive data? How would applications store configuration without conflicting with each other?

The file system solves these problems by imposing structure on raw storage. It provides the abstraction of files (named, typed data containers) organized into directories (hierarchical namespaces), with permissions controlling access. These file system manipulation services are among the most frequently used OS capabilities—virtually every application reads and writes files.

What You Will Learn

By the end of this page, you will understand how file systems organize data, the operations available for file and directory manipulation, how permissions and access control work, the role of metadata and file attributes, and how modern file systems handle advanced features like links, journaling, and atomic operations. This knowledge is essential for effective systems programming.

File Concepts and Attributes

A file is the fundamental unit of persistent storage—a named collection of related information stored on secondary storage. To the application, a file is a logical sequence of bytes; the file system handles the physical layout on disk.

File attributes (metadata):

Beyond content, files carry extensive metadata:

Name: Human-readable identifier (with extension conventions)
Type: Regular file, directory, symbolic link, device, socket, FIFO
Size: Current content length in bytes
Timestamps: Creation, modification, access times
Owner: User and group ownership
Permissions: Read, write, execute access rights
Location: Physical/logical block addresses (internal)
Flags/Attributes: Read-only, hidden, system, immutable, etc.

file_metadata_example.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
# Examining file metadata on Linux
 
$ stat /etc/passwd
  File: /etc/passwd
  Size: 2834            Blocks: 8          IO Block: 4096   regular file
Device: 259,2           Inode: 1048594     Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2025-01-15 10:30:00.000000000 +0000
Modify: 2025-01-10 08:15:30.123456789 +0000
Change: 2025-01-10 08:15:30.123456789 +0000
 Birth: 2024-06-01 00:00:00.000000000 +0000
 
# Breakdown:
# - Size: 2834 bytes of content
# - Blocks: 8 (512-byte blocks allocated = 4KB)
# - IO Block: Optimal I/O transfer size
# - Inode: Unique identifier within filesystem
# - Links: Hard link count
# - Access mode: 0644 = rw-r--r--
# - Three timestamps:
#   - atime (Access): Last read
#   - mtime (Modify): Last content change
#   - ctime (Change): Last metadata change
#   - btime (Birth): Creation time (not all filesystems)
 
# JSON-format metadata for scripts
$ stat --format='{"name":"%n","size":%s,"uid":%u,"mode":"%a"}' /etc/passwd
{"name":"/etc/passwd","size":2834,"uid":0,"mode":"644"}
 
# Extended attributes (xattrs) - additional metadata
$ getfattr -d /path/to/file
# file: /path/to/file
user.description="Project documentation"
security.selinux="system_u:object_r:user_home_t:s0"

File types:

Modern file systems distinguish several file types, each with different semantics:

$ ls -la /dev /home/user /var/run

crw-rw-rw-  1 root root  1,   3 Jan 15 10:00 /dev/null       # c = character device
brw-rw----  1 root disk  8,   0 Jan 15 10:00 /dev/sda        # b = block device
prw-r--r--  1 user user        0 Jan 15 10:00 /tmp/myfifo    # p = named pipe (FIFO)
srwxrwxrwx  1 user user        0 Jan 15 10:00 /var/run/app.sock  # s = socket
lrwxrwxrwx  1 root root       11 Jan 15 10:00 /bin -> /usr/bin   # l = symbolic link
drwxr-xr-x  2 user user     4096 Jan 15 10:00 /home/user/docs   # d = directory
-rw-r--r--  1 user user     1234 Jan 15 10:00 /home/user/file   # - = regular file

The first character indicates type:

- Regular file (data content)
d Directory (contains other files)
l Symbolic link (pointer to another path)
c Character device (byte-stream I/O)
b Block device (block-based I/O)
p Named pipe (FIFO, IPC)
s Socket (network-style IPC)

Extension vs. Magic Numbers

File extensions (.txt, .pdf, .exe) are naming conventions, not enforced types. The OS doesn't care about extensions—you can rename 'doc.pdf' to 'doc.exe' and the content is unchanged.

Magic numbers are byte sequences at file start that identify content types. For example, PDFs start with '%PDF-', PNGs with '\x89PNG'. The file command uses these to detect actual file types regardless of extension.

Directory Structure and Path Resolution

Directories organize files into hierarchical namespaces. A directory is itself a file—one that contains a list of (name, inode) pairs mapping names to file locations.

Hierarchical structure:

Modern file systems use a tree structure rooted at / (Unix) or drive letters (Windows):

/                               C:\
├── bin/                        ├── Windows/
├── home/                       ├── Program Files/
│   ├── alice/                  ├── Users/
│   │   ├── documents/          │   ├── Alice/
│   │   │   └── report.pdf      │   │   ├── Documents/
│   │   └── .bashrc             │   │   │   └── report.pdf
│   └── bob/                    │   │   └── Desktop/
├── etc/                        │   └── Bob/
├── var/                        └── Temp/
└── tmp/

Special directory entries:

. (dot): Refers to current directory
.. (dot-dot): Refers to parent directory
These exist in every directory (except . in root equals ..)

Path resolution:

When you access a file by path, the OS must resolve the path to an actual file location. This involves traversing the directory tree:

Absolute path /home/alice/documents/report.pdf:

1. Start at root directory (/)
2. Look up "home" → get inode for /home directory
3. Read /home directory, look up "alice" → get inode
4. Read /home/alice, look up "documents" → get inode
5. Read /home/alice/documents, look up "report.pdf" → get inode
6. Inode contains file's disk block locations
7. Access file content from those blocks

Relative path ../bob/file.txt (from /home/alice):

1. Start at current directory (/home/alice)
2. Look up ".." → get inode for /home
3. Read /home, look up "bob" → get inode
4. Read /home/bob, look up "file.txt" → get inode
5. Access file

directory_operations.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
#include <dirent.h>
#include <sys/stat.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
 
/**
 * Demonstrates directory manipulation operations
 */
 
/* List directory contents */
void list_directory(const char *path) {
    DIR *dir = opendir(path);
    if (!dir) {
        perror("opendir");
        return;
    }
    
    struct dirent *entry;
    while ((entry = readdir(dir)) != NULL) {
        /* entry->d_type: DT_REG, DT_DIR, DT_LNK, etc. */
        char type;
        switch (entry->d_type) {
            case DT_REG: type = 'f'; break;  /* Regular file */
            case DT_DIR: type = 'd'; break;  /* Directory */
            case DT_LNK: type = 'l'; break;  /* Symbolic link */
            default:     type = '?'; break;
        }
        printf("[%c] %s (inode: %lu)\n", type, entry->d_name, entry->d_ino);
    }
    
    closedir(dir);
}
 
/* Create directory */
int create_directory(const char *path) {
    /* 0755 = rwxr-xr-x permissions */
    if (mkdir(path, 0755) != 0) {
        perror("mkdir");
        return -1;
    }
    return 0;
}
 
/* Remove empty directory */
int remove_directory(const char *path) {
    if (rmdir(path) != 0) {
        perror("rmdir");
        return -1;
    }
    return 0;
}
 
/* Change current working directory */
int change_directory(const char *path) {
    if (chdir(path) != 0) {
        perror("chdir");
        return -1;
    }
    return 0;
}
 
/* Get current working directory */
void print_cwd() {
    char cwd[PATH_MAX];
    if (getcwd(cwd, sizeof(cwd)) != NULL) {
        printf("Current directory: %s\n", cwd);
    }
}
 
int main() {
    /* Working directory operations */
    print_cwd();
    change_directory("/tmp");
    print_cwd();
    
    /* Create and list directory */
    create_directory("/tmp/test_dir");
    list_directory("/tmp/test_dir");
    
    /* Cleanup */
    remove_directory("/tmp/test_dir");
    
    return 0;
}

Common Directory Operations
Operation	Unix API	Windows API	Purpose
Create directory	mkdir()	CreateDirectory()	Create new directory
Remove directory	rmdir()	RemoveDirectory()	Remove empty directory
Open directory	opendir()	FindFirstFile()	Begin reading directory
Read entry	readdir()	FindNextFile()	Get next directory entry
Close directory	closedir()	FindClose()	Finish reading directory
Change directory	chdir()	SetCurrentDirectory()	Change working directory
Get current dir	getcwd()	GetCurrentDirectory()	Get working directory

Path Resolution Performance

Each path component requires a directory read—potentially a disk access. Deep paths incur more overhead. The OS caches directory contents (dentry cache in Linux) to accelerate repeated lookups. This is why accessing '/a/b/c/d/e/f/file' isn't dramatically slower than '/file' in practice.

File Operations

File operations can be categorized by whether they affect content, metadata, or namespace.

Content operations:

These operations read from or write to file content:

open(): Access file, get file descriptor
read(): Read bytes from file
write(): Write bytes to file
lseek(): Reposition read/write pointer
truncate(): Resize file (shrink or extend)
close(): Release file descriptor and flush buffers

file_operations_example.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <sys/stat.h>
 
/**
 * Comprehensive file operations demonstration
 */
 
int main() {
    const char *filename = "/tmp/demo_file.txt";
    char buffer[256];
    
    /* ===== CREATE AND WRITE ===== */
    
    /* Open flags:
     * O_CREAT  - Create if doesn't exist
     * O_WRONLY - Write-only access
     * O_TRUNC  - Truncate if exists
     * O_APPEND - Append mode (writes always at end)
     * O_EXCL   - Fail if file exists (with O_CREAT)
     * O_SYNC   - Synchronous writes (durability)
     */
    int fd = open(filename, O_CREAT | O_WRONLY | O_TRUNC, 0644);
    if (fd < 0) {
        perror("open for write");
        return 1;
    }
    
    /* Write data */
    const char *text = "Hello, File System!\n";
    ssize_t written = write(fd, text, strlen(text));
    printf("Wrote %zd bytes\n", written);
    
    /* Seek to position (SEEK_SET=absolute, SEEK_CUR=relative, SEEK_END=from end) */
    off_t pos = lseek(fd, 0, SEEK_END);
    printf("File position after write: %ld\n", pos);
    
    /* Write more data */
    const char *more = "Second line.\n";
    write(fd, more, strlen(more));
    
    close(fd);
    
    /* ===== READ ===== */
    
    fd = open(filename, O_RDONLY);
    if (fd < 0) {
        perror("open for read");
        return 1;
    }
    
    /* Read entire file */
    ssize_t bytes_read;
    while ((bytes_read = read(fd, buffer, sizeof(buffer) - 1)) > 0) {
        buffer[bytes_read] = '\0';
        printf("Read: %s", buffer);
    }
    
    /* Seek to beginning and read again */
    lseek(fd, 0, SEEK_SET);
    bytes_read = read(fd, buffer, 5);  /* Read first 5 bytes */
    buffer[bytes_read] = '\0';
    printf("First 5 bytes: '%s'\n", buffer);
    
    close(fd);
    
    /* ===== METADATA OPERATIONS ===== */
    
    /* Get file information */
    struct stat st;
    if (stat(filename, &st) == 0) {
        printf("Size: %ld bytes\n", st.st_size);
        printf("Inode: %lu\n", st.st_ino);
        printf("Mode: %o\n", st.st_mode & 0777);
        printf("Links: %lu\n", st.st_nlink);
    }
    
    /* Change permissions */
    chmod(filename, 0600);  /* rw------- */
    
    /* Change ownership (requires root) */
    // chown(filename, new_uid, new_gid);
    
    /* Truncate to specific size */
    truncate(filename, 10);  /* Keep only first 10 bytes */
    
    /* ===== NAMESPACE OPERATIONS ===== */
    
    /* Rename file */
    rename(filename, "/tmp/renamed_file.txt");
    
    /* Create hard link */
    link("/tmp/renamed_file.txt", "/tmp/hardlink.txt");
    
    /* Create symbolic link */
    symlink("/tmp/renamed_file.txt", "/tmp/symlink.txt");
    
    /* Delete file (unlink removes directory entry) */
    unlink("/tmp/symlink.txt");
    unlink("/tmp/hardlink.txt");
    unlink("/tmp/renamed_file.txt");
    
    return 0;
}

The file descriptor:

When you open a file, the OS returns a file descriptor—a small integer that references an open file. Each process maintains a file descriptor table:

Process File Descriptor Table:
┌─────┬────────────────────────────────────────────────────┐
│ FD  │ Points to (in kernel)                              │
├─────┼────────────────────────────────────────────────────┤
│  0  │ stdin  → terminal input                            │
│  1  │ stdout → terminal output                           │
│  2  │ stderr → terminal output                           │
│  3  │ Open file → /home/user/data.txt (pos: 1024)        │
│  4  │ Socket → TCP connection to 10.0.0.1:80             │
│  5  │ Pipe → write end of pipe to child process          │
└─────┴────────────────────────────────────────────────────┘

Kernel File Table Entry (per fd):
- Reference to underlying inode/vnode
- Current file position (offset)
- Open mode (read/write/append)
- File status flags

File descriptors 0, 1, 2 are standard input, output, and error—conventionally already open when a process starts. New opens return the lowest available number.

Critical File Operation Gotchas

•Always check return values — open() returns -1 on failure, read()/write() return bytes transferred which may be less than requested.
•Close files when done — File descriptors are a limited resource (usually 1024 default per process). Leaked handles cause resource exhaustion.
•Beware of TOCTTOU — Time-of-check to time-of-use race conditions between checking a file's state and acting on it. Use atomic operations where possible.
•Write ordering matters — Without fsync(), write order to disk is not guaranteed. Databases and logs must sync explicitly.
•Rename is atomic — rename() within a filesystem is atomic. Use it for safe file updates: write to temp file, then rename over target.

Memory-Mapped Files

mmap() maps a file directly into process memory. Instead of read()/write(), you access file content as memory addresses. Benefits: zero-copy I/O, demand paging (only accessed pages loaded), shared memory between processes. Used by databases, executables (code pages), and high-performance applications.

Permissions and Access Control

Operating systems enforce access control to prevent unauthorized file access. The traditional Unix permission model and modern access control lists (ACLs) provide overlapping but distinct capabilities.

Unix permission model:

Every file has three permission sets: owner (user), group, and others. Each set has three bits: read (r), write (w), execute (x).

  -rwxr-xr--  1 alice developers 4096 Jan 15 10:00 script.sh
  │└┬┘└┬┘└┬┘
  │ │  │  └── Others: read only (r--) = 4
  │ │  └───── Group: read + execute (r-x) = 5
  │ └──────── Owner: full access (rwx) = 7
  └────────── File type (- = regular file)
  
  Octal representation: 754
  
  Permission values:
    r = 4 (read)
    w = 2 (write)
    x = 1 (execute)
    
  For directories:
    r = list contents
    w = create/delete files
    x = traverse (cd into, access files within)

permission_examples.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# Permission manipulation examples
 
# View permissions
$ ls -la file.txt
-rw-r--r-- 1 alice staff 1234 Jan 15 10:00 file.txt
 
# Change permissions (chmod)
$ chmod 755 script.sh      # rwxr-xr-x (octal)
$ chmod u+x script.sh      # Add execute for user (symbolic)
$ chmod g-w file.txt       # Remove write from group
$ chmod o=r file.txt       # Set others to read-only
$ chmod a+r file.txt       # Add read for all (a = all)
$ chmod -R 755 dir/        # Recursive
 
# Change ownership (chown) - requires root or owner
$ sudo chown bob:staff file.txt      # Change user and group
$ sudo chown :developers file.txt    # Change group only
$ sudo chown -R alice:alice dir/     # Recursive
 
# Special permissions
$ chmod 4755 program       # setuid: runs as file owner
$ chmod 2755 dir           # setgid: new files inherit group
$ chmod 1755 /tmp          # sticky bit: only owner can delete
 
# View numeric permissions
$ stat -c "%a %n" file.txt
644 file.txt
 
# Default permissions (umask)
$ umask              # Show current mask
022
$ umask 027          # Set mask (files: 640, dirs: 750)
 
# File creation with umask:
# Default mode & ~umask = actual mode
# 666 & ~022 = 644 (for files)
# 777 & ~022 = 755 (for directories)

Access Control Lists (ACLs):

Traditional Unix permissions are limited—you can only grant access to owner, one group, and everyone else. ACLs extend this with fine-grained, per-user/per-group permissions:

# View ACL
$ getfacl file.txt
# file: file.txt
# owner: alice
# group: staff
user::rw-
user:bob:r--         # Specific user permission
group::r--
group:developers:rw- # Specific group permission
mask::rw-
other::---

# Set ACL
$ setfacl -m u:bob:r file.txt          # Add user ACL
$ setfacl -m g:developers:rw file.txt  # Add group ACL
$ setfacl -x u:bob file.txt            # Remove user ACL
$ setfacl -b file.txt                  # Remove all ACLs

# Default ACLs (inherited by new files in directory)
$ setfacl -d -m g:developers:rw dir/

Windows permission model:

Windows uses a more complex ACL system by default, with granular permissions like "Read Attributes," "Write Extended Attributes," "Delete Child," etc., plus inheritance rules for directories.

Special Permission Bits
Bit	On Files	On Directories	Representation
setuid (4000)	Execute as file owner	No effect	-rwsr-xr-x
setgid (2000)	Execute as file group	New files inherit group	-rwxr-sr-x
sticky (1000)	No effect (historically swap)	Only owner can delete files	drwxrwxrwt

Security Implications of setuid

setuid programs run with the file owner's privileges regardless of who executes them. When owned by root, they're a significant security risk—any vulnerability allows privilege escalation. Examples: /usr/bin/passwd (needs to modify /etc/shadow), sudo, su. Modern systems minimize setuid binaries and prefer capabilities for fine-grained privilege.

Links and Indirection

Unix file systems support two types of links that provide indirection—allowing multiple names to reference the same data.

Hard links:

A hard link is an additional directory entry pointing to the same inode. The file's data has multiple names equally valid—there's no "original" and "link."

$ echo "content" > file1.txt
$ ln file1.txt file2.txt      # Create hard link
$ ls -li
12345 -rw-r--r-- 2 user user 8 Jan 15 10:00 file1.txt
12345 -rw-r--r-- 2 user user 8 Jan 15 10:00 file2.txt
       ^^^^^^^^^^^^^^^
       Same inode (12345), link count = 2

$ rm file1.txt                 # Remove one name
$ cat file2.txt                # Data still accessible!
content
$ ls -li file2.txt
12345 -rw-r--r-- 1 user user 8 Jan 15 10:00 file2.txt
                 ^
                 Link count now 1

Hard link characteristics:

Same inode number (same underlying file)
Changes through one name visible through all
File deleted only when last hard link removed
Cannot cross filesystem boundaries
Cannot link to directories (prevents cycles)

Symbolic links (symlinks):

A symbolic link is a special file containing a path to another file. It's an indirect reference resolved at access time.

$ ln -s /path/to/target linkname    # Create symlink
$ ls -l linkname
lrwxrwxrwx 1 user user 15 Jan 15 10:00 linkname -> /path/to/target

# Symlink characteristics:
# - Different inode from target
# - Contains path string, not data
# - Can cross filesystems
# - Can link to directories
# - Can become "dangling" if target deleted

$ rm /path/to/target
$ cat linkname
cat: linkname: No such file or directory    # Dangling symlink!

$ ls -l linkname                             # Link still exists
lrwxrwxrwx 1 user user 15 Jan 15 10:00 linkname -> /path/to/target

Comparison:

Hard Link                          Symbolic Link
┌────────────────────┐             ┌────────────────────┐
│  Directory Entry   │             │  Directory Entry   │
│  name: "file1.txt" │             │  name: "linkname"  │
│  inode: 12345      │             │  inode: 67890      │
└────────┬───────────┘             └────────┬───────────┘
         │                                   │
         │                                   ▼
         │                         ┌────────────────────┐
         │                         │ Inode 67890        │
         │                         │ type: symlink      │
         │                         │ data: "/path/to/   │
         │                         │        target"     │
         ▼                         └────────┬───────────┘
┌────────────────────┐                      │
│  Inode 12345       │◄─────────────────────┘ (resolved at access)
│  type: regular     │
│  size: 1000        │
│  blocks: [...]     │
└────────────────────┘

When to Use Each Link Type

•Hard links — Space-efficient backups (unchanged files share blocks), multiple access paths on same filesystem, ensuring file isn't accidentally deleted while still in use.
•Symbolic links — Cross-filesystem references, linking to directories, version switching (/usr/bin/python → python3.11), configuration flexibility, shortcuts that should break if target moves.

Resolving Symlinks

Many system calls follow symlinks automatically (open, stat). Some have 'l' variants that don't (lstat, lchown). readlink() reads the symlink content itself. realpath() resolves all symlinks to get the canonical absolute path.

File System Organization

Different file system types implement these abstractions in various ways, optimized for different use cases. Understanding file system organization helps explain performance characteristics and limitations.

On-disk structure (simplified ext4 example):

┌────────────────────────────────────────────────────────────────────────┐
│                          Disk Layout                                   │
├─────────┬─────────┬─────────┬──────────────────────────────────────────┤
│ Boot    │ Super-  │ Block   │      Block Groups (repeated)              │
│ Block   │ block   │ Group   │                                           │
│         │         │ Desc.   │ ┌────────────────────────────────────────┐│
│         │         │         │ │ Group 0 │ Group 1 │ Group 2 │ ...      ││
│ 1 KB    │ 1 KB    │ n KB    │ └────────────────────────────────────────┘│
└─────────┴─────────┴─────────┴──────────────────────────────────────────┘

Block Group Structure:
┌──────────────────────────────────────────────────────────────┐
│ Data Block  │ Inode    │ Inode   │     Data Blocks           │
│ Bitmap      │ Bitmap   │ Table   │     (file contents)       │
│             │          │         │                            │
│ 1 bit per   │ 1 bit per│ inodes  │     Actual file data      │
│ data block  │ inode    │         │                            │
└──────────────────────────────────────────────────────────────┘

Inode Structure:
┌────────────────────────────────────────────────────────────┐
│ Type & Permissions        │ Owner UID/GID                  │
├───────────────────────────┼────────────────────────────────┤
│ Size                      │ Timestamps (atime/mtime/ctime) │
├───────────────────────────┼────────────────────────────────┤
│ Link count                │ Flags                          │
├───────────────────────────┴────────────────────────────────┤
│ Direct block pointers (12)           → data blocks         │
│ Indirect pointer (1)                 → block of pointers   │
│ Double indirect pointer (1)          → block of ind. ptrs  │
│ Triple indirect pointer (1)          → block of dbl. ptrs  │
└────────────────────────────────────────────────────────────┘

Common File Systems Comparison
File System	OS	Max File Size	Features
ext4	Linux	16 TB	Journaling, extents, large FS support
XFS	Linux	8 EB	High performance, large files, scalable
Btrfs	Linux	16 EB	Copy-on-write, snapshots, checksums
NTFS	Windows	16 EB	Journaling, ACLs, compression, encryption
APFS	macOS	8 EB	Copy-on-write, snapshots, encryption
ZFS	Illumos/BSD	16 EB	Checksums, RAID, snapshots, compression
FAT32	Cross-platform	4 GB	Simple, wide compatibility, no journaling
exFAT	Cross-platform	16 EB	Flash-optimized, large files, no journaling

Journaling and crash recovery:

A key concern for file systems is crash recovery—if power fails during a write, the file system should remain consistent. Journaling file systems maintain a log of changes:

Write intended changes to journal
Perform actual changes to file system
Mark journal entry as complete

If crash occurs:

Between steps 1-2: Replay journal to complete operation
During step 2: Journal indicates what was intended, can recover
After step 3: Operation completed, journal entry can be discarded

Journal modes:

Journal (data journaling): Log both metadata and data — safest, slowest
Ordered: Log metadata; write data before metadata — balanced
Writeback: Log metadata only; data may precede metadata — fastest, less safe

Copy-on-Write File Systems

Modern file systems like Btrfs, ZFS, and APFS use copy-on-write: modified blocks are written to new locations, never overwriting existing data. Benefits: snapshots are instant (just preserve old block pointers), corruption is detectable (checksums on all blocks), atomic updates. Trade-off: can fragment over time, write amplification.

Summary: File System Manipulation

We've explored how operating systems organize and manage persistent data through file system services. Let's consolidate the key insights:

Key Takeaways

•Files abstract persistent storage — Named containers with metadata (size, timestamps, permissions) organize raw disk blocks into manageable units.
•Directories provide hierarchical namespace — Tree structure enables organization; path resolution traverses this tree to locate files.
•File operations span content, metadata, and namespace — open/read/write/close for content; chmod/chown for metadata; rename/link/unlink for namespace.
•Permission model controls access — Unix owner/group/other permissions plus ACLs for fine-grained control; special bits (setuid, sticky) for advanced needs.
•Hard and symbolic links provide indirection — Hard links share inodes (same file, multiple names); symlinks contain paths (flexible, can dangle).
•File system types vary in capabilities — Journaling for crash recovery, copy-on-write for snapshots, various limits and features per implementation.
•Understanding internals aids debugging — Knowing about inodes, block allocation, and journaling helps diagnose performance and recovery issues.

What's next:

With file systems covered, we'll explore the final category of OS services in this module: Communication Services. This includes inter-process communication (IPC), networking, and the mechanisms that allow processes and systems to exchange information.

Page Complete

You now understand how operating systems provide file system manipulation services. From file concepts through directory structures, operations, permissions, links, and file system organization—these services form the foundation of persistent data management in computing.