Operating SystemsFile System Structures

Unix inode Structure

LevelIntermediate

Duration75 mins

TopicFile System Structures

3 / 5

Direct Blocks

The Fast Path to File Data

We've established that an inode contains all metadata about a file. But an inode must do more than describe a file—it must also locate the file's actual data on disk. This is the inode's most critical function: providing the mapping from logical file offsets to physical disk blocks.

The Unix designers faced a fundamental challenge: how to organize block pointers for maximum efficiency across files ranging from a few bytes to many gigabytes? Their solution was a multi-level pointer scheme, and at its foundation are direct blocks—simple, fast pointers that handle the common case brilliantly.

Here's a key insight: Most files are small. Studies consistently show that the median file size on typical systems is between 2KB and 8KB. By optimizing for small files, the direct block design ensures that the vast majority of file accesses require minimal I/O operations.

What You Will Learn

By the end of this page, you will understand: how direct block pointers work at the bit level; why Unix filesystems use 12-15 direct pointers; how to calculate the maximum file size addressable by direct blocks alone; the O(1) access pattern that makes small file operations lightning-fast; and how modern filesystems optimize small file storage even further.

The Block Pointer Array

Every traditional Unix inode contains an array of block pointers. The classic design uses 15 pointers, organized as follows:

Pointers 0-11: Direct block pointers (12 total)
Pointer 12: Single indirect block pointer
Pointer 13: Double indirect block pointer
Pointer 14: Triple indirect block pointer

This page focuses on the first 12 entries—the direct block pointers. These are the simplest and fastest form of file data addressing.

Converting Mermaid diagram...

Each direct block pointer is a 32-bit or 64-bit integer containing a block number. When you read bytes 0 through (block_size - 1) of a file, the filesystem:

Reads the inode (if not cached)
Extracts block[0] from the pointer array
Multiplies by block size to get disk byte offset
Issues a single disk read

No traversal, no indirection, no additional I/O—just one lookup and one read. This is O(1) access to the file's first blocks.

direct_block_access.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
/*
 * Simplified file read using direct block pointers
 * This illustrates the conceptual access pattern
 */
ssize_t read_file(struct inode *inode, off_t offset, void *buffer, size_t count) {
    size_t block_size = inode->sb->block_size;  // e.g., 4096
    
    // Calculate which logical block we need
    loff_t logical_block = offset / block_size;
    size_t block_offset = offset % block_size;
    
    // Direct blocks handle logical blocks 0-11
    if (logical_block < DIRECT_BLOCKS) {  // DIRECT_BLOCKS = 12
        // Get actual physical block number from inode
        blkcnt_t physical_block = inode->block[logical_block];
        
        if (physical_block == 0) {
            // Block not allocated (sparse file)
            // Return zeros without disk I/O
            memset(buffer, 0, count);
            return count;
        }
        
        // Calculate disk byte address
        off_t disk_offset = physical_block * block_size + block_offset;
        
        // Single disk read - O(1) disk operations
        return disk_read(disk_offset, buffer, count);
    }
    
    // For larger offsets, we need indirect blocks (covered later)
    return read_via_indirect(inode, logical_block, block_offset, buffer, count);
}

Calculating Direct Block Capacity

The maximum file size addressable by direct blocks alone is straightforward to calculate:

Direct Block Capacity = Number of Direct Pointers × Block Size

Common configurations:

Direct Block Capacity by Block Size
Block Size	Direct Pointers	Direct Block Capacity	Practical Impact
1 KB	12	12 KB	Covers most scripts, configs, small source files
2 KB	12	24 KB	Covers typical text documents, small images
4 KB (common)	12	48 KB	Covers most source files, configs, small documents
8 KB	12	96 KB	Many medium-sized files fit entirely in direct blocks
64 KB (XFS max)	12	768 KB	Even moderately large files use only direct blocks

Why 12 Direct Pointers?

The choice of 12 direct pointers is not arbitrary—it's a carefully considered design decision:

Storage fit: With 15 total pointers at 4 bytes each = 60 bytes, plus ~68 bytes of other metadata = 128 bytes (classic inode size)
Optimal file coverage: Studies from the 1980s showed that 80-90% of files were under 48KB, making 12 direct pointers cover the vast majority
Cache efficiency: 12 pointers × 4 bytes = 48 bytes, fitting nicely in CPU cache lines
Simple indexing: The array index equals the logical block number (blocks 0-11 use pointers 0-11)

The 80/20 Rule in Action

The direct block design embodies the principle of optimizing for the common case. By making small file access as fast as possible (just inode + data read), the design ensures that most file operations complete with minimal overhead. Large files require more indirection, but this cost is amortized over more data.

Access Pattern Analysis

Let's trace exactly what happens when you access different parts of a small file. Assume a 4KB block size and a 32KB file (uses 8 direct blocks).

Reading the entire 32KB file sequentially:

Operation: read(fd, buffer, 32768)  // Read all 32KB

Step 1: Kernel consults file descriptor → gets inode reference
        (Inode likely cached in memory - no disk I/O)

Step 2: Calculate blocks needed:
        - Bytes 0-4095: block[0] = physical block 1000
        - Bytes 4096-8191: block[1] = physical block 1001
        - Bytes 8192-12287: block[2] = physical block 1002
        - Bytes 12288-16383: block[3] = physical block 1003
        - Bytes 16384-20479: block[4] = physical block 1004
        - Bytes 20480-24575: block[5] = physical block 1005
        - Bytes 24576-28671: block[6] = physical block 1006
        - Bytes 28672-32767: block[7] = physical block 1007

Step 3: Issue disk reads (may be batched via readahead):
        - Read blocks 1000-1007 (possibly one composite I/O)

Disk operations: 1 (if contiguous) to 8 (if fragmented)
All blocks accessed via direct pointers: YES
Indirection used: NONE

Key insight: If blocks are contiguous, the entire file can be read in a single disk operation. The direct pointers simply provide the starting address.

Sparse Files and Direct Blocks

Direct block pointers elegantly support sparse files—files with holes that contain logical zeros but consume no disk space. A zero value in a block pointer indicates an unallocated block.

How sparse files work with direct blocks:

Converting Mermaid diagram...

Consider a file that looks 48KB in size but only has data at specific locations:

# Create a sparse file with data at specific offsets
$ dd if=/dev/urandom of=sparse.bin bs=4096 seek=2 count=1 conv=notrunc 2>/dev/null
$ dd if=/dev/urandom of=sparse.bin bs=4096 seek=4 count=1 conv=notrunc 2>/dev/null

# Set the logical size to 48KB
$ truncate -s 49152 sparse.bin

# Check sizes
$ ls -l sparse.bin
-rw-r--r-- 1 user user 49152 Jan 15 10:00 sparse.bin   # Logical: 48KB

$ du -h sparse.bin
8.0K    sparse.bin                                      # Physical: 8KB

$ stat sparse.bin
  Size: 49152       Blocks: 16         IO Block: 4096   regular file
                           ^^ Only 16 × 512 = 8KB allocated

Reading from holes:

// When reading from a hole (block pointer == 0)
if (inode->block[logical_block] == 0) {
    // No disk I/O needed! Just return zeros.
    memset(buffer, 0, bytes_to_read);
    return bytes_to_read;
}

This is a powerful optimization for:

Virtual machine disk images (lots of unused space)
Database files with gaps
Log files with fixed record positions (empty slots)

Sparse File Pitfalls

Copying sparse files with naive tools (like cp without --sparse) reads all zeros and writes them, converting holes to allocated blocks. Use cp --sparse=always or rsync with sparse support. Also, when filling holes with writes, disk usage grows. A 1TB sparse file that fits in 1GB can suddenly need 1TB if fully written.

Performance Characteristics of Direct Blocks

Direct block access represents the best-case scenario for file I/O in Unix filesystems. Let's analyze the performance characteristics in detail:

Direct Block Performance Analysis
Operation	Time Complexity	Disk I/O	Notes
Read first byte	O(1)	1-2 reads (inode + data)	Inode usually cached after open()
Read byte at offset N < 48KB	O(1)	1 data read	Direct calculation: N / block_size
Sequential read of 48KB	O(1)	1-12 reads	May batch if blocks contiguous
Random read within 48KB	O(1)	1 read per block	No pointer traversal needed
Append within direct range	O(1)	1 alloc + 1 write + 1 inode update	Simple block allocation
Overwrite within file	O(1)	1 write	No reallocation needed
Truncate to < 48KB	O(n)	n block frees + inode update	Must free each block

Optimal Cases for Direct Blocks:

✓ Small configuration files ✓ Source code files (most < 48KB) ✓ Log entries/records ✓ Temporary files ✓ Database index pages ✓ Cached web assets

These files—which constitute the majority on most systems—achieve maximum performance with minimal overhead.

Comparison to Other Approaches:

Method	First Block	Random Block
Direct (Unix)	O(1)	O(1)
Linked List (FAT)	O(1)	O(n)
B-tree Index	O(log n)	O(log n)
Extent-only	O(log n)	O(log n)

Direct blocks trade flexibility for speed in the common case.

The Cache Effect

In practice, inode caching makes direct block access even faster. After a file is opened, its inode remains in kernel memory. Subsequent reads only need to consult the cached block pointers—no inode I/O at all. Only when accessing a new block does any disk I/O occur.

Block Allocation for Direct Blocks

When a file grows and needs new blocks assigned to direct pointers, the filesystem must choose which physical blocks to allocate. The goal is contiguity—placing blocks sequentially on disk to optimize sequential read performance.

Allocation strategies:

Modern Block Allocation Techniques

•Pre-allocation / Delayed allocation — When a write is requested, the filesystem reserves logical space but delays physical allocation until writeback. This allows multiple writes to be considered together, improving contiguity.
•Block group affinity — Allocate blocks in the same block group as the file's inode. This reduces seek distance between inode and data.
•Goal-oriented allocation — Start searching for free blocks near a "goal" block (usually the previous block or inode location).
•Extent hints — Track the last allocated block and prefer the next sequential block for new allocations.
•Multi-block allocation — Allocate multiple blocks at once for large writes, ensuring they are contiguous.

Example: ext4's allocation approach:

File write of 32KB to a new file:

1. Write request arrives for 32KB (8 blocks)

2. Delayed allocation reserves 8 logical blocks
   (No physical blocks allocated yet)

3. At writeback time (after short delay or memory pressure):
   - ext4 multi-block allocator (mballoc) activates
   - Searches for 8 contiguous free blocks
   - Preferably near the file's inode block group

4. If 8 contiguous blocks found at physical blocks 5000-5007:
   block[0] = 5000
   block[1] = 5001
   block[2] = 5002
   block[3] = 5003
   block[4] = 5004
   block[5] = 5005
   block[6] = 5006
   block[7] = 5007

5. All 8 blocks written in single I/O operation

This batching makes small file creation significantly faster than immediate allocation and improves subsequent read performance through contiguity.

Preallocation for Known File Sizes

If you know a file's final size in advance, use posix_fallocate() or fallocate() to pre-allocate all blocks at once. This guarantees contiguity (if space permits) and eliminates fragmentation. Database systems commonly do this for data files.

Direct Blocks in Modern Filesystems

While the classic 12-direct-pointer design persists in many filesystems, modern implementations have evolved with variations and optimizations:

Direct Block Implementations Across Filesystems
Filesystem	Approach	Small File Optimization	Notes
ext2/ext3	Classic 12 direct + 3 indirect	None	Original Unix-style design
ext4	Extents replace block pointers	Inline data for tiny files	Extents subsume direct blocks for most files
XFS	B+ tree extents	Local format for small extent lists	No traditional direct pointers
Btrfs	Inline extents in B-tree	File data inline in tree node	Copy-on-write changes semantics
ZFS	Object block pointers	Micro-blocks, compression	128KB variable blocks
NTFS	MFT resident data	Small files stored in MFT record	Similar to ext4 inline data

The Extent Evolution:

Ext4 introduced extents as an alternative to the 15-pointer array. An extent describes a contiguous range of blocks:

struct ext4_extent {
    __le32 ee_block;      // First logical block
    __le16 ee_len;        // Number of blocks in extent (up to 32768)
    __le16 ee_start_hi;   // Upper 16 bits of physical block
    __le32 ee_start_lo;   // Lower 32 bits of physical block
};

Where the classic scheme might need 1000 direct/indirect pointers for a 4MB contiguous file, a single extent covers it: "blocks 0-999 at physical location X."

However, ext4 maintains compatibility: you can mount with noextent to use the classic pointer scheme, and the filesystem automatically uses it for very small files that fit in the inode's extent space.

Inline Data: The Ultimate Direct Access

Ext4 and NTFS can store tiny files entirely within the inode itself (inline data). If a file is ~100 bytes, why waste a 4KB block? The data is placed in the space normally used for block pointers. Reading such files requires only reading the inode—no data block access at all. This is O(0) additional disk I/O!

Practical Implications for Developers

Understanding direct blocks has practical implications for how you structure applications and data:

Development Best Practices

•Keep hot files small — Configuration files, frequently-accessed cache entries, and session data should stay under 48KB when possible to benefit from direct block access.
•Cluster related small files — Small files in the same directory often get allocated near each other, improving cache locality.
•Consider file splitting — A 10MB file accessed only at the beginning and end might be better as two small files, each using direct blocks.
•Preallocate for known sizes — Use posix_fallocate() when you know final file size to ensure contiguous allocation.
•Avoid many tiny files — While each tiny file is fast to access, millions of them can exhaust inodes and stress directory structures.

Monitoring Tools:

# Check if file uses extents or direct blocks
$ debugfs -R "stat <inode_num>" /dev/sda1

# View inode details including block map
$ hdparm --fibmap /path/to/file

# Analyze file fragmentation
$ filefrag -v /path/to/file
  ext:     logical_offset:        physical_offset:
   0:        0..       7:       1000..1007: 8

Interpreting filefrag Output:

Each line is an extent (contiguous run)
Many extents = fragmented file
Single extent covering entire file = optimal
For small files (≤12 blocks), single extent is common

Ideal output for 32KB file:

extent 0: 0..7: 5000..5007: 8 blocks
         (contiguous, all direct)

SSD Considerations

On SSDs, seek time is nearly zero, so the contiguity benefit of direct blocks is less about avoiding seeks and more about enabling larger I/O operations (which SSDs handle efficiently) and reducing extent/pointer metadata overhead.

Summary: Direct Blocks as the Fast Path

We've thoroughly examined how direct block pointers enable fast, simple access to small file data. Let's consolidate the key concepts:

Key Takeaways

•Direct blocks are O(1) access — The inode's block pointer array provides instant lookup for any of the first 12 logical blocks without any indirection or traversal.
•12 direct pointers cover 48KB — With 4KB blocks, this handles the vast majority of files, making common-case access maximally efficient.
•Zero pointers indicate holes — Sparse files use zero-valued pointers to represent unallocated regions, saving disk space while maintaining fast access.
•Allocation strategy affects performance — Delayed allocation and contiguity optimization ensure that sequentially-written files have contiguous direct blocks.
•Modern filesystems evolve the concept — Extents, inline data, and B-tree approaches build on the same principles while handling large files more efficiently.
•The design optimizes for the common case — Most files are small; making small file access fast yields the biggest overall performance improvement.

What's next:

Direct blocks handle files up to 48KB brilliantly. But what about larger files? The next page explores indirect blocks—the first layer of indirection that extends file capacity by dedicating a data block to hold additional block pointers.

Page Complete

You now understand direct block pointers—the foundation of Unix file data addressing. These 12 simple pointers handle the majority of files with O(1) access, making the common case fast. Next, we'll see how indirect blocks extend this scheme to handle larger files.

3 / 5

Loading learning content...

Operating SystemsFile System Structures

Unix inode Structure

LevelIntermediate

Duration75 mins

TopicFile System Structures

3 / 5

Direct Blocks

The Fast Path to File Data

What You Will Learn

The Block Pointer Array

Every traditional Unix inode contains an array of block pointers. The classic design uses 15 pointers, organized as follows:

Pointers 0-11: Direct block pointers (12 total)
Pointer 12: Single indirect block pointer
Pointer 13: Double indirect block pointer
Pointer 14: Triple indirect block pointer

This page focuses on the first 12 entries—the direct block pointers. These are the simplest and fastest form of file data addressing.

Converting Mermaid diagram...

Each direct block pointer is a 32-bit or 64-bit integer containing a block number. When you read bytes 0 through (block_size - 1) of a file, the filesystem:

Reads the inode (if not cached)
Extracts block[0] from the pointer array
Multiplies by block size to get disk byte offset
Issues a single disk read

No traversal, no indirection, no additional I/O—just one lookup and one read. This is O(1) access to the file's first blocks.

direct_block_access.c
C
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
/*
 * Simplified file read using direct block pointers
 * This illustrates the conceptual access pattern
 */
ssize_t read_file(struct inode *inode, off_t offset, void *buffer, size_t count) {
    size_t block_size = inode->sb->block_size;  // e.g., 4096
    
    // Calculate which logical block we need
    loff_t logical_block = offset / block_size;
    size_t block_offset = offset % block_size;
    
    // Direct blocks handle logical blocks 0-11
    if (logical_block < DIRECT_BLOCKS) {  // DIRECT_BLOCKS = 12
        // Get actual physical block number from inode
        blkcnt_t physical_block = inode->block[logical_block];
        
        if (physical_block == 0) {
            // Block not allocated (sparse file)
            // Return zeros without disk I/O
            memset(buffer, 0, count);
            return count;
        }
        
        // Calculate disk byte address
        off_t disk_offset = physical_block * block_size + block_offset;
        
        // Single disk read - O(1) disk operations
        return disk_read(disk_offset, buffer, count);
    }
    
    // For larger offsets, we need indirect blocks (covered later)
    return read_via_indirect(inode, logical_block, block_offset, buffer, count);
}

Calculating Direct Block Capacity

The maximum file size addressable by direct blocks alone is straightforward to calculate:

Direct Block Capacity = Number of Direct Pointers × Block Size

Common configurations:

Direct Block Capacity by Block Size
Block Size	Direct Pointers	Direct Block Capacity	Practical Impact
1 KB	12	12 KB	Covers most scripts, configs, small source files
2 KB	12	24 KB	Covers typical text documents, small images
4 KB (common)	12	48 KB	Covers most source files, configs, small documents
8 KB	12	96 KB	Many medium-sized files fit entirely in direct blocks
64 KB (XFS max)	12	768 KB	Even moderately large files use only direct blocks

Why 12 Direct Pointers?

The choice of 12 direct pointers is not arbitrary—it's a carefully considered design decision:

Storage fit: With 15 total pointers at 4 bytes each = 60 bytes, plus ~68 bytes of other metadata = 128 bytes (classic inode size)
Optimal file coverage: Studies from the 1980s showed that 80-90% of files were under 48KB, making 12 direct pointers cover the vast majority
Cache efficiency: 12 pointers × 4 bytes = 48 bytes, fitting nicely in CPU cache lines
Simple indexing: The array index equals the logical block number (blocks 0-11 use pointers 0-11)

The 80/20 Rule in Action

Access Pattern Analysis

Let's trace exactly what happens when you access different parts of a small file. Assume a 4KB block size and a 32KB file (uses 8 direct blocks).

Reading the entire 32KB file sequentially:

Operation: read(fd, buffer, 32768)  // Read all 32KB

Step 1: Kernel consults file descriptor → gets inode reference
        (Inode likely cached in memory - no disk I/O)

Step 2: Calculate blocks needed:
        - Bytes 0-4095: block[0] = physical block 1000
        - Bytes 4096-8191: block[1] = physical block 1001
        - Bytes 8192-12287: block[2] = physical block 1002
        - Bytes 12288-16383: block[3] = physical block 1003
        - Bytes 16384-20479: block[4] = physical block 1004
        - Bytes 20480-24575: block[5] = physical block 1005
        - Bytes 24576-28671: block[6] = physical block 1006
        - Bytes 28672-32767: block[7] = physical block 1007

Step 3: Issue disk reads (may be batched via readahead):
        - Read blocks 1000-1007 (possibly one composite I/O)

Disk operations: 1 (if contiguous) to 8 (if fragmented)
All blocks accessed via direct pointers: YES
Indirection used: NONE

Key insight: If blocks are contiguous, the entire file can be read in a single disk operation. The direct pointers simply provide the starting address.

Sparse Files and Direct Blocks

Direct block pointers elegantly support sparse files—files with holes that contain logical zeros but consume no disk space. A zero value in a block pointer indicates an unallocated block.

How sparse files work with direct blocks:

Converting Mermaid diagram...

Consider a file that looks 48KB in size but only has data at specific locations:

# Create a sparse file with data at specific offsets
$ dd if=/dev/urandom of=sparse.bin bs=4096 seek=2 count=1 conv=notrunc 2>/dev/null
$ dd if=/dev/urandom of=sparse.bin bs=4096 seek=4 count=1 conv=notrunc 2>/dev/null

# Set the logical size to 48KB
$ truncate -s 49152 sparse.bin

# Check sizes
$ ls -l sparse.bin
-rw-r--r-- 1 user user 49152 Jan 15 10:00 sparse.bin   # Logical: 48KB

$ du -h sparse.bin
8.0K    sparse.bin                                      # Physical: 8KB

$ stat sparse.bin
  Size: 49152       Blocks: 16         IO Block: 4096   regular file
                           ^^ Only 16 × 512 = 8KB allocated

Reading from holes:

// When reading from a hole (block pointer == 0)
if (inode->block[logical_block] == 0) {
    // No disk I/O needed! Just return zeros.
    memset(buffer, 0, bytes_to_read);
    return bytes_to_read;
}

This is a powerful optimization for:

Virtual machine disk images (lots of unused space)
Database files with gaps
Log files with fixed record positions (empty slots)

Sparse File Pitfalls

Performance Characteristics of Direct Blocks

Direct block access represents the best-case scenario for file I/O in Unix filesystems. Let's analyze the performance characteristics in detail:

Direct Block Performance Analysis
Operation	Time Complexity	Disk I/O	Notes
Read first byte	O(1)	1-2 reads (inode + data)	Inode usually cached after open()
Read byte at offset N < 48KB	O(1)	1 data read	Direct calculation: N / block_size
Sequential read of 48KB	O(1)	1-12 reads	May batch if blocks contiguous
Random read within 48KB	O(1)	1 read per block	No pointer traversal needed
Append within direct range	O(1)	1 alloc + 1 write + 1 inode update	Simple block allocation
Overwrite within file	O(1)	1 write	No reallocation needed
Truncate to < 48KB	O(n)	n block frees + inode update	Must free each block

Optimal Cases for Direct Blocks:

✓ Small configuration files ✓ Source code files (most < 48KB) ✓ Log entries/records ✓ Temporary files ✓ Database index pages ✓ Cached web assets

These files—which constitute the majority on most systems—achieve maximum performance with minimal overhead.

Comparison to Other Approaches:

Method	First Block	Random Block
Direct (Unix)	O(1)	O(1)
Linked List (FAT)	O(1)	O(n)
B-tree Index	O(log n)	O(log n)
Extent-only	O(log n)	O(log n)

Direct blocks trade flexibility for speed in the common case.

The Cache Effect

Block Allocation for Direct Blocks

Allocation strategies:

Modern Block Allocation Techniques

•Pre-allocation / Delayed allocation — When a write is requested, the filesystem reserves logical space but delays physical allocation until writeback. This allows multiple writes to be considered together, improving contiguity.
•Block group affinity — Allocate blocks in the same block group as the file's inode. This reduces seek distance between inode and data.
•Goal-oriented allocation — Start searching for free blocks near a "goal" block (usually the previous block or inode location).
•Extent hints — Track the last allocated block and prefer the next sequential block for new allocations.
•Multi-block allocation — Allocate multiple blocks at once for large writes, ensuring they are contiguous.

Example: ext4's allocation approach:

File write of 32KB to a new file:

1. Write request arrives for 32KB (8 blocks)

2. Delayed allocation reserves 8 logical blocks
   (No physical blocks allocated yet)

3. At writeback time (after short delay or memory pressure):
   - ext4 multi-block allocator (mballoc) activates
   - Searches for 8 contiguous free blocks
   - Preferably near the file's inode block group

4. If 8 contiguous blocks found at physical blocks 5000-5007:
   block[0] = 5000
   block[1] = 5001
   block[2] = 5002
   block[3] = 5003
   block[4] = 5004
   block[5] = 5005
   block[6] = 5006
   block[7] = 5007

5. All 8 blocks written in single I/O operation

This batching makes small file creation significantly faster than immediate allocation and improves subsequent read performance through contiguity.

Preallocation for Known File Sizes

Direct Blocks in Modern Filesystems

While the classic 12-direct-pointer design persists in many filesystems, modern implementations have evolved with variations and optimizations:

Direct Block Implementations Across Filesystems
Filesystem	Approach	Small File Optimization	Notes
ext2/ext3	Classic 12 direct + 3 indirect	None	Original Unix-style design
ext4	Extents replace block pointers	Inline data for tiny files	Extents subsume direct blocks for most files
XFS	B+ tree extents	Local format for small extent lists	No traditional direct pointers
Btrfs	Inline extents in B-tree	File data inline in tree node	Copy-on-write changes semantics
ZFS	Object block pointers	Micro-blocks, compression	128KB variable blocks
NTFS	MFT resident data	Small files stored in MFT record	Similar to ext4 inline data

The Extent Evolution:

Ext4 introduced extents as an alternative to the 15-pointer array. An extent describes a contiguous range of blocks:

struct ext4_extent {
    __le32 ee_block;      // First logical block
    __le16 ee_len;        // Number of blocks in extent (up to 32768)
    __le16 ee_start_hi;   // Upper 16 bits of physical block
    __le32 ee_start_lo;   // Lower 32 bits of physical block
};

Where the classic scheme might need 1000 direct/indirect pointers for a 4MB contiguous file, a single extent covers it: "blocks 0-999 at physical location X."

Inline Data: The Ultimate Direct Access

Practical Implications for Developers

Understanding direct blocks has practical implications for how you structure applications and data:

Development Best Practices

•Keep hot files small — Configuration files, frequently-accessed cache entries, and session data should stay under 48KB when possible to benefit from direct block access.
•Cluster related small files — Small files in the same directory often get allocated near each other, improving cache locality.
•Consider file splitting — A 10MB file accessed only at the beginning and end might be better as two small files, each using direct blocks.
•Preallocate for known sizes — Use posix_fallocate() when you know final file size to ensure contiguous allocation.
•Avoid many tiny files — While each tiny file is fast to access, millions of them can exhaust inodes and stress directory structures.

Monitoring Tools:

# Check if file uses extents or direct blocks
$ debugfs -R "stat <inode_num>" /dev/sda1

# View inode details including block map
$ hdparm --fibmap /path/to/file

# Analyze file fragmentation
$ filefrag -v /path/to/file
  ext:     logical_offset:        physical_offset:
   0:        0..       7:       1000..1007: 8

Interpreting filefrag Output:

Each line is an extent (contiguous run)
Many extents = fragmented file
Single extent covering entire file = optimal
For small files (≤12 blocks), single extent is common

Ideal output for 32KB file:

extent 0: 0..7: 5000..5007: 8 blocks
         (contiguous, all direct)

SSD Considerations

Summary: Direct Blocks as the Fast Path

We've thoroughly examined how direct block pointers enable fast, simple access to small file data. Let's consolidate the key concepts:

Key Takeaways

•Direct blocks are O(1) access — The inode's block pointer array provides instant lookup for any of the first 12 logical blocks without any indirection or traversal.
•12 direct pointers cover 48KB — With 4KB blocks, this handles the vast majority of files, making common-case access maximally efficient.
•Zero pointers indicate holes — Sparse files use zero-valued pointers to represent unallocated regions, saving disk space while maintaining fast access.
•Allocation strategy affects performance — Delayed allocation and contiguity optimization ensure that sequentially-written files have contiguous direct blocks.
•Modern filesystems evolve the concept — Extents, inline data, and B-tree approaches build on the same principles while handling large files more efficiently.
•The design optimizes for the common case — Most files are small; making small file access fast yields the biggest overall performance improvement.

What's next:

Page Complete

3 / 5