Loading content...
If linked allocation is a chain, then pointers are the links that hold everything together. Each block in a linked file carries not just data, but also a critical piece of metadata: the address of the next block in the chain. This seemingly simple concept—embedding a pointer within each block—has profound implications for file system design, performance, reliability, and recovery.
In the previous page, we established the conceptual foundation of linked allocation. Now we dive deeper into the mechanics of these pointers: their structure, their role in file operations, how they're updated, and what happens when things go wrong.
By the end of this page, you will master the detailed mechanics of block pointers: their physical layout within blocks, the algorithms for reading and updating pointer chains, the critical importance of atomic updates, and the catastrophic consequences of pointer corruption. You'll understand why pointer reliability is a central concern in file system design.
Every block in a linked allocation file system is divided into two distinct regions: the data region and the pointer region. Understanding their precise layout is fundamental to understanding linked allocation mechanics.
Block Layout Options:
File systems typically choose one of two layouts for placing the pointer:
Trailing Pointer Layout:
The most common approach places the pointer at the end of the block:
┌─────────────────────────────────────────┐
│ │
│ FILE DATA │
│ │
│ (BlockSize - 4) bytes │
│ │
├─────────────────────────────────────────┤
│ NEXT BLOCK POINTER (4 bytes) │
└─────────────────────────────────────────┘
Offset: BlockSize - 4
Advantages:
Pointer Location:
pointer_offset = block_size - sizeof(uint32_t);
next_block = *(uint32_t*)(block_buffer + pointer_offset);
Pointer Data Types:
The pointer field must be large enough to address all blocks on the storage device:
| Pointer Size | Max Block Number | 4KB Block Disk Size | 512B Sector Disk Size |
|---|---|---|---|
| 16 bits (2 bytes) | 65,535 | 256 MB | 32 MB |
| 24 bits (3 bytes) | 16,777,215 | 64 GB | 8 GB |
| 32 bits (4 bytes) | 4,294,967,295 | 16 TB | 2 TB |
| 48 bits (6 bytes) | 281 trillion | 1 EB | 128 PB |
| 64 bits (8 bytes) | 18 quintillion | 64 ZB | 8 ZB |
When FAT32 was designed in 1996, 32-bit pointers (actually 28 bits used) seemed excessive—16TB with 4KB clusters! Today, consumer SSDs exceed 4TB. The moral: always plan for growth. Modern file systems use 48-64 bit addressing.
Not all pointer values represent valid block addresses. File systems reserve certain values as sentinel markers to indicate special conditions:
End of File (EOF) Marker:
The most critical sentinel is the EOF marker, indicating no more blocks follow. Different systems use different representations:
12345678910111213141516171819202122
// Common sentinel values for 32-bit block pointers // Approach 1: Maximum value (0xFFFFFFFF)#define NULL_BLOCK_1 0xFFFFFFFF // -1 as unsigned // Approach 2: Zero#define NULL_BLOCK_2 0x00000000 // Requires block 0 to be reserved // Approach 3: High bit flag#define NULL_BLOCK_3 0x80000000 // Uses sign bit as flag // FAT32 specific values#define FAT32_EOF 0x0FFFFFF8 // End of file marker range (0xFFF8-0xFFFF)#define FAT32_FREE 0x00000000 // Free cluster#define FAT32_BAD 0x0FFFFFF7 // Bad cluster#define FAT32_RSVD 0x0FFFFFF0 // Reserved range start // Helper macros#define IS_EOF(ptr) ((ptr) >= 0x0FFFFFF8)#define IS_FREE(ptr) ((ptr) == 0x00000000)#define IS_BAD(ptr) ((ptr) == 0x0FFFFFF7)#define IS_VALID(ptr) ((ptr) >= 2 && (ptr) < 0x0FFFFFF0)Complete Sentinel Value Taxonomy:
| Value/Range | Meaning | Usage Context |
|---|---|---|
| 0x00000000 | Free block / Unused | Free space tracking; block not allocated to any file |
| 0x00000001 | Reserved / System | Sometimes reserved for boot sector or superblock |
| 0x0FFFFFF7 | Bad block marker | Marks blocks with physical defects; excluded from allocation |
| 0x0FFFFFF8-0x0FFFFFFF | End of file range | File chain terminator; no subsequent block exists |
| 0xFFFFFFFF | Alternative EOF | Some systems use max value for EOF instead of range |
Most file systems reserve blocks 0 and 1 for metadata (boot sector, superblock, FAT copies). This means valid data block addresses typically start at 2. Using 0 as NULL works because block 0 is never a valid data block destination.
Why Ranges Instead of Single Values?
FAT32 uses a range (0xFFFFFFF8-0xFFFFFFFF) for EOF rather than a single value for several reasons:
Traversing the pointer chain is fundamental to all file operations in linked allocation. The algorithm varies based on the operation type:
Sequential Read Traversal:
The simplest case—read each block in order following pointers:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354
// Sequential file read with complete error handlingtypedef struct { uint32_t first_block; uint64_t file_size;} FileHandle; ssize_t read_file_sequential(FileHandle *file, void *buffer, size_t count) { uint32_t current_block = file->first_block; size_t bytes_read = 0; uint8_t block_buffer[BLOCK_SIZE]; // Track visited blocks for loop detection BlockSet *visited = blockset_create(); while (current_block != EOF_MARKER && bytes_read < count) { // CRITICAL: Check for pointer loop (corruption detection) if (blockset_contains(visited, current_block)) { log_error("Circular pointer detected at block %u", current_block); blockset_destroy(visited); return -EIO; // I/O error due to corruption } blockset_add(visited, current_block); // Validate block number is in valid range if (current_block < 2 || current_block >= total_blocks) { log_error("Invalid block pointer: %u", current_block); blockset_destroy(visited); return -EIO; } // Read the block from disk if (disk_read_block(current_block, block_buffer) != 0) { blockset_destroy(visited); return -EIO; } // Extract data portion (exclude pointer region) size_t data_size = BLOCK_SIZE - POINTER_SIZE; size_t to_copy = min(data_size, count - bytes_read); // Don't read past file end size_t remaining_in_file = file->file_size - bytes_read; to_copy = min(to_copy, remaining_in_file); memcpy(buffer + bytes_read, block_buffer, to_copy); bytes_read += to_copy; // Extract next block pointer current_block = *(uint32_t*)(block_buffer + data_size); } blockset_destroy(visited); return bytes_read;}Random Access (Seek) Traversal:
Random access requires traversing to the target block without reading intervening data:
123456789101112131415161718192021222324252627282930313233343536373839
// Seek to specific offset - must traverse chainint seek_to_offset(FileHandle *file, uint64_t offset, SeekPosition *pos) { // Calculate target block number in chain size_t data_per_block = BLOCK_SIZE - POINTER_SIZE; uint32_t target_block_index = offset / data_per_block; pos->offset_in_block = offset % data_per_block; // Traverse from beginning to target block // THIS IS THE PERFORMANCE KILLER uint32_t current = file->first_block; uint8_t pointer_buffer[POINTER_SIZE]; for (uint32_t i = 0; i < target_block_index; i++) { if (current == EOF_MARKER) { return -EINVAL; // Offset beyond file end } // Read ONLY the pointer portion (optimization) // Some systems support partial block reads if (disk_read_partial(current, BLOCK_SIZE - POINTER_SIZE, // Pointer offset POINTER_SIZE, pointer_buffer) != 0) { return -EIO; } current = *(uint32_t*)pointer_buffer; } pos->current_block = current; return 0;} // Performance analysis:// - Sequential seek from start: O(n) disk reads where n = block number// - Each read typically triggers a disk seek on HDD// - For a 1GB file with 4KB blocks: 262,144 blocks// - Seeking to end: 262,144 disk reads!// - At 5ms per seek: ~22 MINUTES for one random accessA 1GB file accessed at random offsets can require hundreds of thousands of disk seeks. This is why pure linked allocation is impractical for databases, media players, or any application requiring random file access. The FAT optimization addresses this by caching pointers in memory.
Modifying pointer chains is where linked allocation's complexity becomes apparent. Every structural file operation—append, truncate, delete—involves updating pointers, and these updates must be performed carefully to maintain file integrity.
Append Operation (Adding Blocks):
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263
// Append new block to end of file chainint append_block_to_file(FileHandle *file, uint32_t new_block) { uint8_t block_buffer[BLOCK_SIZE]; // Case 1: Empty file - update directory entry if (file->first_block == EOF_MARKER) { file->first_block = new_block; // Initialize new block with EOF pointer memset(block_buffer, 0, BLOCK_SIZE); *(uint32_t*)(block_buffer + BLOCK_SIZE - POINTER_SIZE) = EOF_MARKER; if (disk_write_block(new_block, block_buffer) != 0) { return -EIO; } return update_directory_entry(file); } // Case 2: Non-empty file - find last block and update its pointer uint32_t current = file->first_block; uint32_t previous; // Traverse to find last block (pointer == EOF) while (1) { if (disk_read_block(current, block_buffer) != 0) { return -EIO; } uint32_t next = *(uint32_t*)(block_buffer + BLOCK_SIZE - POINTER_SIZE); if (next == EOF_MARKER) { break; // Found last block } previous = current; current = next; } // CRITICAL SECTION: Update pointer chain // Order matters for crash safety! // Step 1: Initialize new block with EOF pointer FIRST uint8_t new_block_buffer[BLOCK_SIZE]; memset(new_block_buffer, 0, BLOCK_SIZE); *(uint32_t*)(new_block_buffer + BLOCK_SIZE - POINTER_SIZE) = EOF_MARKER; if (disk_write_block(new_block, new_block_buffer) != 0) { // Failed to write new block - no changes made return -EIO; } // Step 2: Update previous last block to point to new block *(uint32_t*)(block_buffer + BLOCK_SIZE - POINTER_SIZE) = new_block; if (disk_write_block(current, block_buffer) != 0) { // DANGER: New block written but not linked! // This creates an orphaned block (space leak) // Recovery will need to mark new_block as free return -EIO; } // Success: Chain extended return 0;}Truncate Operation (Removing Blocks):
Truncating a file requires unlinking blocks and returning them to the free list:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869
// Truncate file to specified block countint truncate_file_to_blocks(FileHandle *file, uint32_t keep_blocks) { if (keep_blocks == 0) { // Delete entire file - return all blocks to free list return delete_file_chain(file->first_block); } uint8_t block_buffer[BLOCK_SIZE]; uint32_t current = file->first_block; // Traverse to the block that will become the new last block for (uint32_t i = 0; i < keep_blocks - 1; i++) { if (current == EOF_MARKER) { // File shorter than keep_blocks - nothing to truncate return 0; } if (disk_read_block(current, block_buffer) != 0) { return -EIO; } current = *(uint32_t*)(block_buffer + BLOCK_SIZE - POINTER_SIZE); } // Read the block that will become the new last block if (disk_read_block(current, block_buffer) != 0) { return -EIO; } // Save pointer to blocks being freed uint32_t freed_chain_start = *(uint32_t*)(block_buffer + BLOCK_SIZE - POINTER_SIZE); // Step 1: Set new last block's pointer to EOF *(uint32_t*)(block_buffer + BLOCK_SIZE - POINTER_SIZE) = EOF_MARKER; if (disk_write_block(current, block_buffer) != 0) { return -EIO; } // Step 2: Return freed blocks to free list // Must happen AFTER the chain is broken to avoid double-use if (freed_chain_start != EOF_MARKER) { return_blocks_to_freelist(freed_chain_start); } return 0;} // Return chain of blocks to free listint return_blocks_to_freelist(uint32_t first_freed) { uint32_t current = first_freed; uint8_t block_buffer[BLOCK_SIZE]; while (current != EOF_MARKER) { if (disk_read_block(current, block_buffer) != 0) { // Orphaned blocks - log for fsck recovery log_error("Orphan at block %u during truncate", current); return -EIO; } uint32_t next = *(uint32_t*)(block_buffer + BLOCK_SIZE - POINTER_SIZE); // Mark block as free mark_block_free(current); current = next; } return 0;}Notice the gap between 'break the chain' and 'mark blocks free'. If a crash occurs here, blocks become orphaned—allocated but unreachable. This is why file system check tools (fsck, chkdsk) exist: to find and reclaim orphaned blocks.
File system integrity depends critically on the order of disk writes. For linked allocation, incorrect ordering can lead to catastrophic data loss.
The Critical Ordering Rule:
New blocks must be initialized BEFORE being linked into the chain.
Why Order Matters:
Consider appending a block without proper ordering:
[BAD ORDER]
1. Update old-last-block's pointer to point to new-block
2. Write data to new-block
3. Set new-block's pointer to EOF
--- CRASH HERE ---
If a crash occurs after step 1 but before step 3:
Correct Write Ordering:
[CORRECT ORDER - Crash-Safe Append]
1. Allocate new block from free list
2. Write data and EOF marker to new block (complete block write)
3. Sync to ensure step 2 is on disk
4. Update old-last-block's pointer to point to new block
5. Sync to ensure step 4 is on disk
With this ordering:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546
// Crash-safe append with explicit barriersint safe_append_block(FileHandle *file, void *data, size_t len) { uint32_t new_block = allocate_block(); if (new_block == INVALID_BLOCK) { return -ENOSPC; } // STEP 1: Prepare new block completely uint8_t new_buffer[BLOCK_SIZE]; memset(new_buffer, 0, BLOCK_SIZE); memcpy(new_buffer, data, min(len, DATA_PER_BLOCK)); *(uint32_t*)(new_buffer + DATA_PER_BLOCK) = EOF_MARKER; // STEP 2: Write new block to disk if (disk_write_block(new_block, new_buffer) != 0) { free_block(new_block); return -EIO; } // STEP 3: I/O BARRIER - Ensure new block is on physical media if (disk_sync() != 0) { // Block might or might not be written // Conservative: treat as failed free_block(new_block); return -EIO; } // STEP 4: Now safe to link into chain uint32_t last_block = find_last_block(file); if (last_block == EOF_MARKER) { // Empty file case file->first_block = new_block; update_directory_entry(file); } else { // Update last block's pointer uint8_t last_buffer[BLOCK_SIZE]; disk_read_block(last_block, last_buffer); *(uint32_t*)(last_buffer + DATA_PER_BLOCK) = new_block; disk_write_block(last_block, last_buffer); } // STEP 5: I/O BARRIER - Ensure link is persistent disk_sync(); return 0;}disk_sync() (fsync/fdatasync) forces data to physical media. However, many disks have write caches that lie about completion. True durability requires disabling disk write cache or using battery-backed controllers. This is why enterprise storage differs from consumer drives.
Pointer corruption is linked allocation's greatest vulnerability. Unlike contiguous allocation where each file's blocks are independent, linked allocation creates a fragile chain where one broken link can render an entire file inaccessible.
Common Corruption Causes:
Corruption Taxonomy:
| Corruption Type | Symptom | Data Loss Severity | Detectability |
|---|---|---|---|
| Null/EOF premature | File appears truncated | Medium - data exists but unreachable | Detected by size mismatch |
| Wild pointer (out of range) | I/O error when following chain | High - file unreadable from error point | Easily detected |
| Crossed chains | Reading file shows other file's data | Variable - data exposed/mixed | Hard to detect without audit |
| Circular pointer | System hangs reading file | High - infinite loop | Requires cycle detection |
| Point to free block | Garbage data or crash | High - unpredictable | Requires free-space cross-check |
| Single bit flip | Points to nearby wrong block | High - silent corruption | Very hard to detect |
The Single Point of Failure Problem:
Consider a file with 1000 blocks. In linked allocation:
Probability of data loss with n blocks and corruption rate p:
P(file intact) = (1 - p)^(n-1)
For p = 0.001% and n = 10,000 blocks:
P(intact) = (0.999999)^9999 ≈ 99.0%
P(some loss) ≈ 1%
Even a tiny corruption rate becomes significant for large files.
In the 1990s, FAT file system corruption was so common that tools like ScanDisk and Norton Disk Doctor became essential. Improper shutdowns, buggy drivers, and unreliable hardware constantly damaged the FAT table, causing lost clusters, cross-linked files, and endless chkdsk runs on boot.
File systems employ various techniques to detect and recover from pointer corruption:
Detection Techniques:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697
// Comprehensive pointer validation during traversaltypedef enum { PTR_VALID, PTR_EOF, PTR_OUT_OF_RANGE, PTR_POINTS_TO_FREE, PTR_CYCLE_DETECTED, PTR_CROSS_LINKED} PointerStatus; PointerStatus validate_pointer(uint32_t ptr, FileHandle *file, BlockBitmap *allocated_map, BlockSet *visited) { // Check 1: EOF marker if (ptr >= EOF_MARKER_START) { return PTR_EOF; } // Check 2: Within valid block range if (ptr < 2 || ptr >= total_data_blocks) { log_warning("Block pointer %u out of range [2, %u)", ptr, total_data_blocks); return PTR_OUT_OF_RANGE; } // Check 3: Points to allocated block if (!bitmap_is_set(allocated_map, ptr)) { log_warning("Block %u points to free block %u", current, ptr); return PTR_POINTS_TO_FREE; } // Check 4: Cycle detection if (blockset_contains(visited, ptr)) { log_error("Cycle detected: block %u already visited", ptr); return PTR_CYCLE_DETECTED; } // Check 5: Cross-link detection (requires global state) FileHandle *owner = get_block_owner(ptr); if (owner != NULL && owner != file) { log_error("Cross-link: block %u owned by both %s and %s", ptr, file->name, owner->name); return PTR_CROSS_LINKED; } return PTR_VALID;} // Full file chain verificationint verify_file_chain(FileHandle *file) { uint32_t current = file->first_block; uint32_t block_count = 0; BlockSet *visited = blockset_create(); while (1) { if (current >= EOF_MARKER_START) { break; // Reached end of file } PointerStatus status = validate_pointer( current, file, global_allocated_map, visited); if (status != PTR_VALID && status != PTR_EOF) { blockset_destroy(visited); return -1; // Chain corrupted } blockset_add(visited, current); block_count++; // Read next pointer uint32_t next = read_block_pointer(current); current = next; // Safety limit to prevent infinite loops from undetected cycles if (block_count > MAX_FILE_BLOCKS) { log_error("Block count exceeds maximum - possible cycle"); blockset_destroy(visited); return -1; } } blockset_destroy(visited); // Verify block count matches file size uint64_t expected_blocks = (file->size + DATA_PER_BLOCK - 1) / DATA_PER_BLOCK; if (block_count != expected_blocks) { log_warning("Block count mismatch: found %u, expected %lu", block_count, expected_blocks); return -1; } return 0; // Chain is valid}Recovery Strategies:
| Strategy | Description | Limitations |
|---|---|---|
| Chain truncation | Mark corrupted block as EOF | Loses all data after corruption point |
| Orphan recovery | Scan for blocks not in any chain | Can't determine original file ownership |
| Signature scanning | Search block data for file signatures | Works for known formats (JPEG, PDF) only |
| Redundant chains | Keep backup of pointer chain | Doubles metadata overhead |
| Checksums per block | Detect (not fix) corruption | Can't recover, only detect |
Modern file systems avoid embedded pointers entirely. Instead, they maintain centralized extent tables (like FAT) or use B-trees (NTFS, ext4, Btrfs) where structural metadata is separated from data. This allows redundant storage of critical pointers without polluting data blocks.
We've conducted an exhaustive examination of block pointers—the critical threads that hold linked files together. Let's consolidate our understanding:
What's next:
The next page explores one of linked allocation's greatest strengths: the complete elimination of external fragmentation. We'll examine why any free block can satisfy any allocation request, how this differs dramatically from contiguous allocation, and why despite this advantage, internal inefficiencies remain.
You now understand the detailed mechanics of block pointers in linked allocation—their structure, traversal algorithms, update ordering requirements, and vulnerability to corruption. This knowledge forms the foundation for understanding why the FAT optimization (centralizing pointers) was such a revolutionary advancement.