Loading learning content...
In 1977, a simple but profound insight transformed linked allocation from an academic concept into the foundation of the most widely-used file system in history. Tim Paterson, working on what would become the FAT file system for Seattle Computer Products' 86-DOS (later MS-DOS), made a critical observation:
What if we didn't store pointers in each block, but instead kept ALL the pointers in one centralized table?
This single architectural decision—moving pointers from scattered disk blocks into a dedicated File Allocation Table—solved linked allocation's random access problem while preserving its fragmentation benefits. The result was a file system that powered billions of computers, and whose direct descendants still run on virtually every USB drive, SD card, and UEFI system partition today.
By the end of this page, you will understand the architectural innovation of the FAT, see how it transforms O(n) traversal into O(1) lookup, analyze the memory-disk tradeoffs, explore the evolution from FAT12 to FAT32, and appreciate why this optimization remains relevant nearly 50 years later.
The fundamental insight of the FAT optimization is separating pointers from data. Instead of embedding the "next block" pointer within each data block, all pointers are stored in a separate, dedicated table.
Traditional Linked Allocation:
Block 0: [Data 0-4091][Ptr→Block 7]
Block 1: [Data][Ptr→...]
...
Block 7: [Data 4092-8183][Ptr→Block 23]
FAT Optimization:
Block 0: [Data 0-4095] (Full 4KB of data)
Block 1: [Data 4096-8191] (No embedded pointer)
...
FAT Table (separate structure):
FAT[0] = 7 (Block 0's next block is Block 7)
FAT[7] = 23 (Block 7's next block is Block 23)
FAT[23] = EOF (Block 23 is end of file)
Key Benefits of Pointer Externalization:
| Aspect | Embedded Pointers | Centralized FAT |
|---|---|---|
| Data per block | BlockSize - PtrSize | Full BlockSize |
| Pointer access | Read full block | Read FAT entry |
| Random access | O(n) disk reads | O(1) FAT lookup |
| Pointer caching | Each block cached separately | Entire FAT cacheable |
| Recovery | Traverse to find | FAT is complete map |
| Corruption | Lose tail of file | FAT backup available |
The FAT is small enough to load entirely into memory. Once loaded, following a chain of blocks requires only RAM lookups—no disk I/O. This transforms O(n) disk operations into O(n) memory operations, which are millions of times faster.
The File Allocation Table is conceptually an array where each entry corresponds to one cluster (block) on the disk.
Terminology:
FAT Structure:
// Conceptual FAT structure
uint32_t FAT[total_clusters];
// Each entry contains either:
// - Next cluster number (link in chain)
// - EOF marker (end of file)
// - FREE marker (available for allocation)
// - BAD marker (defective cluster)
// - RESERVED marker (system use)
Physical Layout on Disk:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253
// FAT32 Disk Layout// (not to scale - FAT tables are much smaller than data region) /* * +------------------------+ * | Boot Sector (1 sector) | <- Contains volume info, FAT location * +------------------------+ * | Reserved Sectors | <- Usually 32 sectors in FAT32 * +------------------------+ * | FAT #1 (Primary) | <- The File Allocation Table * +------------------------+ * | FAT #2 (Backup) | <- Exact copy for redundancy * +------------------------+ * | Root Directory | <- FAT32: Part of data region * +------------------------+ * | | * | Data Region | <- Actual file/directory data * | (Clusters) | * | | * +------------------------+ */ // Boot sector structure (partial)typedef struct __attribute__((packed)) { uint8_t jump[3]; // Jump instruction char oem_name[8]; // "MSDOS5.0", "MSWIN4.1", etc. uint16_t bytes_per_sector; // Usually 512 uint8_t sectors_per_cluster; uint16_t reserved_sectors; uint8_t num_fats; // Usually 2 (primary + backup) uint16_t root_entry_count; // 0 for FAT32 uint16_t total_sectors_16; // 0 if > 65535 sectors uint8_t media_type; // 0xF8 for hard disk uint16_t fat_size_16; // 0 for FAT32 uint16_t sectors_per_track; uint16_t num_heads; uint32_t hidden_sectors; uint32_t total_sectors_32; // FAT32-specific fields follow... uint32_t fat_size_32; // Sectors per FAT uint16_t ext_flags; uint16_t fs_version; uint32_t root_cluster; // First cluster of root directory uint16_t fs_info; uint16_t backup_boot_sector; uint8_t reserved[12]; uint8_t drive_number; uint8_t reserved1; uint8_t boot_signature; uint32_t volume_id; char volume_label[11]; char fs_type[8]; // "FAT32 "} FAT32BootSector;FAT Entry Values:
| Value Range | Meaning | Description |
|---|---|---|
| 0x0000000 | FREE | Cluster is available for allocation |
| 0x0000001 | RESERVED | Reserved; should not occur |
| 0x0000002 - 0x0FFFFFEF | Valid cluster | Points to next cluster in chain |
| 0x0FFFFFF0 - 0x0FFFFFF6 | RESERVED | Reserved values |
| 0x0FFFFFF7 | BAD | Cluster has defective sectors |
| 0x0FFFFFF8 - 0x0FFFFFFF | EOF | End of cluster chain (last cluster) |
FAT32 uses only 28 bits of each 32-bit entry; the upper 4 bits are reserved. This allows 268 million clusters (2^28), supporting volumes up to 2TB with 8KB clusters or 16TB with 64KB clusters. The reserved bits were intended for future use but never formally specified.
With the FAT loaded in memory, traversing a file's cluster chain becomes extraordinarily fast.
The Key Insight:
Each FAT entry IS the pointer to the next cluster. Following the chain is simply array indexing:
next_cluster = FAT[current_cluster];
No disk I/O required—just a memory read.
Complete File Reading Algorithm:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071
// FAT file reading with in-memory FAT tabletypedef struct { char name[11]; // 8.3 filename uint8_t attributes; uint8_t reserved; uint8_t create_time_ms; uint16_t create_time; uint16_t create_date; uint16_t access_date; uint16_t first_cluster_high; // High 16 bits (FAT32) uint16_t modify_time; uint16_t modify_date; uint16_t first_cluster_low; // Low 16 bits uint32_t file_size;} __attribute__((packed)) DirectoryEntry; // Global: FAT loaded in memory at mount timeuint32_t *FAT; // FAT[cluster] = next_cluster or EOF/FREE/BAD // Calculate cluster number from directory entryuint32_t get_first_cluster(DirectoryEntry *entry) { return ((uint32_t)entry->first_cluster_high << 16) | entry->first_cluster_low;} // Read entire file into bufferssize_t read_file(DirectoryEntry *entry, void *buffer) { uint32_t cluster = get_first_cluster(entry); uint32_t bytes_read = 0; uint32_t remaining = entry->file_size; while (cluster < 0x0FFFFFF8) { // While not EOF // Calculate disk address of this cluster uint64_t disk_offset = data_region_start + (cluster - 2) * cluster_size; // Read cluster data (this is the only disk I/O!) uint32_t to_read = min(cluster_size, remaining); disk_read(disk_offset, buffer + bytes_read, to_read); bytes_read += to_read; remaining -= to_read; // Follow chain in FAT - JUST A MEMORY READ! cluster = FAT[cluster] & 0x0FFFFFFF; // Mask reserved bits } return bytes_read;} // Random access with FAT - O(n) memory operations, O(1) disk after seekssize_t read_at_offset(DirectoryEntry *entry, uint64_t offset, void *buffer, size_t count) { // Calculate which cluster contains the offset uint32_t target_cluster_index = offset / cluster_size; uint32_t offset_in_cluster = offset % cluster_size; // Traverse FAT to find the cluster (all in memory!) uint32_t cluster = get_first_cluster(entry); for (uint32_t i = 0; i < target_cluster_index; i++) { cluster = FAT[cluster] & 0x0FFFFFFF; if (cluster >= 0x0FFFFFF8) { return -EINVAL; // Offset beyond file end } } // NOW do the disk read - just ONE I/O operation! uint64_t disk_offset = data_region_start + (cluster - 2) * cluster_size + offset_in_cluster; return disk_read(disk_offset, buffer, count);}Performance Transformation:
| Operation | Pure Linked | FAT (loaded) | Improvement |
|---|---|---|---|
| Find cluster N | N disk reads | N memory reads | ~100,000× |
| Random read | N × 8ms = 8N ms | ~0.001ms + 8ms | ~1000× |
| Sequential read | Same (1 read/cluster) | Same | 1× |
The transformation from disk I/O to memory operations is the key. Memory access takes ~100 nanoseconds; disk access takes ~8 million nanoseconds. That's why "caching the FAT" provides such massive improvements.
While technically still O(n) to traverse the chain, n memory operations complete in microseconds. The dominant cost is the final disk read, which is O(1). For practical purposes, FAT-based random access is effectively O(1) for any reasonable file size.
Over the decades, FAT evolved through three major versions, each increasing the number of bits per FAT entry:
FAT12 (1977, 86-DOS/MS-DOS 1.0):
12345678910111213141516171819202122232425262728293031
// FAT12: 12-bit entries packed into bytes// Two entries share three bytes (12 + 12 = 24 bits = 3 bytes) /* * Byte layout: [AAAA AAAA] [BBBB AAAA] [BBBB BBBB] * Entry 0 Entry 0+1 Entry 1 * bits 0-7 bits 8-11 bits 0-11 * bits 0-3 */ uint16_t read_fat12_entry(uint8_t *fat, uint32_t cluster) { // Each entry is 12 bits = 1.5 bytes uint32_t fat_offset = cluster + (cluster / 2); // cluster * 1.5 // Read 16 bits starting at offset uint16_t value = *(uint16_t*)(fat + fat_offset); // Extract 12 bits based on odd/even cluster number if (cluster & 1) { // Odd cluster: take upper 12 bits return value >> 4; } else { // Even cluster: take lower 12 bits return value & 0x0FFF; }} // FAT12 special values#define FAT12_FREE 0x000#define FAT12_EOF 0xFF8 // 0xFF8-0xFFF are EOF markers#define FAT12_BAD 0xFF7FAT16 (1984, MS-DOS 3.0):
123456789
// FAT16: 16-bit entries, nice and simpleuint16_t read_fat16_entry(uint16_t *fat, uint32_t cluster) { return fat[cluster]; // Direct array access!} // FAT16 special values#define FAT16_FREE 0x0000#define FAT16_EOF 0xFFF8 // 0xFFF8-0xFFFF are EOF markers#define FAT16_BAD 0xFFF7FAT32 (1996, Windows 95 OSR2):
| Feature | FAT12 | FAT16 | FAT32 |
|---|---|---|---|
| Entry size | 12 bits | 16 bits | 32 bits (28 used) |
| Max clusters | 4,086 | 65,526 | 268,435,446 |
| Max volume | 16 MB | 2 GB | 2 TB* |
| Max file size | 16 MB | 2 GB | 4 GB - 1 |
| Cluster sizes | 512B-4KB | 512B-32KB | 512B-32KB |
| FAT size (1GB vol) | ~1.5 KB | ~128 KB | ~4 MB |
| Era | 1977-1990s | 1984-2000s | 1996-present |
FAT32's 4GB file size limit comes from the directory entry's 32-bit file_size field, not the FAT itself. A 32-bit unsigned integer maxes out at 4,294,967,295 bytes = 4GB - 1 byte. This limitation is why FAT32 is unsuitable for large video files or disk images.
The FAT optimization trades disk space and memory for performance. Let's analyze these tradeoffs:
Disk Space Overhead:
The FAT table must be stored on disk, consuming space that could otherwise hold data:
1234567891011121314151617181920212223242526272829303132333435363738
// Calculate FAT table disk overhead typedef struct { uint64_t volume_size; uint32_t cluster_size; uint32_t entry_bits; // 12, 16, or 32 uint32_t num_fats; // Usually 2} FATConfig; uint64_t calculate_fat_overhead(FATConfig *config) { // Number of clusters uint64_t data_clusters = config->volume_size / config->cluster_size; // Bits needed for FAT uint64_t fat_bits = data_clusters * config->entry_bits; // Convert to bytes, round up to sector uint64_t fat_bytes = (fat_bits + 7) / 8; uint64_t fat_sectors = (fat_bytes + 511) / 512; // Total FAT overhead (primary + backups) uint64_t total_overhead = fat_sectors * 512 * config->num_fats; return total_overhead;} // Examples:// 1 GB volume, 4KB clusters, FAT32:// 262,144 clusters × 4 bytes = 1,048,576 bytes = 1 MB per FAT// With backup: 2 MB total (0.2% of volume) // 1 TB volume, 4KB clusters, FAT32:// 268,435,456 clusters × 4 bytes = 1,073,741,824 bytes = 1 GB per FAT// With backup: 2 GB total (0.2% of volume) // Compare to embedded pointers:// 4 bytes per 4KB block = 0.1% overhead// FAT is actually 2x MORE disk overhead, but enables cachingMemory Requirements:
For optimal performance, the entire FAT should be loaded into RAM:
| Volume Size | Cluster Size | FAT32 Size | Typical System RAM (Era) |
|---|---|---|---|
| 512 MB | 4 KB | 512 KB | 16-32 MB (Win95) |
| 4 GB | 4 KB | 4 MB | 64-128 MB (Win98) |
| 32 GB | 32 KB | 4 MB | 256 MB-1 GB (WinXP) |
| 256 GB | 32 KB | 32 MB | 2-4 GB (Win7) |
| 2 TB | 32 KB | 256 MB | 8-16 GB (Modern) |
Partial Loading Strategies:
For large volumes where loading the entire FAT is impractical:
Modern Windows keeps FAT cached using its standard file buffer cache, loading portions on demand rather than mapping the entire table.
On USB drives and SD cards (typically under 64GB), the entire FAT fits in a few MB—trivial for modern systems. This is why FAT remains popular for removable media: the FAT table is always small enough to cache completely.
A major advantage of the centralized FAT is the ability to maintain redundant copies for recovery.
Dual FAT Strategy:
Most FAT file systems maintain two identical copies of the FAT:
+-------------------+
| Boot Sector |
+-------------------+
| FAT #1 (Primary) | <- Read from here
+-------------------+
| FAT #2 (Backup) | <- Use for recovery
+-------------------+
| Data Region |
+-------------------+
Synchronization:
Both FATs are updated together during modifications:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869
// FAT consistency check and recoverytypedef enum { FAT_OK, FAT_MISMATCH, FAT_BOTH_CORRUPT} FATStatus; FATStatus check_fat_consistency(VolumInfo *vol) { uint8_t *fat1 = read_fat(vol, 0); // Primary FAT uint8_t *fat2 = read_fat(vol, 1); // Backup FAT // Compare byte-by-byte if (memcmp(fat1, fat2, vol->fat_size) == 0) { return FAT_OK; } // FATs differ - validate each bool fat1_valid = validate_fat_checksums(fat1, vol); bool fat2_valid = validate_fat_checksums(fat2, vol); if (fat1_valid && !fat2_valid) { // FAT1 good, FAT2 corrupt - copy FAT1 to FAT2 write_fat(vol, 1, fat1); return FAT_MISMATCH; } if (!fat1_valid && fat2_valid) { // FAT2 good, FAT1 corrupt - copy FAT2 to FAT1 write_fat(vol, 0, fat2); return FAT_MISMATCH; } // Both valid but different (crash during update) // Typically trust FAT1 and recover FAT2 log_warning("FAT mismatch - recovering from primary"); write_fat(vol, 1, fat1); return FAT_MISMATCH;} // Lost cluster recoveryint find_lost_clusters(VolumeInfo *vol) { uint32_t *fat = get_fat(vol); bool *reachable = calloc(vol->total_clusters, sizeof(bool)); // Mark all clusters reachable from directory tree mark_reachable_clusters(vol->root_dir, fat, reachable); // Find allocated but unreachable clusters (lost chains) int lost_count = 0; uint32_t lost_chain_start = 0; for (uint32_t c = 2; c < vol->total_clusters; c++) { if (fat[c] != FAT32_FREE && !reachable[c]) { lost_count++; if (lost_chain_start == 0) { lost_chain_start = c; } } } if (lost_count > 0) { // Options: 1) Free them, 2) Save as FOUND.XXX files printf("Found %d lost clusters starting at %u\n", lost_count, lost_chain_start); } free(reachable); return lost_count;}Common FAT Corruption Scenarios:
| Issue | Symptom | Recovery |
|---|---|---|
| FAT mismatch | Different values in FAT1/FAT2 | Copy valid FAT to other |
| Lost clusters | Allocated but not in any file chain | Free or save as FOUND.XXX |
| Cross-linked files | Two files share clusters | Manual intervention required |
| Invalid FAT entry | Entry points to out-of-range cluster | Truncate file at error |
| Circular chain | FAT entry points back to earlier cluster | Break cycle, mark EOF |
The backup FAT only helps if the primary FAT is corrupted but the backup is undamaged. If a bug or crash corrupts both FATs identically, the backup provides no protection. This is why journaling file systems provide superior crash consistency.
Despite being nearly 50 years old, FAT remains ubiquitous. Understanding why reveals important lessons about file system design.
Where FAT Lives Today:
Why FAT Persists:
| Factor | Explanation |
|---|---|
| Simplicity | Easy to implement, debug, and understand |
| Patent-free (mostly) | Microsoft's FAT LFN patents expired |
| Universal support | Every OS can read/write FAT |
| Proven reliability | Decades of real-world validation |
| Small code footprint | Works on microcontrollers |
| Low memory requirement | Runs on devices with KB of RAM |
| No journal overhead | Fast for small writes |
Need to share files between Windows, macOS, and Linux without any driver installation? FAT32. Need to boot any UEFI system? FAT32 on the ESP. Need firmware for an embedded device? Often FAT. Simplicity and ubiquity trump any technical shortcoming.
Successor: exFAT (2006):
Microsoft addressed FAT32's limitations with exFAT:
exFAT is essentially "FAT64"—the same linked allocation concepts with wider fields for modern storage sizes.
We've thoroughly explored the FAT optimization—the brilliant architectural insight that made linked allocation practical for decades. Let's consolidate our understanding:
Module Conclusion:
You've now mastered linked allocation—from the fundamental concept of block chains, through pointer mechanics and fragmentation properties, understanding the random access limitation, to the revolutionary FAT optimization. This complete picture shows how a simple idea (linking blocks with pointers) spawned one of computing's most successful file systems, and why modern systems have moved to more sophisticated schemes like extent-based and B-tree allocation while still maintaining FAT for compatibility.
You've completed the Linked Allocation module! You can now explain how linked allocation works, analyze its strengths (no external fragmentation, dynamic growth) and weaknesses (poor random access), and understand why the FAT optimization made this approach practical for real-world use. This knowledge provides crucial context for understanding modern file system design choices.