Loading learning content...
In our exploration of file system implementation, we've encountered a fundamental tension. Contiguous allocation provides blazing-fast sequential access but suffers from external fragmentation and requires knowing file sizes upfront. Linked allocation eliminates fragmentation but sacrifices random access performance, requiring traversal from the beginning to reach any block.
What if we could combine the best of both worlds? What if we could allocate blocks anywhere on disk—eliminating fragmentation—while still supporting efficient random access to any part of the file?
The answer lies in indexed allocation, and at its heart is a deceptively simple data structure: the index block.
By the end of this page, you will understand what an index block is, how it enables both random access and non-contiguous allocation, and why it forms the foundation of virtually every modern file system from Unix inodes to NTFS extent trees.
An index block (also known as an index node or i-node in Unix terminology) is a dedicated disk block that contains pointers to all the data blocks belonging to a file. Rather than embedding location information within the data blocks themselves (as in linked allocation), indexed allocation externalizes this metadata into a separate structure.
The Core Idea:
Imagine you're writing a book with chapters scattered across different sections of a library. Instead of placing a note at the end of each chapter telling readers where to find the next one, you create a table of contents at the front. This table lists every chapter and its exact shelf location. Readers can jump directly to any chapter without traversing through the others.
The index block is that table of contents for a file.
| Index Entry | Block Pointer | Logical Block Number |
|---|---|---|
| Entry 0 | Block 25 | File bytes 0-4095 |
| Entry 1 | Block 137 | File bytes 4096-8191 |
| Entry 2 | Block 89 | File bytes 8192-12287 |
| Entry 3 | Block 412 | File bytes 12288-16383 |
| Entry 4 | Block 201 | File bytes 16384-20479 |
| Entry 5 | NULL | Not yet allocated |
| ... | ... | ... |
Each entry in the index block contains a block pointer—the physical address of a data block on disk. The position of each entry implicitly defines the logical block number within the file. Entry 0 points to the first block of data, entry 1 to the second, and so on.
Key Characteristics:
To truly understand indexed allocation, we must examine the internal structure of an index block in detail. While implementations vary across file systems, the fundamental components remain consistent.
The block size determines how many pointers fit in an index block. With a 4KB block and 8-byte pointers, you can store 512 pointers per index block, addressing 512 × 4KB = 2MB of file data. This limitation drives the need for multi-level indexing, which we'll explore later.
1234567891011121314151617181920212223242526272829303132333435363738
/* Simplified index block structure for a file system */ #define BLOCK_SIZE 4096 /* 4KB blocks */#define POINTER_SIZE 8 /* 64-bit pointers */#define POINTERS_PER_BLOCK (BLOCK_SIZE / POINTER_SIZE) /* 512 pointers */ typedef uint64_t block_ptr_t; /* Physical block address */ /* * Index Block Layout (Single-Level) * * An index block is simply an array of block pointers. * Each pointer addresses one data block of the file. * * Memory Layout: * +------------------+ * | block_ptr[0] | → First 4KB of file data * +------------------+ * | block_ptr[1] | → Second 4KB of file data * +------------------+ * | block_ptr[2] | → Third 4KB of file data * +------------------+ * | ... | * +------------------+ * | block_ptr[511] | → 512th 4KB of file data * +------------------+ */struct index_block { block_ptr_t pointers[POINTERS_PER_BLOCK];}; /* * Maximum file size with single-level indexing: * 512 pointers × 4KB per block = 2MB * * This is why real file systems use multi-level indexing! */#define MAX_FILE_SIZE_SINGLE_LEVEL (POINTERS_PER_BLOCK * BLOCK_SIZE)The true power of indexed allocation becomes apparent when we examine how file access operations work. Unlike linked allocation, where reaching the 1000th block requires traversing 999 preceding blocks, indexed allocation provides direct access to any block.
The Access Algorithm:
To read data at byte offset position in a file:
block_num = position / BLOCK_SIZEindex_block[block_num] to find the physical block addressoffset_in_block = position % BLOCK_SIZEThis is O(1) with respect to file position—accessing the last byte of a 1GB file takes the same number of disk operations as accessing the first byte.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071
/* * Reading data from a file using indexed allocation * * This demonstrates the O(1) random access property of * indexed allocation - we can jump directly to any block. */ struct file_descriptor { block_ptr_t index_block_addr; /* Location of index block */ struct index_block *cached_index; /* Cached in memory */ uint64_t file_size; /* For bounds checking */ uint64_t current_position; /* Current read/write position */}; /* * Read 'count' bytes from file at the current position. * * Time Complexity: O(blocks_read), NOT O(file_size) * This is the key advantage over linked allocation. */ssize_t indexed_read(struct file_descriptor *fd, void *buffer, size_t count) { size_t bytes_read = 0; char *buf = (char *)buffer; /* Ensure index block is cached in memory */ if (fd->cached_index == NULL) { fd->cached_index = read_block(fd->index_block_addr); } while (bytes_read < count && fd->current_position < fd->file_size) { /* Step 1: Calculate which logical block we need */ uint64_t logical_block = fd->current_position / BLOCK_SIZE; uint64_t offset_in_block = fd->current_position % BLOCK_SIZE; /* Step 2: Direct lookup in index block - O(1) */ block_ptr_t physical_block = fd->cached_index->pointers[logical_block]; /* Step 3: Read the data block */ char *data = read_block(physical_block); /* Step 4: Copy requested bytes */ size_t bytes_available = BLOCK_SIZE - offset_in_block; size_t bytes_remaining = count - bytes_read; size_t bytes_to_copy = MIN(bytes_available, bytes_remaining); memcpy(buf + bytes_read, data + offset_in_block, bytes_to_copy); bytes_read += bytes_to_copy; fd->current_position += bytes_to_copy; free(data); /* Release temporary buffer */ } return bytes_read;} /* * Seek to a specific position - O(1) operation! * * Unlike linked allocation where seeking requires traversal, * indexed allocation allows instant positioning. */int indexed_seek(struct file_descriptor *fd, uint64_t position) { if (position > fd->file_size) { return -1; /* Beyond end of file */ } /* No disk I/O required - just update the position */ fd->current_position = position; return 0;}In practice, operating systems cache index blocks in memory. Once loaded, subsequent accesses within the same file require only one disk I/O per data block. This makes random access within open files extremely efficient.
To fully appreciate why indexed allocation won the evolutionary battle for file system design, let's systematically compare it with the alternatives we've studied.
| Criterion | Contiguous | Linked | Indexed |
|---|---|---|---|
| Sequential Access | Excellent (O(1) per block) | Good (O(1) per block) | Good (O(1) per block + index lookup) |
| Random Access | Excellent (O(1)) | Poor (O(n)) | Excellent (O(1)) |
| External Fragmentation | Severe | None | None |
| Internal Fragmentation | Minimal | Minimal | Index block waste for small files |
| File Growth | Difficult/Impossible | Easy | Easy |
| Space Overhead | None | One pointer per block | One index block per file |
| Reliability | Good | Chain breakage = data loss | Index loss = file loss |
| Implementation Complexity | Low | Medium | Medium-High |
Why Indexed Allocation Wins:
The critical insight is that modern workloads are dominated by random access patterns—databases seeking to specific records, applications loading configuration files, users opening documents at arbitrary positions. The O(n) random access cost of linked allocation becomes prohibitive.
Meanwhile, external fragmentation in contiguous allocation wastes enormous amounts of disk space and requires defragmentation overhead. Indexed allocation elegantly solves both problems, which is why it (or variations of it) underlies virtually every production file system.
Index blocks don't exist in isolation—they're part of a larger file system architecture. Understanding how directories reference index blocks is crucial for understanding file lookup operations.
The Connection:
When you run open("/home/user/document.txt"), the operating system must:
home, user, document.txtdocument.txt1234567891011121314151617181920212223242526272829303132333435363738394041424344454647
/* * Directory Entry Structure * * Directories are special files that contain mappings from * file names to index block addresses (inode numbers in Unix). */ #define MAX_FILENAME_LENGTH 255 struct directory_entry { char filename[MAX_FILENAME_LENGTH + 1]; uint64_t index_block_number; /* Also called inode number */ uint8_t entry_type; /* File, directory, symlink, etc. */ uint8_t padding[6]; /* Alignment padding */}; /* * Opening a file: The journey from path to index block */struct index_block* path_to_index_block(const char *path) { /* Start at root directory's index block */ struct index_block *current_dir_index = get_root_index_block(); char *path_copy = strdup(path); char *component = strtok(path_copy, "/"); while (component != NULL) { /* Search directory for this component */ struct directory_entry *entry = search_directory( current_dir_index, component ); if (entry == NULL) { free(path_copy); return NULL; /* File not found */ } /* Load the index block for this component */ current_dir_index = read_index_block(entry->index_block_number); component = strtok(NULL, "/"); } free(path_copy); return current_dir_index; /* Index block of the target file */}In Unix file systems, each index block has a unique number called the inode number. Directory entries store only the filename and inode number—all other file metadata lives in the inode (index block) itself. This design enables hard links, where multiple directory entries reference the same inode.
When creating a new file, the file system must allocate an index block. Different strategies optimize for different scenarios.
| Strategy | Advantages | Disadvantages | Best For |
|---|---|---|---|
| Pre-allocated Table | Fast allocation, predictable | Wastes space if unused, fixed limit | General-purpose, many small files |
| Dynamic Allocation | Flexible, no wasted inodes | Fragmentation possible, slower allocation | Variable workloads |
| Embedded/Inline | Zero overhead for tiny files | Limited size, complexity | Systems with many small files |
| Extent-Based | Efficient for large sequential files | Poor for highly fragmented files | Media files, databases |
Let's trace through exactly what happens when a user creates and writes to a file using indexed allocation. This end-to-end example solidifies the concepts we've covered.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106
/* * Complete File Creation Example with Indexed Allocation * * Scenario: User creates /home/user/report.txt and writes 10KB of data * Assumptions: 4KB block size, 512 pointers per index block */ /* Step 1: System call - create("report.txt", mode) */int create_file_indexed(const char *path, mode_t mode) { /* Step 2: Allocate a new index block (inode) */ uint64_t inode_num = allocate_inode(); if (inode_num == 0) { return -ENOSPC; /* No free inodes */ } /* Step 3: Initialize the index block */ struct inode *new_inode = get_inode(inode_num); new_inode->mode = mode; new_inode->size = 0; new_inode->link_count = 1; new_inode->uid = current_user(); new_inode->gid = current_group(); new_inode->atime = new_inode->mtime = new_inode->ctime = now(); /* All block pointers initially NULL */ for (int i = 0; i < POINTERS_PER_INODE; i++) { new_inode->direct_blocks[i] = 0; } new_inode->indirect_block = 0; new_inode->double_indirect = 0; /* Step 4: Add entry to parent directory */ struct inode *parent_dir = get_parent_directory(path); const char *filename = get_filename(path); add_directory_entry(parent_dir, filename, inode_num); /* Step 5: Write the inode and directory to disk */ write_inode(new_inode); write_inode(parent_dir); return open_file(inode_num); /* Return file descriptor */} /* Step 6: User writes 10KB of data - write(fd, buffer, 10240) */ssize_t write_file_indexed(struct file_descriptor *fd, const void *buffer, size_t count) { struct inode *inode = get_inode(fd->inode_num); const char *data = (const char *)buffer; size_t bytes_written = 0; while (bytes_written < count) { /* Calculate which logical block we're writing to */ uint64_t logical_block = fd->position / BLOCK_SIZE; uint64_t offset_in_block = fd->position % BLOCK_SIZE; /* Allocate a data block if this logical block is empty */ if (inode->direct_blocks[logical_block] == 0) { uint64_t new_block = allocate_data_block(); if (new_block == 0) { break; /* Disk full */ } inode->direct_blocks[logical_block] = new_block; } /* Write data to the block */ uint64_t physical_block = inode->direct_blocks[logical_block]; size_t bytes_to_write = MIN(BLOCK_SIZE - offset_in_block, count - bytes_written); write_partial_block(physical_block, offset_in_block, data + bytes_written, bytes_to_write); bytes_written += bytes_to_write; fd->position += bytes_to_write; } /* Update file size and timestamps */ if (fd->position > inode->size) { inode->size = fd->position; } inode->mtime = now(); write_inode(inode); return bytes_written;} /* * Disk State After Writing 10KB: * * Inode (Index Block): * +------------------+ * | mode, size=10240 | * | uid, gid | * | timestamps | * | direct[0] → 500 | ← Block 500: bytes 0-4095 * | direct[1] → 501 | ← Block 501: bytes 4096-8191 * | direct[2] → 502 | ← Block 502: bytes 8192-10239 * | direct[3-11] = 0 | ← Unused * | indirect = 0 | * | double_ind = 0 | * +------------------+ */The index block is a deceptively simple idea with profound implications. By externalizing block location metadata into a dedicated structure, indexed allocation achieves what seemed impossible: efficient random access without external fragmentation.
What's Next:
We've established what index blocks are and why they're powerful. But we glossed over a crucial detail: how do those block pointers actually work? In the next page, we'll dive deep into direct pointers, understanding their structure, limitations, and the role they play in the overall indexing hierarchy.
You now understand the fundamental concept of index blocks—the metadata structures that enable efficient, fragmentation-free file storage. This knowledge forms the foundation for understanding all modern file system designs, from Unix inodes to NTFS MFT entries.