Loading content...
The linked list approach to free space management has an inherent inefficiency: each disk block stores only a single pointer to the next free block. With blocks typically sized at 4KB and pointers requiring only 8 bytes, we're wasting 99.8% of each free block's capacity!
Grouping addresses this waste through a simple but powerful insight: instead of storing one pointer per block, store hundreds or thousands of free block addresses in a single block. This transforms the linked list's O(N) traversal into something far more manageable, dramatically reducing the I/O required for free space operations.
This technique was pioneered in early Unix file systems and influenced many subsequent designs. It occupies an interesting middle ground between the space efficiency of bitmaps and the operational simplicity of linked lists, trading some complexity for significant performance gains in specific scenarios.
By the end of this page, you will understand the motivation behind grouping, how free block addresses are packed into container blocks, the algorithms for allocation and deallocation, performance characteristics compared to pure linked lists, implementation variants including multi-level grouping, and scenarios where grouping excels.
Grouping modifies the linked list approach by using the first free block in each "group" to store the addresses of many subsequent free blocks, not just one. The last entry in each group points to the next group's container block.
Structure of a Group Container Block:
┌─────────────────────────────────────────────────────────────────┐
│ GROUP CONTAINER BLOCK │
├─────────────────────────────────────────────────────────────────┤
│ Count: 510 │
├─────────────────────────────────────────────────────────────────┤
│ Entry 0: Block 1234 (free block) │
│ Entry 1: Block 1567 (free block) │
│ Entry 2: Block 1890 (free block) │
│ ... │
│ Entry 508: Block 9876 (free block) │
│ Entry 509: Block 5555 → (NEXT GROUP CONTAINER) │
└─────────────────────────────────────────────────────────────────┘
Capacity Calculation:
For a 4KB block with 8-byte block addresses:
Usable space = 4096 bytes
Pointer size = 8 bytes
Max entries = 4096 / 8 = 512 entries
Reserving 1 for next-group pointer and 1 for count:
Free blocks per group = 510
For 1 million free blocks:
The tradeoff is that container blocks are "overhead"—they store addresses rather than being directly available for user data. But this overhead is tiny (1 block per 510 free blocks ≈ 0.2%).
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130
#include <stdint.h>#include <stdbool.h>#include <string.h> /** * Grouping Free Space Manager * * Each free block group can hold up to N block addresses. * The last entry in a full group points to the next group. */ #define BLOCK_SIZE 4096#define POINTER_SIZE sizeof(uint64_t)#define ENTRIES_PER_GROUP ((BLOCK_SIZE / POINTER_SIZE) - 1) // Reserve 1 for count typedef struct __attribute__((packed)) { uint64_t count; // Number of valid entries uint64_t entries[ENTRIES_PER_GROUP]; // Free block addresses // Last entry (when count == ENTRIES_PER_GROUP) points to next group} GroupBlock; typedef struct { uint64_t first_group; // Block number of first group container uint64_t total_free; // Total free blocks across all groups uint64_t block_size; void *disk_handle;} GroupingManager; /** * Read a group container block from disk */GroupBlock* read_group(GroupingManager *gm, uint64_t block_num) { GroupBlock *gb = malloc(sizeof(GroupBlock)); // disk_read(gm->disk_handle, block_num * gm->block_size, gb, sizeof(GroupBlock)); return gb;} /** * Write a group container block to disk */void write_group(GroupingManager *gm, uint64_t block_num, GroupBlock *gb) { // disk_write(gm->disk_handle, block_num * gm->block_size, gb, sizeof(GroupBlock));} /** * Allocate a single block from the group structure. * * Time complexity: O(1) amortized * I/O: Typically 1-2 reads, 1 write */uint64_t group_allocate(GroupingManager *gm) { if (gm->first_group == 0) { return 0; // No free blocks } GroupBlock *current = read_group(gm, gm->first_group); if (current->count == 0) { // This shouldn't happen in a well-maintained structure free(current); gm->first_group = 0; return 0; } // Take the last entry (LIFO order for efficiency) current->count--; uint64_t allocated = current->entries[current->count]; if (current->count == 0) { // Group is now empty // Check if there's a next group pointer in entry 0 // (We stored next-group in entries[0] when this became the last entry) uint64_t next_group = current->entries[0]; // The container block itself becomes free! // We can either return it or add it back // For simplicity, we'll use it as the allocated block allocated = gm->first_group; gm->first_group = next_group; } else { write_group(gm, gm->first_group, current); } free(current); gm->total_free--; return allocated;} /** * Free a block by adding it to the group structure. * * Time complexity: O(1) * I/O: 1-2 reads, 1-2 writes */bool group_free(GroupingManager *gm, uint64_t block_num) { if (gm->first_group == 0) { // No existing groups - this block becomes the first container GroupBlock *new_group = malloc(sizeof(GroupBlock)); memset(new_group, 0, sizeof(GroupBlock)); new_group->count = 0; // Container itself is overhead, not counted write_group(gm, block_num, new_group); gm->first_group = block_num; free(new_group); // Note: We don't increment total_free because container is overhead return true; } GroupBlock *current = read_group(gm, gm->first_group); if (current->count < ENTRIES_PER_GROUP - 1) { // Room in current group current->entries[current->count] = block_num; current->count++; write_group(gm, gm->first_group, current); } else { // Current group is full - this block becomes new container GroupBlock *new_group = malloc(sizeof(GroupBlock)); new_group->count = 1; new_group->entries[0] = gm->first_group; // First entry points to old group write_group(gm, block_num, new_group); gm->first_group = block_num; free(new_group); } free(current); gm->total_free++; return true;}Grouping provides dramatic performance improvements over pure linked lists while maintaining similar allocation flexibility. Let's quantify these improvements.
Traversal Efficiency:
To enumerate all free blocks:
For N = 510 and F = 1,000,000:
Allocation Patterns:
| Operation | Linked List | Grouping | Improvement |
|---|---|---|---|
| Allocate 1 block | 1 read, 1 write | 1 read, 1 write | None |
| Allocate 100 blocks | 100 reads | 1 read | 100× |
| Allocate 1000 blocks | 1000 reads | 2 reads | 500× |
| Find all free blocks | F reads | F/N reads | N× |
| Free 1 block | 1 write | 1 read + 1 write | -1 read |
The key insight: bulk allocation benefits enormously because reading one group block yields hundreds of available addresses.
| Method | Overhead Size | Overhead % |
|---|---|---|
| Bitmap | 32 MB (fixed) | 0.003% |
| Linked List | ~1 GB (134M × 8 bytes in blocks) | 0.1%* |
| Grouping | ~1 MB (134M/510 containers × 4KB) | 0.0001% |
For linked lists, the '0.1% overhead' represents the storage wasted by pointers within blocks. For grouping, overhead is the container blocks themselves. In practice, grouping overhead is negligible because each container tracks ~510 free blocks while occupying just 1 block of space. The bitmap overhead is fixed regardless of utilization.
When Does Grouping Excel?
When Does Grouping Struggle?
The basic grouping concept can be enhanced in several ways to address specific performance concerns or operational requirements.
Variant 1: Multi-Level Grouping
Just as file systems use indirect blocks for large files, free space can use multi-level grouping:
Level 0 (Superblock): Points to Level 1 container
│
▼
Level 1 Container: [L2 ptr, L2 ptr, L2 ptr, ... L2 ptr]
│
▼
Level 2 Container: [free, free, free, ... free]
With 510 entries per block:
This reduces traversal from O(F/N) to O(log_N(F)).
Variant 2: Sorted Grouping
Maintain entries within each group in sorted order by block number:
Group Container (sorted):
[count: 5]
[entry 0: block 100]
[entry 1: block 150]
[entry 2: block 151] ← Note: 150 and 151 are adjacent!
[entry 3: block 200]
[entry 4: block 500]
Benefits:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119
/** * Sorted Grouping: Maintain entries in sorted order within each group * * Enables binary search and easy contiguity detection */ typedef struct { uint64_t count; uint64_t entries[ENTRIES_PER_GROUP];} SortedGroupBlock; /** * Find insertion position using binary search */uint64_t find_insert_position(SortedGroupBlock *sg, uint64_t block_num) { uint64_t low = 0, high = sg->count; while (low < high) { uint64_t mid = low + (high - low) / 2; if (sg->entries[mid] < block_num) { low = mid + 1; } else { high = mid; } } return low;} /** * Insert block in sorted order */void sorted_group_insert(SortedGroupBlock *sg, uint64_t block_num) { uint64_t pos = find_insert_position(sg, block_num); // Shift entries to make room for (uint64_t i = sg->count; i > pos; i--) { sg->entries[i] = sg->entries[i - 1]; } sg->entries[pos] = block_num; sg->count++;} /** * Check if block_num is free using binary search * O(log N) within group */bool sorted_group_contains(SortedGroupBlock *sg, uint64_t block_num) { uint64_t pos = find_insert_position(sg, block_num); return (pos < sg->count && sg->entries[pos] == block_num);} /** * Find contiguous runs within a sorted group */typedef struct { uint64_t start; uint64_t length;} ContiguousRun; ContiguousRun* find_contiguous_runs(SortedGroupBlock *sg, uint64_t *run_count) { if (sg->count == 0) { *run_count = 0; return NULL; } // First pass: count runs uint64_t count = 1; for (uint64_t i = 1; i < sg->count; i++) { if (sg->entries[i] != sg->entries[i-1] + 1) { count++; } } ContiguousRun *runs = malloc(count * sizeof(ContiguousRun)); *run_count = count; // Second pass: build runs runs[0].start = sg->entries[0]; runs[0].length = 1; uint64_t run_idx = 0; for (uint64_t i = 1; i < sg->count; i++) { if (sg->entries[i] == sg->entries[i-1] + 1) { runs[run_idx].length++; } else { run_idx++; runs[run_idx].start = sg->entries[i]; runs[run_idx].length = 1; } } return runs;} /** * Allocate contiguous blocks from sorted group * O(N) scan, but contiguity visible immediately */uint64_t sorted_group_alloc_contiguous(SortedGroupBlock *sg, uint64_t needed) { uint64_t run_count; ContiguousRun *runs = find_contiguous_runs(sg, &run_count); for (uint64_t i = 0; i < run_count; i++) { if (runs[i].length >= needed) { uint64_t result = runs[i].start; // Remove allocated blocks from entries // (implementation omitted for brevity) free(runs); return result; } } free(runs); return 0; // No sufficient contiguous run}Variant 3: Hybrid Grouping with Extent Tracking
Instead of storing individual block addresses, store extents (start, length pairs):
Traditional: [100, 101, 102, 103, 200, 201, 500]
7 entries
Extent-based: [(100, 4), (200, 2), (500, 1)]
3 entries (57% reduction)
With 16 bytes per extent (8 for start, 8 for length), one 4KB block holds 256 extents. If average extent length is 10 blocks, one container represents 2,560 free blocks—5× improvement over simple grouping.
This variant converges toward the "Counting" approach covered in the next page.
The grouping technique was implemented in early Unix versions, most notably the Unix Version 7 (V7) file system from 1979. Understanding this historical implementation illuminates how the technique evolved and influenced later designs.
Unix V7 Free List Structure:
The superblock contained:
s_nfree: Count of free blocks in the current group (max 50 in V7)s_free[50]: Array of free block addressesWhen s_nfree reached 50, the entire array was copied to a new free block, which became the new head of the chain:
Superblock:
┌────────────┬──────────────────────────────────┐
│ s_nfree │ s_free[0..49] │
│ 50 │ [blk1, blk2, ..., blk49, →grp1] │
└────────────┴──────────────────────────────────┘
│
▼
Group 1 Block:
┌────────────┬──────┐
│ 50 │[...]→┤───▶ Group 2
└────────────┴──────┘
Allocation Algorithm:
// V7-style allocation (simplified)
int alloc() {
if (s_nfree == 0) {
return -1; // No free blocks
}
s_nfree--;
int block = s_free[s_nfree];
if (s_nfree == 0 && block != 0) {
// Need to read next group
// Block contains: [nfree, free[0..49]]
read_block(block, &s_nfree, s_free);
// Now we have a new group of 50 blocks
// And 'block' itself can be returned to caller
}
return block;
}
Deallocation Algorithm:
void free(int block) {
if (s_nfree == 50) {
// Current group is full
// Write current group to freed block
write_block(block, s_nfree, s_free);
// Reset superblock to have just this one entry
s_nfree = 1;
s_free[0] = block; // Points to saved group
} else {
s_free[s_nfree] = block;
s_nfree++;
}
}
Notice how the V7 design elegantly handles the bootstrap problem: when the in-memory group (in the superblock) fills up, the newly freed block becomes the container for the current group. No separate allocation of container blocks is needed—the free blocks themselves rotate through container duty. This is a beautiful example of minimal overhead design.
| Aspect | Unix V7 | Modern Approach |
|---|---|---|
| Group size | 50 entries (fixed) | 512+ entries (variable) |
| Superblock role | Holds current group | Points to container |
| Block size | 512 bytes | 4KB typical |
| Crash recovery | Minimal (fsck required) | Journaled or COW |
| Sorting | Unsorted | Often sorted for locality |
Why 50?
With 512-byte blocks and 2-byte block addresses (16-bit addressing), 50 entries plus a count field fit comfortably:
50 entries × 2 bytes = 100 bytes
+ count (2 bytes) = 102 bytes
+ padding/alignment ≤ 512 bytes ✓
Modern systems with 4KB blocks and 64-bit addressing could fit 510+ entries, but the principle remains identical.
Grouping introduces specific crash consistency challenges due to the interrelationship between container blocks and the superblock's pointer to them.
Crash Scenarios:
Scenario 1: Crash during allocation when group empties
Before: Superblock → Group A (has 1 entry: block X pointing to Group B)
Step 1: Allocate block X from Group A
Step 2: Read Group B to refill superblock
Step 3: [CRASH before superblock update]
After crash:
- Superblock still points to Group A
- Group A is empty (count=0) or contains stale data
- Block X may have been allocated elsewhere
- Group B is orphaned
Scenario 2: Crash during free when new container created
Before: Superblock group is full (50/50 entries)
Step 1: Write current group content to freed block Y
Step 2: [CRASH before superblock update]
After crash:
- Block Y contains group data (looks like container)
- Superblock still has old full group
- Block Y is neither free nor in any file
Mitigation Strategies:
Write ordering: Always write container blocks before updating superblock pointers
Journaling: Log the group transition as a transaction:
BEGIN_TXN
UPDATE superblock.first_group = new_container
WRITE new_container content
COMMIT_TXN
Redundancy: Keep shadow copy of superblock group in a known location
fsck recovery: Rebuild free list by scanning all inodes and marking non-referenced blocks as free
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384
/** * Journaled Grouping Operations * * Demonstrates how to wrap group transitions in transactions * for crash consistency. */ typedef struct { uint32_t txn_id; uint64_t old_first_group; uint64_t new_first_group; GroupBlock new_group_content;} GroupTransition; /** * Allocate with crash-safe group transition */uint64_t group_allocate_safe(GroupingManager *gm, Journal *journal) { GroupBlock *current = read_group(gm, gm->first_group); current->count--; uint64_t allocated = current->entries[current->count]; if (current->count == 0) { // Group transition needed - must be atomic uint64_t next_group = current->entries[0]; // Journal the transition GroupTransition txn = { .txn_id = journal_next_id(journal), .old_first_group = gm->first_group, .new_first_group = next_group, }; read_group_into(gm, next_group, &txn.new_group_content); // Write journal entry journal_write(journal, &txn, sizeof(txn)); journal_commit(journal, txn.txn_id); // Now safe to update in-memory state gm->first_group = next_group; // Persist superblock // persist_superblock(gm); // Mark journal entry as checkpointed journal_checkpoint(journal, txn.txn_id); // Return the container block itself allocated = txn.old_first_group; } else { write_group(gm, gm->first_group, current); } free(current); gm->total_free--; return allocated;} /** * Recover after crash - replay uncommitted group transitions */void group_recover(GroupingManager *gm, Journal *journal) { // Find most recent committed but not checkpointed transition GroupTransition *txn = journal_find_pending_group_transition(journal); if (txn) { printf("Recovering group transition: %lu -> %lu\n", txn->old_first_group, txn->new_first_group); // Ensure new group is written to disk write_group(gm, txn->new_first_group, &txn->new_group_content); // Update superblock gm->first_group = txn->new_first_group; // persist_superblock(gm); // Checkpoint journal_checkpoint(journal, txn->txn_id); printf("Group recovery complete.\n"); }}Unlike bitmaps where each bit is independent, grouping creates chains of dependency. A single corrupted container block can make thousands of free blocks inaccessible. This is why modern file systems either use bitmaps (independent) or copy-on-write structures (inherently consistent) rather than linked grouping approaches.
Grouping occupies a middle ground in the spectrum of free space management techniques. Understanding its position helps in selecting the right approach for specific scenarios.
| Criterion | Bitmap | Linked List | Grouping |
|---|---|---|---|
| Space overhead | Fixed (1 bit/block) | Variable (1 ptr/free) | Variable (1 ptr/N free) |
| Single allocation | O(N) scan | O(1) | O(1) amortized |
| Bulk allocation | O(N) scan | O(K) | O(K/N) scans |
| Check if free | O(1) | O(F) | O(F/N) |
| Contiguous find | O(N) scan | O(F × log F) | O(F/N) |
| Crash recovery | Simple | Complex | Moderate |
| Parallelism | Excellent | Poor (single head) | Moderate |
Pure grouping is rare in modern file systems. However, the concept influenced designs like ext4's buddy allocator (groups of different sizes) and ZFS's space maps (which group allocation records into digestible chunks before processing). The principle of batching free space information for I/O efficiency remains relevant.
Grouping represents an important optimization over pure linked lists, demonstrating how storing multiple addresses together can dramatically reduce disk I/O. While largely superseded by bitmaps in modern general-purpose file systems, the technique's influence persists in various forms.
What's Next:
Grouping treats all free blocks equally, storing their addresses individually (or in sorted order). The next page explores Counting, which takes the extent concept further—storing only the start and length of free regions rather than enumerating each block. This approach excels when free space tends to form large contiguous regions.
You now understand grouping free space management—how it improves upon linked lists by batching addresses, its historical implementation in Unix, and its trade-offs compared to bitmaps. This technique exemplifies the broader principle of batching to reduce I/O costs.