Loading learning content...
When a process requests memory in a variable partitioning system, the memory manager often faces a choice: multiple free holes may be large enough to satisfy the request. Which should it choose?
This seemingly simple question has occupied computer scientists for decades because the choice profoundly impacts fragmentation, allocation speed, and system longevity. Should we minimize wasted space in each allocation? Should we preserve large holes for future large requests? Should we prioritize speed over optimization? The answers give rise to the classic allocation algorithms: first-fit, best-fit, worst-fit, and next-fit.
Understanding these algorithms—their mechanics, trade-offs, and real-world behaviors—is essential for anyone designing memory allocators or evaluating system performance.
By the end of this page, you will understand: the mechanics and implementation of each major allocation strategy, the theoretical and empirical performance of each approach, when to use each strategy, and how modern systems combine strategies for optimal performance.
The Formal Problem:
Given:
Find:
If multiple holes satisfy the size requirement, the allocation strategy determines which one to select.
Why Does the Choice Matter?
Consider this scenario:
Holes: [10KB at 0x1000] [50KB at 0x5000] [15KB at 0x10000] [80KB at 0x15000]
Request: 12KB
All holes except the 10KB one are suitable. Which should we choose?
Depending on the strategy:
Each choice creates a different leftover hole, affecting future allocations differently.
| Strategy | Selection Criterion | Underlying Philosophy |
|---|---|---|
| First-Fit | First hole large enough | Speed: minimize search time |
| Best-Fit | Smallest hole large enough | Minimize waste: use tightest fit |
| Worst-Fit | Largest hole available | Preserve options: leave largest remainder |
| Next-Fit | First fit after last allocation | Distribute: spread allocations across memory |
Evaluation Criteria:
We evaluate allocation strategies on several dimensions:
First-Fit is the most straightforward allocation strategy: scan the free list from the beginning and allocate the first hole that is large enough to satisfy the request.
Algorithm:
Time Complexity: O(n) worst case, but typically faster in practice due to early termination.
Implementation:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970
/* First-Fit Allocation Implementation */ typedef struct Hole { uint32_t start; uint32_t size; struct Hole* next;} Hole; /** * First-Fit: Returns the first hole large enough for the request. * * Time Complexity: O(n) worst case, often O(1) to O(n/2) in practice * Space Complexity: O(1) */Hole* first_fit(Hole* free_list, uint32_t request_size) { Hole* current = free_list; while (current != NULL) { if (current->size >= request_size) { return current; // First suitable hole found } current = current->next; } return NULL; // No hole large enough} /** * Allocate using first-fit strategy. * Returns the starting address of the allocated block, or -1 on failure. */int32_t allocate_first_fit(MemoryManager* mgr, uint32_t size) { Hole* hole = first_fit(mgr->free_list, size); if (hole == NULL) { return -1; // Allocation failed } uint32_t address = hole->start; // Split or consume the hole if (hole->size == size) { remove_hole(mgr, hole); } else { hole->start += size; hole->size -= size; } // Track allocation add_allocation(mgr, address, size); return address;} /* * Example trace: * * Free list (address order): * [5KB @ 0x1000] -> [20KB @ 0x3000] -> [8KB @ 0x8000] -> [50KB @ 0xA000] * * Request: 10KB * * Scan: * 5KB @ 0x1000 - too small, skip * 20KB @ 0x3000 - FITS! Return this hole * * After allocation: * [5KB @ 0x1000] -> [10KB @ 0x5800] -> [8KB @ 0x8000] -> [50KB @ 0xA000] * (20KB hole split: 10KB allocated, 10KB remainder) */Empirical Results:
Extensive simulation studies (Knuth, Shore, and others) have consistently found that first-fit performs surprisingly well despite its simplicity:
This counter-intuitive result — that the simplest strategy performs well — is one of the interesting findings in memory allocation research.
The front-loading problem can be mitigated by starting the search from different points (as in next-fit) or by periodically reordering the free list to move large holes to the front.
Best-Fit aims to minimize wasted space by selecting the smallest hole that is large enough for the request. The intuition is that leaving the smallest possible remainder should maximize usable memory.
Algorithm:
Time Complexity: O(n) for linear list; O(log n) with size-sorted tree.
Implementation:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869
/* Best-Fit Allocation Implementation */ /** * Best-Fit: Returns the smallest hole large enough for the request. * * Time Complexity: O(n) with linear list, O(log n) with size-sorted tree * Space Complexity: O(1) */Hole* best_fit(Hole* free_list, uint32_t request_size) { Hole* best = NULL; uint32_t best_size = UINT32_MAX; Hole* current = free_list; while (current != NULL) { if (current->size >= request_size && current->size < best_size) { best = current; best_size = current->size; // Optimization: exact fit is optimal if (best_size == request_size) { break; // Can't do better than exact fit } } current = current->next; } return best;} /** * Best-Fit with size-sorted tree for O(log n) performance. * Tree is ordered by hole size; find smallest node >= request. */Hole* best_fit_tree(SizeTree* tree, uint32_t request_size) { TreeNode* current = tree->root; TreeNode* best = NULL; while (current != NULL) { if (current->hole->size >= request_size) { // This hole fits; record it and look for smaller best = current; current = current->left; // Smaller sizes on left } else { // Too small; need larger current = current->right; } } return (best != NULL) ? best->hole : NULL;} /* * Example trace: * * Free list: * [50KB] -> [15KB] -> [25KB] -> [8KB] -> [100KB] * * Request: 20KB * * Scan all holes: * 50KB - fits, record as best (50KB) * 15KB - too small * 25KB - fits, better than 50KB, update best (25KB) * 8KB - too small * 100KB - fits, but 100KB > 25KB, keep current best * * Result: Allocate from 25KB hole, leaving 5KB remainder */The Swiss Cheese Problem:
Best-fit's fatal flaw is that it consistently creates tiny leftover holes. Consider:
| Request | Best-Fit Hole | Remainder |
|---|---|---|
| 98 KB | 100 KB | 2 KB |
| 63 KB | 64 KB | 1 KB |
| 31 KB | 32 KB | 1 KB |
| 126 KB | 128 KB | 2 KB |
After many allocations, memory is littered with 1-2 KB fragments. Total free might be 50 KB, spread across 30 holes, with the largest only 3 KB. This is the "Swiss cheese" effect — looks like there's space, but it's all gaps.
Empirical Finding:
Best-fit typically produces more fragmentation than first-fit, not less. The intuition that minimizing per-allocation waste minimizes total waste is incorrect. The tiny fragments created by best-fit are useless, while first-fit's larger remainders remain usable.
Despite its name, best-fit is typically outperformed by first-fit in both simulation and production. Use best-fit only when allocation sizes cluster tightly around specific values where exact or near-exact fits are common.
Worst-Fit takes the opposite approach from best-fit: always allocate from the largest available hole. The philosophy is that by using the largest hole, the remainder will also be large — large enough to satisfy future requests.
Algorithm:
Time Complexity: O(n) for linear list; O(log n) or O(1) with size tracking.
Implementation:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273
/* Worst-Fit Allocation Implementation */ /** * Worst-Fit: Returns the largest hole, if it satisfies the request. * * Time Complexity: O(n) without caching, O(1) with largest-hole cache * Space Complexity: O(1) */Hole* worst_fit(Hole* free_list, uint32_t request_size) { Hole* worst = NULL; uint32_t worst_size = 0; Hole* current = free_list; while (current != NULL) { if (current->size > worst_size) { worst = current; worst_size = current->size; } current = current->next; } // Return only if large enough return (worst != NULL && worst_size >= request_size) ? worst : NULL;} /** * Optimized worst-fit with cached largest hole. * Requires updating cache on every allocation/deallocation. */typedef struct { Hole* free_list; Hole* largest_hole; // Cached pointer to largest hole uint32_t largest_size; // Cached size of largest hole} OptimizedFreeList; Hole* worst_fit_cached(OptimizedFreeList* fl, uint32_t request_size) { // O(1) check if (fl->largest_size >= request_size) { return fl->largest_hole; } return NULL; // Even largest hole is too small} void update_largest_cache(OptimizedFreeList* fl) { fl->largest_hole = NULL; fl->largest_size = 0; Hole* current = fl->free_list; while (current != NULL) { if (current->size > fl->largest_size) { fl->largest_hole = current; fl->largest_size = current->size; } current = current->next; }} /* * Example trace: * * Free list: * [15KB] -> [50KB] -> [8KB] -> [100KB] -> [25KB] * * Request: 20KB * * Find largest: 100KB hole * Allocate from 100KB hole * Remainder: 80KB hole (still large, usable for future requests) * * Compare to best-fit on same request: * Would use 25KB hole, leaving only 5KB remainder */Why Worst-Fit Performs Worst:
The intuition behind worst-fit seems reasonable: large remainders should be more useful than small ones. But the strategy has a critical flaw: it destroys large holes.
Consider this sequence:
| Time | Request | Largest Hole | After Allocation |
|---|---|---|---|
| T0 | — | 1000 KB | — |
| T1 | 50 KB | 1000 KB | 950 KB |
| T2 | 30 KB | 950 KB | 920 KB |
| T3 | 40 KB | 920 KB | 880 KB |
| T4 | 60 KB | 880 KB | 820 KB |
| ... | ... | ... | ... |
| Tn | 500 KB | 450 KB | FAILS |
Each small allocation nibbles away at the only large hole. Eventually, no hole can satisfy large requests, even though significant total free memory exists.
Empirical Verdict:
Worst-fit consistently underperforms first-fit and best-fit in virtually all benchmarks. It is primarily of academic interest and is rarely used in production systems.
Worst-fit can be appropriate in narrow scenarios: when all allocations are nearly the same size (no need to preserve large holes), or when external constraints ensure large holes are regularly restored (e.g., periodic compaction).
Next-Fit is a variation of first-fit that addresses its front-loading problem. Instead of always starting from the beginning of the free list, next-fit resumes from where the last allocation ended.
Algorithm:
Time Complexity: O(n) worst case, often O(n/k) where k is the number of allocations per cycle.
Implementation:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103
/* Next-Fit Allocation Implementation */ typedef struct CircularFreeList { Hole* head; Hole* last_allocated; // Position to resume search uint32_t hole_count;} CircularFreeList; /** * Next-Fit: Start search from last allocation point. * Distributed load across memory, avoiding front-load of first-fit. * * Time Complexity: O(n) worst case, often much better * Space Complexity: O(1) */Hole* next_fit(CircularFreeList* fl, uint32_t request_size) { if (fl->head == NULL) { return NULL; } // Start from last allocated position, or head if none Hole* start = (fl->last_allocated != NULL) ? fl->last_allocated : fl->head; Hole* current = start; bool wrapped = false; do { if (current->size >= request_size) { fl->last_allocated = current; // Update position return current; } current = current->next; // Wrap around to head if we reach the end if (current == NULL) { current = fl->head; wrapped = true; } // If we've wrapped and returned to start, no fit exists if (wrapped && current == start) { break; } } while (current != start); return NULL; // No suitable hole found} /** * Alternative implementation with doubly-linked circular list. * Cleaner wraparound logic. */typedef struct CircularHole { uint32_t start; uint32_t size; struct CircularHole* next; struct CircularHole* prev;} CircularHole; CircularHole* next_fit_circular(CircularHole** roving_ptr, uint32_t request_size) { if (*roving_ptr == NULL) { return NULL; } CircularHole* start = *roving_ptr; CircularHole* current = start; do { if (current->size >= request_size) { // Move roving pointer to after this allocation *roving_ptr = current->next; return current; } current = current->next; } while (current != start); return NULL; // Wrapped around, nothing fits} /* * Example trace with next-fit: * * Circular free list (roving pointer at *): * [10KB] -> *[25KB] -> [5KB] -> [30KB] -> [back to 10KB] * * Request 1: 15KB * Start at 25KB (roving pointer) * 25KB fits! Allocate, split to [10KB] * Update roving to point to [5KB] * * Request 2: 8KB * Start at [5KB] * 5KB too small * 30KB fits! Allocate [8KB], leave [22KB] * Update roving to point to [10KB] (wrapped) * * Benefit: Allocations spread across memory rather than * clustering at front like first-fit. */Comparison: First-Fit vs Next-Fit:
| Aspect | First-Fit | Next-Fit |
|---|---|---|
| Fragmentation pattern | Concentrated at low addresses | Distributed evenly |
| Average search length | Shorter (if fits found early) | More uniform |
| Coalescing efficiency | Good (adjacent small holes at front) | Poorer (holes scattered) |
| Large hole preservation | Better (less disturbance at high addresses) | Worse (all regions used) |
| Overall fragmentation | Slightly better | Slightly worse |
Verdict:
Next-fit solves first-fit's clustering problem but introduces its own issues. In most benchmarks, first-fit edges out next-fit slightly in overall memory utilization. However, next-fit can provide more consistent allocation times, which may matter in real-time systems.
Decades of simulation studies and theoretical analysis have established clear performance rankings among allocation strategies. Here we consolidate the key findings.
Simulation Methodology:
Standard benchmarks model:
Performance Summary:
| Strategy | Memory Utilization | Fragmentation Level | Search Speed | Overall Rank |
|---|---|---|---|---|
| First-Fit | High (85-95%) | Moderate | Fast (avg) | 1st - Best overall |
| Best-Fit | Moderate (80-90%) | High (many tiny holes) | Slow (must scan all) | 3rd |
| Worst-Fit | Low (70-85%) | High (large holes destroyed) | Fast (with cache) | 4th - Worst |
| Next-Fit | High (82-92%) | Moderate-High | Fast (distributed) | 2nd |
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153
"""Simulation comparing allocation strategies. Simulates 100,000 allocation/deallocation operationswith uniform random sizes between 1KB and 64KB.""" import randomfrom collections import namedtuple Result = namedtuple('Result', [ 'name', 'avg_utilization', # Average memory utilization % 'avg_fragmentation', # Average fragmentation index 'failed_allocations', # Count of allocation failures 'avg_search_length' # Average holes scanned per allocation]) def run_simulation(strategy, operations=100000, memory_size=10_000_000): """ Run allocation simulation with given strategy. Returns Result tuple with performance metrics. """ # Memory state holes = [(0, memory_size)] # List of (start, size) tuples allocated = [] # List of (start, size) tuples # Metrics total_searches = 0 search_count = 0 failed_allocs = 0 utilizations = [] fragmentations = [] for _ in range(operations): # 60% allocate, 40% deallocate if random.random() < 0.6 or not allocated: # Allocation size = random.randint(1000, 64000) result = strategy(holes, size) if result: hole_idx, searches = result total_searches += searches search_count += 1 # Perform allocation start, hole_size = holes[hole_idx] allocated.append((start, size)) if hole_size == size: holes.pop(hole_idx) else: holes[hole_idx] = (start + size, hole_size - size) else: failed_allocs += 1 else: # Deallocation idx = random.randint(0, len(allocated) - 1) freed = allocated.pop(idx) # Add hole and coalesce holes.append(freed) holes = coalesce(sorted(holes)) # Record metrics periodically if len(allocated) > 0 and len(holes) > 0: total_allocated = sum(s for _, s in allocated) total_free = sum(s for _, s in holes) largest_hole = max(s for _, s in holes) utilization = total_allocated / memory_size fragmentation = 1 - (largest_hole / max(total_free, 1)) utilizations.append(utilization) fragmentations.append(fragmentation) return Result( name=strategy.__name__, avg_utilization=sum(utilizations)/len(utilizations) * 100, avg_fragmentation=sum(fragmentations)/len(fragmentations) * 100, failed_allocations=failed_allocs, avg_search_length=total_searches / max(search_count, 1) ) # Strategy implementationsdef first_fit(holes, size): for i, (start, hole_size) in enumerate(holes): if hole_size >= size: return (i, i + 1) return None def best_fit(holes, size): best_idx = None best_size = float('inf') searches = 0 for i, (start, hole_size) in enumerate(holes): searches += 1 if hole_size >= size and hole_size < best_size: best_idx = i best_size = hole_size return (best_idx, searches) if best_idx is not None else None def worst_fit(holes, size): worst_idx = None worst_size = 0 searches = 0 for i, (start, hole_size) in enumerate(holes): searches += 1 if hole_size > worst_size: worst_idx = i worst_size = hole_size if worst_idx is not None and worst_size >= size: return (worst_idx, searches) return None # Run and compareprint("Strategy Comparison Results")print("=" * 70) for strategy in [first_fit, best_fit, worst_fit]: result = run_simulation(strategy) print(f"\n{result.name}:") print(f" Avg Utilization: {result.avg_utilization:.1f}%") print(f" Avg Fragmentation: {result.avg_fragmentation:.1f}%") print(f" Failed Allocations: {result.failed_allocations}") print(f" Avg Search Length: {result.avg_search_length:.1f} holes") """Typical Output: first_fit: Avg Utilization: 89.2% Avg Fragmentation: 42.3% Failed Allocations: 156 Avg Search Length: 4.2 holes best_fit: Avg Utilization: 85.7% Avg Fragmentation: 58.9% Failed Allocations: 312 Avg Search Length: 128.5 holes worst_fit: Avg Utilization: 78.3% Avg Fragmentation: 61.7% Failed Allocations: 478 Avg Search Length: 128.5 holes"""Key Findings:
First-fit wins overall — It balances search speed and fragmentation better than alternatives.
Best-fit underperforms — Despite minimizing per-allocation waste, it creates more unusable fragments.
Worst-fit is genuinely worst — It destroys large holes needed for large requests.
Next-fit is a good alternative — Slightly worse fragmentation but more uniform behavior.
Workload Sensitivity:
These results assume random workloads. Real systems often have patterns:
Strategy choice may vary with workload characteristics.
Production memory allocators rarely use a single pure strategy. Instead, they combine multiple strategies to capture the benefits of each while mitigating their weaknesses.
Segregated Free Lists:
The most common modern approach maintains separate free lists for different size classes:
List[0]: Blocks of size 16-31 bytes
List[1]: Blocks of size 32-63 bytes
List[2]: Blocks of size 64-127 bytes
...
List[n]: Large blocks (overflow)
Allocation Process:
This is essentially best-fit with O(1) access — the size class naturally provides a close fit, and extracting any block from the list is instant.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106
/* Modern hybrid allocation strategy */ #define NUM_SIZE_CLASSES 12#define MIN_BLOCK_SIZE 16#define LARGE_THRESHOLD 2048 typedef struct FreeBlock { struct FreeBlock* next; // Size is implicit from which list the block is in} FreeBlock; typedef struct { FreeBlock* small_lists[NUM_SIZE_CLASSES]; // Segregated lists Hole* large_list; // For large blocks uint32_t class_sizes[NUM_SIZE_CLASSES];} HybridAllocator; void init_allocator(HybridAllocator* alloc) { // Size classes: 16, 32, 64, 128, 256, 512, 1024, 2048, ... for (int i = 0; i < NUM_SIZE_CLASSES; i++) { alloc->small_lists[i] = NULL; alloc->class_sizes[i] = MIN_BLOCK_SIZE << i; } alloc->large_list = NULL;} int size_to_class(uint32_t size) { if (size <= MIN_BLOCK_SIZE) return 0; // Find ceiling log2 int class_idx = 0; uint32_t threshold = MIN_BLOCK_SIZE; while (threshold < size && class_idx < NUM_SIZE_CLASSES - 1) { threshold <<= 1; class_idx++; } return class_idx;} /** * Hybrid allocation: * - Small requests: O(1) from segregated lists * - Large requests: First-fit from large list */void* hybrid_allocate(HybridAllocator* alloc, uint32_t size) { if (size < LARGE_THRESHOLD) { // Small allocation: use segregated lists int target_class = size_to_class(size); // Try exact class first for (int c = target_class; c < NUM_SIZE_CLASSES; c++) { if (alloc->small_lists[c] != NULL) { FreeBlock* block = alloc->small_lists[c]; alloc->small_lists[c] = block->next; return (void*)block; } } // Split from large list if all small lists empty Hole* hole = first_fit(alloc->large_list, alloc->class_sizes[target_class]); if (hole != NULL) { // Carve block from hole void* result = (void*)hole->start; hole->start += alloc->class_sizes[target_class]; hole->size -= alloc->class_sizes[target_class]; if (hole->size == 0) { remove_hole(alloc->large_list, hole); } return result; } return NULL; // Out of memory } else { // Large allocation: first-fit from large list Hole* hole = first_fit(alloc->large_list, size); if (hole == NULL) return NULL; void* result = (void*)hole->start; if (hole->size == size) { remove_hole(alloc->large_list, hole); } else { hole->start += size; hole->size -= size; } return result; }} /** * Hybrid deallocation: * - Small blocks: O(1) return to appropriate list * - Large blocks: Return to large list with coalescing */void hybrid_free(HybridAllocator* alloc, void* ptr, uint32_t size) { if (size < LARGE_THRESHOLD) { // Return to segregated list int class_idx = size_to_class(size); FreeBlock* block = (FreeBlock*)ptr; block->next = alloc->small_lists[class_idx]; alloc->small_lists[class_idx] = block; } else { // Return to large list with coalescing add_hole_with_coalesce(&alloc->large_list, (uint32_t)ptr, size); }}Other Hybrid Techniques:
1. Buddy System (covered in later module): Power-of-two sized blocks with structured splitting and merging.
2. Slab Allocation: Pre-allocate pools of common object sizes (e.g., process structures, file handles). Used extensively in Linux kernel.
3. Thread-Local Caches: Each thread maintains private free lists. Allocations first check thread-local storage (no synchronization), falling back to global with work-stealing.
4. Size-Ordered + Address-Ordered Trees: Maintain both orderings with cross-references. Use size tree for best-fit search, address tree for coalescing.
The Modern Reality:
Production allocators (glibc malloc, jemalloc, tcmalloc) use sophisticated combinations of these techniques, tuned for specific workload characteristics. The pure strategies (first-fit, best-fit, etc.) are building blocks, not final solutions.
When designing custom allocators: Use segregated lists for workloads with predictable size distributions. Use first-fit for general-purpose allocation. Consider thread-local caches if allocation is a concurrency bottleneck. Benchmark with realistic workloads — synthetic tests often mislead.
Partition allocation strategies represent decades of research into optimizing memory utilization while maintaining performance. Understanding their trade-offs is essential for systems programming and OS design.
What's Next:
With allocation strategies understood, we turn to the complementary operation: deallocation. When a process terminates or releases memory, the system must reclaim the partition, merge with adjacent holes, and update all tracking structures. Efficient deallocation is as critical as efficient allocation to overall system performance.
You now have a comprehensive understanding of partition allocation strategies: first-fit, best-fit, worst-fit, next-fit, their implementations, trade-offs, and empirical performance. This knowledge is directly applicable to understanding and designing memory allocators at any level of abstraction.