Linked Lists vs Arrays - Learning Module

Loading content...

0/276

Hybrid Approaches — Combining Arrays and Linked Lists

Beyond the Binary Choice

Throughout this module, we've framed the discussion as arrays versus linked lists — a binary choice. But experienced engineers know that real-world problems often benefit from hybrid approaches that combine the strengths of both structures.

These hybrid data structures aren't just academic curiosities. They power production systems at massive scale, from database storage engines to text editors to memory allocators. Understanding them expands your design toolkit and reveals how creative combination of simple structures can solve complex problems.

This page explores the most important hybrid approaches, when they're appropriate, and how they achieve their performance characteristics.

What You Will Learn

You will learn about hybrid data structures including array of linked lists, linked list of arrays (unrolled linked lists), skip lists, and more. For each, we'll examine the design rationale, performance trade-offs, and real-world applications.

Array of Linked Lists

The most common hybrid structure uses an array where each element is the head of a linked list. This pattern appears throughout computing, most notably in hash tables with chaining and graph adjacency lists.

Design rationale:

The array provides O(1) access to a specific "bucket" or "vertex"
Each linked list handles variable numbers of elements per bucket/vertex
The combination provides fast lookup plus dynamic sizing per bucket

Hash Table with Chaining:

A hash table maps keys to indices via a hash function. Collisions (multiple keys mapping to the same index) are handled by storing all colliding elements in a linked list at that index:

hash-table-chaining.pseudo

Pseudocode

// Hash table with chaining
// Array of linked lists
 
class HashTable:
    buckets: Array<LinkedList>  // Array of size M
    
    constructor(size):
        buckets = new Array(size)
        for i in 0..size:
            buckets[i] = new LinkedList()  // Empty list per bucket
    
    insert(key, value):
        index = hash(key) mod buckets.length  // O(1)
        buckets[index].addFirst(key, value)   // O(1)
    
    find(key):
        index = hash(key) mod buckets.length  // O(1)
        return buckets[index].search(key)     // O(k), k = chain length
    
    delete(key):
        index = hash(key) mod buckets.length  // O(1)
        buckets[index].remove(key)            // O(k) search + O(1) delete
 
// Average case (good hash function, load factor ~1):
// Insert: O(1), Find: O(1), Delete: O(1)
// Worst case (all keys collide):
// All operations: O(n)

Why this works:

Array indexing (via hash) is O(1)
Each chain handles an unknown, variable number of collisions
Linked lists allow O(1) insertion without knowing chain length in advance
Deletion from a chain is O(1) once the node is found

Graph Adjacency Lists:

Graph representations face a similar challenge: each vertex has a variable number of neighbors.

adjacency-list.pseudo

Pseudocode

// Graph as array of linked lists
// vertices[i] = head of list of neighbors of vertex i
 
class Graph:
    adjacencyList: Array<LinkedList>  // Index = vertex ID
    
    constructor(numVertices):
        adjacencyList = new Array(numVertices)
        for i in 0..numVertices:
            adjacencyList[i] = new LinkedList()
    
    addEdge(from, to):
        adjacencyList[from].addFirst(to)  // O(1)
        // For undirected: adjacencyList[to].addFirst(from)
    
    neighbors(vertex):
        return adjacencyList[vertex].iterator()  // O(1) to get iterator
    
    hasEdge(from, to):
        return adjacencyList[from].contains(to)  // O(degree of 'from')
 
// Space: O(V + E) for V vertices and E edges
// Adding edge: O(1)
// Iterating neighbors: O(degree)
// Checking specific edge: O(degree), but can be O(1) with hash set per vertex

When to Use Array of Linked Lists

•Variable-size partitions: Each bucket/vertex has a different, unpredictable number of elements
•Frequent per-partition modifications: Elements are frequently added to or removed from individual partitions
•Indexed access to partitions: You need O(1) access to a specific partition
•No inter-partition random access: You don't need to access "element 5 in bucket 3" — only iterate or search within buckets

Modern Alternatives

Modern hash tables increasingly use open addressing (probing within the array) instead of chaining, for better cache performance. Adjacency lists may use arrays per vertex if edge additions are rare. Consider the specific access patterns before choosing.

Unrolled Linked Lists

An unrolled linked list is a linked list where each node contains an array of elements rather than a single element. This hybrid structure offers a middle ground between arrays and linked lists.

Design rationale:

Each node is a small array (typically sized to a cache line or a few cache lines)
Nodes are linked together like a traditional linked list
This reduces pointer overhead and improves cache utilization while maintaining some linked list flexibility

unrolled-linked-list.pseudo

Pseudocode

// Unrolled Linked List
// Each node holds multiple elements in an array
 
const BLOCK_SIZE = 16  // Elements per node (tune for cache line)
 
class UnrolledNode:
    elements: Array[BLOCK_SIZE]
    count: int               // Current elements in this node
    next: UnrolledNode
 
class UnrolledLinkedList:
    head: UnrolledNode
    nodeCount: int
    totalElements: int
    
    insert(index, value):
        // Find the node containing the index
        node, localIndex = findNodeAndIndex(index)
        
        if node.count < BLOCK_SIZE:
            // Room in current node: shift and insert locally
            shiftRight(node.elements, localIndex)
            node.elements[localIndex] = value
            node.count++
        else:
            // Node full: split into two nodes
            splitNode(node, localIndex, value)
    
    get(index):
        node, localIndex = findNodeAndIndex(index)
        return node.elements[localIndex]
    
    findNodeAndIndex(globalIndex):
        node = head
        remaining = globalIndex
        while remaining >= node.count:
            remaining -= node.count
            node = node.next
        return (node, remaining)
 
// Trade-offs compared to regular linked list:
// + Better cache locality (BLOCK_SIZE elements per fetch)
// + Lower pointer overhead (one pointer per BLOCK_SIZE elements)
// - More complex implementation
// - Some internal fragmentation if nodes aren't full

Unrolled Linked List vs Traditional Structures
Aspect	Array	Linked List	Unrolled Linked List
Random access	O(1)	O(n)	O(n/B) where B=block size
Insert at position	O(n)	O(1) with reference	O(n/B) + O(B) local shift
Pointer overhead	None	100% (one per element)	~6% (one per B elements)
Cache behavior	Excellent	Poor	Good
Memory fragmentation	None	High	Low
Implementation complexity	Simple	Moderate	Complex

Why unrolled linked lists work:

Cache efficiency: When you access one element, nearby elements (in the same node-array) are cached. Iteration becomes much faster.
Reduced overhead: Instead of one pointer per element, you have one pointer per B elements. For B=16, pointer overhead drops from 100% to ~6%.
Maintained flexibility: You can still insert/delete without shifting the entire structure — only the affected node needs modification (or splitting/merging).
Tunable block size: Block size can be tuned for cache line size, element size, and workload characteristics.

Real-world applications:

Text editors: Buffer of lines as unrolled list — good for both sequential display and random editing
Database indices: B-tree nodes are essentially unrolled — many keys per node
Memory allocators: Some allocators use similar block-based free lists
Standard libraries: Some language implementations use unrolled lists for deques

Block Size Selection

Choose block size based on cache line size (typically 64 bytes) and element size. A good heuristic: make each node roughly 1-4 cache lines. Too small: overhead approaches linked list. Too large: insertion overhead approaches array.

Skip Lists

A skip list is a probabilistic data structure that extends sorted linked lists with multiple layers of "express lanes" for faster traversal. It achieves O(log n) search, insert, and delete — comparable to balanced trees — while maintaining some linked list properties.

Design rationale:

A sorted linked list has O(n) search because you must traverse linearly. Skip lists add shortcut pointers at multiple levels:

Level 0: All elements (standard linked list)
Level 1: ~1/2 of elements (skip every other)
Level 2: ~1/4 of elements (skip every 4)
Level k: ~1/2^k of elements

Search starts at the highest level, moves forward until overshooting, then drops down. This is analogous to binary search but works on a linked structure.

skip-list.pseudo

Pseudocode

// Skip List with probabilistic level assignment
const MAX_LEVEL = 16
const P = 0.5  // Probability of promoting to next level
 
class SkipNode:
    value: int
    forward: Array<SkipNode>[MAX_LEVEL]  // Pointers at each level
    
class SkipList:
    header: SkipNode  // Sentinel with value -∞
    level: int        // Current max level in use
    
    randomLevel():
        lvl = 0
        while random() < P and lvl < MAX_LEVEL - 1:
            lvl++
        return lvl
    
    search(value):
        current = header
        // Start at highest level, work down
        for i = level down to 0:
            while current.forward[i] != null and 
                  current.forward[i].value < value:
                current = current.forward[i]  // Move forward at level i
        // Now at level 0, one step from target
        current = current.forward[0]
        return current != null and current.value == value
    
    insert(value):
        update = new Array[MAX_LEVEL]  // Track update points
        current = header
        
        // Find position and track predecessors at each level
        for i = level down to 0:
            while current.forward[i] != null and
                  current.forward[i].value < value:
                current = current.forward[i]
            update[i] = current
        
        // Insert new node with random height
        newLevel = randomLevel()
        if newLevel > level:
            for i = level + 1 to newLevel:
                update[i] = header
            level = newLevel
        
        newNode = new SkipNode(value)
        for i = 0 to newLevel:
            newNode.forward[i] = update[i].forward[i]
            update[i].forward[i] = newNode
 
// Expected complexities:
// Search: O(log n) expected
// Insert: O(log n) expected  
// Delete: O(log n) expected
// Space: O(n) expected (2n pointers on average)

Skip List Advantages

•O(log n) search/insert/delete — matches balanced trees
•Simpler implementation — no rotations or rebalancing
•Ordered iteration — easy to traverse in sorted order
•Lock-free variants — easier to make concurrent than trees
•Range queries — efficient finding of all elements in a range

Skip List Limitations

•Probabilistic — worst case is O(n), though extremely rare
•Memory overhead — 2n pointers on average
•Cache behavior — still pointer-chasing at each level
•No efficient indexing — no O(1) access by position
•Random number generation — requires RNG on every insert

Real-world usage:

Redis sorted sets: Redis uses skip lists for its sorted set data type, enabling efficient rank queries
LevelDB/RocksDB: Skip lists used for in-memory tables before flush to disk
Concurrent skip lists: Java's ConcurrentSkipListMap provides thread-safe sorted map

When to choose skip lists over balanced trees:

Simpler implementation is valuable (maintenance, debugging)
Concurrent access is required (lock-free algorithms are simpler for skip lists)
Range queries are common (easy bidirectional traversal)
Probabilistic guarantees are acceptable (rare worst case)

Redis Design Decision

Redis creator Salvatore Sanfilippo explained choosing skip lists: "Memory usage is about the same as balanced trees [...] But they are simpler to implement, debug, and the algorithms are easier to explain. They also have a very nice property: it's very easy to get an element's rank."

Rope Data Structure for Text

A rope (or cord) is a binary tree where leaves contain short strings (arrays of characters) and internal nodes represent concatenation. Ropes are designed specifically for text editing, where strings are frequently inserted, deleted, and concatenated.

The problem with strings as arrays:

Text editors face challenging requirements:

Documents can be arbitrarily large (millions of characters)
Edits happen at arbitrary positions (cursor location)
Operations must be interactive (< 16ms for 60fps)

With a simple character array:

Insertion at position K requires shifting N-K characters: O(n)
For a 10MB document, every keystroke could require moving megabytes of data

How ropes solve this:

rope.pseudo

Pseudocode

// Rope: Tree of string segments
// Leaves: short strings (arrays of chars)
// Internal nodes: concatenation + weight (left subtree length)
 
class RopeNode:
    // Leaf node
    text: String        // Only in leaves
    
    // Internal node  
    left: RopeNode
    right: RopeNode
    weight: int         // Length of left subtree
 
class Rope:
    root: RopeNode
    
    // O(log n) to find character at index
    charAt(index):
        return charAtNode(root, index)
    
    charAtNode(node, index):
        if node is leaf:
            return node.text[index]
        if index < node.weight:
            return charAtNode(node.left, index)
        else:
            return charAtNode(node.right, index - node.weight)
    
    // O(log n) split: divide rope at position
    split(index):
        // Returns two ropes: chars 0..index-1 and chars index..end
        // Just restructures tree pointers
    
    // O(log n) concatenate: join two ropes
    concat(rope1, rope2):
        newRoot = new InternalNode()
        newRoot.left = rope1.root
        newRoot.right = rope2.root
        newRoot.weight = size(rope1)
        return new Rope(newRoot)
    
    // O(log n) insert at position via split + concat
    insert(index, text):
        (left, right) = split(index)
        middle = new Rope(text)
        return concat(concat(left, middle), right)
    
    // O(log n) delete range via split + concat
    delete(start, end):
        (left, temp) = split(start)
        (_, right) = temp.split(end - start)
        return concat(left, right)

Rope vs String (Character Array)
Operation	String/Array	Rope	Notes
Index access	O(1)	O(log n)	Rope trades access speed for edit speed
Insert at position	O(n)	O(log n)	Rope: split + concat
Delete range	O(n)	O(log n)	Rope: two splits + concat
Concatenate	O(n+m)	O(log n)	Rope: create new root node
Sequential iteration	O(n), cache-friendly	O(n), more overhead	String is faster for iteration
Memory overhead	Minimal	Per-node pointers	Rope uses more memory

Why ropes matter for text editors:

Interactive editing: Insert/delete in O(log n) instead of O(n). A 10MB document edits as fast as a 10KB document.
Undo/redo efficiency: Splitting and concatenating creates new structure without copying strings. Old versions can be preserved cheaply.
Large file handling: Ropes can represent files larger than available RAM by keeping overflow segments on disk.
Parallel processing: Different subtrees can be processed independently.

Real-world usage:

Xi Editor: Google's experimental editor used ropes
SGI STL Rope: Early rope implementation in C++
IDE text buffers: Many modern IDEs use rope-like structures
Web browsers: Some browser DOM implementations use similar tree structures

Gap Buffers as Alternative

For simpler editors, gap buffers are an alternative: an array with a "gap" at the cursor position. Inserts are O(1) near the cursor and O(n) when moving the cursor far. This is simpler than ropes but less efficient for extreme cases.

Block-Based Allocators and Free Lists

Memory allocators represent one of the most practical applications of hybrid array-linked-list structures. They combine arrays (for size-class indexing) with linked lists (for tracking free blocks).

The problem:

General-purpose memory allocation (malloc/free) must handle:

Requests of arbitrary sizes
Unpredictable allocation/free patterns
Minimizing fragmentation
O(1) or near-O(1) allocation for common cases

Classic slab allocator design:

slab-allocator.pseudo

Pseudocode

// Simplified slab allocator
// Array of size classes, each with linked free list
 
SIZE_CLASSES = [8, 16, 32, 64, 128, 256, 512, 1024]  // Bytes
 
class SlabAllocator:
    freeLists: Array<LinkedList>  // One per size class
    
    constructor():
        for each sizeClass in SIZE_CLASSES:
            freeLists[sizeClassIndex] = new LinkedList()
            // Preallocate some blocks
            addSlabToFreeList(sizeClassIndex)
    
    allocate(requestedSize):
        // Find smallest size class that fits
        sizeClass = findSizeClass(requestedSize)  // O(1) lookup
        
        if freeLists[sizeClass].isEmpty():
            addSlabToFreeList(sizeClass)  // Get more blocks
        
        // Pop block from free list: O(1)
        block = freeLists[sizeClass].popFront()
        return block
    
    free(block):
        sizeClass = determineSizeClass(block)
        // Push block onto free list: O(1)
        freeLists[sizeClass].pushFront(block)
 
// Key insight: The block itself contains the 'next' pointer!
// When free, the block's memory holds the linked list node
// When allocated, that same memory holds user data
// Zero additional memory overhead for free list

Why this is brilliant:

Zero overhead for free list: The linked list pointers are stored in the free blocks themselves. A free 64-byte block has room for a 64-bit pointer with plenty to spare.
O(1) allocation and free: Array lookup by size class, then linked list pop/push.
No fragmentation within size classes: All blocks in a class are the same size.
Good cache behavior per slab: Blocks from the same slab are contiguous, improving locality.

Intrusive free list:

The free list is intrusive — the list node is embedded in the data rather than wrapping it:

intrusive-free-list.pseudo

Pseudocode

// Intrusive free list: zero overhead
// When block is FREE: first 8 bytes point to next free block
// When block is ALLOCATED: entire block is user data
 
FreeBlock:
    ┌────────────────────────────┐
    │ next: pointer (8 bytes)    │ ← Only metadata when free
    │ [unused space...]          │
    └────────────────────────────┘
 
AllocatedBlock:
    ┌────────────────────────────┐
    │ user data fills entire     │ ← No list overhead
    │ block (all bytes usable)   │
    └────────────────────────────┘
 
// Same memory, different interpretation based on state

Allocator Patterns in Practice

•jemalloc, tcmalloc: Production allocators using size-class free lists
•Linux kernel slab allocator: Common object sizes (inodes, file handles) in dedicated slabs
•Game engines: Object pools using free lists for game entities
•Database buffer pools: Fixed-size pages tracked via free list

Design Pattern: Intrusive Containers

The intrusive pattern (embedding links in the data) appears throughout systems programming. Linux kernel linked lists, high-performance game engines, and database internals all use intrusive structures to eliminate allocation overhead and improve cache behavior.

When to Consider Hybrid Structures

Hybrid structures add complexity. Use them when neither pure arrays nor pure linked lists meet requirements, and only when the performance gain justifies the implementation cost.

Choosing the Right Hybrid Structure
Requirement Pattern	Consider This Hybrid	Example Use Case
O(1) partition access + variable partition sizes	Array of linked lists	Hash table with chaining, adjacency lists
Frequent insertion + good cache behavior	Unrolled linked list	Text editor buffers, deques
O(log n) sorted operations + simple implementation	Skip list	Concurrent sorted maps, indexes
Frequent string manipulation at arbitrary positions	Rope	Text editors, document processing
O(1) allocation/free with zero overhead	Intrusive free list	Memory allocators, object pools
Need both O(log n) lookup and O(1) cache access	B-tree variants	Database indices, file systems

Decision guidelines:

Start simple: Use pure arrays or pure linked lists initially. Profile to identify actual bottlenecks.
Quantify the need: Hybrid structures add complexity. Ensure the performance gain is measurable and significant.
Consider maintenance: Will team members understand the hybrid structure? Is there library support?
Think long-term: Hybrid structures may be harder to modify as requirements evolve.
Profile the hybrid: Verify that the hybrid actually improves performance in your specific context. Constants matter.

Premature Optimization Warning

Don't use skip lists just because they're interesting. Don't implement ropes for a 1000-character text field. Hybrid structures solve specific problems at specific scales. Use them when you have those problems, not in anticipation of problems you might never face.

Summary and Module Conclusion

This module has equipped you with the knowledge to make informed decisions between arrays and linked lists — and to recognize when hybrid approaches offer superior solutions. Let's consolidate the key insights:

Hybrid Structures Summary:

Key Hybrid Structures

•Array of linked lists — O(1) access to partitions with dynamic partition sizes. Foundational for hash tables and graphs.
•Unrolled linked lists — Blocks improve cache behavior while maintaining flexibility. Good middle ground for many use cases.
•Skip lists — O(log n) operations with simpler implementation than balanced trees. Excellent for concurrent sorted data.
•Ropes — O(log n) string operations for large-scale text editing. Essential for modern editor architectures.
•Intrusive free lists — Zero-overhead memory management. The clever trick powering high-performance allocators.

Module Conclusion: The Decision Framework

You now have a complete framework for data structure selection:

Identify operations: What operations will your code perform most frequently?
Check array criteria: Random access? Sequential iteration? Memory constraints? Sorting? → Favor arrays.
Check linked list criteria: Frequent mid-list changes? Stable references? Merge/split? Real-time? → Favor linked lists.
Consider hybrids: Neither option satisfies requirements? Look for hybrid structures that combine needed properties.
Measure: When in doubt, profile with realistic data. Constants and cache effects can dominate asymptotic complexity.

This isn't a memorization exercise — it's pattern recognition that becomes instinctive with practice. Every time you choose a data structure, you reinforce your intuition for these trade-offs.

Module Complete

You have mastered the comparison between arrays and linked lists — the two fundamental sequential data structures. You understand when to use each, why, and how to recognize situations that call for hybrid approaches. This knowledge will inform every data structure decision throughout your engineering career.

Hybrid Approaches — Combining Arrays and Linked Lists

Beyond the Binary Choice

This page explores the most important hybrid approaches, when they're appropriate, and how they achieve their performance characteristics.

What You Will Learn

Array of Linked Lists

Design rationale:

The array provides O(1) access to a specific "bucket" or "vertex"
Each linked list handles variable numbers of elements per bucket/vertex
The combination provides fast lookup plus dynamic sizing per bucket

Hash Table with Chaining:

A hash table maps keys to indices via a hash function. Collisions (multiple keys mapping to the same index) are handled by storing all colliding elements in a linked list at that index:

hash-table-chaining.pseudo

Pseudocode

// Hash table with chaining
// Array of linked lists
 
class HashTable:
    buckets: Array<LinkedList>  // Array of size M
    
    constructor(size):
        buckets = new Array(size)
        for i in 0..size:
            buckets[i] = new LinkedList()  // Empty list per bucket
    
    insert(key, value):
        index = hash(key) mod buckets.length  // O(1)
        buckets[index].addFirst(key, value)   // O(1)
    
    find(key):
        index = hash(key) mod buckets.length  // O(1)
        return buckets[index].search(key)     // O(k), k = chain length
    
    delete(key):
        index = hash(key) mod buckets.length  // O(1)
        buckets[index].remove(key)            // O(k) search + O(1) delete
 
// Average case (good hash function, load factor ~1):
// Insert: O(1), Find: O(1), Delete: O(1)
// Worst case (all keys collide):
// All operations: O(n)

Why this works:

Array indexing (via hash) is O(1)
Each chain handles an unknown, variable number of collisions
Linked lists allow O(1) insertion without knowing chain length in advance
Deletion from a chain is O(1) once the node is found

Graph Adjacency Lists:

Graph representations face a similar challenge: each vertex has a variable number of neighbors.

adjacency-list.pseudo

Pseudocode

// Graph as array of linked lists
// vertices[i] = head of list of neighbors of vertex i
 
class Graph:
    adjacencyList: Array<LinkedList>  // Index = vertex ID
    
    constructor(numVertices):
        adjacencyList = new Array(numVertices)
        for i in 0..numVertices:
            adjacencyList[i] = new LinkedList()
    
    addEdge(from, to):
        adjacencyList[from].addFirst(to)  // O(1)
        // For undirected: adjacencyList[to].addFirst(from)
    
    neighbors(vertex):
        return adjacencyList[vertex].iterator()  // O(1) to get iterator
    
    hasEdge(from, to):
        return adjacencyList[from].contains(to)  // O(degree of 'from')
 
// Space: O(V + E) for V vertices and E edges
// Adding edge: O(1)
// Iterating neighbors: O(degree)
// Checking specific edge: O(degree), but can be O(1) with hash set per vertex

When to Use Array of Linked Lists

•Variable-size partitions: Each bucket/vertex has a different, unpredictable number of elements
•Frequent per-partition modifications: Elements are frequently added to or removed from individual partitions
•Indexed access to partitions: You need O(1) access to a specific partition
•No inter-partition random access: You don't need to access "element 5 in bucket 3" — only iterate or search within buckets

Modern Alternatives

Unrolled Linked Lists

An unrolled linked list is a linked list where each node contains an array of elements rather than a single element. This hybrid structure offers a middle ground between arrays and linked lists.

Design rationale:

Each node is a small array (typically sized to a cache line or a few cache lines)
Nodes are linked together like a traditional linked list
This reduces pointer overhead and improves cache utilization while maintaining some linked list flexibility

unrolled-linked-list.pseudo

Pseudocode

// Unrolled Linked List
// Each node holds multiple elements in an array
 
const BLOCK_SIZE = 16  // Elements per node (tune for cache line)
 
class UnrolledNode:
    elements: Array[BLOCK_SIZE]
    count: int               // Current elements in this node
    next: UnrolledNode
 
class UnrolledLinkedList:
    head: UnrolledNode
    nodeCount: int
    totalElements: int
    
    insert(index, value):
        // Find the node containing the index
        node, localIndex = findNodeAndIndex(index)
        
        if node.count < BLOCK_SIZE:
            // Room in current node: shift and insert locally
            shiftRight(node.elements, localIndex)
            node.elements[localIndex] = value
            node.count++
        else:
            // Node full: split into two nodes
            splitNode(node, localIndex, value)
    
    get(index):
        node, localIndex = findNodeAndIndex(index)
        return node.elements[localIndex]
    
    findNodeAndIndex(globalIndex):
        node = head
        remaining = globalIndex
        while remaining >= node.count:
            remaining -= node.count
            node = node.next
        return (node, remaining)
 
// Trade-offs compared to regular linked list:
// + Better cache locality (BLOCK_SIZE elements per fetch)
// + Lower pointer overhead (one pointer per BLOCK_SIZE elements)
// - More complex implementation
// - Some internal fragmentation if nodes aren't full

Unrolled Linked List vs Traditional Structures
Aspect	Array	Linked List	Unrolled Linked List
Random access	O(1)	O(n)	O(n/B) where B=block size
Insert at position	O(n)	O(1) with reference	O(n/B) + O(B) local shift
Pointer overhead	None	100% (one per element)	~6% (one per B elements)
Cache behavior	Excellent	Poor	Good
Memory fragmentation	None	High	Low
Implementation complexity	Simple	Moderate	Complex

Why unrolled linked lists work:

Cache efficiency: When you access one element, nearby elements (in the same node-array) are cached. Iteration becomes much faster.
Reduced overhead: Instead of one pointer per element, you have one pointer per B elements. For B=16, pointer overhead drops from 100% to ~6%.
Maintained flexibility: You can still insert/delete without shifting the entire structure — only the affected node needs modification (or splitting/merging).
Tunable block size: Block size can be tuned for cache line size, element size, and workload characteristics.

Real-world applications:

Text editors: Buffer of lines as unrolled list — good for both sequential display and random editing
Database indices: B-tree nodes are essentially unrolled — many keys per node
Memory allocators: Some allocators use similar block-based free lists
Standard libraries: Some language implementations use unrolled lists for deques

Block Size Selection

Skip Lists

Design rationale:

A sorted linked list has O(n) search because you must traverse linearly. Skip lists add shortcut pointers at multiple levels:

Level 0: All elements (standard linked list)
Level 1: ~1/2 of elements (skip every other)
Level 2: ~1/4 of elements (skip every 4)
Level k: ~1/2^k of elements

Search starts at the highest level, moves forward until overshooting, then drops down. This is analogous to binary search but works on a linked structure.

skip-list.pseudo

Pseudocode

// Skip List with probabilistic level assignment
const MAX_LEVEL = 16
const P = 0.5  // Probability of promoting to next level
 
class SkipNode:
    value: int
    forward: Array<SkipNode>[MAX_LEVEL]  // Pointers at each level
    
class SkipList:
    header: SkipNode  // Sentinel with value -∞
    level: int        // Current max level in use
    
    randomLevel():
        lvl = 0
        while random() < P and lvl < MAX_LEVEL - 1:
            lvl++
        return lvl
    
    search(value):
        current = header
        // Start at highest level, work down
        for i = level down to 0:
            while current.forward[i] != null and 
                  current.forward[i].value < value:
                current = current.forward[i]  // Move forward at level i
        // Now at level 0, one step from target
        current = current.forward[0]
        return current != null and current.value == value
    
    insert(value):
        update = new Array[MAX_LEVEL]  // Track update points
        current = header
        
        // Find position and track predecessors at each level
        for i = level down to 0:
            while current.forward[i] != null and
                  current.forward[i].value < value:
                current = current.forward[i]
            update[i] = current
        
        // Insert new node with random height
        newLevel = randomLevel()
        if newLevel > level:
            for i = level + 1 to newLevel:
                update[i] = header
            level = newLevel
        
        newNode = new SkipNode(value)
        for i = 0 to newLevel:
            newNode.forward[i] = update[i].forward[i]
            update[i].forward[i] = newNode
 
// Expected complexities:
// Search: O(log n) expected
// Insert: O(log n) expected  
// Delete: O(log n) expected
// Space: O(n) expected (2n pointers on average)

Skip List Advantages

•O(log n) search/insert/delete — matches balanced trees
•Simpler implementation — no rotations or rebalancing
•Ordered iteration — easy to traverse in sorted order
•Lock-free variants — easier to make concurrent than trees
•Range queries — efficient finding of all elements in a range

Skip List Limitations

•Probabilistic — worst case is O(n), though extremely rare
•Memory overhead — 2n pointers on average
•Cache behavior — still pointer-chasing at each level
•No efficient indexing — no O(1) access by position
•Random number generation — requires RNG on every insert

Real-world usage:

Redis sorted sets: Redis uses skip lists for its sorted set data type, enabling efficient rank queries
LevelDB/RocksDB: Skip lists used for in-memory tables before flush to disk
Concurrent skip lists: Java's ConcurrentSkipListMap provides thread-safe sorted map

When to choose skip lists over balanced trees:

Simpler implementation is valuable (maintenance, debugging)
Concurrent access is required (lock-free algorithms are simpler for skip lists)
Range queries are common (easy bidirectional traversal)
Probabilistic guarantees are acceptable (rare worst case)

Redis Design Decision

Rope Data Structure for Text

The problem with strings as arrays:

Text editors face challenging requirements:

Documents can be arbitrarily large (millions of characters)
Edits happen at arbitrary positions (cursor location)
Operations must be interactive (< 16ms for 60fps)

With a simple character array:

Insertion at position K requires shifting N-K characters: O(n)
For a 10MB document, every keystroke could require moving megabytes of data

How ropes solve this:

rope.pseudo

Pseudocode

// Rope: Tree of string segments
// Leaves: short strings (arrays of chars)
// Internal nodes: concatenation + weight (left subtree length)
 
class RopeNode:
    // Leaf node
    text: String        // Only in leaves
    
    // Internal node  
    left: RopeNode
    right: RopeNode
    weight: int         // Length of left subtree
 
class Rope:
    root: RopeNode
    
    // O(log n) to find character at index
    charAt(index):
        return charAtNode(root, index)
    
    charAtNode(node, index):
        if node is leaf:
            return node.text[index]
        if index < node.weight:
            return charAtNode(node.left, index)
        else:
            return charAtNode(node.right, index - node.weight)
    
    // O(log n) split: divide rope at position
    split(index):
        // Returns two ropes: chars 0..index-1 and chars index..end
        // Just restructures tree pointers
    
    // O(log n) concatenate: join two ropes
    concat(rope1, rope2):
        newRoot = new InternalNode()
        newRoot.left = rope1.root
        newRoot.right = rope2.root
        newRoot.weight = size(rope1)
        return new Rope(newRoot)
    
    // O(log n) insert at position via split + concat
    insert(index, text):
        (left, right) = split(index)
        middle = new Rope(text)
        return concat(concat(left, middle), right)
    
    // O(log n) delete range via split + concat
    delete(start, end):
        (left, temp) = split(start)
        (_, right) = temp.split(end - start)
        return concat(left, right)

Rope vs String (Character Array)
Operation	String/Array	Rope	Notes
Index access	O(1)	O(log n)	Rope trades access speed for edit speed
Insert at position	O(n)	O(log n)	Rope: split + concat
Delete range	O(n)	O(log n)	Rope: two splits + concat
Concatenate	O(n+m)	O(log n)	Rope: create new root node
Sequential iteration	O(n), cache-friendly	O(n), more overhead	String is faster for iteration
Memory overhead	Minimal	Per-node pointers	Rope uses more memory

Why ropes matter for text editors:

Interactive editing: Insert/delete in O(log n) instead of O(n). A 10MB document edits as fast as a 10KB document.
Undo/redo efficiency: Splitting and concatenating creates new structure without copying strings. Old versions can be preserved cheaply.
Large file handling: Ropes can represent files larger than available RAM by keeping overflow segments on disk.
Parallel processing: Different subtrees can be processed independently.

Real-world usage:

Xi Editor: Google's experimental editor used ropes
SGI STL Rope: Early rope implementation in C++
IDE text buffers: Many modern IDEs use rope-like structures
Web browsers: Some browser DOM implementations use similar tree structures

Gap Buffers as Alternative

Block-Based Allocators and Free Lists

Memory allocators represent one of the most practical applications of hybrid array-linked-list structures. They combine arrays (for size-class indexing) with linked lists (for tracking free blocks).

The problem:

General-purpose memory allocation (malloc/free) must handle:

Requests of arbitrary sizes
Unpredictable allocation/free patterns
Minimizing fragmentation
O(1) or near-O(1) allocation for common cases

Classic slab allocator design:

slab-allocator.pseudo

Pseudocode

// Simplified slab allocator
// Array of size classes, each with linked free list
 
SIZE_CLASSES = [8, 16, 32, 64, 128, 256, 512, 1024]  // Bytes
 
class SlabAllocator:
    freeLists: Array<LinkedList>  // One per size class
    
    constructor():
        for each sizeClass in SIZE_CLASSES:
            freeLists[sizeClassIndex] = new LinkedList()
            // Preallocate some blocks
            addSlabToFreeList(sizeClassIndex)
    
    allocate(requestedSize):
        // Find smallest size class that fits
        sizeClass = findSizeClass(requestedSize)  // O(1) lookup
        
        if freeLists[sizeClass].isEmpty():
            addSlabToFreeList(sizeClass)  // Get more blocks
        
        // Pop block from free list: O(1)
        block = freeLists[sizeClass].popFront()
        return block
    
    free(block):
        sizeClass = determineSizeClass(block)
        // Push block onto free list: O(1)
        freeLists[sizeClass].pushFront(block)
 
// Key insight: The block itself contains the 'next' pointer!
// When free, the block's memory holds the linked list node
// When allocated, that same memory holds user data
// Zero additional memory overhead for free list

Why this is brilliant:

Zero overhead for free list: The linked list pointers are stored in the free blocks themselves. A free 64-byte block has room for a 64-bit pointer with plenty to spare.
O(1) allocation and free: Array lookup by size class, then linked list pop/push.
No fragmentation within size classes: All blocks in a class are the same size.
Good cache behavior per slab: Blocks from the same slab are contiguous, improving locality.

Intrusive free list:

The free list is intrusive — the list node is embedded in the data rather than wrapping it:

intrusive-free-list.pseudo

Pseudocode

// Intrusive free list: zero overhead
// When block is FREE: first 8 bytes point to next free block
// When block is ALLOCATED: entire block is user data
 
FreeBlock:
    ┌────────────────────────────┐
    │ next: pointer (8 bytes)    │ ← Only metadata when free
    │ [unused space...]          │
    └────────────────────────────┘
 
AllocatedBlock:
    ┌────────────────────────────┐
    │ user data fills entire     │ ← No list overhead
    │ block (all bytes usable)   │
    └────────────────────────────┘
 
// Same memory, different interpretation based on state

Allocator Patterns in Practice

•jemalloc, tcmalloc: Production allocators using size-class free lists
•Linux kernel slab allocator: Common object sizes (inodes, file handles) in dedicated slabs
•Game engines: Object pools using free lists for game entities
•Database buffer pools: Fixed-size pages tracked via free list

Design Pattern: Intrusive Containers

When to Consider Hybrid Structures

Hybrid structures add complexity. Use them when neither pure arrays nor pure linked lists meet requirements, and only when the performance gain justifies the implementation cost.

Choosing the Right Hybrid Structure
Requirement Pattern	Consider This Hybrid	Example Use Case
O(1) partition access + variable partition sizes	Array of linked lists	Hash table with chaining, adjacency lists
Frequent insertion + good cache behavior	Unrolled linked list	Text editor buffers, deques
O(log n) sorted operations + simple implementation	Skip list	Concurrent sorted maps, indexes
Frequent string manipulation at arbitrary positions	Rope	Text editors, document processing
O(1) allocation/free with zero overhead	Intrusive free list	Memory allocators, object pools
Need both O(log n) lookup and O(1) cache access	B-tree variants	Database indices, file systems

Decision guidelines:

Start simple: Use pure arrays or pure linked lists initially. Profile to identify actual bottlenecks.
Quantify the need: Hybrid structures add complexity. Ensure the performance gain is measurable and significant.
Consider maintenance: Will team members understand the hybrid structure? Is there library support?
Think long-term: Hybrid structures may be harder to modify as requirements evolve.
Profile the hybrid: Verify that the hybrid actually improves performance in your specific context. Constants matter.

Premature Optimization Warning

Summary and Module Conclusion

Hybrid Structures Summary:

Key Hybrid Structures

•Array of linked lists — O(1) access to partitions with dynamic partition sizes. Foundational for hash tables and graphs.
•Unrolled linked lists — Blocks improve cache behavior while maintaining flexibility. Good middle ground for many use cases.
•Skip lists — O(log n) operations with simpler implementation than balanced trees. Excellent for concurrent sorted data.
•Ropes — O(log n) string operations for large-scale text editing. Essential for modern editor architectures.
•Intrusive free lists — Zero-overhead memory management. The clever trick powering high-performance allocators.

Module Conclusion: The Decision Framework

You now have a complete framework for data structure selection:

Identify operations: What operations will your code perform most frequently?
Check array criteria: Random access? Sequential iteration? Memory constraints? Sorting? → Favor arrays.
Check linked list criteria: Frequent mid-list changes? Stable references? Merge/split? Real-time? → Favor linked lists.
Consider hybrids: Neither option satisfies requirements? Look for hybrid structures that combine needed properties.
Measure: When in doubt, profile with realistic data. Constants and cache effects can dominate asymptotic complexity.

This isn't a memorization exercise — it's pattern recognition that becomes instinctive with practice. Every time you choose a data structure, you reinforce your intuition for these trade-offs.

Module Complete