Data Structures & AlgorithmsLinked Lists

Why Linked Lists? Limitations of Arrays

LevelBeginner

Duration55 mins

TopicLinked Lists

2 / 4

The Cost of Insertion and Deletion in Arrays

Understanding the True Cost

We've established that insertion and deletion in arrays are O(n) operations. But understanding Big O notation is only the beginning. To truly internalize why this matters, we need to examine the mechanics of these operations — what actually happens in memory, why it's unavoidable, and how the costs compound in practice.

This page takes a deep dive into the most problematic array operations: inserting and deleting elements at positions other than the end. By the time we're done, you'll not only understand the cost intellectually but feel why a different data structure is needed.

What You Will Learn

By the end of this page, you will understand: (1) the exact mechanics of array insertion and deletion, (2) why contiguous storage makes these operations inherently expensive, (3) how costs scale with data size and operation frequency, and (4) real-world scenarios where these costs become prohibitive.

The Mechanics of Array Insertion

Let's trace through exactly what happens when we insert an element into an array. Understanding each step reveals why the operation cannot be faster than O(n) for non-end insertions.

The fundamental constraint:

Arrays guarantee that array[i] and array[i+1] are adjacent in memory. This means when we insert at position k, elements at positions k, k+1, ..., n-1 must all move one position right.

There is no shortcut. No clever trick. The elements must physically relocate.

insertion_mechanics.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
def insert_at_position(arr, position, value):
    """
    Insert 'value' at 'position' in 'arr'.
    
    Precondition: arr has at least one empty slot at the end
    
    This demonstrates exactly what happens during insertion.
    """
    n = len(arr)
    
    if position < 0 or position > n:
        raise IndexError("Position out of bounds")
    
    # Step 1: Shift all elements from position to end, moving RIGHT
    # We start from the END to avoid overwriting data we haven't moved yet
    for i in range(n - 1, position - 1, -1):
        arr[i + 1] = arr[i]  # Each shift is one operation
        # When i = n-1: arr[n] = arr[n-1]
        # When i = n-2: arr[n-1] = arr[n-2]
        # ...
        # When i = position: arr[position+1] = arr[position]
    
    # Step 2: Place the new value at the now-empty position
    arr[position] = value
    
    # Total operations:
    # - For position = 0: shift n elements + 1 write = n + 1 operations
    # - For position = n: shift 0 elements + 1 write = 1 operation
    # - General case: n - position + 1 operations

Why we shift from right to left:

This detail matters! If we shifted left-to-right, we'd overwrite elements before moving them:

wrong_direction.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
WRONG: Shifting left-to-right
 
Array: [A, B, C, D, E]
Insert X at position 1
 
Step 1: arr[2] = arr[1]  →  [A, B, B, D, E]  // C is LOST!
Step 2: arr[3] = arr[2]  →  [A, B, B, B, E]  // D is LOST!
...
 
CORRECT: Shifting right-to-left
 
Array: [A, B, C, D, E, _]  (with slack space)
Insert X at position 1
 
Step 1: arr[5] = arr[4]  →  [A, B, C, D, E, E]  
Step 2: arr[4] = arr[3]  →  [A, B, C, D, D, E]
Step 3: arr[3] = arr[2]  →  [A, B, C, C, D, E]
Step 4: arr[2] = arr[1]  →  [A, B, B, C, D, E]
Step 5: arr[1] = X       →  [A, X, B, C, D, E]  ✓

Memory Operations Are Expensive

Each arr[i+1] = arr[i] involves reading from one memory location and writing to another. For large arrays, this means potentially millions of memory operations. Memory access — even in cache — is one of the costliest primitive operations a CPU performs.

The Mechanics of Array Deletion

Deletion is the mirror image of insertion. Instead of making room, we fill a gap. Instead of shifting right, we shift left.

The gap problem:

After removing element at position k, we have a gap. But arrays cannot have gaps — the contiguity guarantee requires that array[k] contains a valid element. So elements at positions k+1, k+2, ..., n-1 must all shift left by one.

deletion_mechanics.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
def delete_at_position(arr, position, n):
    """
    Delete element at 'position' from 'arr'.
    
    'n' is the current logical size of the array.
    Returns the new logical size.
    
    This demonstrates exactly what happens during deletion.
    """
    if position < 0 or position >= n:
        raise IndexError("Position out of bounds")
    
    # Step 1: Shift all elements after position LEFT
    # We start from the position and move toward the end
    for i in range(position, n - 1):
        arr[i] = arr[i + 1]  # Each shift is one operation
        # When i = position: arr[position] = arr[position+1]
        # When i = position+1: arr[position+1] = arr[position+2]
        # ...
        # When i = n-2: arr[n-2] = arr[n-1]
    
    # Step 2: The last logical position is now invalid
    # (Optionally clear it, though it's now outside our logical array)
    arr[n - 1] = None
    
    # Total operations:
    # - For position = 0: shift n-1 elements = n-1 operations
    # - For position = n-1: shift 0 elements = 0 operations
    # - General case: n - position - 1 operations
    
    return n - 1  # New logical size

Visualizing a front deletion:

front_deletion.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Array with 6 elements: [A, B, C, D, E, F]
Delete at position 0 (remove A)
 
Memory layout before:
Address:  100  104  108  112  116  120
Value:    [A] [B] [C] [D] [E] [F]
 
Step 1: arr[0] = arr[1]  →  [B, B, C, D, E, F]
Step 2: arr[1] = arr[2]  →  [B, C, C, D, E, F]
Step 3: arr[2] = arr[3]  →  [B, C, D, D, E, F]
Step 4: arr[3] = arr[4]  →  [B, C, D, E, E, F]
Step 5: arr[4] = arr[5]  →  [B, C, D, E, F, F]
Step 6: Clear position 5 →  [B, C, D, E, F, _]
 
Total: 5 shift operations for a 6-element array
In general: n-1 operations for an n-element array = O(n)

The Queue Anti-Pattern

A queue implemented with an array removes from the front repeatedly. Each removal is O(n). Processing n items takes O(n²) time total! This is why production queue implementations use circular arrays (fixed size) or linked lists (dynamic size).

Quantifying the Cost

Let's move from abstract O(n) to concrete numbers. Understanding the actual scale helps build intuition for when array costs become problematic.

Single operation costs:

Element Shifts Required by Array Size and Position
Array Size	Front Insert/Delete	Middle Insert/Delete	End Insert/Delete
100	100 shifts	50 shifts	0-1 shifts
1,000	1,000 shifts	500 shifts	0-1 shifts
10,000	10,000 shifts	5,000 shifts	0-1 shifts
100,000	100,000 shifts	50,000 shifts	0-1 shifts
1,000,000	1,000,000 shifts	500,000 shifts	0-1 shifts

Multiple operation costs (cumulative):

Single operations often seem affordable, but operations accumulate:

Total Shifts for n Operations on n-Element Array
Scenario	Operations	Total Shifts	Complexity
n front insertions	n	n × n = n²	O(n²)
n random insertions	n	n × (n/2) = n²/2	O(n²)
n front deletions	n	n + (n-1) + ... + 1 = n(n+1)/2	O(n²)
n end insertions	n	0 (ignoring resize)	O(n)
n end deletions	n	0	O(n)

Time estimation:

Modern memory operations take roughly 1 nanosecond for cached access, 100+ nanoseconds for main memory. Let's estimate conservatively with 10ns per shift:

Estimated Time for Front Insertion
Array Size	Shifts	Time @ 10ns/shift
1,000	1,000	10 microseconds
10,000	10,000	100 microseconds
100,000	100,000	1 millisecond
1,000,000	1,000,000	10 milliseconds
10,000,000	10,000,000	100 milliseconds

Death by a Thousand Insertions

Performing 10,000 front insertions on a million-element array: 10,000 × 1,000,000 = 10 billion shifts. At 10ns each: 100 seconds. That's nearly two minutes for what seems like a simple operation!

Cache and Memory Hierarchy Effects

The naive analysis assumes constant cost per shift. Reality is messier. Modern computers have a memory hierarchy that dramatically affects actual performance.

The memory hierarchy:

memory_hierarchy.txt
1
2
3
4
5
6
7
8
9
Memory Level    | Size      | Latency       | Bandwidth
----------------|-----------|---------------|------------
CPU Registers   | ~1KB      | <1ns          | -
L1 Cache        | 32-64KB   | ~1ns          | ~1 TB/s
L2 Cache        | 256KB-1MB | ~4ns          | ~500 GB/s
L3 Cache        | 4-64MB    | ~12ns         | ~200 GB/s
Main Memory     | 16GB+     | ~100ns        | ~50 GB/s
SSD             | 256GB+    | ~100,000ns    | ~3 GB/s
HDD             | 1TB+      | ~10,000,000ns | ~150 MB/s

How this affects insertion/deletion:

When shifting elements:

Small arrays (< L1 cache): All elements fit in cache. Shifts are fast (~1-2ns each). The overhead is minimal.
Medium arrays (L2/L3 cache): Some elements are in cache, some aren't. Performance becomes unpredictable — sometimes fast, sometimes slow.
Large arrays (> L3 cache): Most shifts require main memory access. Each shift costs ~100ns instead of ~1ns. Performance degrades 100x compared to the small array case.

The Cliff Effect

Performance doesn't degrade smoothly. There's often a 'cliff' where the array exceeds cache size and performance suddenly drops. An array of 3 million elements might be 50x slower per operation than one of 2 million — because 3 million bytes exceed L3 cache.

Arrays have good cache locality... for traversal:

Arrays are cache-friendly for sequential access because adjacent elements are loaded together. But insertion/deletion destroys this advantage:

Shifting writes to every position from k to n
This "pollutes" the cache with data you just modified
After a front insertion, the cache contains the back of the array
The front (which you just inserted to) may have been evicted

The cache advantage of arrays primarily helps reading, not writing.

Amortized Analysis Doesn't Always Help

When discussing dynamic arrays, we often hear that append is "amortized O(1)." This is true and important, but it's frequently misunderstood — and it does not apply to insertion at arbitrary positions.

What amortized O(1) means:

amortized_analysis.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Appending to a dynamic array (capacity doubling):
 
Operation 1: Write to index 0. Cost: 1
Operation 2: Resize (1→2), copy 1, write. Cost: 2
Operation 3: Resize (2→4), copy 2, write. Cost: 3
Operation 4: Write to index 3. Cost: 1
Operation 5: Resize (4→8), copy 4, write. Cost: 5
Operation 6-8: Write. Cost: 1 each = 3
...
 
After n operations:
- Total resizes: log₂(n)
- Total copies from resizing: 1 + 2 + 4 + ... + n/2 ≈ n
- Total direct writes: n
- Total cost: ~2n
 
Average cost per operation: 2n / n = O(1) amortized
 
The expensive operations (resizing) are rare enough that they
average out to constant time.

Why amortization doesn't save insertion at position k:

The amortized argument works because resizing is rare — it happens only log(n) times across n operations. But when inserting at position k:

Every single insertion requires shifting n - k elements
There's no "cheap" insertion to average with
All insertions at the front have exactly the same high cost

No matter how you analyze it, n front insertions require n² shifts. The cost is inherent, not amortizable.

A Common Misconception

Some developers assume that because ArrayList/Vector has 'amortized O(1) add,' all additions are cheap. No! Amortized O(1) applies only to add at the END. add(0, element) is always O(n) — every single time, no amortization possible.

Java ArrayList Method Complexity
Method	Complexity	Notes
add(element)	O(1) amortized	Appends to end
add(0, element)	O(n)	Inserts at front, shifts all
add(i, element)	O(n)	Inserts at i, shifts rest
remove(size-1)	O(1)	Removes from end
remove(0)	O(n)	Removes from front, shifts all
get(i)	O(1)	Random access by index

Real-World Consequences

Let's examine specific scenarios where array insertion/deletion costs create genuine engineering problems.

Case 1: The Naive Priority Queue

naive_priority_queue.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
class NaivePriorityQueue:
    """
    Priority queue implemented as a sorted array.
    Highest priority = highest value = end of array.
    
    This is how NOT to implement a priority queue!
    """
    def __init__(self):
        self.items = []
    
    def insert(self, value):
        # Find correct position to maintain sorted order
        position = 0
        for i, v in enumerate(self.items):
            if v < value:
                position = i + 1
        
        # Insert at position (requires shifting!)
        self.items.insert(position, value)  # O(n)!
    
    def remove_max(self):
        # Remove from end is O(1)
        return self.items.pop()
 
# Performance test
import time
pq = NaivePriorityQueue()
 
for size in [1000, 5000, 10000, 20000]:
    start = time.time()
    for i in range(size):
        pq.insert(i)  # Insert in sorted order
    end = time.time()
    print(f"Insert {size} items: {end-start:.3f}s")
    pq.items.clear()
 
# Typical output:
# Insert 1000 items: 0.005s
# Insert 5000 items: 0.120s
# Insert 10000 items: 0.480s
# Insert 20000 items: 1.920s
#
# Note: Time grows quadratically! 2x size → 4x time

Case 2: The Document Editor

Consider a simple document editor storing characters in an array:

Array-Based Editor Problems

•Typing at cursor position: Cursor at position 0 of a 100,000 character document? Each keystroke shifts 100,000 characters. At 5 characters/second: 500,000 shifts per second.
•Backspace: Deleting a character requires shifting all subsequent characters left. Same O(n) cost as insertion.
•Cut/paste large blocks: Moving 1,000 characters requires 1,000 deletions then 1,000 insertions = 2,000 × n operations.
•Undo: If undo is implemented by reinserting deleted characters, performance degrades as document grows.

This is why real text editors use specialized structures like:

Gap buffers: Keep a movable gap at the cursor for O(1) local edits
Piece tables: Track insertions as a sequence of pieces
Ropes: Balance a tree of text fragments

None of these are pure arrays.

Case 3: The Message Queue

A message queue processes items in FIFO order: new messages append, processed messages remove from front. With an array: append is O(1), but remove from front is O(n). Processing n messages = O(n²) time. At 10,000 messages/second, the system quickly falls behind.

Why We Can't Optimize This Away

A natural question: Can't we be clever and avoid these costs? Let's examine common "solutions" and why they fail.

Attempt 1: Lazy deletion (tombstoning)

tombstoning.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Instead of shifting, mark elements as "deleted"
arr = ['A', 'B', 'C', 'D', 'E']
deleted = [False, False, False, False, False]
 
def delete_lazy(index):
    deleted[index] = True  # O(1)!
 
def get(index):
    # Problem: We need the index-th NON-DELETED element
    # Must scan from beginning
    count = 0
    for i in range(len(arr)):
        if not deleted[i]:
            if count == index:
                return arr[i]
            count += 1
    raise IndexError("Not found")
 
# Lazy deletion makes get() O(n)!
# We've shifted the cost, not eliminated it.
# Plus we're wasting memory on "deleted" elements.

Attempt 2: Keep deleted slots, insert into gaps

gap_reuse.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Track free slots and reuse them
arr = ['A', None, 'C', None, 'E']  # B and D deleted
free_slots = [1, 3]
 
def insert(value):
    if free_slots:
        slot = free_slots.pop()
        arr[slot] = value  # O(1)!
    else:
        arr.append(value)
 
# Problem: Order is lost!
# arr might end up as ['A', 'F', 'C', 'G', 'E']
# The "first" element is no longer at index 0
# For many use cases (queues, ordered lists), this is unacceptable.

Attempt 3: Use a different index

index_mapping.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# Keep an indirection layer that maps logical to physical indices
arr = ['A', 'B', 'C', 'D', 'E']
logical_to_physical = [0, 1, 2, 3, 4]
 
def get(logical_index):
    physical = logical_to_physical[logical_index]
    return arr[physical]
 
def insert_at(logical_index, value):
    # Add to end of array
    physical = len(arr)
    arr.append(value)
    
    # Update the indirection
    logical_to_physical.insert(logical_index, physical)  # O(n) - same problem!
 
# We've just moved the shifting from the data to the index.
# The fundamental O(n) insertion remains.

The Fundamental Truth

There is no way to have both: (1) O(1) arbitrary-position insertion/deletion AND (2) O(1) random access by contiguous index. This is a fundamental data structure trade-off. Arrays chose random access; linked lists choose insertion/deletion.

Where This Leads

We've thoroughly established the problem. Now the question becomes: what properties would an alternative structure need?

Requirements for a better insertion/deletion structure:

Desired Properties

•No shifting required: Inserting an element should not require moving others
•No contiguity constraint: Elements can exist anywhere in memory
•Order preserved: We still know which element is first, second, etc.
•Dynamic size: No pre-allocation or resizing needed
•Proportional memory: Use only what we need, no slack space

The insight:

What if, instead of storing elements contiguously, we stored each element with a reference to the next element? Then:

Insertion = Update two references. No shifting.
Deletion = Update one reference. No shifting.
Order = Following references from first to last
Size = Add elements anywhere in memory
Memory = Allocate/deallocate per element

This is the core idea of linked structures.

Page Complete

You now understand in depth why array insertion and deletion are expensive. The costs are fundamental to contiguous storage and cannot be optimized away without changing the data structure. The next page explores the need for dynamic, flexible structures and how linked lists provide a fundamentally different set of trade-offs.

2 / 4

Loading learning content...

Data Structures & AlgorithmsLinked Lists

Why Linked Lists? Limitations of Arrays

LevelBeginner

Duration55 mins

TopicLinked Lists

2 / 4

The Cost of Insertion and Deletion in Arrays

Understanding the True Cost

What You Will Learn

The Mechanics of Array Insertion

Let's trace through exactly what happens when we insert an element into an array. Understanding each step reveals why the operation cannot be faster than O(n) for non-end insertions.

The fundamental constraint:

Arrays guarantee that array[i] and array[i+1] are adjacent in memory. This means when we insert at position k, elements at positions k, k+1, ..., n-1 must all move one position right.

There is no shortcut. No clever trick. The elements must physically relocate.

insertion_mechanics.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
def insert_at_position(arr, position, value):
    """
    Insert 'value' at 'position' in 'arr'.
    
    Precondition: arr has at least one empty slot at the end
    
    This demonstrates exactly what happens during insertion.
    """
    n = len(arr)
    
    if position < 0 or position > n:
        raise IndexError("Position out of bounds")
    
    # Step 1: Shift all elements from position to end, moving RIGHT
    # We start from the END to avoid overwriting data we haven't moved yet
    for i in range(n - 1, position - 1, -1):
        arr[i + 1] = arr[i]  # Each shift is one operation
        # When i = n-1: arr[n] = arr[n-1]
        # When i = n-2: arr[n-1] = arr[n-2]
        # ...
        # When i = position: arr[position+1] = arr[position]
    
    # Step 2: Place the new value at the now-empty position
    arr[position] = value
    
    # Total operations:
    # - For position = 0: shift n elements + 1 write = n + 1 operations
    # - For position = n: shift 0 elements + 1 write = 1 operation
    # - General case: n - position + 1 operations

Why we shift from right to left:

This detail matters! If we shifted left-to-right, we'd overwrite elements before moving them:

wrong_direction.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
WRONG: Shifting left-to-right
 
Array: [A, B, C, D, E]
Insert X at position 1
 
Step 1: arr[2] = arr[1]  →  [A, B, B, D, E]  // C is LOST!
Step 2: arr[3] = arr[2]  →  [A, B, B, B, E]  // D is LOST!
...
 
CORRECT: Shifting right-to-left
 
Array: [A, B, C, D, E, _]  (with slack space)
Insert X at position 1
 
Step 1: arr[5] = arr[4]  →  [A, B, C, D, E, E]  
Step 2: arr[4] = arr[3]  →  [A, B, C, D, D, E]
Step 3: arr[3] = arr[2]  →  [A, B, C, C, D, E]
Step 4: arr[2] = arr[1]  →  [A, B, B, C, D, E]
Step 5: arr[1] = X       →  [A, X, B, C, D, E]  ✓

Memory Operations Are Expensive

The Mechanics of Array Deletion

Deletion is the mirror image of insertion. Instead of making room, we fill a gap. Instead of shifting right, we shift left.

The gap problem:

deletion_mechanics.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
def delete_at_position(arr, position, n):
    """
    Delete element at 'position' from 'arr'.
    
    'n' is the current logical size of the array.
    Returns the new logical size.
    
    This demonstrates exactly what happens during deletion.
    """
    if position < 0 or position >= n:
        raise IndexError("Position out of bounds")
    
    # Step 1: Shift all elements after position LEFT
    # We start from the position and move toward the end
    for i in range(position, n - 1):
        arr[i] = arr[i + 1]  # Each shift is one operation
        # When i = position: arr[position] = arr[position+1]
        # When i = position+1: arr[position+1] = arr[position+2]
        # ...
        # When i = n-2: arr[n-2] = arr[n-1]
    
    # Step 2: The last logical position is now invalid
    # (Optionally clear it, though it's now outside our logical array)
    arr[n - 1] = None
    
    # Total operations:
    # - For position = 0: shift n-1 elements = n-1 operations
    # - For position = n-1: shift 0 elements = 0 operations
    # - General case: n - position - 1 operations
    
    return n - 1  # New logical size

Visualizing a front deletion:

front_deletion.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
Array with 6 elements: [A, B, C, D, E, F]
Delete at position 0 (remove A)
 
Memory layout before:
Address:  100  104  108  112  116  120
Value:    [A] [B] [C] [D] [E] [F]
 
Step 1: arr[0] = arr[1]  →  [B, B, C, D, E, F]
Step 2: arr[1] = arr[2]  →  [B, C, C, D, E, F]
Step 3: arr[2] = arr[3]  →  [B, C, D, D, E, F]
Step 4: arr[3] = arr[4]  →  [B, C, D, E, E, F]
Step 5: arr[4] = arr[5]  →  [B, C, D, E, F, F]
Step 6: Clear position 5 →  [B, C, D, E, F, _]
 
Total: 5 shift operations for a 6-element array
In general: n-1 operations for an n-element array = O(n)

The Queue Anti-Pattern

Quantifying the Cost

Let's move from abstract O(n) to concrete numbers. Understanding the actual scale helps build intuition for when array costs become problematic.

Single operation costs:

Element Shifts Required by Array Size and Position
Array Size	Front Insert/Delete	Middle Insert/Delete	End Insert/Delete
100	100 shifts	50 shifts	0-1 shifts
1,000	1,000 shifts	500 shifts	0-1 shifts
10,000	10,000 shifts	5,000 shifts	0-1 shifts
100,000	100,000 shifts	50,000 shifts	0-1 shifts
1,000,000	1,000,000 shifts	500,000 shifts	0-1 shifts

Multiple operation costs (cumulative):

Single operations often seem affordable, but operations accumulate:

Total Shifts for n Operations on n-Element Array
Scenario	Operations	Total Shifts	Complexity
n front insertions	n	n × n = n²	O(n²)
n random insertions	n	n × (n/2) = n²/2	O(n²)
n front deletions	n	n + (n-1) + ... + 1 = n(n+1)/2	O(n²)
n end insertions	n	0 (ignoring resize)	O(n)
n end deletions	n	0	O(n)

Time estimation:

Modern memory operations take roughly 1 nanosecond for cached access, 100+ nanoseconds for main memory. Let's estimate conservatively with 10ns per shift:

Estimated Time for Front Insertion
Array Size	Shifts	Time @ 10ns/shift
1,000	1,000	10 microseconds
10,000	10,000	100 microseconds
100,000	100,000	1 millisecond
1,000,000	1,000,000	10 milliseconds
10,000,000	10,000,000	100 milliseconds

Death by a Thousand Insertions

Performing 10,000 front insertions on a million-element array: 10,000 × 1,000,000 = 10 billion shifts. At 10ns each: 100 seconds. That's nearly two minutes for what seems like a simple operation!

Cache and Memory Hierarchy Effects

The naive analysis assumes constant cost per shift. Reality is messier. Modern computers have a memory hierarchy that dramatically affects actual performance.

The memory hierarchy:

memory_hierarchy.txt
1
2
3
4
5
6
7
8
9
Memory Level    | Size      | Latency       | Bandwidth
----------------|-----------|---------------|------------
CPU Registers   | ~1KB      | <1ns          | -
L1 Cache        | 32-64KB   | ~1ns          | ~1 TB/s
L2 Cache        | 256KB-1MB | ~4ns          | ~500 GB/s
L3 Cache        | 4-64MB    | ~12ns         | ~200 GB/s
Main Memory     | 16GB+     | ~100ns        | ~50 GB/s
SSD             | 256GB+    | ~100,000ns    | ~3 GB/s
HDD             | 1TB+      | ~10,000,000ns | ~150 MB/s

How this affects insertion/deletion:

When shifting elements:

Small arrays (< L1 cache): All elements fit in cache. Shifts are fast (~1-2ns each). The overhead is minimal.
Medium arrays (L2/L3 cache): Some elements are in cache, some aren't. Performance becomes unpredictable — sometimes fast, sometimes slow.
Large arrays (> L3 cache): Most shifts require main memory access. Each shift costs ~100ns instead of ~1ns. Performance degrades 100x compared to the small array case.

The Cliff Effect

Arrays have good cache locality... for traversal:

Arrays are cache-friendly for sequential access because adjacent elements are loaded together. But insertion/deletion destroys this advantage:

Shifting writes to every position from k to n
This "pollutes" the cache with data you just modified
After a front insertion, the cache contains the back of the array
The front (which you just inserted to) may have been evicted

The cache advantage of arrays primarily helps reading, not writing.

Amortized Analysis Doesn't Always Help

What amortized O(1) means:

amortized_analysis.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Appending to a dynamic array (capacity doubling):
 
Operation 1: Write to index 0. Cost: 1
Operation 2: Resize (1→2), copy 1, write. Cost: 2
Operation 3: Resize (2→4), copy 2, write. Cost: 3
Operation 4: Write to index 3. Cost: 1
Operation 5: Resize (4→8), copy 4, write. Cost: 5
Operation 6-8: Write. Cost: 1 each = 3
...
 
After n operations:
- Total resizes: log₂(n)
- Total copies from resizing: 1 + 2 + 4 + ... + n/2 ≈ n
- Total direct writes: n
- Total cost: ~2n
 
Average cost per operation: 2n / n = O(1) amortized
 
The expensive operations (resizing) are rare enough that they
average out to constant time.

Why amortization doesn't save insertion at position k:

The amortized argument works because resizing is rare — it happens only log(n) times across n operations. But when inserting at position k:

Every single insertion requires shifting n - k elements
There's no "cheap" insertion to average with
All insertions at the front have exactly the same high cost

No matter how you analyze it, n front insertions require n² shifts. The cost is inherent, not amortizable.

A Common Misconception

Java ArrayList Method Complexity
Method	Complexity	Notes
add(element)	O(1) amortized	Appends to end
add(0, element)	O(n)	Inserts at front, shifts all
add(i, element)	O(n)	Inserts at i, shifts rest
remove(size-1)	O(1)	Removes from end
remove(0)	O(n)	Removes from front, shifts all
get(i)	O(1)	Random access by index

Real-World Consequences

Let's examine specific scenarios where array insertion/deletion costs create genuine engineering problems.

Case 1: The Naive Priority Queue

naive_priority_queue.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
class NaivePriorityQueue:
    """
    Priority queue implemented as a sorted array.
    Highest priority = highest value = end of array.
    
    This is how NOT to implement a priority queue!
    """
    def __init__(self):
        self.items = []
    
    def insert(self, value):
        # Find correct position to maintain sorted order
        position = 0
        for i, v in enumerate(self.items):
            if v < value:
                position = i + 1
        
        # Insert at position (requires shifting!)
        self.items.insert(position, value)  # O(n)!
    
    def remove_max(self):
        # Remove from end is O(1)
        return self.items.pop()
 
# Performance test
import time
pq = NaivePriorityQueue()
 
for size in [1000, 5000, 10000, 20000]:
    start = time.time()
    for i in range(size):
        pq.insert(i)  # Insert in sorted order
    end = time.time()
    print(f"Insert {size} items: {end-start:.3f}s")
    pq.items.clear()
 
# Typical output:
# Insert 1000 items: 0.005s
# Insert 5000 items: 0.120s
# Insert 10000 items: 0.480s
# Insert 20000 items: 1.920s
#
# Note: Time grows quadratically! 2x size → 4x time

Case 2: The Document Editor

Consider a simple document editor storing characters in an array:

Array-Based Editor Problems

•Typing at cursor position: Cursor at position 0 of a 100,000 character document? Each keystroke shifts 100,000 characters. At 5 characters/second: 500,000 shifts per second.
•Backspace: Deleting a character requires shifting all subsequent characters left. Same O(n) cost as insertion.
•Cut/paste large blocks: Moving 1,000 characters requires 1,000 deletions then 1,000 insertions = 2,000 × n operations.
•Undo: If undo is implemented by reinserting deleted characters, performance degrades as document grows.

This is why real text editors use specialized structures like:

Gap buffers: Keep a movable gap at the cursor for O(1) local edits
Piece tables: Track insertions as a sequence of pieces
Ropes: Balance a tree of text fragments

None of these are pure arrays.

Case 3: The Message Queue

Why We Can't Optimize This Away

A natural question: Can't we be clever and avoid these costs? Let's examine common "solutions" and why they fail.

Attempt 1: Lazy deletion (tombstoning)

tombstoning.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Instead of shifting, mark elements as "deleted"
arr = ['A', 'B', 'C', 'D', 'E']
deleted = [False, False, False, False, False]
 
def delete_lazy(index):
    deleted[index] = True  # O(1)!
 
def get(index):
    # Problem: We need the index-th NON-DELETED element
    # Must scan from beginning
    count = 0
    for i in range(len(arr)):
        if not deleted[i]:
            if count == index:
                return arr[i]
            count += 1
    raise IndexError("Not found")
 
# Lazy deletion makes get() O(n)!
# We've shifted the cost, not eliminated it.
# Plus we're wasting memory on "deleted" elements.

Attempt 2: Keep deleted slots, insert into gaps

gap_reuse.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# Track free slots and reuse them
arr = ['A', None, 'C', None, 'E']  # B and D deleted
free_slots = [1, 3]
 
def insert(value):
    if free_slots:
        slot = free_slots.pop()
        arr[slot] = value  # O(1)!
    else:
        arr.append(value)
 
# Problem: Order is lost!
# arr might end up as ['A', 'F', 'C', 'G', 'E']
# The "first" element is no longer at index 0
# For many use cases (queues, ordered lists), this is unacceptable.

Attempt 3: Use a different index

index_mapping.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# Keep an indirection layer that maps logical to physical indices
arr = ['A', 'B', 'C', 'D', 'E']
logical_to_physical = [0, 1, 2, 3, 4]
 
def get(logical_index):
    physical = logical_to_physical[logical_index]
    return arr[physical]
 
def insert_at(logical_index, value):
    # Add to end of array
    physical = len(arr)
    arr.append(value)
    
    # Update the indirection
    logical_to_physical.insert(logical_index, physical)  # O(n) - same problem!
 
# We've just moved the shifting from the data to the index.
# The fundamental O(n) insertion remains.

The Fundamental Truth

Where This Leads

We've thoroughly established the problem. Now the question becomes: what properties would an alternative structure need?

Requirements for a better insertion/deletion structure:

Desired Properties

•No shifting required: Inserting an element should not require moving others
•No contiguity constraint: Elements can exist anywhere in memory
•Order preserved: We still know which element is first, second, etc.
•Dynamic size: No pre-allocation or resizing needed
•Proportional memory: Use only what we need, no slack space

The insight:

What if, instead of storing elements contiguously, we stored each element with a reference to the next element? Then:

Insertion = Update two references. No shifting.
Deletion = Update one reference. No shifting.
Order = Following references from first to last
Size = Add elements anywhere in memory
Memory = Allocate/deallocate per element

This is the core idea of linked structures.

Page Complete

2 / 4