Loading learning content...
We've established that insertion and deletion in arrays are O(n) operations. But understanding Big O notation is only the beginning. To truly internalize why this matters, we need to examine the mechanics of these operations — what actually happens in memory, why it's unavoidable, and how the costs compound in practice.
This page takes a deep dive into the most problematic array operations: inserting and deleting elements at positions other than the end. By the time we're done, you'll not only understand the cost intellectually but feel why a different data structure is needed.
By the end of this page, you will understand: (1) the exact mechanics of array insertion and deletion, (2) why contiguous storage makes these operations inherently expensive, (3) how costs scale with data size and operation frequency, and (4) real-world scenarios where these costs become prohibitive.
Let's trace through exactly what happens when we insert an element into an array. Understanding each step reveals why the operation cannot be faster than O(n) for non-end insertions.
The fundamental constraint:
Arrays guarantee that array[i] and array[i+1] are adjacent in memory. This means when we insert at position k, elements at positions k, k+1, ..., n-1 must all move one position right.
There is no shortcut. No clever trick. The elements must physically relocate.
1234567891011121314151617181920212223242526272829
def insert_at_position(arr, position, value): """ Insert 'value' at 'position' in 'arr'. Precondition: arr has at least one empty slot at the end This demonstrates exactly what happens during insertion. """ n = len(arr) if position < 0 or position > n: raise IndexError("Position out of bounds") # Step 1: Shift all elements from position to end, moving RIGHT # We start from the END to avoid overwriting data we haven't moved yet for i in range(n - 1, position - 1, -1): arr[i + 1] = arr[i] # Each shift is one operation # When i = n-1: arr[n] = arr[n-1] # When i = n-2: arr[n-1] = arr[n-2] # ... # When i = position: arr[position+1] = arr[position] # Step 2: Place the new value at the now-empty position arr[position] = value # Total operations: # - For position = 0: shift n elements + 1 write = n + 1 operations # - For position = n: shift 0 elements + 1 write = 1 operation # - General case: n - position + 1 operationsWhy we shift from right to left:
This detail matters! If we shifted left-to-right, we'd overwrite elements before moving them:
12345678910111213141516171819
WRONG: Shifting left-to-right Array: [A, B, C, D, E]Insert X at position 1 Step 1: arr[2] = arr[1] → [A, B, B, D, E] // C is LOST!Step 2: arr[3] = arr[2] → [A, B, B, B, E] // D is LOST!... CORRECT: Shifting right-to-left Array: [A, B, C, D, E, _] (with slack space)Insert X at position 1 Step 1: arr[5] = arr[4] → [A, B, C, D, E, E] Step 2: arr[4] = arr[3] → [A, B, C, D, D, E]Step 3: arr[3] = arr[2] → [A, B, C, C, D, E]Step 4: arr[2] = arr[1] → [A, B, B, C, D, E]Step 5: arr[1] = X → [A, X, B, C, D, E] ✓Each arr[i+1] = arr[i] involves reading from one memory location and writing to another. For large arrays, this means potentially millions of memory operations. Memory access — even in cache — is one of the costliest primitive operations a CPU performs.
Deletion is the mirror image of insertion. Instead of making room, we fill a gap. Instead of shifting right, we shift left.
The gap problem:
After removing element at position k, we have a gap. But arrays cannot have gaps — the contiguity guarantee requires that array[k] contains a valid element. So elements at positions k+1, k+2, ..., n-1 must all shift left by one.
12345678910111213141516171819202122232425262728293031
def delete_at_position(arr, position, n): """ Delete element at 'position' from 'arr'. 'n' is the current logical size of the array. Returns the new logical size. This demonstrates exactly what happens during deletion. """ if position < 0 or position >= n: raise IndexError("Position out of bounds") # Step 1: Shift all elements after position LEFT # We start from the position and move toward the end for i in range(position, n - 1): arr[i] = arr[i + 1] # Each shift is one operation # When i = position: arr[position] = arr[position+1] # When i = position+1: arr[position+1] = arr[position+2] # ... # When i = n-2: arr[n-2] = arr[n-1] # Step 2: The last logical position is now invalid # (Optionally clear it, though it's now outside our logical array) arr[n - 1] = None # Total operations: # - For position = 0: shift n-1 elements = n-1 operations # - For position = n-1: shift 0 elements = 0 operations # - General case: n - position - 1 operations return n - 1 # New logical sizeVisualizing a front deletion:
12345678910111213141516
Array with 6 elements: [A, B, C, D, E, F]Delete at position 0 (remove A) Memory layout before:Address: 100 104 108 112 116 120Value: [A] [B] [C] [D] [E] [F] Step 1: arr[0] = arr[1] → [B, B, C, D, E, F]Step 2: arr[1] = arr[2] → [B, C, C, D, E, F]Step 3: arr[2] = arr[3] → [B, C, D, D, E, F]Step 4: arr[3] = arr[4] → [B, C, D, E, E, F]Step 5: arr[4] = arr[5] → [B, C, D, E, F, F]Step 6: Clear position 5 → [B, C, D, E, F, _] Total: 5 shift operations for a 6-element arrayIn general: n-1 operations for an n-element array = O(n)A queue implemented with an array removes from the front repeatedly. Each removal is O(n). Processing n items takes O(n²) time total! This is why production queue implementations use circular arrays (fixed size) or linked lists (dynamic size).
Let's move from abstract O(n) to concrete numbers. Understanding the actual scale helps build intuition for when array costs become problematic.
Single operation costs:
| Array Size | Front Insert/Delete | Middle Insert/Delete | End Insert/Delete |
|---|---|---|---|
| 100 | 100 shifts | 50 shifts | 0-1 shifts |
| 1,000 | 1,000 shifts | 500 shifts | 0-1 shifts |
| 10,000 | 10,000 shifts | 5,000 shifts | 0-1 shifts |
| 100,000 | 100,000 shifts | 50,000 shifts | 0-1 shifts |
| 1,000,000 | 1,000,000 shifts | 500,000 shifts | 0-1 shifts |
Multiple operation costs (cumulative):
Single operations often seem affordable, but operations accumulate:
| Scenario | Operations | Total Shifts | Complexity |
|---|---|---|---|
| n front insertions | n | n × n = n² | O(n²) |
| n random insertions | n | n × (n/2) = n²/2 | O(n²) |
| n front deletions | n | n + (n-1) + ... + 1 = n(n+1)/2 | O(n²) |
| n end insertions | n | 0 (ignoring resize) | O(n) |
| n end deletions | n | 0 | O(n) |
Time estimation:
Modern memory operations take roughly 1 nanosecond for cached access, 100+ nanoseconds for main memory. Let's estimate conservatively with 10ns per shift:
| Array Size | Shifts | Time @ 10ns/shift |
|---|---|---|
| 1,000 | 1,000 | 10 microseconds |
| 10,000 | 10,000 | 100 microseconds |
| 100,000 | 100,000 | 1 millisecond |
| 1,000,000 | 1,000,000 | 10 milliseconds |
| 10,000,000 | 10,000,000 | 100 milliseconds |
Performing 10,000 front insertions on a million-element array: 10,000 × 1,000,000 = 10 billion shifts. At 10ns each: 100 seconds. That's nearly two minutes for what seems like a simple operation!
The naive analysis assumes constant cost per shift. Reality is messier. Modern computers have a memory hierarchy that dramatically affects actual performance.
The memory hierarchy:
123456789
Memory Level | Size | Latency | Bandwidth----------------|-----------|---------------|------------CPU Registers | ~1KB | <1ns | -L1 Cache | 32-64KB | ~1ns | ~1 TB/sL2 Cache | 256KB-1MB | ~4ns | ~500 GB/sL3 Cache | 4-64MB | ~12ns | ~200 GB/sMain Memory | 16GB+ | ~100ns | ~50 GB/sSSD | 256GB+ | ~100,000ns | ~3 GB/sHDD | 1TB+ | ~10,000,000ns | ~150 MB/sHow this affects insertion/deletion:
When shifting elements:
Small arrays (< L1 cache): All elements fit in cache. Shifts are fast (~1-2ns each). The overhead is minimal.
Medium arrays (L2/L3 cache): Some elements are in cache, some aren't. Performance becomes unpredictable — sometimes fast, sometimes slow.
Large arrays (> L3 cache): Most shifts require main memory access. Each shift costs ~100ns instead of ~1ns. Performance degrades 100x compared to the small array case.
Performance doesn't degrade smoothly. There's often a 'cliff' where the array exceeds cache size and performance suddenly drops. An array of 3 million elements might be 50x slower per operation than one of 2 million — because 3 million bytes exceed L3 cache.
Arrays have good cache locality... for traversal:
Arrays are cache-friendly for sequential access because adjacent elements are loaded together. But insertion/deletion destroys this advantage:
k to nThe cache advantage of arrays primarily helps reading, not writing.
When discussing dynamic arrays, we often hear that append is "amortized O(1)." This is true and important, but it's frequently misunderstood — and it does not apply to insertion at arbitrary positions.
What amortized O(1) means:
1234567891011121314151617181920
Appending to a dynamic array (capacity doubling): Operation 1: Write to index 0. Cost: 1Operation 2: Resize (1→2), copy 1, write. Cost: 2Operation 3: Resize (2→4), copy 2, write. Cost: 3Operation 4: Write to index 3. Cost: 1Operation 5: Resize (4→8), copy 4, write. Cost: 5Operation 6-8: Write. Cost: 1 each = 3... After n operations:- Total resizes: log₂(n)- Total copies from resizing: 1 + 2 + 4 + ... + n/2 ≈ n- Total direct writes: n- Total cost: ~2n Average cost per operation: 2n / n = O(1) amortized The expensive operations (resizing) are rare enough that theyaverage out to constant time.Why amortization doesn't save insertion at position k:
The amortized argument works because resizing is rare — it happens only log(n) times across n operations. But when inserting at position k:
n - k elementsNo matter how you analyze it, n front insertions require n² shifts. The cost is inherent, not amortizable.
Some developers assume that because ArrayList/Vector has 'amortized O(1) add,' all additions are cheap. No! Amortized O(1) applies only to add at the END. add(0, element) is always O(n) — every single time, no amortization possible.
| Method | Complexity | Notes |
|---|---|---|
| add(element) | O(1) amortized | Appends to end |
| add(0, element) | O(n) | Inserts at front, shifts all |
| add(i, element) | O(n) | Inserts at i, shifts rest |
| remove(size-1) | O(1) | Removes from end |
| remove(0) | O(n) | Removes from front, shifts all |
| get(i) | O(1) | Random access by index |
Let's examine specific scenarios where array insertion/deletion costs create genuine engineering problems.
Case 1: The Naive Priority Queue
12345678910111213141516171819202122232425262728293031323334353637383940414243
class NaivePriorityQueue: """ Priority queue implemented as a sorted array. Highest priority = highest value = end of array. This is how NOT to implement a priority queue! """ def __init__(self): self.items = [] def insert(self, value): # Find correct position to maintain sorted order position = 0 for i, v in enumerate(self.items): if v < value: position = i + 1 # Insert at position (requires shifting!) self.items.insert(position, value) # O(n)! def remove_max(self): # Remove from end is O(1) return self.items.pop() # Performance testimport timepq = NaivePriorityQueue() for size in [1000, 5000, 10000, 20000]: start = time.time() for i in range(size): pq.insert(i) # Insert in sorted order end = time.time() print(f"Insert {size} items: {end-start:.3f}s") pq.items.clear() # Typical output:# Insert 1000 items: 0.005s# Insert 5000 items: 0.120s# Insert 10000 items: 0.480s# Insert 20000 items: 1.920s## Note: Time grows quadratically! 2x size → 4x timeCase 2: The Document Editor
Consider a simple document editor storing characters in an array:
This is why real text editors use specialized structures like:
None of these are pure arrays.
A message queue processes items in FIFO order: new messages append, processed messages remove from front. With an array: append is O(1), but remove from front is O(n). Processing n messages = O(n²) time. At 10,000 messages/second, the system quickly falls behind.
A natural question: Can't we be clever and avoid these costs? Let's examine common "solutions" and why they fail.
Attempt 1: Lazy deletion (tombstoning)
123456789101112131415161718192021
# Instead of shifting, mark elements as "deleted"arr = ['A', 'B', 'C', 'D', 'E']deleted = [False, False, False, False, False] def delete_lazy(index): deleted[index] = True # O(1)! def get(index): # Problem: We need the index-th NON-DELETED element # Must scan from beginning count = 0 for i in range(len(arr)): if not deleted[i]: if count == index: return arr[i] count += 1 raise IndexError("Not found") # Lazy deletion makes get() O(n)!# We've shifted the cost, not eliminated it.# Plus we're wasting memory on "deleted" elements.Attempt 2: Keep deleted slots, insert into gaps
123456789101112131415
# Track free slots and reuse themarr = ['A', None, 'C', None, 'E'] # B and D deletedfree_slots = [1, 3] def insert(value): if free_slots: slot = free_slots.pop() arr[slot] = value # O(1)! else: arr.append(value) # Problem: Order is lost!# arr might end up as ['A', 'F', 'C', 'G', 'E']# The "first" element is no longer at index 0# For many use cases (queues, ordered lists), this is unacceptable.Attempt 3: Use a different index
123456789101112131415161718
# Keep an indirection layer that maps logical to physical indicesarr = ['A', 'B', 'C', 'D', 'E']logical_to_physical = [0, 1, 2, 3, 4] def get(logical_index): physical = logical_to_physical[logical_index] return arr[physical] def insert_at(logical_index, value): # Add to end of array physical = len(arr) arr.append(value) # Update the indirection logical_to_physical.insert(logical_index, physical) # O(n) - same problem! # We've just moved the shifting from the data to the index.# The fundamental O(n) insertion remains.There is no way to have both: (1) O(1) arbitrary-position insertion/deletion AND (2) O(1) random access by contiguous index. This is a fundamental data structure trade-off. Arrays chose random access; linked lists choose insertion/deletion.
We've thoroughly established the problem. Now the question becomes: what properties would an alternative structure need?
Requirements for a better insertion/deletion structure:
The insight:
What if, instead of storing elements contiguously, we stored each element with a reference to the next element? Then:
This is the core idea of linked structures.
You now understand in depth why array insertion and deletion are expensive. The costs are fundamental to contiguous storage and cannot be optimized away without changing the data structure. The next page explores the need for dynamic, flexible structures and how linked lists provide a fundamentally different set of trade-offs.