Loading learning content...
So far, we've seen a tension in array operations:
This raises a practical question: How do languages like Python, JavaScript, and Java offer 'arrays' that can grow indefinitely, yet still claim O(1) append performance?
The answer is dynamic arrays (ArrayList in Java, list in Python, Array in JavaScript) combined with a clever strategy called amortized analysis.
Append is O(1) on average—but understanding why requires understanding what happens when the array runs out of space.
By the end of this page, you will understand how dynamic arrays grow, why the doubling strategy leads to O(1) amortized time for append, and how to reason about amortized complexity—a powerful tool for analyzing algorithms with occasional expensive operations.
Traditional arrays, as defined at the hardware level, have a fixed size determined at creation. If you allocate an array of 10 integers, it reserves exactly 40 bytes (assuming 4-byte integers) in contiguous memory.
What happens when you need an 11th element?
You can't simply extend the array. The memory immediately after your array might already be occupied by other data. There's no 'slide everyone over' button at the memory level.
The naive solution:
This works, but copying n elements is O(n). If you did this for every single append, n appends would cost O(n²)—disastrous.
If you grow the array by exactly 1 element each time you append, every append requires copying all existing elements. Appending n elements costs: 1 + 2 + 3 + ... + n = O(n²). This is catastrophically slow and why no real implementation does this.
Example: Growing by 1
Append #1: [_] → [1] Copy 0 elements
Append #2: [1, _] → [1, 2] Copy 1 element
Append #3: [1, 2, _] → [1, 2, 3] Copy 2 elements
Append #4: [1, 2, 3, _] → [1, 2, 3, 4] Copy 3 elements
...
Append #n: Copy n-1 elements
Total copies: 0 + 1 + 2 + ... + (n-1) = n(n-1)/2 ≈ n²/2
This quadratic growth is exactly what we want to avoid.
The solution is elegant: instead of growing by 1, double the capacity whenever you run out of space.
The doubling strategy:
This seems wasteful—you're sometimes copying elements only to have lots of empty space. But the math works out beautifully.
Example: Appending 17 elements with doubling (starting capacity 4)
Capacity 4: Append 1, 2, 3, 4 ← Full! Resize
Copy 4 elements to capacity 8
Capacity 8: Append 5, 6, 7, 8 ← Full! Resize
Copy 8 elements to capacity 16
Capacity 16: Append 9, 10, 11, 12, 13, 14, 15, 16 ← Full! Resize
Copy 16 elements to capacity 32
Capacity 32: Append 17 ← Room for 15 more!
Total copies: 4 + 8 + 16 = 28 copies for 17 appends
Naive (grow by 1): 0+1+2+...+16 = 136 copies for 17 appends
Doubling uses ~5× fewer copies than growing by 1, and this ratio improves as n grows.
Any constant factor > 1 works (1.5×, 2×, 3×). Doubling is common because it's simple and the math is clean. Some implementations use 1.5× to save memory at the cost of slightly more frequent resizes. The key insight is GEOMETRIC growth—each resize must increase capacity proportionally, not by a fixed amount.
Amortized analysis asks: "What's the average cost per operation over a sequence of operations?"
It's not the same as average-case analysis (which considers random inputs). Amortized analysis considers the worst-case sequence and spreads the total cost evenly.
The banker's analogy:
Imagine each append costs 1 'dollar' normally. But occasionally (when resizing), it costs n dollars to copy n elements.
With doubling:
Amortized analysis asks: If we 'prepay' a bit on each cheap operation, can we cover the expensive ones?
Amortized O(1) is a GUARANTEE, not a probability. It means that for ANY sequence of n operations, the TOTAL cost is O(n), so the average is O(1). It's not 'usually O(1) but sometimes O(n)'—it's 'total of n operations is at most cn, guaranteeing O(1) each on average.'
The aggregate method (simplest proof):
Let's count total copy operations for n appends, starting with capacity 1 and doubling:
Total copies: 1 + 2 + 4 + 8 + ... + 2^k = 2^(k+1) - 1 < 2n
Total work for n appends: n (for the appends themselves) + 2n (for copies) = 3n = O(n)
Therefore, each append costs O(n)/n = O(1) amortized.
The accounting method provides additional intuition. We assign a 'price' to each operation that may differ from its actual cost, as long as:
For dynamic arrays:
Charge 3 units for each append:
Why this works:
When we resize from capacity n to 2n:
Total paid: n units (n/2 from recent elements × 2 units each) Total needed: n units (copy n elements)
Balance: exactly covered!
By prepaying 3 units per append, we always have enough 'savings' to cover resize costs. Since 3 is a constant, each append is O(1) amortized.
Think of it this way: The elements added in the second half of any array 'pay forward' for their own copy AND for the elements in the first half. Since the number of new elements always equals the number of old elements at resize time (we doubled), the prepayment exactly covers the work.
Let's trace dynamic array growth step by step:
┌─────────────────────────────────────────────────────────────┐
│ Operation │ Size │ Capacity │ Resize? │ Copies │ Total Work │
├─────────────────────────────────────────────────────────────┤
│ append(1) │ 1 │ 1 │ No │ 0 │ 1 │
│ append(2) │ 2 │ 2 │ Yes→2 │ 1 │ 2 │
│ append(3) │ 3 │ 4 │ Yes→4 │ 2 │ 3 │
│ append(4) │ 4 │ 4 │ No │ 0 │ 1 │
│ append(5) │ 5 │ 8 │ Yes→8 │ 4 │ 5 │
│ append(6) │ 6 │ 8 │ No │ 0 │ 1 │
│ append(7) │ 7 │ 8 │ No │ 0 │ 1 │
│ append(8) │ 8 │ 8 │ No │ 0 │ 1 │
│ append(9) │ 9 │ 16 │ Yes→16 │ 8 │ 9 │
└─────────────────────────────────────────────────────────────┘
9 appends, total work = 1+2+3+1+5+1+1+1+9 = 24 units
Average per append = 24/9 ≈ 2.67 units = O(1)
Key observations:
Resizes get exponentially rarer:
When resizes happen, they're proportionally larger:
The expensive operations become rarer faster than they become more expensive:
If we added a fixed amount (say, 10 elements) each resize, the expensive resizes wouldn't become proportionally rarer. Geometric growth (doubling) is essential—each resize buys us proportionally more cheap appends.
Let's implement a simple dynamic array to see the mechanics in action:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657
class DynamicArray: """ A simplified dynamic array implementation demonstrating O(1) amortized append. """ def __init__(self, initial_capacity=4): self._capacity = initial_capacity self._size = 0 self._data = [None] * self._capacity self._resize_count = 0 self._total_copies = 0 def append(self, item): """Add item to end. O(1) amortized.""" if self._size == self._capacity: self._resize() self._data[self._size] = item self._size += 1 def _resize(self): """Double capacity and copy elements. O(n) but rarely called.""" new_capacity = self._capacity * 2 new_data = [None] * new_capacity # Copy existing elements for i in range(self._size): new_data[i] = self._data[i] self._total_copies += self._size self._data = new_data self._capacity = new_capacity self._resize_count += 1 print(f" Resized! New capacity: {new_capacity}, copied {self._size} elements") def __getitem__(self, index): """O(1) access by index.""" if 0 <= index < self._size: return self._data[index] raise IndexError("Index out of range") def __len__(self): return self._size def stats(self): return f"Size: {self._size}, Capacity: {self._capacity}, " \ f"Resizes: {self._resize_count}, Total copies: {self._total_copies}" # Demonstrationarr = DynamicArray(initial_capacity=2)for i in range(1, 33): arr.append(i) if arr._resize_count > 0 and i == 2 ** (arr._resize_count): print(f"After {i} appends: {arr.stats()}") print(f"\nFinal: {arr.stats()}")print(f"Amortized copies per append: {arr._total_copies / len(arr):.2f}")Real dynamic arrays (Python list, Java ArrayList) use similar logic but with optimizations: they may use 1.5× growth instead of 2×, pre-allocate in blocks for memory alignment, or use OS-level memory mapping for very large arrays. The amortized O(1) guarantee remains.
The doubling strategy has memory implications worth understanding.
Space overhead:
With doubling, a dynamic array may have up to 2× the capacity it needs. If you have 513 elements:
On average, arrays are ~75% full (between 50% after resize and 100% before).
| Scenario | Size | Capacity | Overhead | Utilization |
|---|---|---|---|---|
| Just resized | n | 2n | 100% | 50% |
| Half full | n | ~1.3n | ~30% | ~75% |
| About to resize | n | n | 0% | 100% |
| Average case | n | ~1.33n | ~33% | ~75% |
When to consider lower growth factors:
When doubling is fine:
Many languages let you pre-allocate capacity if you know the size upfront. Python: list with capacity isn't built-in but you can use [None]*n. Java: new ArrayList<>(expectedSize). JavaScript: new Array(size). This avoids ALL resizes and achieves true O(1) for each append.
We've discussed growing, but what happens when elements are removed?
Naive approach: never shrink
Many implementations never shrink automatically. If you add 1 million elements then remove 999,999, you still have capacity for 1 million. This is simple but can waste memory.
Shrinking strategy:
Shrink when utilization drops below a threshold (e.g., 25%):
If you shrink at 50% utilization and grow at 100%, a sequence of append-delete-append-delete at the boundary causes infinite resizing. That's why we shrink at 25%, not 50%—there's a buffer zone between grow and shrink thresholds to prevent thrashing.
Example threshold design:
Shrink Grow
threshold threshold
↓ ↓
├──────────────────────┼──────────────────────────┤
0% 25% 100%
Buffer zone: 25% - 100% = 75% of capacity
Add at 100% → resize to 2× → now at 50% (in buffer)
Remove at 50% → now at 49% (still in buffer, no resize)
Remove at 26% → now at 25% (exactly at threshold)
Remove below 25% → shrink to 1/2 → now at 50% (in buffer)
The buffer zone prevents ping-ponging between grow and shrink.
We've explored one of the most elegant ideas in data structure design. Let's consolidate:
The Complete Array Operations Picture:
| Operation | Complexity | Notes |
|---|---|---|
| Access by index | O(1) | Always constant |
| Search (unsorted) | O(n) | Must scan |
| Search (sorted) | O(log n) | Binary search |
| Insert at end | O(1) amortized | Dynamic array |
| Insert elsewhere | O(n) | Must shift |
| Delete at end | O(1) | No shift needed |
| Delete elsewhere | O(n) | Must shift |
You now have a complete understanding of array operations and their costs!
Congratulations! You've mastered the cost model of core array operations. You understand WHY arrays excel at access and struggle with insertions—and the elegant amortized analysis that makes dynamic arrays practical. This knowledge forms the foundation for evaluating when arrays are the right choice versus when you need alternative structures.