Loading content...
Here's a puzzle: Dynamic arrays (like ArrayList in Java or list in Python) claim O(1) append time. But occasionally, when the array is full, they must:
Copying n elements is O(n)—clearly not O(1). So how can we claim O(1) append?
The answer is amortized analysis—a technique that analyzes the average cost per operation over a sequence of operations, rather than the worst case of a single operation.
The key insight:
That expensive O(n) copy happens so infrequently that when we spread its cost across all the cheap O(1) appends that preceded it, the average cost per operation is still O(1).
By the end of this page, you will understand the intuition behind amortized analysis, why it differs from both worst-case and average-case analysis, and how to apply amortized thinking to common data structures. No formal proofs—we're building conceptual understanding.
The core idea:
Amortized analysis computes the average cost per operation over a worst-case sequence of operations.
Notice the nuance:
Why is this different from average case?
Average-case analysis assumes a probability distribution over inputs. Amortized analysis makes no distributional assumptions—it guarantees that any sequence of n operations takes at most f(n) total time, so the average is at most f(n)/n per operation.
Intuitive analogy: Paying off debt
Imagine expensive operations as "going into debt." Cheap operations "pay down the debt." If expensive operations are rare enough, the cheap operations can cover the debt, keeping the average cost low.
Formal definition (simplified):
For a data structure with operations:
The amortized cost of an operation is the average cost per operation in the worst-case sequence of n operations, as n approaches infinity.
If we can show that any sequence of n operations takes O(n) total time, then the amortized cost per operation is O(1).
What amortized analysis IS:
What amortized analysis is NOT:
This distinction is crucial. Average-case says 'on typical inputs, expect this cost.' Amortized says 'over ANY sequence of operations, the average per-operation is at most this cost.' Amortized is a stronger guarantee—it's about worst-case sequences, not typical inputs.
The dynamic array's append operation is the canonical example of amortized analysis. Let's work through it carefully.
How dynamic arrays grow:
A dynamic array maintains:
Cost breakdown:
Let's trace appending n elements to an initially empty array with capacity 1:
| Append # | Capacity Before | Operation | Cost |
|---|---|---|---|
| 1 | 1 | Fits, insert | 1 |
| 2 | 1 | Full! Resize to 2, copy 1, insert | 2 |
| 3 | 2 | Full! Resize to 4, copy 2, insert | 3 |
| 4 | 4 | Fits, insert | 1 |
| 5 | 4 | Full! Resize to 8, copy 4, insert | 5 |
| 6-8 | 8 | Fits, insert (each) | 1 each |
| 9 | 8 | Full! Resize to 16, copy 8, insert | 9 |
Counting total cost for n appends:
Resizing happens at capacities 1, 2, 4, 8, 16, ... up to n.
Total copy cost: 1 + 2 + 4 + 8 + ... + n/2 = n - 1 (geometric series sums to ~n)
Total insert cost: n (one insert per append)
Total cost for n appends: ~2n = O(n)
Amortized cost per append: O(n) / n = O(1)
Despite occasional O(n) resizes, the amortized cost per operation is constant!
The O(1) amortized bound depends on doubling capacity, not incrementing it. If we increased capacity by 1 each time (resize at every insert), resizing would cost 1+2+3+...+n = O(n²) total, giving O(n) amortized per insert. Doubling ensures each element is copied at most O(log n) times total (once per doubling), yielding O(1) amortized.
The accounting intuition:
Think of it this way: each element "pays" for its future copy.
Every element pays 2 (constant), and resizes are "free" because we've pre-paid.
This "banker's method" is one formal approach to proving amortized bounds. The intuition: cheap operations save up credit; expensive operations spend it.
Amortized analysis is applicable when:
1. Expensive operations are rare and enable cheap operations
The pattern is: do something expensive occasionally that makes subsequent operations cheap.
2. Expensive operations have diminishing frequency
The key is that expensive operations can't happen too often:
3. We care about total work, not individual operation latency
Amortized analysis doesn't bound individual operations—some may be slow. Use when:
| Data Structure | Expensive Operation | Amortized Cost | Why It Works |
|---|---|---|---|
| Dynamic Array | Resize (O(n)) | Append: O(1) | Doubling means resizes are rare |
| Hash Table | Rehash (O(n)) | Insert: O(1) | Doubling capacity limits rehash frequency |
| Stack (array-backed) | Resize (O(n)) | Push: O(1) | Same as dynamic array |
| Binary Counter | Cascading flips | Increment: O(1) | High bits flip exponentially less often |
| Splay Tree | Rotations (O(n) worst) | Access: O(log n) | Recent accesses move to root |
When amortized analysis does NOT apply:
Real-time systems:
Amortized O(1) means some operations might take O(n). For real-time systems that must respond within a deadline, amortized bounds are insufficient.
Example: An embedded controller using a dynamic array might miss a deadline during resize. Real-time systems need worst-case O(1) approaches (like pre-allocated fixed arrays or incremental resizing).
Latency-sensitive operations:
If each operation needs consistent latency (not just good average), amortized is misleading:
Adversarial scenarios:
If an adversary can trigger expensive operations repeatedly by timing their requests to coincide with expensive phases, amortized bounds don't help.
Never confuse amortized O(1) with worst-case O(1). Saying 'dynamic array append is O(1)' without the 'amortized' qualifier is technically imprecise. When latency matters, this distinction is crucial.
Let's explore more examples to cement the amortized analysis intuition.
Example 1: Binary Counter Increment
Consider a k-bit binary counter incrementing from 0 to 2ᵏ - 1:
000 → 001 → 010 → 011 → 100 → 101 → 110 → 111
Naive analysis (worst case): Incrementing can flip up to k bits (e.g., 0111 → 1000). k = O(log n) for n increments. Worst case per increment: O(log n).
Amortized analysis:
Total flips: n + n/2 + n/4 + ... + 1 ≤ 2n = O(n)
Amortized cost per increment: O(n) / n = O(1)
Despite worst case O(log n), amortized is O(1)!
Example 2: Multi-Pop Stack
Consider a stack with three operations:
Naive analysis: MultiPop can be O(n). Sequence of operations could be "push n, multipop n" costing O(n) for multipop.
Amortized analysis:
Key insight: each element can only be popped once!
Starting from empty, after any sequence of operations:
Total work from all pops ≤ P = O(n) for n operations.
Amortized cost per operation: O(1)
Even MultiPop is O(1) amortized because you can't pop more than you pushed.
Example 3: Hash Table with Rehashing
Hash tables rehash when load factor exceeds threshold (typically 0.7 or 0.75):
Amortized analysis (similar to dynamic array):
With doubling capacity:
Amortized cost per insert: O(1)
Note: This is amortized on top of the expected O(1) for hash table operations. Hash tables have:
Both analyses are needed for the complete picture.
Notice the common pattern: operations 'bank' credit that expensive operations 'spend.' Push banks credit for pop; insert banks credit for resize; each bit flip banks credit for future flips. When expensive operations consume at most what cheap operations produced, amortized cost stays low.
Let's precisely distinguish these three analytical frameworks with a side-by-side comparison.
| Aspect | Worst Case | Average Case | Amortized |
|---|---|---|---|
| Focus | Single operation, single (worst) input | Single operation, distribution of inputs | Sequence of operations, worst sequence |
| Question answered | What's the max cost of one operation? | What's the expected cost over random inputs? | What's the average cost per operation in any sequence? |
| Probability? | No | Yes (input distribution) | No (but averages over ops in sequence) |
| Guarantee type | Every operation bounded | Average over inputs bounded | Total over sequence bounded |
| Individual operation | Bounded | Expected | May vary widely |
Visual intuition:
Imagine a graph of operation costs over a sequence:
Cost
|
| * (Worst case bounds every point)
| |
|---*-|------*--------- Worst case bound
| * | *
|* * * * *
|** * * * * *
+-----------------------> Operations
↑
Expensive op
For the graph above:
When documentation says 'O(1) insert,' ask: Is that worst-case O(1), expected O(1), or amortized O(1)? They have very different implications. ArrayList.add() is amortized O(1). LinkedList.addFirst() is worst-case O(1). Hash table insert is expected O(1) and amortized O(1) for resize. Precision matters.
Understanding amortized analysis changes how you reason about code and data structure choices.
Implication 1: ArrayList vs. LinkedList
The classic debate: Which is better for appending elements?
| Operation | ArrayList | LinkedList |
|---|---|---|
| Append | Amortized O(1) | Worst-case O(1) |
| Random access | O(1) | O(n) |
| Memory overhead | Low (contiguous) | High (node pointers) |
Despite LinkedList having "true" O(1) append (no resizing ever), ArrayList often wins because:
The lesson: Amortized O(1) is usually acceptable; don't choose worse data structures just to avoid occasional expensive operations.
Implication 2: Understanding "Occasional Spikes"
Profiling code might show occasional slow operations. Before panicking:
Knowing amortized analysis helps you triage: some spikes are designed; others are bugs.
Implication 3: Pre-sizing to avoid spikes
If you know the final size, pre-allocate:
# Python: pre-size list
data = [None] * expected_size
# Java: pre-size ArrayList
List<Integer> data = new ArrayList<>(expectedSize);
# Java: pre-size HashMap
Map<K, V> map = new HashMap<>(expectedSize);
This eliminates resize spikes entirely, turning amortized O(1) into true O(1).
Implication 4: Incremental resizing
For systems that can't tolerate spikes, use incremental approaches:
These trade implementation complexity for bounded worst-case time.
Java's ConcurrentHashMap uses incremental rehashing: during resize, both old and new tables are active, and entries migrate gradually. This bounds worst-case time while maintaining amortized O(1). It's more complex but suitable for concurrent, latency-sensitive contexts.
Amortized analysis provides a powerful middle ground between overly pessimistic worst-case bounds and probability-dependent average-case bounds.
What's next:
We've established the conceptual foundation. The next page brings amortized analysis to life with real-world examples—systems where amortized thinking explains observed performance and guides engineering decisions.
You now understand amortized analysis conceptually: what it means, when it applies, and how it differs from worst-case and average-case analysis. You can recognize amortized bounds in documentation and reason about their practical implications. Next, we'll see real-world applications of amortized thinking.