Loading learning content...
In the previous page, we claimed that dynamic arrays provide O(1) "amortized" time for append operations. But what exactly does "amortized" mean? And how can we claim O(1) when individual operations sometimes take O(n) time due to resizing?
This isn't handwaving or an approximation — amortized analysis is a rigorous mathematical technique used throughout computer science to analyze the true cost of operations in data structures with occasional expensive operations. Understanding amortized analysis transforms how you think about algorithm efficiency and is essential for evaluating any data structure with resize, rebalance, or cleanup operations.
By the end of this page, you will understand the formal concept of amortized analysis, be able to prove why dynamic array append is O(1) amortized using multiple analysis techniques, and recognize how this fundamental concept applies to many other data structures and algorithms.
Definition:
Amortized analysis is a method for analyzing the time complexity of a sequence of operations by averaging the total cost over all operations, even if some individual operations are expensive.
The key distinction from average-case analysis:
Amortized analysis is about guarantees, not probability. There's no randomness involved — we're precisely computing total work.
The intuition:
Imagine you have a monthly subscription for $120/year, billed as a single annual payment. Each month, the "amortized" cost is $10 — even though Jan 1 costs $120 and the other 364 days cost $0.
Similarly, if you append 1 million elements to a dynamic array:
If total cost is O(n) for n operations, amortized cost per operation is O(n)/n = O(1).
Formal definition:
Given a sequence of n operations with individual costs c₁, c₂, ..., cₙ:
Amortized cost per operation = (c₁ + c₂ + ... + cₙ) / n
We say an operation has O(f(n)) amortized cost if the total cost of any n operations is O(n × f(n)).
Amortized cost is not about probability or expected values. It's about the guaranteed total cost of a sequence of operations. Every sequence of n appends has a deterministic, predictable total cost. We're dividing that deterministic total by n.
The simplest amortized analysis technique is the aggregate method: compute the total cost of n operations directly, then divide by n.
Applying to dynamic array append:
Let's analyze appending n elements to an initially empty dynamic array with doubling strategy (growth factor 2).
The costs:
When do resizes occur?
With doubling and initial capacity 1:
Resizes happen when size is 1, 2, 4, 8, ..., up to some 2^k where 2^k ≤ n < 2^(k+1).
Total resize cost:
Total elements copied = 1 + 2 + 4 + 8 + ... + 2^k
This is a geometric series with closed form:
1 + 2 + 4 + ... + 2^k = 2^(k+1) - 1 < 2 × 2^k ≤ 2n
So total elements copied across ALL resizes is less than 2n.
Total cost of n appends:
Amortized cost per append:
Amortized cost = Total cost / n = 3n / n = 3 = O(1)
Each append to a dynamic array costs O(1) amortized. The total work for n appends is O(n), meaning the average work per append is constant. Even though some individual appends cost O(n), they're rare enough that the total remains linear.
The geometric series insight:
The magic is in the geometric series. With doubling:
1 + 2 + 4 + 8 + ... + n/2 + n = 2n - 1
The sum of all previous terms equals less than the last term! This is why doubling works — the total cost of ALL previous resizes is bounded by the cost of one resize at the current capacity.
Another way to see it: each element is copied at most log₂(n) times across all resizes (once when array has 2 elements, once when it has 4, once when it has 8, ...). So n elements × log₂(n) copies each? No! Most elements are added late and copied few times. The average element is copied O(1) times.
The accounting method is a more intuitive approach: we assign each operation an "amortized cost" that may differ from its actual cost. The extra charge (when amortized > actual) is stored as "credit" to pay for expensive operations later.
The rule:
Applying to dynamic array:
Let's assign each append an amortized cost of 3 units:
How does this pay for resizes?
Consider the moments before and after a resize from capacity c to 2c:
Before resize:
During resize:
After resize:
The accounting always works out! Each element "prepays" for its future copying costs.
1 unit for the append itself. 1 unit for copying THIS element in the next resize. 1 unit for copying a PREVIOUS element that was already in the array before this append. Each new element helps pay for an old element's copying too!
At any moment, the number of credits stored equals 2×(size - capacity/2). When size = capacity, credits = capacity. This exactly matches the resize cost. When size = capacity/2, credits = 0 (just after a resize).
The accounting method conclusion:
Since we can pay for all operations (including resizes) with amortized cost 3 per append, and credits never go negative, the amortized cost per append is O(1).
This approach is often more intuitive than the aggregate method because it explicitly shows HOW the expensive operations are paid for. Each cheap operation "saves up" for the occasional expensive one.
The potential method is the most powerful (and most mathematical) amortized analysis technique. It defines a "potential function" Φ that maps data structure states to non-negative numbers, similar to potential energy in physics.
The formula:
Amortized cost = Actual cost + ΔΦ
Where ΔΦ = Φ(after) - Φ(before) is the change in potential.
Requirements:
For dynamic array:
Define potential function:
Φ(array) = 2 × size - capacity
Where size = number of elements, capacity = allocated slots.
Verify the requirements:
Analysis of append (no resize):
Actual cost = 1 (write the element)
ΔΦ = (2(size+1) - capacity) - (2×size - capacity)
= 2(size+1) - 2×size
= 2
Amortized cost = 1 + 2 = 3 = O(1)
Analysis of append (with resize from capacity c to 2c):
Before: size = c, capacity = c, Φ = 2c - c = c
After: size = c+1, capacity = 2c, Φ = 2(c+1) - 2c = 2
Actual cost = 1 (write new element) + c (copy c elements) = c + 1
ΔΦ = 2 - c = -(c-2)
Amortized cost = (c + 1) + (2 - c) = 3 = O(1)
The expensive resize cost (c copies) is exactly offset by the large decrease in potential! The potential we built up over many cheap operations is "released" during the expensive operation.
Aggregate, Accounting, and Potential methods all prove the same result: dynamic array append is O(1) amortized. The Potential method is most formal and generalizes best to complex data structures. The Accounting method is most intuitive. The Aggregate method is simplest for straightforward cases.
We've proven that doubling gives O(1) amortized append. Does this hold for other growth factors? The answer reveals fascinating trade-offs.
General growth factor α > 1:
With growth factor α, capacity sequence is:
c₀, αc₀, α²c₀, α³c₀, ...
For n appends, number of resizes is approximately log_α(n).
Total copying work:
c₀ + αc₀ + α²c₀ + ... + c₀αᵏ (where αᵏ ≈ n)
= c₀(αᵏ⁺¹ - 1)/(α - 1)
≈ αn/(α - 1)
= O(n) for any constant α > 1
So any constant growth factor > 1 gives O(1) amortized append!
But the constants matter:
Total copying work is approximately αn/(α - 1):
| Growth Factor α | Copies per n appends | Amortized copies/append |
|---|---|---|
| 2.0 | 2n | 2 |
| 1.5 | 3n | 3 |
| 1.25 | 5n | 5 |
| 1.1 | 11n | 11 |
Smaller growth factors mean more resize operations and more total copying.
| Factor | Copies/Append | Max Wasted Space | Resizes for n=1M |
|---|---|---|---|
| 2.0 | ~2 | 100% | 20 |
| 1.5 | ~3 | 50% | 35 |
| 1.25 | ~5 | 25% | 62 |
| 1.1 | ~11 | 10% | 145 |
The space-time trade-off:
Why not growth factor 1.0 (adding a constant)?
With constant growth (e.g., adding 100 slots each time), the resize cost analysis breaks down:
Total copies = c₀ + (c₀+100) + (c₀+200) + ...
= O(n²/100) = O(n²)
This is quadratic! Linear growth does not amortize to O(1).
For O(1) amortized append, the growth factor must be strictly greater than 1. Any factor α ≤ 1 results in O(n) amortized append (linear or worse). The factor 2 is popular because it's simple, relatively space-efficient, and has clean analysis.
Understanding amortized cost has practical implications for how you use dynamic arrays:
1. Occasional latency spikes
Amortized O(1) doesn't mean every operation is fast. Resizes can cause latency spikes:
Mitigation strategies:
ArrayList(expectedSize)2. Memory allocation overhead
Each resize involves memory allocation, which has its own overhead:
3. Cache effects during resize
Copying all elements during resize:
4. When amortized analysis fails you
Amortized analysis assumes you perform many operations. If you:
The amortized perspective may not reflect your actual experience. The expensive operations, though rare, do happen.
5. Capacity hints are valuable
Most dynamic array implementations allow specifying initial capacity:
arr = [] # Default capacity, will resize on growth
arr = [None] * 1000 # Pre-allocated for 1000 elements
If you know approximately how many elements you'll add, pre-allocating can eliminate all resizes, turning amortized O(1) into actual O(1).
Amortized analysis provides a theoretical guarantee, but real performance depends on cache behavior, allocator efficiency, memory pressure, and more. When performance is critical, profile actual behavior. Use capacity hints when you can estimate sizes. Be aware that occasional expensive operations exist.
The amortized analysis techniques we've learned apply to many data structures. Understanding this pattern helps you recognize and analyze similar situations.
Hash tables:
Hash tables resize when the load factor exceeds a threshold. The analysis is nearly identical to dynamic arrays:
Splay trees:
Splay trees move accessed nodes to the root, occasionally requiring O(n) rotations. Through potential function analysis, operations are O(log n) amortized.
Union-Find with path compression:
Each operation might walk a long path, but the work "flattens" the structure for future operations. Amortized analysis (using inverse Ackermann function) proves near-O(1) operations.
Fibonacci heaps:
DecreaseKey is O(1) amortized despite occasionally triggering cascading cuts. Potential function tracks trees and marked nodes.
The core insight of amortized analysis is universal: expensive operations can be 'paid for' by cheap operations if the cheap operations are sufficiently more frequent. This applies to algorithms, data structures, resource management, and even economic and physical systems. Recognizing this pattern is a key thinking tool for computer scientists.
We have developed a rigorous understanding of amortized analysis and its application to dynamic arrays. Let's consolidate the key insights:
What's next:
We've now analyzed both static and dynamic arrays, plus the amortized cost of dynamic array operations. The final page synthesizes everything: Trade-offs between Static and Dynamic Arrays — when to choose which, and how to make informed engineering decisions based on your specific constraints.
You now understand amortized analysis — a fundamental technique in algorithm analysis that extends far beyond dynamic arrays. The ability to prove and reason about amortized costs is a distinguishing skill of experienced engineers. Next, we'll apply our complete understanding to make practical trade-off decisions between static and dynamic arrays.