Loading learning content...
We've seen how Dijkstra's algorithm works, traced through examples, and analyzed its complexity. But a fundamental question remains: How do we know it always finds the shortest paths? Not just for specific examples, but for any graph with non-negative weights?
The answer lies in a formal proof of correctness. This isn't just academic rigor—understanding the proof reveals the deep structure of the algorithm and why each piece is necessary. When you understand why the algorithm works, you can confidently modify it, adapt it to new problems, and debug it when implementation errors creep in.
By the end of this page, you will understand: (1) The formal statement of correctness, (2) The inductive proof structure, (3) Why each algorithmic choice is necessary, and (4) How to reason about algorithm correctness in general.
Before proving anything, we must state precisely what we're proving. Mathematics demands clarity.
Formal Statement of Correctness:
12345678910111213141516
THEOREM (Correctness of Dijkstra's Algorithm): Let G = (V, E, w) be a weighted directed graph where: - V is the set of vertices - E is the set of directed edges - w: E → ℝ≥0 is a weight function with all non-negative weights Let s ∈ V be a designated source vertex. Then upon termination of Dijkstra's algorithm: 1. For every vertex v reachable from s: dist[v] = δ(s, v) where δ(s, v) is the weight of the shortest path from s to v. 2. For every vertex v not reachable from s: dist[v] = ∞ 3. The parent pointers define a shortest-path tree rooted at s.Breaking down the claim:
Part 1 says the algorithm correctly computes shortest path distances for all reachable vertices. This is the main result.
Part 2 says unreachable vertices are correctly identified (they keep their initial ∞ distance). This is almost trivial—no relaxation can lower their distance.
Part 3 says the parent pointers form a valid shortest-path tree. This follows from Part 1 plus the Path Relaxation Property we established earlier.
Our proof will focus on Part 1, the substantive claim.
Notice that the theorem explicitly requires w(e) ≥ 0 for all edges. The proof will show exactly where this constraint is used. Without it, the theorem is false—Dijkstra's algorithm can return incorrect results with negative edges.
To prove the algorithm correct, we use a loop invariant—a property that:
Our invariant:
12345678910
INVARIANT: At the start of each iteration of the while loop: For every vertex v in the processed set S: dist[v] = δ(s, v) That is, every processed vertex has its correct shortest-path distance. Corollary (from invariant + termination): When the algorithm terminates, S contains all reachable vertices, and all have correct shortest-path distances.Proof outline:
Initialization: Before the first iteration, S = ∅ (empty set). The invariant is vacuously true—there are no processed vertices to have incorrect distances.
Maintenance: We show that if the invariant holds before an iteration, it holds after. This is the heart of the proof.
Termination: When the loop ends, every reachable vertex is in S. By the invariant, all have correct distances.
Let's tackle each part.
Loop invariants are the standard way to prove iterative algorithms correct. The technique mirrors mathematical induction: establish a base case, show the property is preserved, then conclude the final result. Master this pattern to reason about any loop-based algorithm.
The base case is straightforward. We need to show the invariant holds before the first iteration of the main loop.
State at initialization:
12345678910111213141516
After initialization, before first iteration: dist[s] = 0 dist[v] = ∞ for all v ≠ s processed = { } (empty set) Priority queue contains: { (s, 0) } Checking the invariant: "For every vertex v in the processed set: dist[v] = δ(s, v)" The processed set is empty. There are no vertices in the processed set. The statement "for every vertex v in S, [property]" is vacuously true when S is empty. Therefore, the invariant holds at initialization. ✓About vacuous truth:
A statement of the form "for all x in S, P(x)" is vacuously true when S is empty. There are no elements to provide a counterexample. This might seem like a technicality, but it's important for mathematical rigor.
Think of it this way: if I say "every unicorn in this room can fly," and there are no unicorns in the room, I haven't made a false statement. You can't point to a non-flying unicorn to contradict me.
Besides the invariant, note that dist[s] = 0 = δ(s, s) is correct (distance from source to itself is 0), and dist[v] = ∞ ≥ δ(s, v) for all other vertices (infinity is an upper bound on any distance). This establishes the Upper Bound Property we discussed earlier.
This is the heart of the proof. We must show that if the invariant holds at the start of an iteration, it holds at the end.
Setup:
Assume the invariant holds: every vertex in S has its correct shortest-path distance.
Let u be the vertex extracted from the priority queue (minimum dist among non-processed vertices).
We must prove: dist[u] = δ(s, u)
Once we prove this, adding u to S maintains the invariant.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152
CLAIM: When u is extracted with minimum dist[u] among unprocessed vertices, dist[u] = δ(s, u). PROOF (by contradiction): Suppose dist[u] > δ(s, u). (That is, we have an incorrect, too-high distance estimate for u.) Let P be a true shortest path from s to u with weight δ(s, u).P starts at s (which is in S) and ends at u (which is not in S).Therefore, P must cross from S to V \ S at some point. Let y be the FIRST vertex along P that is not in S.Let x be the vertex immediately before y on P.Note: x is in S, y is not in S, and (x, y) is an edge on P. s ──...──> x ────> y ──...──> u └─── in S ───┘ └── not in S ──┘ Since x ∈ S, by the invariant: dist[x] = δ(s, x) [x has correct distance] When x was processed, we relaxed edge (x, y): dist[y] ≤ dist[x] + w(x, y) = δ(s, x) + w(x, y) Since P is a shortest path and x, y are on it: δ(s, y) = δ(s, x) + w(x, y) Therefore: dist[y] ≤ δ(s, y) But we also know (from Upper Bound Property): dist[y] ≥ δ(s, y) Therefore: dist[y] = δ(s, y) [y has the correct distance!] Now, since y is on the shortest path from s to u, and P continues to u: δ(s, y) ≤ δ(s, u) [can't get shorter by going farther] This uses the NON-NEGATIVE WEIGHT assumption!If weights could be negative, a longer path might have smaller weight. We assumed dist[u] > δ(s, u), so: dist[y] = δ(s, y) ≤ δ(s, u) < dist[u] But we extracted u because it had MINIMUM dist among unprocessed vertices.And y is unprocessed (y ∉ S by construction).So dist[u] ≤ dist[y]. CONTRADICTION: We can't have both dist[y] < dist[u] and dist[u] ≤ dist[y]. Therefore, our assumption was wrong, and dist[u] = δ(s, u). ∎The critical step is 'δ(s, y) ≤ δ(s, u) since y comes before u on the path.' This only holds with non-negative edge weights! With negative weights, going further along the path could decrease the distance, breaking this inequality and the entire proof.
We've shown the invariant is maintained through each iteration. Now we must show the algorithm terminates and the conclusion follows.
Termination:
12345678910111213141516171819
TERMINATION PROOF: The algorithm terminates when the priority queue is empty. Progress argument: - Each iteration extracts at least one element from the priority queue - Each vertex is added to S at most once (processed flag prevents re-addition) - Therefore, at most V extractions lead to finalization - The priority queue can have at most O(V + E) entries (with lazy deletion) - Each extraction either finalizes a vertex or skips a duplicate - Total iterations ≤ V (finalizations) + O(E) (skips) - Finite bound → algorithm terminates ✓ More formally: Define potential function Φ = |priority queue| + |V \ S| - Each iteration: either Φ decreases by 1 (finalization), or queue size decreases by 1 (skip) - Φ starts finite, strictly decreases, bounded below by 0 - Must terminate ✓Conclusion from invariant:
When the algorithm terminates:
What about unreachable vertices?
Unreachable vertices are never added to the priority queue (no path leads to them from s). Their distance remains ∞, which correctly represents "no path exists."
We've proven that Dijkstra's algorithm correctly computes shortest-path distances for all vertices in a weighted graph with non-negative edge weights. The proof relied on: (1) the loop invariant technique, (2) the non-negative weight assumption for the key inequality, and (3) the greedy min-extraction property.
Let's reflect on what the proof tells us about the algorithm's structure. Each piece of the algorithm corresponds to a piece of the proof.
Why initialize dist[s] = 0?
Because δ(s, s) = 0. The source vertex has distance 0 to itself. This is the anchor of our induction—the first correct distance.
Why use a priority queue keyed by distance?
The proof relies on extracting the vertex with minimum dist. This greedy choice ensures that when we process a vertex, no later path through unprocessed vertices can be shorter.
Why doesn't this work with negative weights?
The proof breaks at: "δ(s, y) ≤ δ(s, u) since y comes before u on the shortest path."
With negative edges, a path could go through y, then traverse negative edges, and end up at u with a smaller total weight than δ(s, y). The "distance increases along the path" intuition fails.
Example failure with negative edges:
123456789101112131415161718192021
Graph: s ──(1)──> A ──(-5)──> B | (3) | v C Dijkstra's execution: - Initialize: dist[s]=0, dist[A]=∞, dist[B]=∞, dist[C]=∞ - Extract s: relax s→A (dist[A]=1), relax s→C (dist[C]=3) - Extract A (dist=1 is minimum): relax A→B (dist[B]=1-5=-4) - Extract B (dist=-4): no outgoing edges - Extract C (dist=3): no outgoing edges - Result: dist = {s:0, A:1, B:-4, C:3} But wait—what if we found s→C first? The algorithm might: - Extract C at dist=3 before processing s→A→B The "minimum distance so far is final" assumption fails.We need Bellman-Ford for graphs with negative edges.Understanding the proof makes the non-negative weight requirement feel necessary rather than arbitrary. The proof doesn't just tell us the algorithm is correct—it explains what breaks if we violate the preconditions.
The contradiction proof we presented is the most common, but there are other ways to view Dijkstra's correctness. These alternative perspectives deepen understanding.
Perspective 1: Greedy Choice Property
12345678910111213
Dijkstra as a greedy algorithm: Greedy Choice Property: The locally optimal choice (extract minimum) is globally safe. When we finalize a vertex, its distance is truly optimal. Optimal Substructure: The shortest path to u contains the shortest path to u's predecessor. s ──shortest──> x ──(x,u)──> u If this is the shortest s-to-u path, then s-to-x is shortest too. Together, these properties guarantee greedy algorithms work.Dijkstra exhibits both properties (under non-negative weight assumption).Perspective 2: BFS Generalization
123456789101112
BFS finds shortest paths in unweighted graphs.Why? It processes vertices in order of hop count.The first time we reach a vertex is via a shortest path. Dijkstra generalizes this for weighted graphs.It processes vertices in order of total path weight.The first time we finalize a vertex is via a shortest path. BFS: FIFO queue (order by discovery time = order by hops)Dijkstra: Priority queue (order by cumulative weight) Same structure, different ordering criterion.Perspective 3: Dynamic Programming
12345678910
Bellman-Ford as DP: dist_k[v] = shortest path to v using at most k edges dist_k[v] = min over all (u,v) edges of: dist_{k-1}[u] + w(u,v) Dijkstra optimizes this DP by processing in the right order: - If we process vertices in order of dist value, when we compute dist[v], all predecessors are already final. - We need only one pass, not V passes. Dijkstra is "DP with optimal subproblem ordering."Having multiple ways to understand an algorithm strengthens your grasp. See Dijkstra as a greedy algorithm (making locally optimal choices), as BFS for weighted graphs (generalizing hop-count to weight-sum), or as optimized DP (solving subproblems in the right order). All perspectives are valid.
We've now completed our deep dive into Dijkstra's algorithm, culminating in a formal proof of correctness. Let's consolidate this page and the entire module:
Module Summary — Dijkstra's Algorithm Mastery:
Over five pages, we've covered Dijkstra's algorithm comprehensively:
Page 1: Why weighted graphs require a new approach; the SSSP problem; non-negative weight constraint; the key insight of processing by distance.
Page 2: Priority queues and min-heaps; complete algorithm implementation; tracing through examples; implementation variations; path reconstruction.
Page 3: Edge relaxation formalized; triangle inequality; path optimality; greedy choice property; relaxation order and efficiency; invariants.
Page 4: Time complexity derivation O((V+E) log V); implementation comparisons; graph density effects; space complexity; practical optimizations.
Page 5: Formal proof of correctness; loop invariant technique; the core contradiction argument; termination; alternative proof perspectives.
You now have a complete, rigorous understanding of one of computer science's most important algorithms.
Congratulations! You've mastered Dijkstra's algorithm—not just the how, but the why. You can implement it efficiently, analyze its complexity, trace through its execution, and prove its correctness. This deep understanding prepares you for advanced algorithms like A*, Bellman-Ford, and sophisticated graph optimization techniques.