Loading content...
Sometimes, knowing the shortest path from one source isn't enough. Consider these scenarios:
In these cases, we need the all-pairs shortest path (APSP) solution—a complete matrix giving the shortest distance between every pair of vertices. While we could run Dijkstra or Bellman-Ford V times (once from each source), Floyd-Warshall offers an elegant alternative: a simple dynamic programming algorithm that computes all shortest paths in a single unified sweep.
This page explores the Floyd-Warshall algorithm as the solution for all-pairs shortest paths. You'll understand its elegant dynamic programming formulation, when it outperforms running single-source algorithms multiple times, its space requirements, and the specific scenarios where it's the optimal choice.
The all-pairs shortest path (APSP) problem asks: for every pair of vertices (u, v) in a graph, what is the shortest path from u to v?
Output Format:
The solution is typically represented as a V × V matrix D where D[i][j] = shortest distance from vertex i to vertex j. Additionally, we often maintain a "next" matrix N where N[i][j] = first vertex on the shortest path from i to j, enabling path reconstruction.
Problem Scale:
For a graph with V vertices:
This baseline tells us that APSP is inherently more expensive than single-source shortest paths. The question is: can we do better than running SSSP V times?
Approaches to APSP:
| Approach | Sparse (E ≈ V) | Moderate (E ≈ V√V) | Dense (E ≈ V²) |
|---|---|---|---|
| Dijkstra × V | O(V² log V) | O(V^2.5 log V) | O(V³ log V) |
| Bellman-Ford × V | O(V³) | O(V^3.5) | O(V⁴) |
| Floyd-Warshall | O(V³) | O(V³) | O(V³) |
| Johnson's Algorithm | O(V² log V + V²) | O(V² log V + V^2.5) | O(V² log V + V³) |
Floyd-Warshall's O(V³) complexity is independent of edge count. This makes it relatively better for dense graphs (where E ≈ V²) and relatively worse for sparse graphs. For sparse graphs with non-negative weights, running Dijkstra V times is usually faster.
Floyd-Warshall is remarkable for its simplicity. The entire algorithm is three nested loops with a single conditional update. Yet this simple structure correctly computes all shortest paths.
The Core Idea:
For each pair (i, j), consider all possible intermediate vertices k. The shortest path from i to j either:
If we've already computed shortest paths using only vertices {0, 1, ..., k-1} as intermediates, we can extend to include k by checking: is going through k shorter than the best path so far?
The Recurrence:
Let D^(k)[i][j] = shortest path from i to j using only vertices {0, 1, ..., k-1} as intermediates.
Base case: D^(0)[i][j] = edge weight from i to j (or ∞ if no edge)
Recurrence: D^(k)[i][j] = min(D^(k-1)[i][j], D^(k-1)[i][k] + D^(k-1)[k][j])
Final answer: D^(V)[i][j] = shortest path from i to j (all vertices allowed as intermediates)
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889
interface FloydWarshallResult { distances: number[][]; next: (number | null)[][]; // For path reconstruction hasNegativeCycle: boolean;} function floydWarshall( numVertices: number, edges: { from: number; to: number; weight: number }[]): FloydWarshallResult { const V = numVertices; // Initialize distance matrix const dist: number[][] = Array.from({ length: V }, () => new Array(V).fill(Infinity) ); // Initialize next matrix for path reconstruction const next: (number | null)[][] = Array.from({ length: V }, () => new Array<number | null>(V).fill(null) ); // Self-loops have distance 0 for (let i = 0; i < V; i++) { dist[i][i] = 0; } // Initialize with direct edges for (const edge of edges) { // Handle multiple edges between same pair (keep minimum) if (edge.weight < dist[edge.from][edge.to]) { dist[edge.from][edge.to] = edge.weight; next[edge.from][edge.to] = edge.to; } } // Floyd-Warshall: consider each vertex k as intermediate for (let k = 0; k < V; k++) { for (let i = 0; i < V; i++) { for (let j = 0; j < V; j++) { // Can we improve i→j by going through k? if (dist[i][k] !== Infinity && dist[k][j] !== Infinity) { const throughK = dist[i][k] + dist[k][j]; if (throughK < dist[i][j]) { dist[i][j] = throughK; next[i][j] = next[i][k]; // First step toward j is first step toward k } } } } } // Check for negative cycles (negative value on diagonal) let hasNegativeCycle = false; for (let i = 0; i < V; i++) { if (dist[i][i] < 0) { hasNegativeCycle = true; break; } } return { distances: dist, next, hasNegativeCycle };} // Reconstruct path from i to j using next matrixfunction reconstructPath( next: (number | null)[][], from: number, to: number): number[] | null { if (next[from][to] === null) { return null; // No path exists } const path: number[] = [from]; let current = from; while (current !== to) { current = next[current][to]!; path.push(current); // Safety check to prevent infinite loops on negative cycles if (path.length > next.length) { return null; // Cycle detected } } return path;}Notice that we update the distance matrix in-place rather than creating new matrices for each k. This works because when computing D[i][j] for intermediate k, we read D[i][k] and D[k][j]. These values are not affected by updates targeting D[i][j], so in-place updates are safe.
Floyd-Warshall's correctness comes from a beautiful dynamic programming formulation. Let's understand it deeply:
The Subproblem Structure:
Define D^(k)[i][j] as the length of the shortest path from i to j that only uses vertices {0, 1, ..., k-1} as intermediate vertices (i.e., vertices strictly between i and j on the path).
Key Observation:
The shortest path from i to j using vertices {0, ..., k} as possible intermediates either:
Note: Vertex k cannot appear more than once in an optimal simple path (no cycles on simple paths).
The Recurrence:
D^(k+1)[i][j] = min(D^(k)[i][j], D^(k)[i][k] + D^(k)[k][j])
Why This Leads to O(V³):
We have O(V²) subproblems (one for each (i, j) pair), and for each k from 0 to V-1, we update all V² entries. Total: O(V × V²) = O(V³).
Visual Intuition:
Imagine the vertices numbered 0 to V-1. Initially, we only know direct edges (paths with no intermediate vertices). In iteration k=0, we consider paths that might go through vertex 0. In iteration k=1, we consider paths through vertices 0 or 1. By iteration k=V-1, we've considered all possible intermediate vertices.
Iteration k=0: Direct edges only (i → j)
↓
Iteration k=1: Paths through vertex 0 (i → 0 → j)
↓
Iteration k=2: Paths through vertices 0,1 (i → {0,1} → j)
↓
...
↓
Iteration k=V: Paths through any vertex (shortest paths!)
Handling Negative Weights:
Floyd-Warshall correctly handles negative weights because:
The only complication is negative cycles, detected by checking if any dist[i][i] becomes negative after completion.
The key insight is that the order of the k loop matters. By considering vertices as intermediates in order, we ensure that when we check "can we improve i→j by going through k?", the paths i→k and k→j have already been optimized using all vertices less than k.
Floyd-Warshall isn't always the best choice for APSP, but there are specific scenarios where it excels:
Ideal Conditions for Floyd-Warshall:
Concrete Example — When Floyd-Warshall Wins:
Consider a complete graph (every vertex connected to every other) with 1000 vertices:
Floyd-Warshall is 5× faster and much simpler to implement.
Concrete Example — When Floyd-Warshall Loses:
Consider a sparse road network with 1000 cities, average 5 roads per city:
Dijkstra × V is 40× faster.
V³ grows rapidly. At V=1000, that's 1 billion operations (~1 second). At V=10000, that's 1 trillion operations (~15 minutes). At V=100000, that's 10^15 operations (~years). Floyd-Warshall simply doesn't scale to large graphs.
Floyd-Warshall requires O(V²) space for the distance matrix, and typically another O(V²) for path reconstruction. This can be a significant constraint:
Memory Requirements (Distance Matrix Only):
| Vertices | Matrix Size | Memory (doubles) | Memory (floats) |
|---|---|---|---|
| 100 | 10,000 | 80 KB | 40 KB |
| 1,000 | 1,000,000 | 8 MB | 4 MB |
| 5,000 | 25,000,000 | 200 MB | 100 MB |
| 10,000 | 100,000,000 | 800 MB | 400 MB |
| 50,000 | 2,500,000,000 | 20 GB | 10 GB |
| 100,000 | 10,000,000,000 | 80 GB | 40 GB |
Implications:
Space Optimization Techniques:
When Memory Is Critical:
If you need APSP but can't afford O(V²) memory, consider:
Before choosing Floyd-Warshall, calculate V² × value_size and verify you have sufficient memory. It's better to discover insufficiency during design than when the program crashes with out-of-memory errors.
Like Bellman-Ford, Floyd-Warshall can detect negative-weight cycles. The method is elegant:
Detection Method:
After the algorithm completes, check the diagonal of the distance matrix. If dist[i][i] < 0 for any vertex i, a negative cycle exists that is reachable from (and can return to) vertex i.
Why This Works:
Initially, dist[i][i] = 0 (zero cost to stay in place). A negative diagonal value means we found a path from i back to i with negative total weight—exactly the definition of a negative cycle.
Affected Vertices:
Vertices affected by a negative cycle (those whose shortest distances are undefined, conceptually −∞) can be identified by running an additional iteration:
// After main algorithm, identify affected vertices
for (let k = 0; k < V; k++) {
if (dist[k][k] < 0) {
// k is on a negative cycle
for (let i = 0; i < V; i++) {
for (let j = 0; j < V; j++) {
// If there's a path i→k→j, mark (i,j) as affected
if (dist[i][k] !== Infinity && dist[k][j] !== Infinity) {
dist[i][j] = -Infinity;
}
}
}
}
}
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152
// Complete Floyd-Warshall with full negative cycle handlingfunction floydWarshallWithNegativeCycleHandling( numVertices: number, edges: { from: number; to: number; weight: number }[]): { distances: (number | typeof NEG_INFINITY)[][], negativeCycleVertices: number[] } { const V = numVertices; const NEG_INFINITY = Number.NEGATIVE_INFINITY; // Initialize const dist: number[][] = Array.from({ length: V }, () => new Array(V).fill(Infinity) ); for (let i = 0; i < V; i++) dist[i][i] = 0; for (const e of edges) { dist[e.from][e.to] = Math.min(dist[e.from][e.to], e.weight); } // Main Floyd-Warshall for (let k = 0; k < V; k++) { for (let i = 0; i < V; i++) { for (let j = 0; j < V; j++) { if (dist[i][k] !== Infinity && dist[k][j] !== Infinity) { if (dist[i][k] + dist[k][j] < dist[i][j]) { dist[i][j] = dist[i][k] + dist[k][j]; } } } } } // Find vertices on negative cycles const negativeCycleVertices: number[] = []; for (let i = 0; i < V; i++) { if (dist[i][i] < 0) { negativeCycleVertices.push(i); } } // Mark all pairs affected by negative cycles as -Infinity for (const k of negativeCycleVertices) { for (let i = 0; i < V; i++) { for (let j = 0; j < V; j++) { if (dist[i][k] !== Infinity && dist[k][j] !== Infinity) { dist[i][j] = NEG_INFINITY; } } } } return { distances: dist as any, negativeCycleVertices };}Both Bellman-Ford and Floyd-Warshall can detect negative cycles, but through different mechanisms. Bellman-Ford checks if any edge can still be relaxed after V-1 iterations. Floyd-Warshall checks for negative values on the diagonal. Use whichever algorithm fits your problem scope (single-source vs. all-pairs).
Floyd-Warshall finds applications in many domains where all-pairs information is naturally required:
1. Network Topology Analysis:
In network management, understanding the complete latency matrix helps:
2. Transitive Closure:
A variant of Floyd-Warshall computes transitive closure—can vertex i reach vertex j at all?
// Transitive closure using Floyd-Warshall structure
for (let k = 0; k < V; k++) {
for (let i = 0; i < V; i++) {
for (let j = 0; j < V; j++) {
// reachable[i][j] = can reach j from i through k?
reachable[i][j] = reachable[i][j] ||
(reachable[i][k] && reachable[k][j]);
}
}
}
This is useful for dependency analysis, access control computation, and graph connectivity queries.
3. Minimax and Widest Path Problems:
Floyd-Warshall's structure adapts to other optimization objectives:
Minimax Path: Find the path that minimizes the maximum edge weight.
dist[i][j] = min(dist[i][j], max(dist[i][k], dist[k][j]))
Widest Path (Maximum Capacity): Find the path with maximum bottleneck capacity.
cap[i][j] = max(cap[i][j], min(cap[i][k], cap[k][j]))
These variants use the same O(V³) structure but different combining operations.
Floyd-Warshall works on any closed semiring. The standard version uses (min, +) for shortest paths. Replace with (max, min) for widest paths, (AND, OR) for transitive closure, or (min, max) for minimax paths. This algebraic viewpoint enables many creative applications.
For completeness, let's compare Floyd-Warshall with other APSP approaches:
Floyd-Warshall vs. Dijkstra × V:
| Factor | Floyd-Warshall | Dijkstra × V |
|---|---|---|
| Time (sparse, E ≈ V) | O(V³) | O(V² log V) |
| Time (dense, E ≈ V²) | O(V³) | O(V³ log V) |
| Negative weights | ✅ Yes | ❌ No |
| Space | O(V²) permanent | O(V) per run |
| Implementation | Very simple | Needs priority queue |
| Memory access | Very cache-friendly | Less predictable |
Floyd-Warshall vs. Johnson's Algorithm:
Johnson's algorithm is designed for sparse graphs with potentially negative weights:
| Factor | Floyd-Warshall | Johnson's Algorithm |
|---|---|---|
| Time Complexity | O(V³) | O(V² log V + VE) |
| Best for Sparse | ❌ No advantage | ✅ O(V² log V) |
| Best for Dense | ✅ O(V³) | ⚠️ O(V³ log V) |
| Negative Weights | ✅ Yes | ✅ Yes |
| Implementation | Very simple | Complex (multiple algorithms) |
| Negative Cycle Detection | ✅ Easy | ✅ During Bellman-Ford phase |
Recommendation Matrix:
In practice, Floyd-Warshall is often chosen for its simplicity even when theoretically sub-optimal. For graphs under 1000-2000 vertices, the performance difference rarely matters, and Floyd-Warshall's simplicity reduces bugs and development time.
Let's consolidate the key points about Floyd-Warshall algorithm selection:
What's Next:
In the final page of this module, we'll synthesize everything into a complete decision framework. You'll learn to systematically analyze any shortest path problem and select the optimal algorithm based on problem constraints, graph characteristics, and performance requirements.
You now understand Floyd-Warshall as the algorithm of choice for all-pairs shortest paths, particularly on dense graphs. You can evaluate when its O(V³) complexity and O(V²) space are acceptable trade-offs for its simplicity and capability to handle negative weights.