Choosing Algorithm - Learning Module

Loading content...

0/276

All Pairs — When Floyd-Warshall Algorithm Is the Right Choice

Computing Every Shortest Path at Once

Sometimes, knowing the shortest path from one source isn't enough. Consider these scenarios:

Network planning: What's the maximum latency between any two data centers?
Traffic analysis: Which city pair has the longest commute time?
Social networks: Who are the most "distant" users in terms of connection hops?
Game maps: Precompute all travel times for AI decision-making

In these cases, we need the all-pairs shortest path (APSP) solution—a complete matrix giving the shortest distance between every pair of vertices. While we could run Dijkstra or Bellman-Ford V times (once from each source), Floyd-Warshall offers an elegant alternative: a simple dynamic programming algorithm that computes all shortest paths in a single unified sweep.

What You Will Learn

This page explores the Floyd-Warshall algorithm as the solution for all-pairs shortest paths. You'll understand its elegant dynamic programming formulation, when it outperforms running single-source algorithms multiple times, its space requirements, and the specific scenarios where it's the optimal choice.

Understanding the All-Pairs Shortest Path Problem

The all-pairs shortest path (APSP) problem asks: for every pair of vertices (u, v) in a graph, what is the shortest path from u to v?

Output Format:

The solution is typically represented as a V × V matrix D where D[i][j] = shortest distance from vertex i to vertex j. Additionally, we often maintain a "next" matrix N where N[i][j] = first vertex on the shortest path from i to j, enabling path reconstruction.

Problem Scale:

For a graph with V vertices:

There are V² pairs to consider
The output requires O(V²) space (just for distances)
Any algorithm must do at least O(V²) work to populate the output

This baseline tells us that APSP is inherently more expensive than single-source shortest paths. The question is: can we do better than running SSSP V times?

Approaches to APSP:

Dijkstra × V times: O(V × (V + E) log V) = O(V² log V + VE log V)
Bellman-Ford × V times: O(V × V × E) = O(V² × E)
Floyd-Warshall: O(V³)
Johnson's Algorithm: O(V² log V + VE) — combines reweighting with Dijkstra

APSP Approach Comparison by Graph Density
Approach	Sparse (E ≈ V)	Moderate (E ≈ V√V)	Dense (E ≈ V²)
Dijkstra × V	O(V² log V)	O(V^2.5 log V)	O(V³ log V)
Bellman-Ford × V	O(V³)	O(V^3.5)	O(V⁴)
Floyd-Warshall	O(V³)	O(V³)	O(V³)
Johnson's Algorithm	O(V² log V + V²)	O(V² log V + V^2.5)	O(V² log V + V³)

The Density Trade-off

Floyd-Warshall's O(V³) complexity is independent of edge count. This makes it relatively better for dense graphs (where E ≈ V²) and relatively worse for sparse graphs. For sparse graphs with non-negative weights, running Dijkstra V times is usually faster.

The Floyd-Warshall Algorithm — Elegant Simplicity

Floyd-Warshall is remarkable for its simplicity. The entire algorithm is three nested loops with a single conditional update. Yet this simple structure correctly computes all shortest paths.

The Core Idea:

For each pair (i, j), consider all possible intermediate vertices k. The shortest path from i to j either:

Doesn't use k as an intermediate vertex, OR
Uses k, meaning the path goes i → ... → k → ... → j

If we've already computed shortest paths using only vertices {0, 1, ..., k-1} as intermediates, we can extend to include k by checking: is going through k shorter than the best path so far?

The Recurrence:

Let D^(k)[i][j] = shortest path from i to j using only vertices {0, 1, ..., k-1} as intermediates.

Base case: D^(0)[i][j] = edge weight from i to j (or ∞ if no edge)

Recurrence: D^(k)[i][j] = min(D^(k-1)[i][j], D^(k-1)[i][k] + D^(k-1)[k][j])

Final answer: D^(V)[i][j] = shortest path from i to j (all vertices allowed as intermediates)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
interface FloydWarshallResult {
    distances: number[][];
    next: (number | null)[][];  // For path reconstruction
    hasNegativeCycle: boolean;
}
 
function floydWarshall(
    numVertices: number,
    edges: { from: number; to: number; weight: number }[]
): FloydWarshallResult {
    const V = numVertices;
    
    // Initialize distance matrix
    const dist: number[][] = Array.from({ length: V }, () => 
        new Array(V).fill(Infinity)
    );
    
    // Initialize next matrix for path reconstruction
    const next: (number | null)[][] = Array.from({ length: V }, () => 
        new Array<number | null>(V).fill(null)
    );
    
    // Self-loops have distance 0
    for (let i = 0; i < V; i++) {
        dist[i][i] = 0;
    }
    
    // Initialize with direct edges
    for (const edge of edges) {
        // Handle multiple edges between same pair (keep minimum)
        if (edge.weight < dist[edge.from][edge.to]) {
            dist[edge.from][edge.to] = edge.weight;
            next[edge.from][edge.to] = edge.to;
        }
    }
    
    // Floyd-Warshall: consider each vertex k as intermediate
    for (let k = 0; k < V; k++) {
        for (let i = 0; i < V; i++) {
            for (let j = 0; j < V; j++) {
                // Can we improve i→j by going through k?
                if (dist[i][k] !== Infinity && dist[k][j] !== Infinity) {
                    const throughK = dist[i][k] + dist[k][j];
                    if (throughK < dist[i][j]) {
                        dist[i][j] = throughK;
                        next[i][j] = next[i][k];  // First step toward j is first step toward k
                    }
                }
            }
        }
    }
    
    // Check for negative cycles (negative value on diagonal)
    let hasNegativeCycle = false;
    for (let i = 0; i < V; i++) {
        if (dist[i][i] < 0) {
            hasNegativeCycle = true;
            break;
        }
    }
    
    return { distances: dist, next, hasNegativeCycle };
}
 
// Reconstruct path from i to j using next matrix
function reconstructPath(
    next: (number | null)[][],
    from: number,
    to: number
): number[] | null {
    if (next[from][to] === null) {
        return null;  // No path exists
    }
    
    const path: number[] = [from];
    let current = from;
    
    while (current !== to) {
        current = next[current][to]!;
        path.push(current);
        
        // Safety check to prevent infinite loops on negative cycles
        if (path.length > next.length) {
            return null;  // Cycle detected
        }
    }
    
    return path;
}

In-Place Optimization

Notice that we update the distance matrix in-place rather than creating new matrices for each k. This works because when computing D[i][j] for intermediate k, we read D[i][k] and D[k][j]. These values are not affected by updates targeting D[i][j], so in-place updates are safe.

Why Floyd-Warshall Works — The DP Insight

Floyd-Warshall's correctness comes from a beautiful dynamic programming formulation. Let's understand it deeply:

The Subproblem Structure:

Define D^(k)[i][j] as the length of the shortest path from i to j that only uses vertices {0, 1, ..., k-1} as intermediate vertices (i.e., vertices strictly between i and j on the path).

Key Observation:

The shortest path from i to j using vertices {0, ..., k} as possible intermediates either:

Doesn't include vertex k — In this case, D^(k+1)[i][j] = D^(k)[i][j]
Includes vertex k exactly once — The path is i → ... → k → ... → j, so D^(k+1)[i][j] = D^(k)[i][k] + D^(k)[k][j]

Note: Vertex k cannot appear more than once in an optimal simple path (no cycles on simple paths).

The Recurrence:

D^(k+1)[i][j] = min(D^(k)[i][j], D^(k)[i][k] + D^(k)[k][j])

Why This Leads to O(V³):

We have O(V²) subproblems (one for each (i, j) pair), and for each k from 0 to V-1, we update all V² entries. Total: O(V × V²) = O(V³).

Visual Intuition:

Imagine the vertices numbered 0 to V-1. Initially, we only know direct edges (paths with no intermediate vertices). In iteration k=0, we consider paths that might go through vertex 0. In iteration k=1, we consider paths through vertices 0 or 1. By iteration k=V-1, we've considered all possible intermediate vertices.

Iteration k=0: Direct edges only (i → j)
                 ↓
Iteration k=1: Paths through vertex 0 (i → 0 → j)
                 ↓
Iteration k=2: Paths through vertices 0,1 (i → {0,1} → j)
                 ↓
               ...
                 ↓
Iteration k=V: Paths through any vertex (shortest paths!)

Handling Negative Weights:

Floyd-Warshall correctly handles negative weights because:

It considers all possible paths, not just greedy choices
Negative edges simply contribute negative values to sums
The min operation correctly selects the shortest (possibly negative) path

The only complication is negative cycles, detected by checking if any dist[i][i] becomes negative after completion.

The Power of DP Ordering

The key insight is that the order of the k loop matters. By considering vertices as intermediates in order, we ensure that when we check "can we improve i→j by going through k?", the paths i→k and k→j have already been optimized using all vertices less than k.

When to Choose Floyd-Warshall

Floyd-Warshall isn't always the best choice for APSP, but there are specific scenarios where it excels:

Ideal Conditions for Floyd-Warshall:

You need all-pairs shortest paths — The primary use case
Graph is dense (E ≈ V²) — Running Dijkstra V times would be O(V³ log V), slower than O(V³)
Negative weights are possible — Floyd-Warshall handles them correctly
Graph is small to medium — V < 5000 for practical runtimes
Implementation simplicity matters — Three nested loops, hard to get wrong
Memory is sufficient — Need O(V²) space for the distance matrix

Use Floyd-Warshall When

•All pairs shortest paths needed
•Graph is dense (E approaches V²)
•Negative edge weights present
•V < 3000-5000 vertices
•Simple, correct code is priority
•Matrix representation natural
•Negative cycle detection needed

Avoid Floyd-Warshall When

•Only single-source paths needed
•Graph is sparse (E << V²)
•V > 10000 (becomes very slow)
•Memory constrained (V² space)
•All weights non-negative (Dijkstra faster)
•Only few specific pairs needed
•Real-time performance required

Concrete Example — When Floyd-Warshall Wins:

Consider a complete graph (every vertex connected to every other) with 1000 vertices:

Edges: ~500,000
Floyd-Warshall: 1000³ = 1 billion operations
Dijkstra × 1000: 1000 × 500,000 × log(1000) ≈ 5 billion operations

Floyd-Warshall is 5× faster and much simpler to implement.

Concrete Example — When Floyd-Warshall Loses:

Consider a sparse road network with 1000 cities, average 5 roads per city:

Edges: ~2,500
Floyd-Warshall: 1 billion operations
Dijkstra × 1000: 1000 × 2,500 × log(1000) ≈ 25 million operations

Dijkstra × V is 40× faster.

The V³ Wall

V³ grows rapidly. At V=1000, that's 1 billion operations (~1 second). At V=10000, that's 1 trillion operations (~15 minutes). At V=100000, that's 10^15 operations (~years). Floyd-Warshall simply doesn't scale to large graphs.

Space Complexity — The O(V²) Challenge

Floyd-Warshall requires O(V²) space for the distance matrix, and typically another O(V²) for path reconstruction. This can be a significant constraint:

Memory Requirements (Distance Matrix Only):

Floyd-Warshall Memory Usage
Vertices	Matrix Size	Memory (doubles)	Memory (floats)
100	10,000	80 KB	40 KB
1,000	1,000,000	8 MB	4 MB
5,000	25,000,000	200 MB	100 MB
10,000	100,000,000	800 MB	400 MB
50,000	2,500,000,000	20 GB	10 GB
100,000	10,000,000,000	80 GB	40 GB

Implications:

For V > 10,000: Likely out of RAM on typical machines
For V > 50,000: Even high-memory servers struggle
For V > 100,000: Virtual memory would make it impractically slow

Space Optimization Techniques:

Omit path reconstruction matrix — If you only need distances, save half the memory
Use floats instead of doubles — If precision allows, cuts memory in half
Block decomposition — Process in cache-friendly blocks (improves performance, not memory)
External memory algorithms — For truly massive graphs, but complexity increases dramatically

When Memory Is Critical:

If you need APSP but can't afford O(V²) memory, consider:

Johnson's algorithm (O(V) per source, O(V) at a time)
On-demand computation (compute and discard single-source results as needed)
Approximation algorithms (trade accuracy for memory)

Pre-check Memory Requirements

Before choosing Floyd-Warshall, calculate V² × value_size and verify you have sufficient memory. It's better to discover insufficiency during design than when the program crashes with out-of-memory errors.

Negative Cycle Detection in Floyd-Warshall

Like Bellman-Ford, Floyd-Warshall can detect negative-weight cycles. The method is elegant:

Detection Method:

After the algorithm completes, check the diagonal of the distance matrix. If dist[i][i] < 0 for any vertex i, a negative cycle exists that is reachable from (and can return to) vertex i.

Why This Works:

Initially, dist[i][i] = 0 (zero cost to stay in place). A negative diagonal value means we found a path from i back to i with negative total weight—exactly the definition of a negative cycle.

Affected Vertices:

Vertices affected by a negative cycle (those whose shortest distances are undefined, conceptually −∞) can be identified by running an additional iteration:

// After main algorithm, identify affected vertices
for (let k = 0; k < V; k++) {
    if (dist[k][k] < 0) {
        // k is on a negative cycle
        for (let i = 0; i < V; i++) {
            for (let j = 0; j < V; j++) {
                // If there's a path i→k→j, mark (i,j) as affected
                if (dist[i][k] !== Infinity && dist[k][j] !== Infinity) {
                    dist[i][j] = -Infinity;
                }
            }
        }
    }
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
// Complete Floyd-Warshall with full negative cycle handling
function floydWarshallWithNegativeCycleHandling(
    numVertices: number,
    edges: { from: number; to: number; weight: number }[]
): { distances: (number | typeof NEG_INFINITY)[][], negativeCycleVertices: number[] } {
    const V = numVertices;
    const NEG_INFINITY = Number.NEGATIVE_INFINITY;
    
    // Initialize
    const dist: number[][] = Array.from({ length: V }, () => 
        new Array(V).fill(Infinity)
    );
    
    for (let i = 0; i < V; i++) dist[i][i] = 0;
    for (const e of edges) {
        dist[e.from][e.to] = Math.min(dist[e.from][e.to], e.weight);
    }
    
    // Main Floyd-Warshall
    for (let k = 0; k < V; k++) {
        for (let i = 0; i < V; i++) {
            for (let j = 0; j < V; j++) {
                if (dist[i][k] !== Infinity && dist[k][j] !== Infinity) {
                    if (dist[i][k] + dist[k][j] < dist[i][j]) {
                        dist[i][j] = dist[i][k] + dist[k][j];
                    }
                }
            }
        }
    }
    
    // Find vertices on negative cycles
    const negativeCycleVertices: number[] = [];
    for (let i = 0; i < V; i++) {
        if (dist[i][i] < 0) {
            negativeCycleVertices.push(i);
        }
    }
    
    // Mark all pairs affected by negative cycles as -Infinity
    for (const k of negativeCycleVertices) {
        for (let i = 0; i < V; i++) {
            for (let j = 0; j < V; j++) {
                if (dist[i][k] !== Infinity && dist[k][j] !== Infinity) {
                    dist[i][j] = NEG_INFINITY;
                }
            }
        }
    }
    
    return { distances: dist as any, negativeCycleVertices };
}

Both Algorithms Detect Negative Cycles

Both Bellman-Ford and Floyd-Warshall can detect negative cycles, but through different mechanisms. Bellman-Ford checks if any edge can still be relaxed after V-1 iterations. Floyd-Warshall checks for negative values on the diagonal. Use whichever algorithm fits your problem scope (single-source vs. all-pairs).

Real-World Applications of Floyd-Warshall

Floyd-Warshall finds applications in many domains where all-pairs information is naturally required:

1. Network Topology Analysis:

In network management, understanding the complete latency matrix helps:

Identify the worst-case latency (max entry in matrix)
Find the "diameter" of the network (longest shortest path)
Optimize server placement (minimize maximum distance to any point)
Detect routing anomalies (unexpected values in matrix)

2. Transitive Closure:

A variant of Floyd-Warshall computes transitive closure—can vertex i reach vertex j at all?

// Transitive closure using Floyd-Warshall structure
for (let k = 0; k < V; k++) {
    for (let i = 0; i < V; i++) {
        for (let j = 0; j < V; j++) {
            // reachable[i][j] = can reach j from i through k?
            reachable[i][j] = reachable[i][j] || 
                             (reachable[i][k] && reachable[k][j]);
        }
    }
}

This is useful for dependency analysis, access control computation, and graph connectivity queries.

More Applications

•Traffic Engineering — Precompute all-pairs travel times for city planning and congestion analysis
•Social Network Analysis — Find average separation, clustering coefficients, and influence distances
•Game AI Preprocessing — Precompute movement costs between all map regions for fast pathfinding
•Optimal Meeting Points — Given locations, find central point minimizing total travel
•Graph Metrics — Compute diameter, radius, eccentricity of vertices
•Regular Path Queries — Modify for path constraints using semiring algebra
•Matrix Shortest Path — Computing shortest paths in graph products

3. Minimax and Widest Path Problems:

Floyd-Warshall's structure adapts to other optimization objectives:

Minimax Path: Find the path that minimizes the maximum edge weight.

dist[i][j] = min(dist[i][j], max(dist[i][k], dist[k][j]))

Widest Path (Maximum Capacity): Find the path with maximum bottleneck capacity.

cap[i][j] = max(cap[i][j], min(cap[i][k], cap[k][j]))

These variants use the same O(V³) structure but different combining operations.

Semiring Generalization

Floyd-Warshall works on any closed semiring. The standard version uses (min, +) for shortest paths. Replace with (max, min) for widest paths, (AND, OR) for transitive closure, or (min, max) for minimax paths. This algebraic viewpoint enables many creative applications.

Comparison with Alternative Approaches

For completeness, let's compare Floyd-Warshall with other APSP approaches:

Floyd-Warshall vs. Dijkstra × V:

Factor	Floyd-Warshall	Dijkstra × V
Time (sparse, E ≈ V)	O(V³)	O(V² log V)
Time (dense, E ≈ V²)	O(V³)	O(V³ log V)
Negative weights	✅ Yes	❌ No
Space	O(V²) permanent	O(V) per run
Implementation	Very simple	Needs priority queue
Memory access	Very cache-friendly	Less predictable

Floyd-Warshall vs. Johnson's Algorithm:

Johnson's algorithm is designed for sparse graphs with potentially negative weights:

Add a super-source connected to all vertices with weight 0
Run Bellman-Ford from super-source to compute reweighting potentials
Reweight all edges to be non-negative
Run Dijkstra from each vertex on reweighted graph
Adjust final distances using potentials

Johnson's vs Floyd-Warshall
Factor	Floyd-Warshall	Johnson's Algorithm
Time Complexity	O(V³)	O(V² log V + VE)
Best for Sparse	❌ No advantage	✅ O(V² log V)
Best for Dense	✅ O(V³)	⚠️ O(V³ log V)
Negative Weights	✅ Yes	✅ Yes
Implementation	Very simple	Complex (multiple algorithms)
Negative Cycle Detection	✅ Easy	✅ During Bellman-Ford phase

Recommendation Matrix:

Dense graph, any weights: Floyd-Warshall (simple, competitive time)
Sparse graph, non-negative weights: Dijkstra × V (faster)
Sparse graph, negative weights: Johnson's algorithm (combines benefits)
Very small graph (V < 100): Floyd-Warshall (simplicity wins)
Memory constrained: Dijkstra × V (O(V) working memory)
Precomputation for many queries: Floyd-Warshall (one-time O(V²) storage)

Practical Choice

In practice, Floyd-Warshall is often chosen for its simplicity even when theoretically sub-optimal. For graphs under 1000-2000 vertices, the performance difference rarely matters, and Floyd-Warshall's simplicity reduces bugs and development time.

Summary: Floyd-Warshall for All-Pairs

Let's consolidate the key points about Floyd-Warshall algorithm selection:

Key Takeaways

•Floyd-Warshall solves all-pairs shortest paths — Computes V² distances in one unified sweep
•Elegant dynamic programming formulation — Three nested loops, considering each vertex as intermediate
•O(V³) time, O(V²) space — Independent of edge count, limiting scalability to V < 5000-10000
•Handles negative weights correctly — Uses same recurrence regardless of weight signs
•Simple negative cycle detection — Check diagonal for negative values
•Best for dense graphs — Competitive or superior to Dijkstra × V when E ≈ V²
•Implementation simplicity is a feature — Hard to get wrong, easy to verify
•Semiring generalization enables variants — Transitive closure, minimax paths, widest paths

What's Next:

In the final page of this module, we'll synthesize everything into a complete decision framework. You'll learn to systematically analyze any shortest path problem and select the optimal algorithm based on problem constraints, graph characteristics, and performance requirements.

Page Complete

You now understand Floyd-Warshall as the algorithm of choice for all-pairs shortest paths, particularly on dense graphs. You can evaluate when its O(V³) complexity and O(V²) space are acceptable trade-offs for its simplicity and capability to handle negative weights.

All Pairs — When Floyd-Warshall Algorithm Is the Right Choice

Computing Every Shortest Path at Once

Sometimes, knowing the shortest path from one source isn't enough. Consider these scenarios:

Network planning: What's the maximum latency between any two data centers?
Traffic analysis: Which city pair has the longest commute time?
Social networks: Who are the most "distant" users in terms of connection hops?
Game maps: Precompute all travel times for AI decision-making

What You Will Learn

Understanding the All-Pairs Shortest Path Problem

The all-pairs shortest path (APSP) problem asks: for every pair of vertices (u, v) in a graph, what is the shortest path from u to v?

Output Format:

Problem Scale:

For a graph with V vertices:

There are V² pairs to consider
The output requires O(V²) space (just for distances)
Any algorithm must do at least O(V²) work to populate the output

This baseline tells us that APSP is inherently more expensive than single-source shortest paths. The question is: can we do better than running SSSP V times?

Approaches to APSP:

Dijkstra × V times: O(V × (V + E) log V) = O(V² log V + VE log V)
Bellman-Ford × V times: O(V × V × E) = O(V² × E)
Floyd-Warshall: O(V³)
Johnson's Algorithm: O(V² log V + VE) — combines reweighting with Dijkstra

APSP Approach Comparison by Graph Density
Approach	Sparse (E ≈ V)	Moderate (E ≈ V√V)	Dense (E ≈ V²)
Dijkstra × V	O(V² log V)	O(V^2.5 log V)	O(V³ log V)
Bellman-Ford × V	O(V³)	O(V^3.5)	O(V⁴)
Floyd-Warshall	O(V³)	O(V³)	O(V³)
Johnson's Algorithm	O(V² log V + V²)	O(V² log V + V^2.5)	O(V² log V + V³)

The Density Trade-off

The Floyd-Warshall Algorithm — Elegant Simplicity

Floyd-Warshall is remarkable for its simplicity. The entire algorithm is three nested loops with a single conditional update. Yet this simple structure correctly computes all shortest paths.

The Core Idea:

For each pair (i, j), consider all possible intermediate vertices k. The shortest path from i to j either:

Doesn't use k as an intermediate vertex, OR
Uses k, meaning the path goes i → ... → k → ... → j

If we've already computed shortest paths using only vertices {0, 1, ..., k-1} as intermediates, we can extend to include k by checking: is going through k shorter than the best path so far?

The Recurrence:

Let D^(k)[i][j] = shortest path from i to j using only vertices {0, 1, ..., k-1} as intermediates.

Base case: D^(0)[i][j] = edge weight from i to j (or ∞ if no edge)

Recurrence: D^(k)[i][j] = min(D^(k-1)[i][j], D^(k-1)[i][k] + D^(k-1)[k][j])

Final answer: D^(V)[i][j] = shortest path from i to j (all vertices allowed as intermediates)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
interface FloydWarshallResult {
    distances: number[][];
    next: (number | null)[][];  // For path reconstruction
    hasNegativeCycle: boolean;
}
 
function floydWarshall(
    numVertices: number,
    edges: { from: number; to: number; weight: number }[]
): FloydWarshallResult {
    const V = numVertices;
    
    // Initialize distance matrix
    const dist: number[][] = Array.from({ length: V }, () => 
        new Array(V).fill(Infinity)
    );
    
    // Initialize next matrix for path reconstruction
    const next: (number | null)[][] = Array.from({ length: V }, () => 
        new Array<number | null>(V).fill(null)
    );
    
    // Self-loops have distance 0
    for (let i = 0; i < V; i++) {
        dist[i][i] = 0;
    }
    
    // Initialize with direct edges
    for (const edge of edges) {
        // Handle multiple edges between same pair (keep minimum)
        if (edge.weight < dist[edge.from][edge.to]) {
            dist[edge.from][edge.to] = edge.weight;
            next[edge.from][edge.to] = edge.to;
        }
    }
    
    // Floyd-Warshall: consider each vertex k as intermediate
    for (let k = 0; k < V; k++) {
        for (let i = 0; i < V; i++) {
            for (let j = 0; j < V; j++) {
                // Can we improve i→j by going through k?
                if (dist[i][k] !== Infinity && dist[k][j] !== Infinity) {
                    const throughK = dist[i][k] + dist[k][j];
                    if (throughK < dist[i][j]) {
                        dist[i][j] = throughK;
                        next[i][j] = next[i][k];  // First step toward j is first step toward k
                    }
                }
            }
        }
    }
    
    // Check for negative cycles (negative value on diagonal)
    let hasNegativeCycle = false;
    for (let i = 0; i < V; i++) {
        if (dist[i][i] < 0) {
            hasNegativeCycle = true;
            break;
        }
    }
    
    return { distances: dist, next, hasNegativeCycle };
}
 
// Reconstruct path from i to j using next matrix
function reconstructPath(
    next: (number | null)[][],
    from: number,
    to: number
): number[] | null {
    if (next[from][to] === null) {
        return null;  // No path exists
    }
    
    const path: number[] = [from];
    let current = from;
    
    while (current !== to) {
        current = next[current][to]!;
        path.push(current);
        
        // Safety check to prevent infinite loops on negative cycles
        if (path.length > next.length) {
            return null;  // Cycle detected
        }
    }
    
    return path;
}

In-Place Optimization

Why Floyd-Warshall Works — The DP Insight

Floyd-Warshall's correctness comes from a beautiful dynamic programming formulation. Let's understand it deeply:

The Subproblem Structure:

Define D^(k)[i][j] as the length of the shortest path from i to j that only uses vertices {0, 1, ..., k-1} as intermediate vertices (i.e., vertices strictly between i and j on the path).

Key Observation:

The shortest path from i to j using vertices {0, ..., k} as possible intermediates either:

Doesn't include vertex k — In this case, D^(k+1)[i][j] = D^(k)[i][j]
Includes vertex k exactly once — The path is i → ... → k → ... → j, so D^(k+1)[i][j] = D^(k)[i][k] + D^(k)[k][j]

Note: Vertex k cannot appear more than once in an optimal simple path (no cycles on simple paths).

The Recurrence:

D^(k+1)[i][j] = min(D^(k)[i][j], D^(k)[i][k] + D^(k)[k][j])

Why This Leads to O(V³):

We have O(V²) subproblems (one for each (i, j) pair), and for each k from 0 to V-1, we update all V² entries. Total: O(V × V²) = O(V³).

Visual Intuition:

Iteration k=0: Direct edges only (i → j)
                 ↓
Iteration k=1: Paths through vertex 0 (i → 0 → j)
                 ↓
Iteration k=2: Paths through vertices 0,1 (i → {0,1} → j)
                 ↓
               ...
                 ↓
Iteration k=V: Paths through any vertex (shortest paths!)

Handling Negative Weights:

Floyd-Warshall correctly handles negative weights because:

It considers all possible paths, not just greedy choices
Negative edges simply contribute negative values to sums
The min operation correctly selects the shortest (possibly negative) path

The only complication is negative cycles, detected by checking if any dist[i][i] becomes negative after completion.

The Power of DP Ordering

When to Choose Floyd-Warshall

Floyd-Warshall isn't always the best choice for APSP, but there are specific scenarios where it excels:

Ideal Conditions for Floyd-Warshall:

You need all-pairs shortest paths — The primary use case
Graph is dense (E ≈ V²) — Running Dijkstra V times would be O(V³ log V), slower than O(V³)
Negative weights are possible — Floyd-Warshall handles them correctly
Graph is small to medium — V < 5000 for practical runtimes
Implementation simplicity matters — Three nested loops, hard to get wrong
Memory is sufficient — Need O(V²) space for the distance matrix

Use Floyd-Warshall When

•All pairs shortest paths needed
•Graph is dense (E approaches V²)
•Negative edge weights present
•V < 3000-5000 vertices
•Simple, correct code is priority
•Matrix representation natural
•Negative cycle detection needed

Avoid Floyd-Warshall When

•Only single-source paths needed
•Graph is sparse (E << V²)
•V > 10000 (becomes very slow)
•Memory constrained (V² space)
•All weights non-negative (Dijkstra faster)
•Only few specific pairs needed
•Real-time performance required

Concrete Example — When Floyd-Warshall Wins:

Consider a complete graph (every vertex connected to every other) with 1000 vertices:

Edges: ~500,000
Floyd-Warshall: 1000³ = 1 billion operations
Dijkstra × 1000: 1000 × 500,000 × log(1000) ≈ 5 billion operations

Floyd-Warshall is 5× faster and much simpler to implement.

Concrete Example — When Floyd-Warshall Loses:

Consider a sparse road network with 1000 cities, average 5 roads per city:

Edges: ~2,500
Floyd-Warshall: 1 billion operations
Dijkstra × 1000: 1000 × 2,500 × log(1000) ≈ 25 million operations

Dijkstra × V is 40× faster.

The V³ Wall

Space Complexity — The O(V²) Challenge

Floyd-Warshall requires O(V²) space for the distance matrix, and typically another O(V²) for path reconstruction. This can be a significant constraint:

Memory Requirements (Distance Matrix Only):

Floyd-Warshall Memory Usage
Vertices	Matrix Size	Memory (doubles)	Memory (floats)
100	10,000	80 KB	40 KB
1,000	1,000,000	8 MB	4 MB
5,000	25,000,000	200 MB	100 MB
10,000	100,000,000	800 MB	400 MB
50,000	2,500,000,000	20 GB	10 GB
100,000	10,000,000,000	80 GB	40 GB

Implications:

For V > 10,000: Likely out of RAM on typical machines
For V > 50,000: Even high-memory servers struggle
For V > 100,000: Virtual memory would make it impractically slow

Space Optimization Techniques:

Omit path reconstruction matrix — If you only need distances, save half the memory
Use floats instead of doubles — If precision allows, cuts memory in half
Block decomposition — Process in cache-friendly blocks (improves performance, not memory)
External memory algorithms — For truly massive graphs, but complexity increases dramatically

When Memory Is Critical:

If you need APSP but can't afford O(V²) memory, consider:

Johnson's algorithm (O(V) per source, O(V) at a time)
On-demand computation (compute and discard single-source results as needed)
Approximation algorithms (trade accuracy for memory)

Pre-check Memory Requirements

Negative Cycle Detection in Floyd-Warshall

Like Bellman-Ford, Floyd-Warshall can detect negative-weight cycles. The method is elegant:

Detection Method:

After the algorithm completes, check the diagonal of the distance matrix. If dist[i][i] < 0 for any vertex i, a negative cycle exists that is reachable from (and can return to) vertex i.

Why This Works:

Initially, dist[i][i] = 0 (zero cost to stay in place). A negative diagonal value means we found a path from i back to i with negative total weight—exactly the definition of a negative cycle.

Affected Vertices:

Vertices affected by a negative cycle (those whose shortest distances are undefined, conceptually −∞) can be identified by running an additional iteration:

// After main algorithm, identify affected vertices
for (let k = 0; k < V; k++) {
    if (dist[k][k] < 0) {
        // k is on a negative cycle
        for (let i = 0; i < V; i++) {
            for (let j = 0; j < V; j++) {
                // If there's a path i→k→j, mark (i,j) as affected
                if (dist[i][k] !== Infinity && dist[k][j] !== Infinity) {
                    dist[i][j] = -Infinity;
                }
            }
        }
    }
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
// Complete Floyd-Warshall with full negative cycle handling
function floydWarshallWithNegativeCycleHandling(
    numVertices: number,
    edges: { from: number; to: number; weight: number }[]
): { distances: (number | typeof NEG_INFINITY)[][], negativeCycleVertices: number[] } {
    const V = numVertices;
    const NEG_INFINITY = Number.NEGATIVE_INFINITY;
    
    // Initialize
    const dist: number[][] = Array.from({ length: V }, () => 
        new Array(V).fill(Infinity)
    );
    
    for (let i = 0; i < V; i++) dist[i][i] = 0;
    for (const e of edges) {
        dist[e.from][e.to] = Math.min(dist[e.from][e.to], e.weight);
    }
    
    // Main Floyd-Warshall
    for (let k = 0; k < V; k++) {
        for (let i = 0; i < V; i++) {
            for (let j = 0; j < V; j++) {
                if (dist[i][k] !== Infinity && dist[k][j] !== Infinity) {
                    if (dist[i][k] + dist[k][j] < dist[i][j]) {
                        dist[i][j] = dist[i][k] + dist[k][j];
                    }
                }
            }
        }
    }
    
    // Find vertices on negative cycles
    const negativeCycleVertices: number[] = [];
    for (let i = 0; i < V; i++) {
        if (dist[i][i] < 0) {
            negativeCycleVertices.push(i);
        }
    }
    
    // Mark all pairs affected by negative cycles as -Infinity
    for (const k of negativeCycleVertices) {
        for (let i = 0; i < V; i++) {
            for (let j = 0; j < V; j++) {
                if (dist[i][k] !== Infinity && dist[k][j] !== Infinity) {
                    dist[i][j] = NEG_INFINITY;
                }
            }
        }
    }
    
    return { distances: dist as any, negativeCycleVertices };
}

Both Algorithms Detect Negative Cycles

Real-World Applications of Floyd-Warshall

Floyd-Warshall finds applications in many domains where all-pairs information is naturally required:

1. Network Topology Analysis:

In network management, understanding the complete latency matrix helps:

Identify the worst-case latency (max entry in matrix)
Find the "diameter" of the network (longest shortest path)
Optimize server placement (minimize maximum distance to any point)
Detect routing anomalies (unexpected values in matrix)

2. Transitive Closure:

A variant of Floyd-Warshall computes transitive closure—can vertex i reach vertex j at all?

// Transitive closure using Floyd-Warshall structure
for (let k = 0; k < V; k++) {
    for (let i = 0; i < V; i++) {
        for (let j = 0; j < V; j++) {
            // reachable[i][j] = can reach j from i through k?
            reachable[i][j] = reachable[i][j] || 
                             (reachable[i][k] && reachable[k][j]);
        }
    }
}

This is useful for dependency analysis, access control computation, and graph connectivity queries.

More Applications

•Traffic Engineering — Precompute all-pairs travel times for city planning and congestion analysis
•Social Network Analysis — Find average separation, clustering coefficients, and influence distances
•Game AI Preprocessing — Precompute movement costs between all map regions for fast pathfinding
•Optimal Meeting Points — Given locations, find central point minimizing total travel
•Graph Metrics — Compute diameter, radius, eccentricity of vertices
•Regular Path Queries — Modify for path constraints using semiring algebra
•Matrix Shortest Path — Computing shortest paths in graph products

3. Minimax and Widest Path Problems:

Floyd-Warshall's structure adapts to other optimization objectives:

Minimax Path: Find the path that minimizes the maximum edge weight.

dist[i][j] = min(dist[i][j], max(dist[i][k], dist[k][j]))

Widest Path (Maximum Capacity): Find the path with maximum bottleneck capacity.

cap[i][j] = max(cap[i][j], min(cap[i][k], cap[k][j]))

These variants use the same O(V³) structure but different combining operations.

Semiring Generalization

Comparison with Alternative Approaches

For completeness, let's compare Floyd-Warshall with other APSP approaches:

Floyd-Warshall vs. Dijkstra × V:

Factor	Floyd-Warshall	Dijkstra × V
Time (sparse, E ≈ V)	O(V³)	O(V² log V)
Time (dense, E ≈ V²)	O(V³)	O(V³ log V)
Negative weights	✅ Yes	❌ No
Space	O(V²) permanent	O(V) per run
Implementation	Very simple	Needs priority queue
Memory access	Very cache-friendly	Less predictable

Floyd-Warshall vs. Johnson's Algorithm:

Johnson's algorithm is designed for sparse graphs with potentially negative weights:

Add a super-source connected to all vertices with weight 0
Run Bellman-Ford from super-source to compute reweighting potentials
Reweight all edges to be non-negative
Run Dijkstra from each vertex on reweighted graph
Adjust final distances using potentials

Johnson's vs Floyd-Warshall
Factor	Floyd-Warshall	Johnson's Algorithm
Time Complexity	O(V³)	O(V² log V + VE)
Best for Sparse	❌ No advantage	✅ O(V² log V)
Best for Dense	✅ O(V³)	⚠️ O(V³ log V)
Negative Weights	✅ Yes	✅ Yes
Implementation	Very simple	Complex (multiple algorithms)
Negative Cycle Detection	✅ Easy	✅ During Bellman-Ford phase

Recommendation Matrix:

Dense graph, any weights: Floyd-Warshall (simple, competitive time)
Sparse graph, non-negative weights: Dijkstra × V (faster)
Sparse graph, negative weights: Johnson's algorithm (combines benefits)
Very small graph (V < 100): Floyd-Warshall (simplicity wins)
Memory constrained: Dijkstra × V (O(V) working memory)
Precomputation for many queries: Floyd-Warshall (one-time O(V²) storage)

Practical Choice

Summary: Floyd-Warshall for All-Pairs

Let's consolidate the key points about Floyd-Warshall algorithm selection:

Key Takeaways

•Floyd-Warshall solves all-pairs shortest paths — Computes V² distances in one unified sweep
•Elegant dynamic programming formulation — Three nested loops, considering each vertex as intermediate
•O(V³) time, O(V²) space — Independent of edge count, limiting scalability to V < 5000-10000
•Handles negative weights correctly — Uses same recurrence regardless of weight signs
•Simple negative cycle detection — Check diagonal for negative values
•Best for dense graphs — Competitive or superior to Dijkstra × V when E ≈ V²
•Implementation simplicity is a feature — Hard to get wrong, easy to verify
•Semiring generalization enables variants — Transitive closure, minimax paths, widest paths

What's Next:

Page Complete