Data Structures & AlgorithmsNetwork Flow

Network Flow — Conceptual Introduction

LevelAdvanced

Duration90 mins

TopicNetwork Flow

5 / 5

Applications — Resource Allocation, Matching, and Beyond

Network Flow as a Problem-Solving Swiss Army Knife

The true power of network flow lies not in solving flow problems directly, but in its ability to model an astonishing variety of seemingly unrelated optimization problems. Once you recognize a problem as a flow problem in disguise, you unlock polynomial-time solutions and deep theoretical insights.

This page explores the most important applications of maximum flow, from classic bipartite matching to sophisticated scheduling and allocation problems. Each reduction demonstrates the art of problem transformation—a skill that separates algorithm designers from algorithm users.

What You'll Master

By the end of this page, you'll understand how to: reduce bipartite matching to max flow, model assignment problems as flow networks, solve resource allocation with capacity constraints, recognize flow patterns in scheduling problems, and apply network flow to real-world scenarios from organ donation to project management.

Bipartite Matching — The Classic Application

The Maximum Bipartite Matching Problem:

Given a bipartite graph G = (X ∪ Y, E) where edges only connect vertices in X to vertices in Y, find the largest set of edges M ⊆ E such that no two edges in M share a vertex.

Real-World Examples:

Assigning students to dorm rooms (each student gets one room, each room one student)
Matching job applicants to positions
Pairing organ donors with recipients
Assigning tasks to workers
Matching medical residents to hospitals

The Key Insight: Maximum bipartite matching reduces directly to maximum flow!

The Reduction:

Create a source s connected to every vertex in X with capacity 1
Create a sink t connected from every vertex in Y with capacity 1
For each edge (x, y) in E, create directed edge x → y with capacity 1

       1    [x₁]----1----[y₁]    1
      ↗       \    /       ↘
    [S]        \  /         [T]
      ↘       /    \       ↗
       1    [x₂]----1----[y₂]    1

Why This Works:

Capacity 1 from source ensures each x ∈ X is matched at most once
Capacity 1 to sink ensures each y ∈ Y is matched at most once
Capacity 1 on matching edges ensures each edge is used at most once
Maximum flow = maximum matching size
The edges carrying flow correspond to matched pairs

Integrality Guarantees Success

Because all capacities are 1, the maximum flow is an integer. More importantly, each edge either carries flow 0 or 1—never fractional. This means the flow directly gives a valid matching: matched pairs are precisely the edges with flow 1.

bipartite_matching.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
def max_bipartite_matching(X, Y, edges):
    """
    Find maximum bipartite matching using max flow reduction.
    
    Args:
        X: List of vertices on left side
        Y: List of vertices on right side  
        edges: List of (x, y) pairs representing allowed matchings
    
    Returns:
        List of matched pairs
    """
    from collections import deque
    
    # Build flow network
    SOURCE = 'SOURCE'
    SINK = 'SINK'
    
    # Capacity graph: graph[u][v] = capacity
    graph = {SOURCE: {}, SINK: {}}
    for x in X:
        graph[SOURCE][x] = 1
        graph[x] = {}
    for y in Y:
        graph[y] = {SINK: 1}
    for x, y in edges:
        graph[x][y] = 1
    graph[SINK] = {}
    
    # Run Edmonds-Karp (max flow)
    def bfs_augment():
        parent = {SOURCE: None}
        queue = deque([SOURCE])
        
        while queue:
            u = queue.popleft()
            for v, cap in graph[u].items():
                if v not in parent and cap > 0:
                    parent[v] = u
                    if v == SINK:
                        # Found augmenting path
                        path = []
                        curr = SINK
                        while parent[curr] is not None:
                            path.append((parent[curr], curr))
                            curr = parent[curr]
                        return path
                    queue.append(v)
        return None
    
    # Augment flow
    max_flow = 0
    while True:
        path = bfs_augment()
        if path is None:
            break
        
        # Augment by 1 (all capacities are 1)
        for u, v in path:
            graph[u][v] -= 1
            if v not in graph:
                graph[v] = {}
            graph[v][u] = graph.get(v, {}).get(u, 0) + 1
        max_flow += 1
    
    # Extract matching: edges from X to Y with flow (i.e., capacity exhausted)
    matching = []
    for x in X:
        for y in Y:
            if (x, y) in [(u, v) for u, v in edges]:
                # If original capacity 1 is now 0, edge is matched
                if graph[x].get(y, 0) == 0 and y in edges_dict.get(x, set()):
                    matching.append((x, y))
    
    # Cleaner extraction: check reverse edges
    matching = []
    for x in X:
        for y in Y:
            if graph.get(y, {}).get(x, 0) > 0:  # Reverse edge exists with flow
                matching.append((x, y))
    
    return matching
 
 
# Example: Job assignment
workers = ['Alice', 'Bob', 'Charlie']
jobs = ['Frontend', 'Backend', 'DevOps']
skills = [
    ('Alice', 'Frontend'),
    ('Alice', 'Backend'),
    ('Bob', 'Backend'),
    ('Bob', 'DevOps'),
    ('Charlie', 'Frontend'),
]
 
matching = max_bipartite_matching(workers, jobs, skills)
print(f"Maximum matching: {matching}")
# Possible output: [('Alice', 'Backend'), ('Bob', 'DevOps'), ('Charlie', 'Frontend')]

Assignment Problems with Preferences

Basic bipartite matching is binary: either an edge exists or it doesn't. But many real problems have preferences or costs.

The Assignment Problem:

Given n workers and n jobs, where worker i doing job j has cost c(i, j), find a one-to-one assignment minimizing total cost.

The Minimum-Cost Maximum-Flow Variation:

Max flow with costs assigns not just capacities but also per-unit costs to edges. The goal: find maximum flow with minimum total cost.

Reduction for assignment:

Source to each worker: capacity 1, cost 0
Each job to sink: capacity 1, cost 0
Worker i to job j: capacity 1, cost c(i, j)

Minimum-cost max-flow finds the assignment minimizing total cost while ensuring everyone is assigned.

Beyond Basic Max Flow

Minimum-cost maximum-flow requires different algorithms: successive shortest paths, cycle-canceling, or cost-scaling. The key insight is that it's still polynomial-time solvable, extending the power of flow-based approaches to weighted optimization.

Example: Hospital-Resident Matching

The National Resident Matching Program pairs medical residents with hospitals:

Residents rank hospitals by preference
Hospitals rank applicants by preference
Goal: stable matching (no resident-hospital pair would prefer each other over their assignments)

While the full stable matching problem uses the Gale-Shapley algorithm, capacity-constrained variants can be modeled as flow:

Each hospital is a vertex with capacity = number of positions
Edges have costs based on (negative) preference rankings
Min-cost max-flow finds optimal assignment respecting capacity constraints

Example: Organ Donation Networks

Kidney exchange programs match donor-recipient pairs:

Incompatible pairs can swap donors
Chains of swaps increase total transplants
Flow networks model capability constraints (donor-recipient compatibility)
Maximum flow = maximum number of successful transplants

Assignment Problem Variants and Reductions
Problem	Reduction Approach	Algorithm
Simple matching	Max flow with unit capacities	Edmonds-Karp / Dinic
Weighted matching	Min-cost max flow	Hungarian / Successive shortest paths
Many-to-one matching	Source/sink capacities > 1	Standard max flow
Constrained matching	Add edges only for valid pairs	Max flow on restricted graph

Resource Allocation — Distributing Limited Resources

Many optimization problems involve distributing limited resources to competing demands. Network flow provides elegant solutions.

The General Resource Allocation Framework:

Sources represent resource providers (factories, warehouses, servers)
Sinks represent resource consumers (stores, users, data centers)
Intermediate nodes represent distribution points (hubs, routers)
Edge capacities represent transportation/processing limits
Maximum flow = maximum total resources delivered

Example: Supply Chain Optimization

A company has:

3 factories producing goods (with production capacities)
2 distribution centers (with handling capacities)
4 retail stores (with demand)
Transportation links with shipping capacities

Network Model:

                     ┌─────┐         ┌───────┐
          ┌─────────►│ DC1 ├────────►│Store1 │
          │  100     └──┬──┘  50     └───────┘
┌─────┐   │             │              
│Fact1├───┤             │80      ┌───────┐
└─────┘   │             ├───────►│Store2 │
   200    │         ┌───┴──┐     └───────┘
          └────────►│ DC2  │
┌─────┐      120    └──┬───┘     ┌───────┐
│Fact2├───────────────►│────────►│Store3 │
└─────┘    direct      │  60     └───────┘
   150                 │
                       │         ┌───────┐
┌─────┐     90         └────────►│Store4 │
│Fact3├───────────────────►      └───────┘
└─────┘            70

Maximum flow tells us the maximum goods deliverable. The min-cut reveals which links to upgrade for higher throughput.

Multi-Commodity Extensions

If different product types share capacity (e.g., trucks carry mixed goods), the problem becomes multi-commodity flow—significantly harder (often NP-hard). For single commodity or separable problems, standard max flow suffices.

supply_chain.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
def model_supply_chain(factories, warehouses, stores, links):
    """
    Model a supply chain as a flow network.
    
    Args:
        factories: Dict[name, capacity] - production capacity
        warehouses: Dict[name, capacity] - handling capacity  
        stores: Dict[name, demand] - store demands
        links: List[(from, to, capacity)] - transportation links
    
    Returns:
        Flow network graph suitable for max flow algorithms
    """
    SOURCE = 'SUPER_SOURCE'
    SINK = 'SUPER_SINK'
    
    graph = {SOURCE: {}, SINK: {}}
    
    # Super-source connects to factories
    for factory, capacity in factories.items():
        graph[SOURCE][factory] = capacity
        graph[factory] = {}
    
    # Warehouses need vertex splitting for handling capacity
    for wh, capacity in warehouses.items():
        wh_in = f"{wh}_IN"
        wh_out = f"{wh}_OUT"
        graph[wh_in] = {wh_out: capacity}
        graph[wh_out] = {}
    
    # Stores connect to super-sink (with demand as capacity)
    for store, demand in stores.items():
        graph[store] = {SINK: demand}
    
    # Add transportation links
    for src, dst, cap in links:
        # Handle warehouse splits
        actual_src = f"{src}_OUT" if src in warehouses else src
        actual_dst = f"{dst}_IN" if dst in warehouses else dst
        
        if actual_src not in graph:
            graph[actual_src] = {}
        graph[actual_src][actual_dst] = cap
    
    return graph, SOURCE, SINK
 
 
# Example usage
factories = {'Factory_A': 200, 'Factory_B': 150, 'Factory_C': 100}
warehouses = {'DC_North': 180, 'DC_South': 200}
stores = {'Store_1': 80, 'Store_2': 100, 'Store_3': 120, 'Store_4': 90}
links = [
    ('Factory_A', 'DC_North', 150),
    ('Factory_A', 'DC_South', 100),
    ('Factory_B', 'DC_North', 80),
    ('Factory_B', 'DC_South', 120),
    ('Factory_C', 'DC_South', 100),
    ('DC_North', 'Store_1', 60),
    ('DC_North', 'Store_2', 70),
    ('DC_South', 'Store_2', 50),
    ('DC_South', 'Store_3', 100),
    ('DC_South', 'Store_4', 80),
]
 
graph, source, sink = model_supply_chain(factories, warehouses, stores, links)
# Now run max_flow(graph, source, sink) to find maximum deliverable goods

Scheduling with Constraints

Scheduling problems often reduce to network flow when constraints are capacity-like.

The Crew Scheduling Problem:

Airline needs to assign crews to flights:

Each crew member can work a limited number of hours
Flights require specific crew sizes
Transitions between flights have constraints (rest time, location)

Flow Modeling:

Source connects to each crew member (capacity = available hours)
Each flight is a node with crew requirement
Edges from crews to flights based on qualification/availability
Sink receives flow from flights
Max flow determines if all flights can be crewed

The Meeting Room Problem:

Schedule n meetings in k rooms:

Each meeting has a time slot
Overlapping meetings need different rooms
Some meetings require specific room features

As Bipartite Matching:

Left side: meetings
Right side: (room, time slot) pairs
Edge exists if meeting can use that room at that time
Maximum matching = maximum schedulable meetings

The Sports Tournament Problem (Baseball Elimination):

Can team X still win the championship?

Given: current wins, remaining games between teams
Team X can get at most w_X + r_X total wins
Other teams might accumulate too many wins

Flow formulation reveals impossibility:

Construct network where max flow = total remaining games between other teams
If max flow is achieved, wins are distributed such that no team exceeds w_X + r_X
If max flow < total games, team X is mathematically eliminated

The Power of Feasibility Checking

Often we don't need the actual flow value—just whether a target is achievable. In scheduling, max flow = required resources means the schedule is feasible. This binary check uses the same algorithm but answers a yes/no question.

Scheduling Problems as Flow Networks
Problem	Left Side (Sources)	Right Side (Sinks)	Edge Meaning
Crew scheduling	Crew members	Flights	Crew qualified for flight
Room scheduling	Meetings	Room-time slots	Meeting fits in slot
Exam scheduling	Exams	Time slots	No student conflict
Sports elimination	Games to play	Teams	Game winner

Project Selection — Maximizing Profit with Dependencies

The Project Selection Problem:

Given n projects with profits p₁, ..., pₙ (some negative, representing costs) and dependencies (if you do project i, you must also do project j), find the subset maximizing total profit.

Why Dependencies Matter:

Project A (profit $100) requires prerequisite Project B (cost $30).

Taking A alone: impossible (dependency)
Taking B alone: -$30
Taking both: $100 - $30 = $70 ✓
Taking neither: $0

The optimal is taking both. Dependencies create constraints that flow models elegantly.

The Flow Reduction:

Source s and sink t
For each project i:
- If pᵢ > 0: edge from s to i with capacity pᵢ
- If pᵢ < 0: edge from i to t with capacity |pᵢ|
For each dependency (i requires j): edge from j to i with capacity ∞ (or very large)

The Interpretation:

A min cut (S, T) corresponds to a project selection:

Projects in S (source side) are selected
Projects in T (sink side) are not selected

Cut capacity = lost positive profits + incurred negative costs

If positive-profit project i is in T: we lose pᵢ (edge s→i is cut)
If negative-profit project i is in S: we pay |pᵢ| (edge i→t is cut)
Dependency edges: if i requires j but j ∈ T and i ∈ S, infinite capacity is cut (invalid)

Maximum profit = (sum of all positive profits) - (min cut capacity)

The Elegant Reduction

The min cut naturally respects dependencies (infinite capacity edges prevent violations) and optimizes profit (minimizing cut = maximizing retained profits minus costs). This reduction transforms a combinatorial optimization problem into a graph problem.

project_selection.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
def max_profit_projects(projects, dependencies):
    """
    Find maximum profit subset of projects respecting dependencies.
    
    Args:
        projects: Dict[project_id, profit] - profit (negative = cost)
        dependencies: List[(required, dependent)] - if take dependent, must take required
    
    Returns:
        Tuple of (max_profit, selected_projects)
    """
    SOURCE = 'SOURCE'
    SINK = 'SINK'
    INF = float('inf')
    
    # Build graph
    graph = {SOURCE: {}, SINK: {}}
    for project in projects:
        graph[project] = {}
    
    # Source edges for positive profits, sink edges for costs
    total_positive = 0
    for project, profit in projects.items():
        if profit > 0:
            graph[SOURCE][project] = profit
            total_positive += profit
        elif profit < 0:
            graph[project][SINK] = -profit
    
    # Dependency edges (j required by i means edge j -> i)
    for required, dependent in dependencies:
        graph[required][dependent] = INF
    
    # Run min cut (= max flow)
    min_cut_value = max_flow(graph, SOURCE, SINK)  # Use any max flow algorithm
    
    # Find selected projects (those in source-side of min cut)
    # Run BFS from source in residual graph
    selected = find_source_reachable(residual_graph, SOURCE)
    selected.remove(SOURCE)
    
    max_profit = total_positive - min_cut_value
    
    return max_profit, selected
 
 
# Example
projects = {
    'Website': 100,      # Revenue $100
    'Backend': -40,      # Cost $40
    'Database': -20,     # Cost $20
    'Analytics': 50,     # Revenue $50
    'ML_Model': 80,      # Revenue $80
    'Data_Pipeline': -30 # Cost $30
}
 
dependencies = [
    ('Backend', 'Website'),     # Website requires Backend
    ('Database', 'Backend'),    # Backend requires Database
    ('Data_Pipeline', 'ML_Model'),  # ML requires Data Pipeline
    ('Database', 'Analytics'),  # Analytics requires Database
]
 
# Optimal: Take Website, Backend, Database, Analytics
# = 100 - 40 - 20 + 50 = $90
# (Skip ML_Model+Data_Pipeline: 80-30=50, but we already have 90)

Image Segmentation and Graph Cuts

One of the most elegant applications of max-flow min-cut is in computer vision for image segmentation—separating foreground objects from background.

The Setup:

Each pixel is a node in a graph
Adjacent pixels are connected by edges with weights = similarity
Special treatment for "seed" pixels known to be foreground/background

The Network Model:

Source s represents "foreground"
Sink t represents "background"
Pixel nodes are all image pixels
Source edges: s → pixel with weight = likelihood pixel is foreground
Sink edges: pixel → t with weight = likelihood pixel is background
Pixel-pixel edges: neighboring pixels connected with weight = similarity

Why Min-Cut Works:

The minimum cut partitions pixels into S (foreground) and T (background):

Cutting s→pixel edge: pixel is classified as background despite foreground likelihood (penalty = likelihood score)
Cutting pixel→t edge: pixel is classified as foreground despite background likelihood (penalty = likelihood score)
Cutting pixel-pixel edge: adjacent pixels have different labels (penalty = similarity)

Min cut = minimum total penalty

This naturally:

Respects foreground/background likelihoods
Keeps similar pixels together
Creates smooth, sensible boundaries

The Algorithm:

1. For each pixel, compute foreground and background likelihood
2. Build the graph with appropriate edge weights
3. Compute min cut = max flow
4. Pixels reachable from source in residual graph = foreground
5. Others = background

Interactive Segmentation

In practice, users draw "scribbles" marking definite foreground and background. These become infinite-capacity edges to source/sink. The algorithm then finds the optimal boundary respecting user input—a perfect blend of human guidance and algorithmic optimization.

Real-World Applications:

Medical imaging: Segmenting tumors from healthy tissue
Photo editing: Automatic background removal
Video processing: Object tracking across frames
Autonomous vehicles: Identifying road, obstacles, pedestrians
Augmented reality: Separating person from background for virtual backgrounds

Hall's Theorem and Perfect Matching

Max-flow min-cut provides an algorithmic proof of Hall's Marriage Theorem, a classical result in combinatorics.

Hall's Theorem:

A bipartite graph G = (X ∪ Y, E) has a matching that saturates all of X (every vertex in X is matched) if and only if for every subset S ⊆ X:

|N(S)| ≥ |S|

where N(S) is the neighborhood of S (all vertices in Y adjacent to some vertex in S).

The Condition Intuition:

If there's a subset of, say, 5 elements in X that collectively only connect to 4 elements in Y, we can't match all 5—there aren't enough "partners." Hall's condition says this bottleneck never exists.

Proof via Max-Flow Min-Cut:

If Hall's condition holds:

Construct the flow network (source to X, X to Y by edges, Y to sink)
All capacities are 1
Consider any min cut (S', T') separating source from sink
Let A = X ∩ S' (X vertices on source side) and B = Y ∩ S' (Y vertices on source side)
Cut capacity = |X - A| + |B| + (edges from A to Y - B)

By Hall's condition, |N(A)| ≥ |A|. The cut must "pay" for each vertex in A either by including its neighbor in B or by cutting the edge.

This forces min cut ≥ |X|, hence max flow ≥ |X|, meaning a matching saturating X exists.

If matching exists:

The matching shows how to route flow 1 to each vertex in X
This achieves flow = |X|
For any subset S, the matched vertices form N(S) ⊇ S, so |N(S)| ≥ |S|

König's Theorem

A related result: in bipartite graphs, maximum matching size = minimum vertex cover size. This too follows from max-flow min-cut. The duality between matchings and covers mirrors the duality between flows and cuts.

Classical Theorems from Max-Flow Min-Cut
Theorem	Statement	Connection to Flow
Hall's Marriage	Perfect matching ↔ Hall's condition	Min cut ≥ \|X\| ↔ Hall's condition
König's	Max matching = min vertex cover (bipartite)	Direct duality
Menger's	Max disjoint paths = min vertex cut	Unit capacity flow
Dilworth's	Min chain cover = max antichain	DAG path cover reduction

Edge-Disjoint and Vertex-Disjoint Paths

Menger's Theorem connects path connectivity to cut size, and max flow provides an algorithmic proof.

Edge-Disjoint Paths:

The maximum number of edge-disjoint paths from s to t equals the minimum number of edges whose removal disconnects s from t.

Proof via Max Flow:

Give every edge capacity 1
Max flow value = max number of edge-disjoint paths (each path uses 1 unit, paths don't share edges)
Min cut capacity = min number of edges to remove
Max-flow min-cut theorem gives equality

Vertex-Disjoint Paths:

The maximum number of vertex-disjoint paths (sharing no vertices except s and t) equals the minimum number of vertices whose removal disconnects s from t.

Proof via Vertex Splitting:

Split each vertex v (except s, t) into vᵢₙ and vₒᵤₜ with edge capacity 1
This enforces that each vertex is used by at most one path
Max flow = max vertex-disjoint paths
Min cut (now potentially cutting internal edges) = min vertex removal

Applications of Disjoint Paths:

Network Reliability:

Edge-disjoint paths = redundant connections (if one cable fails, alternatives exist)
Maximum disjoint paths = fault tolerance level

Routing:

Finding independent routes for load balancing
Ensuring no single link failure breaks connectivity

Security:

Minimizing single points of failure
Identifying critical edges whose failure partitions the network

The Unifying Theme

Edge-disjoint paths, vertex-disjoint paths, connectivity, and reliability all reduce to max flow. The flow value tells you the robustness; the min cut tells you the vulnerabilities. One algorithm, many insights.

Summary and Looking Forward

We've explored the rich tapestry of problems solvable via network flow. Let's consolidate the key insights:

Key Takeaways

•Bipartite matching reduces to max flow with unit capacities. Flow value = matching size.
•Assignment problems with preferences use minimum-cost maximum-flow for optimal weighted matching.
•Resource allocation models supply chains, distribution networks, and logistics as flow problems.
•Scheduling with capacity constraints (crews, rooms, time slots) often has natural flow formulations.
•Project selection with dependencies uses min-cut to maximize profit while respecting constraints.
•Image segmentation uses graph cuts to partition pixels into foreground/background optimally.
•Classical theorems (Hall, König, Menger, Dilworth) are consequences of max-flow min-cut.
•Disjoint paths and connectivity problems reduce to flow on unit-capacity networks.

The Reduction Mindset:

The most valuable skill from this module isn't memorizing these reductions—it's developing the reduction mindset. When you encounter an optimization problem, ask:

What's being maximized/minimized?
Are there capacity-like constraints?
Is there a natural "source" and "sink"?
Can I model it as: maximize flow from A to B subject to constraints?
Can I model it as: minimize cost of separating A from B?

If the answers align with flow structure, you've found a polynomial-time solution. This pattern recognition is what makes network flow a powerful tool in your algorithmic toolkit.

Module Complete

Congratulations! You've completed the Network Flow — Conceptual Introduction module. You now understand flow networks, the Ford-Fulkerson method, the max-flow min-cut theorem, and a wealth of applications. These concepts form the foundation for advanced topics like minimum-cost flow, multi-commodity flow, and approximation algorithms for NP-hard problems.

What's Next in Your Journey:

Network flow opens doors to advanced algorithmic techniques:

Linear programming generalizes flow to arbitrary linear constraints
Approximation algorithms use flow as a subroutine for NP-hard problems
Parameterized complexity analyzes flow on special graph classes
Randomized algorithms for faster flow computation

The concepts you've learned here are foundational—they'll appear again and again as you advance in algorithmic problem-solving.

5 / 5

Loading learning content...

Data Structures & AlgorithmsNetwork Flow

Network Flow — Conceptual Introduction

LevelAdvanced

Duration90 mins

TopicNetwork Flow

5 / 5

Applications — Resource Allocation, Matching, and Beyond

Network Flow as a Problem-Solving Swiss Army Knife

What You'll Master

Bipartite Matching — The Classic Application

The Maximum Bipartite Matching Problem:

Given a bipartite graph G = (X ∪ Y, E) where edges only connect vertices in X to vertices in Y, find the largest set of edges M ⊆ E such that no two edges in M share a vertex.

Real-World Examples:

Assigning students to dorm rooms (each student gets one room, each room one student)
Matching job applicants to positions
Pairing organ donors with recipients
Assigning tasks to workers
Matching medical residents to hospitals

The Key Insight: Maximum bipartite matching reduces directly to maximum flow!

The Reduction:

Create a source s connected to every vertex in X with capacity 1
Create a sink t connected from every vertex in Y with capacity 1
For each edge (x, y) in E, create directed edge x → y with capacity 1

       1    [x₁]----1----[y₁]    1
      ↗       \    /       ↘
    [S]        \  /         [T]
      ↘       /    \       ↗
       1    [x₂]----1----[y₂]    1

Why This Works:

Capacity 1 from source ensures each x ∈ X is matched at most once
Capacity 1 to sink ensures each y ∈ Y is matched at most once
Capacity 1 on matching edges ensures each edge is used at most once
Maximum flow = maximum matching size
The edges carrying flow correspond to matched pairs

Integrality Guarantees Success

bipartite_matching.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
def max_bipartite_matching(X, Y, edges):
    """
    Find maximum bipartite matching using max flow reduction.
    
    Args:
        X: List of vertices on left side
        Y: List of vertices on right side  
        edges: List of (x, y) pairs representing allowed matchings
    
    Returns:
        List of matched pairs
    """
    from collections import deque
    
    # Build flow network
    SOURCE = 'SOURCE'
    SINK = 'SINK'
    
    # Capacity graph: graph[u][v] = capacity
    graph = {SOURCE: {}, SINK: {}}
    for x in X:
        graph[SOURCE][x] = 1
        graph[x] = {}
    for y in Y:
        graph[y] = {SINK: 1}
    for x, y in edges:
        graph[x][y] = 1
    graph[SINK] = {}
    
    # Run Edmonds-Karp (max flow)
    def bfs_augment():
        parent = {SOURCE: None}
        queue = deque([SOURCE])
        
        while queue:
            u = queue.popleft()
            for v, cap in graph[u].items():
                if v not in parent and cap > 0:
                    parent[v] = u
                    if v == SINK:
                        # Found augmenting path
                        path = []
                        curr = SINK
                        while parent[curr] is not None:
                            path.append((parent[curr], curr))
                            curr = parent[curr]
                        return path
                    queue.append(v)
        return None
    
    # Augment flow
    max_flow = 0
    while True:
        path = bfs_augment()
        if path is None:
            break
        
        # Augment by 1 (all capacities are 1)
        for u, v in path:
            graph[u][v] -= 1
            if v not in graph:
                graph[v] = {}
            graph[v][u] = graph.get(v, {}).get(u, 0) + 1
        max_flow += 1
    
    # Extract matching: edges from X to Y with flow (i.e., capacity exhausted)
    matching = []
    for x in X:
        for y in Y:
            if (x, y) in [(u, v) for u, v in edges]:
                # If original capacity 1 is now 0, edge is matched
                if graph[x].get(y, 0) == 0 and y in edges_dict.get(x, set()):
                    matching.append((x, y))
    
    # Cleaner extraction: check reverse edges
    matching = []
    for x in X:
        for y in Y:
            if graph.get(y, {}).get(x, 0) > 0:  # Reverse edge exists with flow
                matching.append((x, y))
    
    return matching
 
 
# Example: Job assignment
workers = ['Alice', 'Bob', 'Charlie']
jobs = ['Frontend', 'Backend', 'DevOps']
skills = [
    ('Alice', 'Frontend'),
    ('Alice', 'Backend'),
    ('Bob', 'Backend'),
    ('Bob', 'DevOps'),
    ('Charlie', 'Frontend'),
]
 
matching = max_bipartite_matching(workers, jobs, skills)
print(f"Maximum matching: {matching}")
# Possible output: [('Alice', 'Backend'), ('Bob', 'DevOps'), ('Charlie', 'Frontend')]

Assignment Problems with Preferences

Basic bipartite matching is binary: either an edge exists or it doesn't. But many real problems have preferences or costs.

The Assignment Problem:

Given n workers and n jobs, where worker i doing job j has cost c(i, j), find a one-to-one assignment minimizing total cost.

The Minimum-Cost Maximum-Flow Variation:

Max flow with costs assigns not just capacities but also per-unit costs to edges. The goal: find maximum flow with minimum total cost.

Reduction for assignment:

Source to each worker: capacity 1, cost 0
Each job to sink: capacity 1, cost 0
Worker i to job j: capacity 1, cost c(i, j)

Minimum-cost max-flow finds the assignment minimizing total cost while ensuring everyone is assigned.

Beyond Basic Max Flow

Example: Hospital-Resident Matching

The National Resident Matching Program pairs medical residents with hospitals:

Residents rank hospitals by preference
Hospitals rank applicants by preference
Goal: stable matching (no resident-hospital pair would prefer each other over their assignments)

While the full stable matching problem uses the Gale-Shapley algorithm, capacity-constrained variants can be modeled as flow:

Each hospital is a vertex with capacity = number of positions
Edges have costs based on (negative) preference rankings
Min-cost max-flow finds optimal assignment respecting capacity constraints

Example: Organ Donation Networks

Kidney exchange programs match donor-recipient pairs:

Incompatible pairs can swap donors
Chains of swaps increase total transplants
Flow networks model capability constraints (donor-recipient compatibility)
Maximum flow = maximum number of successful transplants

Assignment Problem Variants and Reductions
Problem	Reduction Approach	Algorithm
Simple matching	Max flow with unit capacities	Edmonds-Karp / Dinic
Weighted matching	Min-cost max flow	Hungarian / Successive shortest paths
Many-to-one matching	Source/sink capacities > 1	Standard max flow
Constrained matching	Add edges only for valid pairs	Max flow on restricted graph

Resource Allocation — Distributing Limited Resources

Many optimization problems involve distributing limited resources to competing demands. Network flow provides elegant solutions.

The General Resource Allocation Framework:

Sources represent resource providers (factories, warehouses, servers)
Sinks represent resource consumers (stores, users, data centers)
Intermediate nodes represent distribution points (hubs, routers)
Edge capacities represent transportation/processing limits
Maximum flow = maximum total resources delivered

Example: Supply Chain Optimization

A company has:

3 factories producing goods (with production capacities)
2 distribution centers (with handling capacities)
4 retail stores (with demand)
Transportation links with shipping capacities

Network Model:

                     ┌─────┐         ┌───────┐
          ┌─────────►│ DC1 ├────────►│Store1 │
          │  100     └──┬──┘  50     └───────┘
┌─────┐   │             │              
│Fact1├───┤             │80      ┌───────┐
└─────┘   │             ├───────►│Store2 │
   200    │         ┌───┴──┐     └───────┘
          └────────►│ DC2  │
┌─────┐      120    └──┬───┘     ┌───────┐
│Fact2├───────────────►│────────►│Store3 │
└─────┘    direct      │  60     └───────┘
   150                 │
                       │         ┌───────┐
┌─────┐     90         └────────►│Store4 │
│Fact3├───────────────────►      └───────┘
└─────┘            70

Maximum flow tells us the maximum goods deliverable. The min-cut reveals which links to upgrade for higher throughput.

Multi-Commodity Extensions

supply_chain.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
def model_supply_chain(factories, warehouses, stores, links):
    """
    Model a supply chain as a flow network.
    
    Args:
        factories: Dict[name, capacity] - production capacity
        warehouses: Dict[name, capacity] - handling capacity  
        stores: Dict[name, demand] - store demands
        links: List[(from, to, capacity)] - transportation links
    
    Returns:
        Flow network graph suitable for max flow algorithms
    """
    SOURCE = 'SUPER_SOURCE'
    SINK = 'SUPER_SINK'
    
    graph = {SOURCE: {}, SINK: {}}
    
    # Super-source connects to factories
    for factory, capacity in factories.items():
        graph[SOURCE][factory] = capacity
        graph[factory] = {}
    
    # Warehouses need vertex splitting for handling capacity
    for wh, capacity in warehouses.items():
        wh_in = f"{wh}_IN"
        wh_out = f"{wh}_OUT"
        graph[wh_in] = {wh_out: capacity}
        graph[wh_out] = {}
    
    # Stores connect to super-sink (with demand as capacity)
    for store, demand in stores.items():
        graph[store] = {SINK: demand}
    
    # Add transportation links
    for src, dst, cap in links:
        # Handle warehouse splits
        actual_src = f"{src}_OUT" if src in warehouses else src
        actual_dst = f"{dst}_IN" if dst in warehouses else dst
        
        if actual_src not in graph:
            graph[actual_src] = {}
        graph[actual_src][actual_dst] = cap
    
    return graph, SOURCE, SINK
 
 
# Example usage
factories = {'Factory_A': 200, 'Factory_B': 150, 'Factory_C': 100}
warehouses = {'DC_North': 180, 'DC_South': 200}
stores = {'Store_1': 80, 'Store_2': 100, 'Store_3': 120, 'Store_4': 90}
links = [
    ('Factory_A', 'DC_North', 150),
    ('Factory_A', 'DC_South', 100),
    ('Factory_B', 'DC_North', 80),
    ('Factory_B', 'DC_South', 120),
    ('Factory_C', 'DC_South', 100),
    ('DC_North', 'Store_1', 60),
    ('DC_North', 'Store_2', 70),
    ('DC_South', 'Store_2', 50),
    ('DC_South', 'Store_3', 100),
    ('DC_South', 'Store_4', 80),
]
 
graph, source, sink = model_supply_chain(factories, warehouses, stores, links)
# Now run max_flow(graph, source, sink) to find maximum deliverable goods

Scheduling with Constraints

Scheduling problems often reduce to network flow when constraints are capacity-like.

The Crew Scheduling Problem:

Airline needs to assign crews to flights:

Each crew member can work a limited number of hours
Flights require specific crew sizes
Transitions between flights have constraints (rest time, location)

Flow Modeling:

Source connects to each crew member (capacity = available hours)
Each flight is a node with crew requirement
Edges from crews to flights based on qualification/availability
Sink receives flow from flights
Max flow determines if all flights can be crewed

The Meeting Room Problem:

Schedule n meetings in k rooms:

Each meeting has a time slot
Overlapping meetings need different rooms
Some meetings require specific room features

As Bipartite Matching:

Left side: meetings
Right side: (room, time slot) pairs
Edge exists if meeting can use that room at that time
Maximum matching = maximum schedulable meetings

The Sports Tournament Problem (Baseball Elimination):

Can team X still win the championship?

Given: current wins, remaining games between teams
Team X can get at most w_X + r_X total wins
Other teams might accumulate too many wins

Flow formulation reveals impossibility:

Construct network where max flow = total remaining games between other teams
If max flow is achieved, wins are distributed such that no team exceeds w_X + r_X
If max flow < total games, team X is mathematically eliminated

The Power of Feasibility Checking

Scheduling Problems as Flow Networks
Problem	Left Side (Sources)	Right Side (Sinks)	Edge Meaning
Crew scheduling	Crew members	Flights	Crew qualified for flight
Room scheduling	Meetings	Room-time slots	Meeting fits in slot
Exam scheduling	Exams	Time slots	No student conflict
Sports elimination	Games to play	Teams	Game winner

Project Selection — Maximizing Profit with Dependencies

The Project Selection Problem:

Given n projects with profits p₁, ..., pₙ (some negative, representing costs) and dependencies (if you do project i, you must also do project j), find the subset maximizing total profit.

Why Dependencies Matter:

Project A (profit $100) requires prerequisite Project B (cost $30).

Taking A alone: impossible (dependency)
Taking B alone: -$30
Taking both: $100 - $30 = $70 ✓
Taking neither: $0

The optimal is taking both. Dependencies create constraints that flow models elegantly.

The Flow Reduction:

Source s and sink t
For each project i:
- If pᵢ > 0: edge from s to i with capacity pᵢ
- If pᵢ < 0: edge from i to t with capacity |pᵢ|
For each dependency (i requires j): edge from j to i with capacity ∞ (or very large)

The Interpretation:

A min cut (S, T) corresponds to a project selection:

Projects in S (source side) are selected
Projects in T (sink side) are not selected

Cut capacity = lost positive profits + incurred negative costs

If positive-profit project i is in T: we lose pᵢ (edge s→i is cut)
If negative-profit project i is in S: we pay |pᵢ| (edge i→t is cut)
Dependency edges: if i requires j but j ∈ T and i ∈ S, infinite capacity is cut (invalid)

Maximum profit = (sum of all positive profits) - (min cut capacity)

The Elegant Reduction

project_selection.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
def max_profit_projects(projects, dependencies):
    """
    Find maximum profit subset of projects respecting dependencies.
    
    Args:
        projects: Dict[project_id, profit] - profit (negative = cost)
        dependencies: List[(required, dependent)] - if take dependent, must take required
    
    Returns:
        Tuple of (max_profit, selected_projects)
    """
    SOURCE = 'SOURCE'
    SINK = 'SINK'
    INF = float('inf')
    
    # Build graph
    graph = {SOURCE: {}, SINK: {}}
    for project in projects:
        graph[project] = {}
    
    # Source edges for positive profits, sink edges for costs
    total_positive = 0
    for project, profit in projects.items():
        if profit > 0:
            graph[SOURCE][project] = profit
            total_positive += profit
        elif profit < 0:
            graph[project][SINK] = -profit
    
    # Dependency edges (j required by i means edge j -> i)
    for required, dependent in dependencies:
        graph[required][dependent] = INF
    
    # Run min cut (= max flow)
    min_cut_value = max_flow(graph, SOURCE, SINK)  # Use any max flow algorithm
    
    # Find selected projects (those in source-side of min cut)
    # Run BFS from source in residual graph
    selected = find_source_reachable(residual_graph, SOURCE)
    selected.remove(SOURCE)
    
    max_profit = total_positive - min_cut_value
    
    return max_profit, selected
 
 
# Example
projects = {
    'Website': 100,      # Revenue $100
    'Backend': -40,      # Cost $40
    'Database': -20,     # Cost $20
    'Analytics': 50,     # Revenue $50
    'ML_Model': 80,      # Revenue $80
    'Data_Pipeline': -30 # Cost $30
}
 
dependencies = [
    ('Backend', 'Website'),     # Website requires Backend
    ('Database', 'Backend'),    # Backend requires Database
    ('Data_Pipeline', 'ML_Model'),  # ML requires Data Pipeline
    ('Database', 'Analytics'),  # Analytics requires Database
]
 
# Optimal: Take Website, Backend, Database, Analytics
# = 100 - 40 - 20 + 50 = $90
# (Skip ML_Model+Data_Pipeline: 80-30=50, but we already have 90)

Image Segmentation and Graph Cuts

One of the most elegant applications of max-flow min-cut is in computer vision for image segmentation—separating foreground objects from background.

The Setup:

Each pixel is a node in a graph
Adjacent pixels are connected by edges with weights = similarity
Special treatment for "seed" pixels known to be foreground/background

The Network Model:

Source s represents "foreground"
Sink t represents "background"
Pixel nodes are all image pixels
Source edges: s → pixel with weight = likelihood pixel is foreground
Sink edges: pixel → t with weight = likelihood pixel is background
Pixel-pixel edges: neighboring pixels connected with weight = similarity

Why Min-Cut Works:

The minimum cut partitions pixels into S (foreground) and T (background):

Cutting s→pixel edge: pixel is classified as background despite foreground likelihood (penalty = likelihood score)
Cutting pixel→t edge: pixel is classified as foreground despite background likelihood (penalty = likelihood score)
Cutting pixel-pixel edge: adjacent pixels have different labels (penalty = similarity)

Min cut = minimum total penalty

This naturally:

Respects foreground/background likelihoods
Keeps similar pixels together
Creates smooth, sensible boundaries

The Algorithm:

1. For each pixel, compute foreground and background likelihood
2. Build the graph with appropriate edge weights
3. Compute min cut = max flow
4. Pixels reachable from source in residual graph = foreground
5. Others = background

Interactive Segmentation

Real-World Applications:

Medical imaging: Segmenting tumors from healthy tissue
Photo editing: Automatic background removal
Video processing: Object tracking across frames
Autonomous vehicles: Identifying road, obstacles, pedestrians
Augmented reality: Separating person from background for virtual backgrounds

Hall's Theorem and Perfect Matching

Max-flow min-cut provides an algorithmic proof of Hall's Marriage Theorem, a classical result in combinatorics.

Hall's Theorem:

A bipartite graph G = (X ∪ Y, E) has a matching that saturates all of X (every vertex in X is matched) if and only if for every subset S ⊆ X:

|N(S)| ≥ |S|

where N(S) is the neighborhood of S (all vertices in Y adjacent to some vertex in S).

The Condition Intuition:

Proof via Max-Flow Min-Cut:

If Hall's condition holds:

Construct the flow network (source to X, X to Y by edges, Y to sink)
All capacities are 1
Consider any min cut (S', T') separating source from sink
Let A = X ∩ S' (X vertices on source side) and B = Y ∩ S' (Y vertices on source side)
Cut capacity = |X - A| + |B| + (edges from A to Y - B)

By Hall's condition, |N(A)| ≥ |A|. The cut must "pay" for each vertex in A either by including its neighbor in B or by cutting the edge.

This forces min cut ≥ |X|, hence max flow ≥ |X|, meaning a matching saturating X exists.

If matching exists:

The matching shows how to route flow 1 to each vertex in X
This achieves flow = |X|
For any subset S, the matched vertices form N(S) ⊇ S, so |N(S)| ≥ |S|

König's Theorem

Classical Theorems from Max-Flow Min-Cut
Theorem	Statement	Connection to Flow
Hall's Marriage	Perfect matching ↔ Hall's condition	Min cut ≥ \|X\| ↔ Hall's condition
König's	Max matching = min vertex cover (bipartite)	Direct duality
Menger's	Max disjoint paths = min vertex cut	Unit capacity flow
Dilworth's	Min chain cover = max antichain	DAG path cover reduction

Edge-Disjoint and Vertex-Disjoint Paths

Menger's Theorem connects path connectivity to cut size, and max flow provides an algorithmic proof.

Edge-Disjoint Paths:

The maximum number of edge-disjoint paths from s to t equals the minimum number of edges whose removal disconnects s from t.

Proof via Max Flow:

Give every edge capacity 1
Max flow value = max number of edge-disjoint paths (each path uses 1 unit, paths don't share edges)
Min cut capacity = min number of edges to remove
Max-flow min-cut theorem gives equality

Vertex-Disjoint Paths:

The maximum number of vertex-disjoint paths (sharing no vertices except s and t) equals the minimum number of vertices whose removal disconnects s from t.

Proof via Vertex Splitting:

Split each vertex v (except s, t) into vᵢₙ and vₒᵤₜ with edge capacity 1
This enforces that each vertex is used by at most one path
Max flow = max vertex-disjoint paths
Min cut (now potentially cutting internal edges) = min vertex removal

Applications of Disjoint Paths:

Network Reliability:

Edge-disjoint paths = redundant connections (if one cable fails, alternatives exist)
Maximum disjoint paths = fault tolerance level

Routing:

Finding independent routes for load balancing
Ensuring no single link failure breaks connectivity

Security:

Minimizing single points of failure
Identifying critical edges whose failure partitions the network

The Unifying Theme

Summary and Looking Forward

We've explored the rich tapestry of problems solvable via network flow. Let's consolidate the key insights:

Key Takeaways

•Bipartite matching reduces to max flow with unit capacities. Flow value = matching size.
•Assignment problems with preferences use minimum-cost maximum-flow for optimal weighted matching.
•Resource allocation models supply chains, distribution networks, and logistics as flow problems.
•Scheduling with capacity constraints (crews, rooms, time slots) often has natural flow formulations.
•Project selection with dependencies uses min-cut to maximize profit while respecting constraints.
•Image segmentation uses graph cuts to partition pixels into foreground/background optimally.
•Classical theorems (Hall, König, Menger, Dilworth) are consequences of max-flow min-cut.
•Disjoint paths and connectivity problems reduce to flow on unit-capacity networks.

The Reduction Mindset:

The most valuable skill from this module isn't memorizing these reductions—it's developing the reduction mindset. When you encounter an optimization problem, ask:

What's being maximized/minimized?
Are there capacity-like constraints?
Is there a natural "source" and "sink"?
Can I model it as: maximize flow from A to B subject to constraints?
Can I model it as: minimize cost of separating A from B?

If the answers align with flow structure, you've found a polynomial-time solution. This pattern recognition is what makes network flow a powerful tool in your algorithmic toolkit.

Module Complete

What's Next in Your Journey:

Network flow opens doors to advanced algorithmic techniques:

Linear programming generalizes flow to arbitrary linear constraints
Approximation algorithms use flow as a subroutine for NP-hard problems
Parameterized complexity analyzes flow on special graph classes
Randomized algorithms for faster flow computation

The concepts you've learned here are foundational—they'll appear again and again as you advance in algorithmic problem-solving.

5 / 5