Data Structures & AlgorithmsGraph Traversal Overview

Graph Traversal Overview

LevelIntermediate

Duration60 mins

TopicGraph Traversal Overview

3 / 4

Traversal as Foundation for Other Algorithms

The Algorithm Generator

Here's a profound insight that will transform how you think about graph algorithms: nearly every graph algorithm is a traversal with additional bookkeeping.

Shortest paths? Traversal with distance tracking. Cycle detection? Traversal with state tracking. Topological sort? Traversal with finish ordering. Connected components? Multiple traversals with labeling. The pattern repeats endlessly.

This isn't a coincidence—it's a consequence of what graphs are. To answer any question about a graph's structure, you must examine that structure - which means traversing it. Different algorithms simply track different information during traversal and interpret it differently afterward.

Understanding this relationship is immensely practical: once you master traversal, learning new graph algorithms becomes a matter of understanding what additional information to track and how to use it. The core traversal logic remains constant.

What You Will Learn

This page reveals how traversal underlies all major graph algorithms. You'll see the pattern of 'traversal plus augmentation' across shortest paths, cycle detection, topological sorting, and more. By the end, you'll understand why mastering traversal means mastering the foundation of graph algorithms.

The Augmentation Pattern

Every graph algorithm follows the same meta-pattern:

Traverse the graph systematically
Track additional information during traversal
Extract the answer from the tracked information

The traversal provides the scaffolding—a systematic way to visit vertices and edges. The tracking provides the computation—the algorithm-specific work. The extraction provides the result—the answer to whatever question we're asking.

The Generic Template:

def generic_graph_algorithm(graph, start):
    # Standard traversal infrastructure
    visited = set()
    frontier = initialize_frontier(start)
    
    # Algorithm-specific data structures
    additional_data = initialize_data()  # Varies by algorithm
    
    while frontier:
        current = select_from_frontier(frontier)  # Selection strategy varies
        
        if current in visited:
            continue
        visited.add(current)
        
        # Algorithm-specific processing
        process(current, additional_data)  # Varies by algorithm
        
        for neighbor in graph.neighbors(current):
            if neighbor not in visited:
                # Algorithm-specific edge handling
                handle_edge(current, neighbor, additional_data)  # Varies
                add_to_frontier(frontier, neighbor)  # Addition strategy varies
    
    # Extract result from additional_data
    return extract_result(additional_data)  # Varies by algorithm

Different algorithms substitute different implementations for the varying parts, but the traversal backbone remains constant.

The Augmentation Pattern Across Algorithms
Algorithm	Additional Data	Processing Step	Extraction
BFS Shortest Path	Distance array	Record distance for current	Read distances from array
DFS Timestamps	Discovery/finish times, clock	Record discovery, increment clock, later record finish	Use times for various analyses
Cycle Detection	Recursion stack / colors	Check for back edges to gray vertices	Return true if back edge found
Topological Sort	Finish order list	Append to list when finished	Reverse the finish order list
Connected Components	Component ID map	Assign current component ID	Return component ID mapping
Bipartite Check	Color/parity array	Assign color, check neighbor colors	Return false if conflict, true otherwise

The Pattern Recognition Skill

When encountering a new graph algorithm, first identify: What traversal strategy does it use (BFS/DFS)? What data does it track? How does it use that data? This decomposition makes complex algorithms comprehensible.

Shortest Paths: Traversal with Distance Tracking

The simplest shortest path algorithm—BFS for unweighted graphs—is just traversal with a distance array. This connection reveals why BFS finds shortest paths and foreshadows more sophisticated algorithms.

Why BFS Finds Shortest Paths:

BFS explores vertices in layers—first all vertices at distance 1, then all at distance 2, and so on. This layer-by-layer expansion means the first time we reach a vertex is via a shortest path.

Key Insight: In BFS, vertices are added to the frontier in order of discovery. Since the frontier is a queue (FIFO), vertices are processed in the same order. When we reach vertex v for the first time, we've arrived via the shortest path because:

All vertices at smaller distances were already visited
We process smaller distances before larger ones

The Augmentation for Shortest Paths:

def bfs_shortest_paths(graph, start):
    # Standard traversal infrastructure
    visited = set()
    frontier = deque([start])
    
    # Shortest path augmentation: distance tracking
    distance = {start: 0}  # Maps vertex -> shortest distance from start
    parent = {start: None} # Maps vertex -> predecessor (for path reconstruction)
    
    while frontier:
        current = frontier.popleft()  # BFS: FIFO order is crucial
        
        if current in visited:
            continue
        visited.add(current)
        
        for neighbor in graph.neighbors(current):
            if neighbor not in visited and neighbor not in distance:
                # First time reaching neighbor: this is the shortest path
                distance[neighbor] = distance[current] + 1
                parent[neighbor] = current
                frontier.append(neighbor)
    
    return distance, parent

The traversal structure is identical to basic BFS. The augmentation is:

distance: Track shortest path lengths
parent: Track how we reached each vertex (for path reconstruction)

This is exactly the pattern: traversal + tracking = algorithm.

From BFS to Dijkstra

Dijkstra's algorithm for weighted graphs follows the same pattern, but with a priority queue instead of a regular queue. The 'distance tracking' becomes 'tentative distance tracking with relaxation.' Same pattern, more sophisticated execution. We'll study this in detail in later modules.

Cycle Detection: Traversal with State Tracking

Cycle detection exemplifies traversal augmentation beautifully. By tracking vertex states during DFS, we can detect cycles in both directed and undirected graphs.

The Three-Color Approach:

Recall our three-state model from Page 1:

White: Not yet discovered
Gray: Discovered but not finished (currently being explored)
Black: Completely finished

For cycle detection, we track these states explicitly:

Cycle Detection Insight (Directed Graphs):

A directed graph has a cycle if and only if DFS discovers a back edge—an edge from a gray vertex to another gray vertex. Gray vertices are on the current DFS path; an edge back to one creates a cycle.

def has_cycle_directed(graph, all_vertices):
    WHITE, GRAY, BLACK = 0, 1, 2
    color = {v: WHITE for v in all_vertices}
    
    def dfs(vertex):
        color[vertex] = GRAY  # We're now exploring this vertex
        
        for neighbor in graph.get(vertex, []):
            if color[neighbor] == GRAY:
                # Back edge to an ancestor! Cycle detected.
                return True
            if color[neighbor] == WHITE:
                if dfs(neighbor):
                    return True
        
        color[vertex] = BLACK  # Finished exploring
        return False
    
    for vertex in all_vertices:
        if color[vertex] == WHITE:
            if dfs(vertex):
                return True
    
    return False

Why This Works:

Gray vertices form the current path from the DFS root. If we're at vertex u (gray) and find an edge to vertex v that's also gray, v is an ancestor of u on the current path. This means:

path from root → ... → v → ... → u → v

The suffix v → ... → u → v is a cycle.

Undirected Graphs:

For undirected graphs, cycle detection is simpler but requires one adjustment: we must not count the edge to our immediate parent as a back edge (since undirected edges go both ways).

def has_cycle_undirected(graph, all_vertices):
    visited = set()
    
    def dfs(vertex, parent):
        visited.add(vertex)
        
        for neighbor in graph.get(vertex, []):
            if neighbor not in visited:
                if dfs(neighbor, vertex):
                    return True
            elif neighbor != parent:
                # Edge to visited non-parent: cycle!
                return True
        
        return False
    
    for vertex in all_vertices:
        if vertex not in visited:
            if dfs(vertex, None):
                return True
    
    return False

The Parent Check for Undirected Graphs

In undirected graphs, every edge appears twice (once from each endpoint). When we traverse from u to v, the edge (v, u) will be seen when processing v. Without the parent check, we'd falsely detect every edge as a cycle. Always track and exclude the immediate parent.

Topological Sort: Traversal with Finish Ordering

Topological sorting—ordering vertices so all directed edges point forward—emerges naturally from DFS's finish ordering. This is one of the most elegant examples of traversal augmentation.

The Problem:

Given a directed acyclic graph (DAG), produce an ordering of vertices such that for every edge (u, v), u appears before v in the ordering. Such an ordering is called a topological order or topological sort.

Applications:

Build systems (compile dependencies before dependents)
Course prerequisites (take required courses first)
Task scheduling (complete prerequisites before dependent tasks)

The DFS Solution:

DFS naturally produces a topological order through its finish times. When we finish exploring a vertex (mark it black), all its descendants have already been finished. So vertices finished later have no path to vertices finished earlier.

def topological_sort(graph, all_vertices):
    visited = set()
    finish_order = []  # Will hold vertices in reverse topological order
    
    def dfs(vertex):
        visited.add(vertex)
        
        for neighbor in graph.get(vertex, []):
            if neighbor not in visited:
                dfs(neighbor)
        
        # Finished exploring vertex and all descendants
        finish_order.append(vertex)
    
    for vertex in all_vertices:
        if vertex not in visited:
            dfs(vertex)
    
    # Reverse to get topological order
    return finish_order[::-1]

Why This Works:

Consider any edge (u, v). We must show u appears before v in the result.

When we explore u, one of two things happens to v:

v is not yet visited: We'll visit v as a descendant of u, so v finishes before u
v is already visited: v was already finished (can't be in progress since that would mean a cycle), so v finished before u

In both cases, v finishes before u. Since we reverse the finish order, u appears before v. ✓

Iterative Version:

from collections import deque

def topological_sort_iterative(graph, all_vertices):
    # Compute in-degrees
    in_degree = {v: 0 for v in all_vertices}
    for v in all_vertices:
        for neighbor in graph.get(v, []):
            in_degree[neighbor] += 1
    
    # Start with vertices having no incoming edges
    queue = deque([v for v in all_vertices if in_degree[v] == 0])
    result = []
    
    while queue:
        current = queue.popleft()
        result.append(current)
        
        for neighbor in graph.get(current, []):
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)
    
    if len(result) != len(all_vertices):
        raise ValueError("Graph has a cycle; topological sort impossible")
    
    return result

This BFS-based approach (Kahn's algorithm) is also traversal with augmentation—we track in-degrees and process vertices when their in-degree reaches zero.

Two Approaches, Same Pattern

Both DFS (finish-order based) and BFS (in-degree based) approaches to topological sort are traversal with augmentation. The DFS approach tracks finish order; the BFS approach tracks in-degrees. Both produce valid topological orderings.

Connected Components: Multiple Traversals with Labeling

We saw connected component detection on the previous page, but let's examine it through the lens of the augmentation pattern.

The Augmentation:

Connected component detection uses:

Multiple traversal starts: One per component
Component labeling: Track which component each vertex belongs to
Component counting: Increment counter for each new traversal start

def find_connected_components(graph, all_vertices):
    visited = set()
    
    # Augmentation: component tracking
    component_id = {}
    num_components = 0
    
    def traverse(start, component):
        frontier = [start]
        while frontier:
            current = frontier.pop()
            if current in visited:
                continue
            visited.add(current)
            
            # Augmentation: label with component ID
            component_id[current] = component
            
            for neighbor in graph.get(current, []):
                if neighbor not in visited:
                    frontier.append(neighbor)
    
    for vertex in all_vertices:
        if vertex not in visited:
            # Augmentation: new component discovered
            num_components += 1
            traverse(vertex, num_components)
    
    return num_components, component_id

Pattern Analysis:

Element	Standard Traversal	Component-Finding Augmentation
Starting	From a given start	From each unvisited vertex
Processing	Visit and mark	Visit, mark, AND label
Tracking	Just visited set	Visited set + component IDs
Result	Set of visited	Component count + labeling

The traversal logic is unchanged; we've added labeling on top.

Union-Find Alternative:

For some applications (especially in Kruskal's MST algorithm), Union-Find provides more efficient component tracking:

class UnionFind:
    def __init__(self, vertices):
        self.parent = {v: v for v in vertices}
        self.rank = {v: 0 for v in vertices}
    
    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # Path compression
        return self.parent[x]
    
    def union(self, x, y):
        px, py = self.find(x), self.find(y)
        if px == py:
            return False  # Already same component
        if self.rank[px] < self.rank[py]:
            px, py = py, px
        self.parent[py] = px
        if self.rank[px] == self.rank[py]:
            self.rank[px] += 1
        return True

def find_components_union_find(vertices, edges):
    uf = UnionFind(vertices)
    for u, v in edges:
        uf.union(u, v)
    
    # Count distinct roots
    roots = set(uf.find(v) for v in vertices)
    return len(roots)

Union-Find processes edges rather than performing traversal, but conceptually it's answering the same reachability question.

Bipartite Checking: Traversal with Two-Coloring

Determining if a graph is bipartite—whether its vertices can be split into two groups with edges only between groups—is another traversal augmentation. We track a two-coloring and check for conflicts.

The Bipartiteness Property:

A graph is bipartite if and only if:

Its vertices can be partitioned into two sets A and B
Every edge has one endpoint in A and one in B

Equivalently: A graph is bipartite if and only if it contains no odd-length cycles.

The Two-Coloring Approach:

We attempt to assign each vertex a color (0 or 1) such that adjacent vertices have different colors. If we succeed, the graph is bipartite; if we find a conflict (adjacent vertices with the same color), it's not.

from collections import deque

def is_bipartite(graph, all_vertices):
    color = {}  # Maps vertex -> color (0 or 1)
    
    for start in all_vertices:
        if start in color:
            continue
        
        # BFS with coloring augmentation
        queue = deque([start])
        color[start] = 0
        
        while queue:
            current = queue.popleft()
            current_color = color[current]
            neighbor_color = 1 - current_color  # Opposite color
            
            for neighbor in graph.get(current, []):
                if neighbor not in color:
                    color[neighbor] = neighbor_color
                    queue.append(neighbor)
                elif color[neighbor] != neighbor_color:
                    # Conflict! Adjacent vertices have same color
                    return False, {}
    
    return True, color

Understanding the Algorithm:

Start with an uncolored graph
Pick any uncolored vertex, color it 0
BFS from that vertex, coloring each neighbor the opposite color of its predecessor
If we ever try to color a vertex that's already colored differently, we have a conflict
Repeat for any remaining uncolored vertices (handles disconnected graphs)

Why This Works:

If the graph is bipartite, a valid two-coloring exists, and BFS will find one (assigning colors by distance parity: even distance = color 0, odd distance = color 1).

If the graph is not bipartite, there's an odd cycle. BFS will eventually reach both ends of some edge with the same color (because oddness causes parity mismatch), detecting the conflict.

Applications of Bipartite Checking

Bipartite graphs arise in many contexts: matching students to projects, scheduling (tasks vs. time slots), and modeling relationships that cross between two types. Bipartite checking often precedes bipartite matching algorithms like Hopcroft-Karp.

The Pattern in More Complex Algorithms

The traversal-plus-augmentation pattern extends to even the most sophisticated graph algorithms. Let's see how this plays out in advanced scenarios.

Complex Algorithms as Traversal Augmentations
Algorithm	Base Traversal	Key Augmentation	Purpose
Dijkstra's Algorithm	Modified BFS (priority queue)	Tentative distances + relaxation	Shortest paths in weighted graphs
Bellman-Ford	Repeated edge relaxation	Distance array + predecessor tracking	Shortest paths with negative edges
Prim's MST	Modified BFS (priority queue)	Edge weights to MST frontier	Minimum spanning tree
Kosaraju's SCC	Two DFS passes	Finish order + transpose graph	Strongly connected components
Tarjan's SCC	Single DFS	Discovery times + low-link values	Strongly connected components
Articulation Points	DFS	Discovery times + low-link values	Cut vertices in graphs
A* Search	Priority queue traversal	Heuristic estimates + costs	Shortest paths with heuristics

Case Study: Dijkstra's Algorithm

Dijkstra's algorithm for weighted shortest paths is BFS with a priority queue and distance relaxation:

import heapq

def dijkstra(graph, start):
    # graph: {vertex: [(neighbor, weight), ...]}
    
    distance = {start: 0}  # Shortest known distances
    parent = {start: None}  # For path reconstruction
    visited = set()
    
    # Priority queue: (distance, vertex)
    pq = [(0, start)]
    
    while pq:
        dist, current = heapq.heappop(pq)
        
        if current in visited:
            continue
        visited.add(current)
        
        for neighbor, weight in graph.get(current, []):
            if neighbor not in visited:
                new_dist = dist + weight
                if neighbor not in distance or new_dist < distance[neighbor]:
                    distance[neighbor] = new_dist
                    parent[neighbor] = current
                    heapq.heappush(pq, (new_dist, neighbor))
    
    return distance, parent

Compare to BFS:

Both use a queue-based frontier
Both track a visited set
Both record distances and predecessors

The difference is:

Priority queue instead of FIFO queue (process nearest vertex first)
Distance updates instead of simple increments (relaxation)

The traversal structure is preserved; the augmentation handles weighted edges.

Case Study: Strongly Connected Components (Kosaraju's)

Kosaraju's algorithm uses two DFS passes with clever augmentation:

First DFS: Compute finish order (same as topological sort)
Transpose: Reverse all edge directions
Second DFS: Process vertices in reverse finish order on the transposed graph

Each DFS tree in the second pass is a strongly connected component.

Both DFS passes are standard traversals; the innovation is in how they're combined and what information flows between them.

Learning Strategy

When studying any new graph algorithm, first identify the traversal structure, then identify the augmentation. This decomposition makes even complex algorithms learnable by connecting them to foundations you already understand.

Summary: Traversal Is Everything

We've seen that graph traversal isn't just a graph algorithm—it's the graph algorithm upon which all others are built. Let's consolidate these insights.

Key Takeaways

•The Augmentation Pattern — Virtually all graph algorithms are traversal + additional tracking + result extraction.
•Shortest Paths — BFS with distance tracking; Dijkstra adds priority queue and relaxation for weighted graphs.
•Cycle Detection — DFS with vertex state (color) tracking; back edges to gray vertices indicate cycles.
•Topological Sort — DFS with finish ordering; reverse finish order gives topological order.
•Connected Components — Multiple traversal starts with component labeling; each new start is a new component.
•Bipartite Check — BFS/DFS with two-coloring; conflicts indicate odd cycles (non-bipartite).
•Complex Algorithms — Even Dijkstra, Bellman-Ford, Prim, and SCC algorithms follow the traversal-plus-augmentation pattern.
•Learning Strategy — Decompose new algorithms into traversal structure and augmentation to accelerate understanding.

The Meta-Principle:

Traversal is the means by which we interrogate graphs. Every question we might ask about a graph's structure requires examining that structure, which means traversing it. Different questions require different information to be tracked during traversal, but the fundamental process of systematically visiting vertices and following edges remains constant.

This is why mastering traversal is so valuable: once you deeply understand BFS and DFS, every other graph algorithm becomes a variation on a theme you already know.

What's Next:

We've established that there are two primary traversal strategies: DFS and BFS. The next page provides a detailed comparison of these two approaches: when to choose each, their characteristic properties, and how to think about the trade-offs.

Page Complete

You now understand how traversal serves as the foundation for all graph algorithms. This insight transforms learning graph algorithms from memorizing procedures to understanding variations on a core theme. Next, we'll compare DFS and BFS in depth.

3 / 4

Loading learning content...

Data Structures & AlgorithmsGraph Traversal Overview

Graph Traversal Overview

LevelIntermediate

Duration60 mins

TopicGraph Traversal Overview

3 / 4

Traversal as Foundation for Other Algorithms

The Algorithm Generator

Here's a profound insight that will transform how you think about graph algorithms: nearly every graph algorithm is a traversal with additional bookkeeping.

What You Will Learn

The Augmentation Pattern

Every graph algorithm follows the same meta-pattern:

Traverse the graph systematically
Track additional information during traversal
Extract the answer from the tracked information

The Generic Template:

def generic_graph_algorithm(graph, start):
    # Standard traversal infrastructure
    visited = set()
    frontier = initialize_frontier(start)
    
    # Algorithm-specific data structures
    additional_data = initialize_data()  # Varies by algorithm
    
    while frontier:
        current = select_from_frontier(frontier)  # Selection strategy varies
        
        if current in visited:
            continue
        visited.add(current)
        
        # Algorithm-specific processing
        process(current, additional_data)  # Varies by algorithm
        
        for neighbor in graph.neighbors(current):
            if neighbor not in visited:
                # Algorithm-specific edge handling
                handle_edge(current, neighbor, additional_data)  # Varies
                add_to_frontier(frontier, neighbor)  # Addition strategy varies
    
    # Extract result from additional_data
    return extract_result(additional_data)  # Varies by algorithm

Different algorithms substitute different implementations for the varying parts, but the traversal backbone remains constant.

The Augmentation Pattern Across Algorithms
Algorithm	Additional Data	Processing Step	Extraction
BFS Shortest Path	Distance array	Record distance for current	Read distances from array
DFS Timestamps	Discovery/finish times, clock	Record discovery, increment clock, later record finish	Use times for various analyses
Cycle Detection	Recursion stack / colors	Check for back edges to gray vertices	Return true if back edge found
Topological Sort	Finish order list	Append to list when finished	Reverse the finish order list
Connected Components	Component ID map	Assign current component ID	Return component ID mapping
Bipartite Check	Color/parity array	Assign color, check neighbor colors	Return false if conflict, true otherwise

The Pattern Recognition Skill

Shortest Paths: Traversal with Distance Tracking

Why BFS Finds Shortest Paths:

BFS explores vertices in layers—first all vertices at distance 1, then all at distance 2, and so on. This layer-by-layer expansion means the first time we reach a vertex is via a shortest path.

All vertices at smaller distances were already visited
We process smaller distances before larger ones

The Augmentation for Shortest Paths:

def bfs_shortest_paths(graph, start):
    # Standard traversal infrastructure
    visited = set()
    frontier = deque([start])
    
    # Shortest path augmentation: distance tracking
    distance = {start: 0}  # Maps vertex -> shortest distance from start
    parent = {start: None} # Maps vertex -> predecessor (for path reconstruction)
    
    while frontier:
        current = frontier.popleft()  # BFS: FIFO order is crucial
        
        if current in visited:
            continue
        visited.add(current)
        
        for neighbor in graph.neighbors(current):
            if neighbor not in visited and neighbor not in distance:
                # First time reaching neighbor: this is the shortest path
                distance[neighbor] = distance[current] + 1
                parent[neighbor] = current
                frontier.append(neighbor)
    
    return distance, parent

The traversal structure is identical to basic BFS. The augmentation is:

distance: Track shortest path lengths
parent: Track how we reached each vertex (for path reconstruction)

This is exactly the pattern: traversal + tracking = algorithm.

From BFS to Dijkstra

Cycle Detection: Traversal with State Tracking

Cycle detection exemplifies traversal augmentation beautifully. By tracking vertex states during DFS, we can detect cycles in both directed and undirected graphs.

The Three-Color Approach:

Recall our three-state model from Page 1:

White: Not yet discovered
Gray: Discovered but not finished (currently being explored)
Black: Completely finished

For cycle detection, we track these states explicitly:

Cycle Detection Insight (Directed Graphs):

def has_cycle_directed(graph, all_vertices):
    WHITE, GRAY, BLACK = 0, 1, 2
    color = {v: WHITE for v in all_vertices}
    
    def dfs(vertex):
        color[vertex] = GRAY  # We're now exploring this vertex
        
        for neighbor in graph.get(vertex, []):
            if color[neighbor] == GRAY:
                # Back edge to an ancestor! Cycle detected.
                return True
            if color[neighbor] == WHITE:
                if dfs(neighbor):
                    return True
        
        color[vertex] = BLACK  # Finished exploring
        return False
    
    for vertex in all_vertices:
        if color[vertex] == WHITE:
            if dfs(vertex):
                return True
    
    return False

Why This Works:

Gray vertices form the current path from the DFS root. If we're at vertex u (gray) and find an edge to vertex v that's also gray, v is an ancestor of u on the current path. This means:

path from root → ... → v → ... → u → v

The suffix v → ... → u → v is a cycle.

Undirected Graphs:

For undirected graphs, cycle detection is simpler but requires one adjustment: we must not count the edge to our immediate parent as a back edge (since undirected edges go both ways).

def has_cycle_undirected(graph, all_vertices):
    visited = set()
    
    def dfs(vertex, parent):
        visited.add(vertex)
        
        for neighbor in graph.get(vertex, []):
            if neighbor not in visited:
                if dfs(neighbor, vertex):
                    return True
            elif neighbor != parent:
                # Edge to visited non-parent: cycle!
                return True
        
        return False
    
    for vertex in all_vertices:
        if vertex not in visited:
            if dfs(vertex, None):
                return True
    
    return False

The Parent Check for Undirected Graphs

Topological Sort: Traversal with Finish Ordering

Topological sorting—ordering vertices so all directed edges point forward—emerges naturally from DFS's finish ordering. This is one of the most elegant examples of traversal augmentation.

The Problem:

Applications:

Build systems (compile dependencies before dependents)
Course prerequisites (take required courses first)
Task scheduling (complete prerequisites before dependent tasks)

The DFS Solution:

def topological_sort(graph, all_vertices):
    visited = set()
    finish_order = []  # Will hold vertices in reverse topological order
    
    def dfs(vertex):
        visited.add(vertex)
        
        for neighbor in graph.get(vertex, []):
            if neighbor not in visited:
                dfs(neighbor)
        
        # Finished exploring vertex and all descendants
        finish_order.append(vertex)
    
    for vertex in all_vertices:
        if vertex not in visited:
            dfs(vertex)
    
    # Reverse to get topological order
    return finish_order[::-1]

Why This Works:

Consider any edge (u, v). We must show u appears before v in the result.

When we explore u, one of two things happens to v:

v is not yet visited: We'll visit v as a descendant of u, so v finishes before u
v is already visited: v was already finished (can't be in progress since that would mean a cycle), so v finished before u

In both cases, v finishes before u. Since we reverse the finish order, u appears before v. ✓

Iterative Version:

from collections import deque

def topological_sort_iterative(graph, all_vertices):
    # Compute in-degrees
    in_degree = {v: 0 for v in all_vertices}
    for v in all_vertices:
        for neighbor in graph.get(v, []):
            in_degree[neighbor] += 1
    
    # Start with vertices having no incoming edges
    queue = deque([v for v in all_vertices if in_degree[v] == 0])
    result = []
    
    while queue:
        current = queue.popleft()
        result.append(current)
        
        for neighbor in graph.get(current, []):
            in_degree[neighbor] -= 1
            if in_degree[neighbor] == 0:
                queue.append(neighbor)
    
    if len(result) != len(all_vertices):
        raise ValueError("Graph has a cycle; topological sort impossible")
    
    return result

This BFS-based approach (Kahn's algorithm) is also traversal with augmentation—we track in-degrees and process vertices when their in-degree reaches zero.

Two Approaches, Same Pattern

Connected Components: Multiple Traversals with Labeling

We saw connected component detection on the previous page, but let's examine it through the lens of the augmentation pattern.

The Augmentation:

Connected component detection uses:

Multiple traversal starts: One per component
Component labeling: Track which component each vertex belongs to
Component counting: Increment counter for each new traversal start

def find_connected_components(graph, all_vertices):
    visited = set()
    
    # Augmentation: component tracking
    component_id = {}
    num_components = 0
    
    def traverse(start, component):
        frontier = [start]
        while frontier:
            current = frontier.pop()
            if current in visited:
                continue
            visited.add(current)
            
            # Augmentation: label with component ID
            component_id[current] = component
            
            for neighbor in graph.get(current, []):
                if neighbor not in visited:
                    frontier.append(neighbor)
    
    for vertex in all_vertices:
        if vertex not in visited:
            # Augmentation: new component discovered
            num_components += 1
            traverse(vertex, num_components)
    
    return num_components, component_id

Pattern Analysis:

Element	Standard Traversal	Component-Finding Augmentation
Starting	From a given start	From each unvisited vertex
Processing	Visit and mark	Visit, mark, AND label
Tracking	Just visited set	Visited set + component IDs
Result	Set of visited	Component count + labeling

The traversal logic is unchanged; we've added labeling on top.

Union-Find Alternative:

For some applications (especially in Kruskal's MST algorithm), Union-Find provides more efficient component tracking:

class UnionFind:
    def __init__(self, vertices):
        self.parent = {v: v for v in vertices}
        self.rank = {v: 0 for v in vertices}
    
    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # Path compression
        return self.parent[x]
    
    def union(self, x, y):
        px, py = self.find(x), self.find(y)
        if px == py:
            return False  # Already same component
        if self.rank[px] < self.rank[py]:
            px, py = py, px
        self.parent[py] = px
        if self.rank[px] == self.rank[py]:
            self.rank[px] += 1
        return True

def find_components_union_find(vertices, edges):
    uf = UnionFind(vertices)
    for u, v in edges:
        uf.union(u, v)
    
    # Count distinct roots
    roots = set(uf.find(v) for v in vertices)
    return len(roots)

Union-Find processes edges rather than performing traversal, but conceptually it's answering the same reachability question.

Bipartite Checking: Traversal with Two-Coloring

The Bipartiteness Property:

A graph is bipartite if and only if:

Its vertices can be partitioned into two sets A and B
Every edge has one endpoint in A and one in B

Equivalently: A graph is bipartite if and only if it contains no odd-length cycles.

The Two-Coloring Approach:

from collections import deque

def is_bipartite(graph, all_vertices):
    color = {}  # Maps vertex -> color (0 or 1)
    
    for start in all_vertices:
        if start in color:
            continue
        
        # BFS with coloring augmentation
        queue = deque([start])
        color[start] = 0
        
        while queue:
            current = queue.popleft()
            current_color = color[current]
            neighbor_color = 1 - current_color  # Opposite color
            
            for neighbor in graph.get(current, []):
                if neighbor not in color:
                    color[neighbor] = neighbor_color
                    queue.append(neighbor)
                elif color[neighbor] != neighbor_color:
                    # Conflict! Adjacent vertices have same color
                    return False, {}
    
    return True, color

Understanding the Algorithm:

Start with an uncolored graph
Pick any uncolored vertex, color it 0
BFS from that vertex, coloring each neighbor the opposite color of its predecessor
If we ever try to color a vertex that's already colored differently, we have a conflict
Repeat for any remaining uncolored vertices (handles disconnected graphs)

Why This Works:

If the graph is bipartite, a valid two-coloring exists, and BFS will find one (assigning colors by distance parity: even distance = color 0, odd distance = color 1).

If the graph is not bipartite, there's an odd cycle. BFS will eventually reach both ends of some edge with the same color (because oddness causes parity mismatch), detecting the conflict.

Applications of Bipartite Checking

The Pattern in More Complex Algorithms

The traversal-plus-augmentation pattern extends to even the most sophisticated graph algorithms. Let's see how this plays out in advanced scenarios.

Complex Algorithms as Traversal Augmentations
Algorithm	Base Traversal	Key Augmentation	Purpose
Dijkstra's Algorithm	Modified BFS (priority queue)	Tentative distances + relaxation	Shortest paths in weighted graphs
Bellman-Ford	Repeated edge relaxation	Distance array + predecessor tracking	Shortest paths with negative edges
Prim's MST	Modified BFS (priority queue)	Edge weights to MST frontier	Minimum spanning tree
Kosaraju's SCC	Two DFS passes	Finish order + transpose graph	Strongly connected components
Tarjan's SCC	Single DFS	Discovery times + low-link values	Strongly connected components
Articulation Points	DFS	Discovery times + low-link values	Cut vertices in graphs
A* Search	Priority queue traversal	Heuristic estimates + costs	Shortest paths with heuristics

Case Study: Dijkstra's Algorithm

Dijkstra's algorithm for weighted shortest paths is BFS with a priority queue and distance relaxation:

import heapq

def dijkstra(graph, start):
    # graph: {vertex: [(neighbor, weight), ...]}
    
    distance = {start: 0}  # Shortest known distances
    parent = {start: None}  # For path reconstruction
    visited = set()
    
    # Priority queue: (distance, vertex)
    pq = [(0, start)]
    
    while pq:
        dist, current = heapq.heappop(pq)
        
        if current in visited:
            continue
        visited.add(current)
        
        for neighbor, weight in graph.get(current, []):
            if neighbor not in visited:
                new_dist = dist + weight
                if neighbor not in distance or new_dist < distance[neighbor]:
                    distance[neighbor] = new_dist
                    parent[neighbor] = current
                    heapq.heappush(pq, (new_dist, neighbor))
    
    return distance, parent

Compare to BFS:

Both use a queue-based frontier
Both track a visited set
Both record distances and predecessors

The difference is:

Priority queue instead of FIFO queue (process nearest vertex first)
Distance updates instead of simple increments (relaxation)

The traversal structure is preserved; the augmentation handles weighted edges.

Case Study: Strongly Connected Components (Kosaraju's)

Kosaraju's algorithm uses two DFS passes with clever augmentation:

First DFS: Compute finish order (same as topological sort)
Transpose: Reverse all edge directions
Second DFS: Process vertices in reverse finish order on the transposed graph

Each DFS tree in the second pass is a strongly connected component.

Both DFS passes are standard traversals; the innovation is in how they're combined and what information flows between them.

Learning Strategy

Summary: Traversal Is Everything

We've seen that graph traversal isn't just a graph algorithm—it's the graph algorithm upon which all others are built. Let's consolidate these insights.

Key Takeaways

•The Augmentation Pattern — Virtually all graph algorithms are traversal + additional tracking + result extraction.
•Shortest Paths — BFS with distance tracking; Dijkstra adds priority queue and relaxation for weighted graphs.
•Cycle Detection — DFS with vertex state (color) tracking; back edges to gray vertices indicate cycles.
•Topological Sort — DFS with finish ordering; reverse finish order gives topological order.
•Connected Components — Multiple traversal starts with component labeling; each new start is a new component.
•Bipartite Check — BFS/DFS with two-coloring; conflicts indicate odd cycles (non-bipartite).
•Complex Algorithms — Even Dijkstra, Bellman-Ford, Prim, and SCC algorithms follow the traversal-plus-augmentation pattern.
•Learning Strategy — Decompose new algorithms into traversal structure and augmentation to accelerate understanding.

The Meta-Principle:

This is why mastering traversal is so valuable: once you deeply understand BFS and DFS, every other graph algorithm becomes a variation on a theme you already know.

What's Next:

Page Complete

3 / 4