Loading content...
Consider a build system compiling your code. Module A imports from B, B imports from C. Each import is directional—A depends on B, not vice versa. Now imagine C also imports from A. This creates a directed cycle: A → B → C → A. No module can be compiled first because each waits for another in an endless loop.
This scenario illustrates why cycle detection in directed graphs is fundamentally different from undirected graphs. In undirected graphs, if you can get from A to B, you can always get from B to A. In directed graphs, reachability is asymmetric—paths have direction, and this changes everything about cycle detection.
By the end of this page, you will master the three-color DFS algorithm for directed cycle detection, understand why the undirected approach fails for directed graphs, and be able to implement production-quality cycle detection with cycle reconstruction capabilities.
Let's first understand why we can't simply use the parent-tracking approach from undirected graphs.
The Core Problem:
In undirected graphs, when we visit a vertex that's already visited AND it's not our parent, we've found an alternative path back—a cycle. But in directed graphs, finding a visited vertex doesn't necessarily mean we've found a cycle.
Consider this directed graph:
0 → 1 → 3
↓ ↑
2 ──────┘
Vertex 3 is reachable from 0 via two paths: 0→1→3 and 0→2→3. When DFS explores 0→1→3 first, vertex 3 gets marked visited. Then when exploring 0→2, we find edge 2→3 leading to already-visited 3.
Is this a cycle? No! There's no way to get back from 3 to any vertex in our current path. We've just found a cross edge—an edge to a vertex that was completed in a different branch of the DFS tree.
In directed graphs, we must distinguish between: (1) edges to vertices currently in our recursion stack (BACK EDGES → cycles), and (2) edges to vertices already completely explored in a different path (CROSS/FORWARD EDGES → no cycle). A simple visited/unvisited distinction cannot make this differentiation.
Edge Classification in Directed Graphs:
During DFS of a directed graph, edges are classified into four types:
Only back edges create cycles. Our algorithm must specifically detect back edges, which means tracking not just "visited" but "currently in the active DFS path."
| Aspect | Undirected Graph | Directed Graph |
|---|---|---|
| What indicates cycle? | Any edge to visited non-parent | Only edge to ancestor in current path |
| Edge types | Tree edges, Back edges only | Tree, Back, Forward, Cross edges |
| Sufficient tracking | visited[] array | Need recursion stack state |
| Parent tracking | Required to avoid false positives | Not needed (edges are one-way) |
| Cross edges | Don't exist (would be back edges) | Exist and must not trigger false positive |
The standard algorithm for detecting cycles in directed graphs uses a three-color marking scheme (or equivalently, tracking an explicit recursion stack). Each vertex can be in one of three states:
WHITE (Unvisited): The vertex has not yet been discovered.
GRAY (In Progress): The vertex is currently being explored—it's in our active DFS path. We've started exploring it but haven't finished (still have unvisited neighbors).
BLACK (Completed): The vertex and ALL its descendants have been fully explored. We've finished with it completely.
The Key Insight:
A cycle exists if and only if we find an edge from a GRAY vertex to another GRAY vertex. This is a back edge—an edge from a vertex in the current path back to an ancestor in that same path.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153
from collections import defaultdictfrom enum import Enumfrom typing import Optional class Color(Enum): """ Three-color states for directed graph DFS. """ WHITE = 0 # Unvisited GRAY = 1 # Currently in DFS path (in progress) BLACK = 2 # Completely processed class DirectedGraph: """ Directed graph with cycle detection using three-color DFS. """ def __init__(self, vertices: int): self.V = vertices self.adj: dict[int, list[int]] = defaultdict(list) def add_edge(self, u: int, v: int) -> None: """Add directed edge from u to v.""" self.adj[u].append(v) def has_cycle(self) -> bool: """ Detect if the directed graph contains a cycle. Uses three-color DFS: - WHITE: unvisited - GRAY: currently in DFS stack (being processed) - BLACK: completely finished A cycle exists iff we find a GRAY → GRAY edge (back edge). Time Complexity: O(V + E) Space Complexity: O(V) """ color = [Color.WHITE] * self.V for vertex in range(self.V): if color[vertex] == Color.WHITE: if self._dfs_has_cycle(vertex, color): return True return False def _dfs_has_cycle(self, vertex: int, color: list[Color]) -> bool: """ DFS helper for cycle detection. Returns True if cycle is found starting from vertex. """ # Mark as gray (in progress / in current DFS path) color[vertex] = Color.GRAY for neighbor in self.adj[vertex]: if color[neighbor] == Color.GRAY: # Found edge to vertex in current path - CYCLE! return True elif color[neighbor] == Color.WHITE: # Explore unvisited neighbor if self._dfs_has_cycle(neighbor, color): return True # If BLACK, neighbor was fully processed in different path # This is a cross edge or forward edge - NOT a cycle # Mark as black (completely processed) color[vertex] = Color.BLACK return False def find_cycle(self) -> Optional[list[int]]: """ Find and return a cycle if one exists. Returns: List of vertices forming the cycle, or None if no cycle. """ color = [Color.WHITE] * self.V parent = [-1] * self.V for vertex in range(self.V): if color[vertex] == Color.WHITE: cycle = self._dfs_find_cycle(vertex, color, parent) if cycle: return cycle return None def _dfs_find_cycle( self, vertex: int, color: list[Color], parent: list[int] ) -> Optional[list[int]]: """ DFS helper that returns the cycle path when found. """ color[vertex] = Color.GRAY for neighbor in self.adj[vertex]: if color[neighbor] == Color.GRAY: # Cycle found! Reconstruct it by following parent pointers cycle = [neighbor, vertex] current = vertex while parent[current] != -1 and parent[current] != neighbor: current = parent[current] cycle.append(current) cycle.append(neighbor) # Complete the cycle return cycle[::-1] elif color[neighbor] == Color.WHITE: parent[neighbor] = vertex result = self._dfs_find_cycle(neighbor, color, parent) if result: return result color[vertex] = Color.BLACK return None # Example usage demonstrating directed cycle detectionif __name__ == "__main__": # Graph with cycle: 0 → 1 → 2 → 0 g1 = DirectedGraph(4) g1.add_edge(0, 1) g1.add_edge(1, 2) g1.add_edge(2, 0) # Creates cycle g1.add_edge(2, 3) print(f"Graph 1 has cycle: {g1.has_cycle()}") # True print(f"Cycle found: {g1.find_cycle()}") # [0, 1, 2, 0] # DAG (no cycle) g2 = DirectedGraph(4) g2.add_edge(0, 1) g2.add_edge(0, 2) g2.add_edge(1, 3) g2.add_edge(2, 3) print(f"Graph 2 has cycle: {g2.has_cycle()}") # False # Graph with cross edge but no cycle # 0 → 1 → 3 # ↓ ↑ # 2 ──────┘ g3 = DirectedGraph(4) g3.add_edge(0, 1) g3.add_edge(0, 2) g3.add_edge(1, 3) g3.add_edge(2, 3) # Cross edge, NOT a cycle print(f"Graph 3 has cycle: {g3.has_cycle()}") # FalseInstead of three colors, you can use a separate inStack[] boolean array alongside visited[]. The vertex is 'gray' when visited[v] && inStack[v], and 'black' when visited[v] && !inStack[v]. Both approaches are equivalent. The three-color version is more conceptually elegant; the in-stack version may be clearer for some implementations.
Let's trace through the three-color algorithm on two graphs: one with a cycle, one without.
Example 1: Graph WITH Cycle
Edges: 0→1, 1→2, 2→3, 3→1 (cycle: 1→2→3→1)
Starting DFS from vertex 0:
| Step | Action | Vertex 0 | Vertex 1 | Vertex 2 | Vertex 3 | Result |
|---|---|---|---|---|---|---|
| 1 | Start DFS(0) | GRAY | WHITE | WHITE | WHITE | Continue |
| 2 | DFS(1) from 0 | GRAY | GRAY | WHITE | WHITE | Continue |
| 3 | DFS(2) from 1 | GRAY | GRAY | GRAY | WHITE | Continue |
| 4 | DFS(3) from 2 | GRAY | GRAY | GRAY | GRAY | Continue |
| 5 | 3's neighbor is 1 | GRAY | GRAY | GRAY | GRAY | 1 is GRAY → CYCLE! |
At step 5, vertex 3 examines its neighbor (vertex 1). Vertex 1 is GRAY, meaning it's currently in our DFS path: 0→1→2→3. Finding an edge from 3 back to 1 creates the cycle 1→2→3→1.
Example 2: Graph WITHOUT Cycle (but with cross edge)
Edges: 0→1, 0→2, 1→3, 2→3
This graph has two paths to 3: 0→1→3 and 0→2→3.
| Step | Action | Vertex 0 | Vertex 1 | Vertex 2 | Vertex 3 | Result |
|---|---|---|---|---|---|---|
| 1 | Start DFS(0) | GRAY | WHITE | WHITE | WHITE | Continue |
| 2 | DFS(1) from 0 | GRAY | GRAY | WHITE | WHITE | Continue |
| 3 | DFS(3) from 1 | GRAY | GRAY | WHITE | GRAY | Continue |
| 4 | 3 done, backtrack | GRAY | GRAY | WHITE | BLACK | Continue |
| 5 | 1 done, backtrack | GRAY | BLACK | WHITE | BLACK | Continue |
| 6 | DFS(2) from 0 | GRAY | BLACK | GRAY | BLACK | Continue |
| 7 | 2's neighbor is 3 | GRAY | BLACK | GRAY | BLACK | 3 is BLACK → No cycle |
| 8 | 2 done, 0 done | BLACK | BLACK | BLACK | BLACK | Complete |
At step 7, vertex 2 examines neighbor 3. Vertex 3 is BLACK (fully processed), not GRAY. This is a cross edge—it goes to a vertex that was completed in a different branch of exploration. No cycle exists.
Every vertex follows the same color progression: WHITE → GRAY → BLACK. A vertex becomes GRAY when we start exploring it, and BLACK when we finish exploring all its descendants. A back edge (cycle) requires going from current GRAY vertex to another GRAY vertex—one that we started but haven't finished yet.
Some implementations use two separate arrays instead of the three-color scheme:
This is functionally equivalent to three colors but can be more intuitive for some programmers.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143
from collections import defaultdict class DirectedGraphRecStack: """ Directed cycle detection using explicit recursion stack tracking. """ def __init__(self, vertices: int): self.V = vertices self.adj: dict[int, list[int]] = defaultdict(list) def add_edge(self, u: int, v: int) -> None: self.adj[u].append(v) def has_cycle(self) -> bool: """ Detect cycle using visited and in_stack arrays. visited[v] = True means we've seen v at some point in_stack[v] = True means v is in our current DFS path Cycle exists if we find edge to vertex that's in_stack """ visited = [False] * self.V in_stack = [False] * self.V for vertex in range(self.V): if not visited[vertex]: if self._dfs(vertex, visited, in_stack): return True return False def _dfs( self, vertex: int, visited: list[bool], in_stack: list[bool] ) -> bool: """ DFS with explicit stack tracking. """ visited[vertex] = True in_stack[vertex] = True # Add to current path for neighbor in self.adj[vertex]: # If neighbor is in current DFS path → CYCLE if in_stack[neighbor]: return True # If not visited, explore it if not visited[neighbor]: if self._dfs(neighbor, visited, in_stack): return True # If visited but NOT in stack, it was completed # in a different path → no cycle from this edge # Remove from stack (backtracking) in_stack[vertex] = False return False class DirectedGraphIterative: """ Iterative DFS for cycle detection - avoids stack overflow. """ def __init__(self, vertices: int): self.V = vertices self.adj: dict[int, list[int]] = defaultdict(list) def add_edge(self, u: int, v: int) -> None: self.adj[u].append(v) def has_cycle(self) -> bool: """ Iterative DFS cycle detection using explicit stack. Uses state machine: each vertex on stack has an iterator position to track which neighbors are yet to be explored. """ WHITE, GRAY, BLACK = 0, 1, 2 color = [WHITE] * self.V for source in range(self.V): if color[source] != WHITE: continue # Stack contains (vertex, iterator over neighbors) stack = [(source, iter(self.adj[source]))] color[source] = GRAY while stack: vertex, neighbors = stack[-1] try: neighbor = next(neighbors) if color[neighbor] == GRAY: # Back edge - CYCLE! return True elif color[neighbor] == WHITE: # Explore unvisited neighbor color[neighbor] = GRAY stack.append((neighbor, iter(self.adj[neighbor]))) # If BLACK, cross/forward edge - continue except StopIteration: # Done with all neighbors stack.pop() color[vertex] = BLACK return False # Verify both implementations agreedef test_implementations(): """Test that both approaches give same results.""" test_cases = [ # (edges, expected_has_cycle) ([(0, 1), (1, 2), (2, 0)], True), # Triangle cycle ([(0, 1), (0, 2), (1, 3), (2, 3)], False), # DAG with cross edge ([(0, 1), (1, 2), (2, 3), (3, 1)], True), # Cycle: 1→2→3→1 ([(0, 1)], False), # Single edge ([], False), # No edges ] for edges, expected in test_cases: n = max([max(e) for e in edges]) + 1 if edges else 1 g1 = DirectedGraphRecStack(n) g2 = DirectedGraphIterative(n) for u, v in edges: g1.add_edge(u, v) g2.add_edge(u, v) r1, r2 = g1.has_cycle(), g2.has_cycle() assert r1 == r2 == expected, f"Failed: {edges}" print("All tests passed!") test_implementations()There's an elegant alternative for cycle detection: Kahn's algorithm for topological sorting. The key insight is:
A directed graph has a valid topological ordering if and only if it has no cycles (is a DAG).
Kahn's algorithm works by repeatedly removing vertices with no incoming edges. If the graph is acyclic, we'll eventually remove all vertices. If a cycle exists, vertices in the cycle will never have zero incoming edges (each depends on another in the cycle), so we'll get stuck.
Algorithm:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980
from collections import deque, defaultdict def has_cycle_kahn(vertices: int, edges: list[tuple[int, int]]) -> bool: """ Detect cycle using Kahn's algorithm (BFS-based topological sort). If topological sort processes all vertices, no cycle exists. If some vertices remain unprocessed, they're part of a cycle. Time: O(V + E) Space: O(V + E) """ # Build adjacency list and compute in-degrees adj: dict[int, list[int]] = defaultdict(list) in_degree = [0] * vertices for u, v in edges: adj[u].append(v) in_degree[v] += 1 # Initialize queue with all vertices having in-degree 0 queue = deque() for v in range(vertices): if in_degree[v] == 0: queue.append(v) processed_count = 0 while queue: vertex = queue.popleft() processed_count += 1 for neighbor in adj[vertex]: in_degree[neighbor] -= 1 if in_degree[neighbor] == 0: queue.append(neighbor) # If we processed all vertices, no cycle # If some remain, they're in cycles (never reached 0 in-degree) return processed_count < vertices def find_cycle_vertices_kahn( vertices: int, edges: list[tuple[int, int]]) -> list[int]: """ Find all vertices that are part of cycles. Returns list of vertices that couldn't be topologically sorted. """ adj: dict[int, list[int]] = defaultdict(list) in_degree = [0] * vertices for u, v in edges: adj[u].append(v) in_degree[v] += 1 queue = deque([v for v in range(vertices) if in_degree[v] == 0]) processed = set() while queue: vertex = queue.popleft() processed.add(vertex) for neighbor in adj[vertex]: in_degree[neighbor] -= 1 if in_degree[neighbor] == 0: queue.append(neighbor) # Vertices not processed are in cycles return [v for v in range(vertices) if v not in processed] # Examplesprint(has_cycle_kahn(4, [(0, 1), (1, 2), (2, 3), (3, 1)])) # Trueprint(has_cycle_kahn(4, [(0, 1), (0, 2), (1, 3), (2, 3)])) # False print(find_cycle_vertices_kahn(5, [(0, 1), (1, 2), (2, 3), (3, 1), (3, 4)]))# Output: [1, 2, 3] - these vertices are in the cycleKahn's algorithm is ideal when you also need a topological ordering, or when you want to identify ALL vertices involved in cycles (not just detect one cycle). It's BFS-based, so it avoids recursion depth issues. However, it doesn't easily give you the actual cycle path—just which vertices are 'stuck.'
Cycle detection is intimately related to a more general concept: Strongly Connected Components (SCCs).
Definition: A strongly connected component is a maximal set of vertices such that every vertex is reachable from every other vertex in the set.
Key Insight:
A directed graph has a cycle if and only if it has a strongly connected component with more than one vertex.
If vertices A and B are in the same SCC, there's a path from A to B and from B to A. These two paths together form at least one cycle.
Implications:
Algorithms like Tarjan's or Kosaraju's find all SCCs in O(V + E) time, effectively solving cycle detection as a byproduct.
| Approach | Time | Detects Cycles | Finds All SCCs | Returns Cycle Path |
|---|---|---|---|---|
| Three-Color DFS | O(V+E) | Yes | No | Yes (with modification) |
| Kahn's Algorithm | O(V+E) | Yes | No | No (but finds cycle vertices) |
| Tarjan's SCC | O(V+E) | Yes (byproduct) | Yes | No (but gives SCC groupings) |
| Kosaraju's SCC | O(V+E) | Yes (byproduct) | Yes | No (but gives SCC groupings) |
If you only need to know WHETHER a cycle exists, use three-color DFS—it's simplest. If you need to understand the full cyclic structure, identify all cycles, or condense the graph into a DAG of SCCs, use Tarjan's or Kosaraju's algorithm. SCC algorithms provide more information but are more complex to implement.
Directed cycle detection has several subtle cases that can cause bugs if not handled properly:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152
def test_edge_cases(graph_class): """ Comprehensive edge case testing for cycle detection. """ # Test 1: Self-loop g1 = graph_class(3) g1.add_edge(0, 1) g1.add_edge(1, 1) # Self-loop on vertex 1 assert g1.has_cycle() == True, "Should detect self-loop" # Test 2: Empty graph g2 = graph_class(0) assert g2.has_cycle() == False, "Empty graph has no cycle" # Test 3: Single vertex, no edges g3 = graph_class(1) assert g3.has_cycle() == False, "Single vertex, no self-loop" # Test 4: Disconnected - cycle in unreachable component g4 = graph_class(5) g4.add_edge(0, 1) # Component 1: just an edge g4.add_edge(2, 3) g4.add_edge(3, 4) g4.add_edge(4, 2) # Component 2: cycle 2→3→4→2 assert g4.has_cycle() == True, "Should find cycle in second component" # Test 5: Cross edge (looks like cycle but isn't) g5 = graph_class(4) g5.add_edge(0, 1) g5.add_edge(0, 2) g5.add_edge(1, 3) g5.add_edge(2, 3) # Cross edge to already-processed 3 assert g5.has_cycle() == False, "Cross edge is not a cycle" # Test 6: Forward edge (also not a cycle) g6 = graph_class(3) g6.add_edge(0, 1) g6.add_edge(1, 2) g6.add_edge(0, 2) # Forward edge: 0→2 but 0→1→2 exists assert g6.has_cycle() == False, "Forward edge is not a cycle" # Test 7: Long cycle g7 = graph_class(100) for i in range(99): g7.add_edge(i, i + 1) g7.add_edge(99, 0) # Complete the cycle assert g7.has_cycle() == True, "Should detect long cycle" print("All edge case tests passed!") # Run with your DirectedGraph implementation# test_edge_cases(DirectedGraph)Directed cycle detection is fundamentally more nuanced than the undirected case. Let's consolidate the key insights:
visited[] + inStack[] arrays. Same logic, different representation.Coming Up Next:
In the next page, we'll dive deeper into the three-color marking system, exploring its variants and how to use it not just for cycle detection but for computing discovery and finish times, which unlock powerful graph algorithms.
You now have mastery of cycle detection in directed graphs. You understand why the undirected approach fails, can implement three-color DFS and its variants, and know when to use Kahn's algorithm. Next, we'll explore the powerful state-marking system in greater depth.