Loading learning content...
Imagine you're exploring a vast cave system. You stand at the entrance with multiple tunnels branching before you. One approach is to fully explore the first tunnel—following it as far as it goes, exploring every sub-tunnel before returning—before moving to the next. This is the essence of Depth-First Search (DFS): commit fully to one path before backtracking.
DFS is one of the two fundamental graph traversal algorithms (alongside Breadth-First Search), yet it holds a special place in computer science. Its elegant recursive structure maps directly onto the call stack, making it intuitive to implement. Its iterative version reveals the explicit stack mechanics that power recursive algorithms. And its applications span from detecting cycles in dependency graphs to topological sorting in build systems.
By the end of this page, you will deeply understand DFS mechanics from first principles. You'll implement both recursive and iterative versions, trace through execution step-by-step, recognize when each version is preferred, and understand the role of the stack in graph exploration. This foundation prepares you for all subsequent DFS applications.
Depth-First Search (DFS) is a graph traversal algorithm that explores as far as possible along each branch before backtracking. Given a starting vertex, DFS visits a vertex, then recursively visits all vertices reachable from it through unvisited edges, before returning to explore other branches.
The core principle: Go deep before going wide.
This contrasts with Breadth-First Search (BFS), which explores all neighbors at the current level before moving deeper. DFS commits to following one path completely—like a determined explorer who refuses to turn around until hitting a dead end.
Formal Definition:
Given a graph G = (V, E) and a starting vertex s, DFS systematically explores the graph by:
This simple rule generates a traversal that visits every vertex reachable from s exactly once.
Think of DFS as a depth-obsessed explorer who always takes the first unexplored doorway they find, going as deep as possible before retreating. BFS, in contrast, is methodical—checking every room on the current floor before taking stairs to the next level. Both visit all reachable vertices, but in fundamentally different orders.
Why "Depth-First"?
The name captures the algorithm's defining characteristic: it prioritizes depth over breadth. When multiple unvisited neighbors exist, DFS doesn't systematically explore all neighbors before proceeding—it picks one and dives deeper immediately.
Consider a tree structure:
A
/|
B C D
/| |
E F G
Starting from A, DFS might visit: A → B → E → F → C → D → G
Notice how it fully explores B's subtree (E, F) before moving to C, then fully explores D's subtree (G). Each subtree is completely traversed before sibling subtrees are considered.
Key Insight: The Traversal Forms a Spanning Tree
As DFS explores, the edges it traverses form a DFS spanning tree (or forest, if the graph is disconnected). This tree structure is not arbitrary—it encodes the discovery relationships: for any vertex v, all vertices discovered from v form the subtree rooted at v. This property is foundational to many DFS applications.
The recursive implementation of DFS is remarkably elegant—it mirrors the algorithm's natural definition almost verbatim. The system call stack implicitly tracks where we've been, eliminating the need for explicit stack management.
Core Recursive Structure:
DFS(vertex v):
mark v as visited
process v (if desired)
for each neighbor u of v:
if u is not visited:
DFS(u)
That's it. The recursion handles backtracking automatically—when a DFS call completes (no more unvisited neighbors), the call returns and the previous call continues from where it left off.
Why Recursion Maps So Naturally
DFS has an inherently recursive structure:
This maps perfectly to function calls: each recursive call explores a subtree, and returning from the call naturally backtracks to the parent.
123456789101112131415161718192021222324252627282930313233343536373839404142434445
def dfs_recursive(graph, start, visited=None): """ Perform Depth-First Search recursively on a graph. Args: graph: Adjacency list representation {vertex: [neighbors]} start: Starting vertex for traversal visited: Set of visited vertices (created on first call) Returns: List of vertices in DFS traversal order """ # Initialize visited set on first call if visited is None: visited = set() # Mark current vertex as visited visited.add(start) # Process current vertex (here we collect traversal order) result = [start] # Explore all unvisited neighbors for neighbor in graph.get(start, []): if neighbor not in visited: # Recursive call - dive deeper into this branch result.extend(dfs_recursive(graph, neighbor, visited)) # Backtracking happens implicitly when we return return result # Example usage with an undirected graphgraph = { 'A': ['B', 'C', 'D'], 'B': ['A', 'E', 'F'], 'C': ['A'], 'D': ['A', 'G'], 'E': ['B'], 'F': ['B'], 'G': ['D']} print("DFS Traversal:", dfs_recursive(graph, 'A'))# Output: DFS Traversal: ['A', 'B', 'E', 'F', 'C', 'D', 'G']Step-by-Step Execution Trace
Let's trace through the recursive DFS on our example graph, showing the call stack at each step:
Graph:
A ── B ── E
| |
C F
|
D ── G
| Step | Action | Call Stack | Visited |
|---|---|---|---|
| 1 | Call DFS(A) | [A] | {A} |
| 2 | A's first unvisited neighbor is B. Call DFS(B) | [A, B] | {A, B} |
| 3 | B's first unvisited neighbor is E. Call DFS(E) | [A, B, E] | {A, B, E} |
| 4 | E has no unvisited neighbors. Return from DFS(E) | [A, B] | {A, B, E} |
| 5 | Back in DFS(B), next unvisited neighbor is F. Call DFS(F) | [A, B, F] | {A, B, E, F} |
| 6 | F has no unvisited neighbors. Return from DFS(F) | [A, B] | {A, B, E, F} |
| 7 | Back in DFS(B), no more unvisited neighbors. Return | [A] | {A, B, E, F} |
| 8 | Back in DFS(A), next unvisited neighbor is C. Call DFS(C) | [A, C] | {A, B, C, E, F} |
| 9 | C has no unvisited neighbors. Return from DFS(C) | [A] | {A, B, C, E, F} |
| 10 | Back in DFS(A), next unvisited neighbor is D. Call DFS(D) | [A, D] | {A, B, C, D, E, F} |
| 11 | D's first unvisited neighbor is G. Call DFS(G) | [A, D, G] | {A, B, C, D, E, F, G} |
| 12 | G has no unvisited neighbors. Return from DFS(G) | [A, D] | {A, B, C, D, E, F, G} |
| 13 | Back in DFS(D), no more unvisited neighbors. Return | [A] | All visited |
| 14 | Back in DFS(A), no more unvisited neighbors. Return | [] | All visited |
Final traversal order: A → B → E → F → C → D → G
Recursive DFS uses system call stack space proportional to the maximum depth of the DFS tree. For graphs with very deep paths (e.g., a linked list of 100,000 nodes), this can cause stack overflow. Python's default recursion limit is ~1000. Use iterative DFS for potentially deep graphs, or increase the recursion limit carefully.
While recursive DFS is elegant, it relies on the implicit call stack. The iterative version makes this stack explicit, giving us direct control over memory usage and eliminating recursion depth limits.
Core Insight: Replace Recursion with an Explicit Stack
Every recursive algorithm can be converted to an iterative one using an explicit stack data structure. In recursive DFS:
In iterative DFS, we manage this stack ourselves:
Iterative-DFS(vertex start):
create empty stack S
push start onto S
while S is not empty:
v = pop from S
if v is not visited:
mark v as visited
process v
for each neighbor u of v:
if u is not visited:
push u onto S
Subtle Difference: Traversal Order
The iterative version may produce a slightly different traversal order than the recursive version. Why? The recursive version processes neighbors in their listed order (first neighbor first). The iterative version pushes all neighbors onto the stack, so the last neighbor pushed is processed first (LIFO behavior).
To achieve identical order, push neighbors in reverse order.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273
def dfs_iterative(graph, start): """ Perform Depth-First Search iteratively using an explicit stack. Args: graph: Adjacency list representation {vertex: [neighbors]} start: Starting vertex for traversal Returns: List of vertices in DFS traversal order """ visited = set() stack = [start] # Explicit stack replaces call stack result = [] while stack: # Pop the next vertex to explore vertex = stack.pop() # Skip if already visited (handles duplicates in stack) if vertex in visited: continue # Mark as visited and process visited.add(vertex) result.append(vertex) # Push all unvisited neighbors onto the stack # Reverse to maintain same order as recursive version for neighbor in reversed(graph.get(vertex, [])): if neighbor not in visited: stack.append(neighbor) return result # Alternative: Check visited before pushing (more memory efficient)def dfs_iterative_optimized(graph, start): """ Memory-optimized iterative DFS that checks visited before pushing. This prevents duplicate vertices from being added to the stack. """ visited = set([start]) # Mark start as visited immediately stack = [start] result = [] while stack: vertex = stack.pop() result.append(vertex) # Only push unvisited neighbors, mark them visited immediately for neighbor in reversed(graph.get(vertex, [])): if neighbor not in visited: visited.add(neighbor) # Mark visited when pushing, not popping stack.append(neighbor) return result # Example usagegraph = { 'A': ['B', 'C', 'D'], 'B': ['A', 'E', 'F'], 'C': ['A'], 'D': ['A', 'G'], 'E': ['B'], 'F': ['B'], 'G': ['D']} print("Iterative DFS:", dfs_iterative(graph, 'A'))print("Optimized DFS:", dfs_iterative_optimized(graph, 'A'))# Both output: ['A', 'B', 'E', 'F', 'C', 'D', 'G']Detailed Stack Trace: Iterative DFS Execution
Let's trace the iterative version step by step:
| Step | Stack (top → bottom) | Action | Result | Visited |
|---|---|---|---|---|
| 0 | [A] | Initialize | [] | {} |
| 1 | [] | Pop A, visit, push D,C,B | [A] | {A} |
| 2 | [D, C, B] | Pop B, visit, push F,E | [A,B] | {A,B} |
| 3 | [D, C, F, E] | Pop E, visit (no new neighbors) | [A,B,E] | {A,B,E} |
| 4 | [D, C, F] | Pop F, visit (no new neighbors) | [A,B,E,F] | {A,B,E,F} |
| 5 | [D, C] | Pop C, visit (A already visited) | [A,B,E,F,C] | {A,B,C,E,F} |
| 6 | [D] | Pop D, visit, push G | [A,B,E,F,C,D] | {A,B,C,D,E,F} |
| 7 | [G] | Pop G, visit (no new neighbors) | [A,B,E,F,C,D,G] | {A,B,C,D,E,F,G} |
| 8 | [] | Stack empty, done | [A,B,E,F,C,D,G] | All visited |
Final traversal: A → B → E → F → C → D → G
(Same as recursive version because we pushed neighbors in reverse order)
Mark on pop: Simple, but the same vertex can appear multiple times in the stack before being visited.
Mark on push: More memory efficient—each vertex appears at most once in the stack. But slightly more complex: you must mark the start vertex before entering the loop.
For most applications, "mark on pop" is fine. For very large, dense graphs, "mark on push" saves memory.
Both implementations achieve the same goal—visiting all reachable vertices in depth-first order—but differ in important ways. Understanding these differences helps you choose the right approach for each situation.
| Aspect | Recursive DFS | Iterative DFS |
|---|---|---|
| Code Simplicity | Very simple, mirrors definition | Slightly more verbose |
| Stack Management | Implicit (call stack) | Explicit (data structure) |
| Memory Location | Call stack (limited size) | Heap (dynamically sized) |
| Max Depth (Python default) | ~1000 calls | Limited only by heap memory |
| Tail Call Optimization | Rarely available (not in Python/Java) | Not applicable |
| Debugging | Clear stack trace | Manual stack inspection |
| State Preservation | Function variables preserved naturally | Must explicitly store state |
| Pre/Post-order Actions | Natural placement | Requires extra bookkeeping |
| Performance (typical) | Slightly slower (function call overhead) | Slightly faster |
In 2012, Apple's SSL vulnerability (goto fail) was discovered. During analysis, engineers noted that code with deep recursion could crash on certain inputs. In production systems, always consider whether recursion depth is bounded. If you can't guarantee a depth limit, use iterative DFS or increase system limits carefully.
Handling Pre-order and Post-order in Iterative DFS
Recursive DFS naturally supports both pre-order (process before children) and post-order (process after children) actions:
def dfs_recursive(v):
pre_order_action(v) # Before exploring children
for child in children(v):
dfs(child)
post_order_action(v) # After all children explored
In iterative DFS, pre-order is straightforward (process when popping), but post-order requires explicitly tracking when all children have been processed. A common technique is to push a "marker" or use state flags:
def dfs_with_postorder(start):
stack = [(start, False)] # (vertex, is_processed_flag)
while stack:
vertex, processed = stack.pop()
if processed:
post_order_action(vertex)
continue
pre_order_action(vertex)
stack.append((vertex, True)) # Push back with flag = True
for child in reversed(children(vertex)):
if not visited(child):
stack.append((child, False))
This technique is essential for applications like topological sorting that rely on post-order.
DFS can operate on any graph representation, but the choice affects both implementation details and performance. Let's examine DFS across the three common representations.
1. Adjacency List (Most Common)
This is the standard representation for DFS. Each vertex stores a list of its neighbors, allowing O(degree(v)) iteration over neighbors.
# Adjacency List: {vertex: [neighbor1, neighbor2, ...]}
graph = {
'A': ['B', 'C'],
'B': ['A', 'D'],
'C': ['A'],
'D': ['B']
}
Total DFS complexity: O(V + E) — We visit each vertex once, and for each vertex, we iterate over its adjacency list. The sum of all adjacency list lengths equals 2E (for undirected) or E (for directed).
2. Adjacency Matrix
Matrix representation uses M[i][j] = 1 to indicate an edge from i to j.
# Adjacency Matrix: matrix[i][j] = 1 if edge (i,j) exists
matrix = [
[0, 1, 1, 0], # A's connections
[1, 0, 0, 1], # B's connections
[1, 0, 0, 0], # C's connections
[0, 1, 0, 0] # D's connections
]
Total DFS complexity: O(V²) — For each vertex, we scan the entire row (V entries) to find neighbors. This is inefficient for sparse graphs.
1234567891011121314151617181920212223242526272829303132333435363738
def dfs_on_matrix(matrix, start): """ DFS on adjacency matrix representation. Args: matrix: 2D list where matrix[i][j] = 1 if edge exists start: Starting vertex index Returns: List of vertex indices in DFS order """ n = len(matrix) visited = [False] * n result = [] def dfs(v): visited[v] = True result.append(v) # Must scan entire row to find neighbors - O(V) per vertex for neighbor in range(n): if matrix[v][neighbor] == 1 and not visited[neighbor]: dfs(neighbor) dfs(start) return result # Example: Graph with 4 verticesmatrix = [ [0, 1, 1, 0], # Vertex 0: connected to 1, 2 [1, 0, 0, 1], # Vertex 1: connected to 0, 3 [1, 0, 0, 0], # Vertex 2: connected to 0 [0, 1, 0, 0] # Vertex 3: connected to 1] print("DFS on matrix:", dfs_on_matrix(matrix, 0))# Output: [0, 1, 3, 2]3. Edge List
Edge list stores edges as pairs (u, v). This representation is space-efficient but requires preprocessing for efficient DFS.
# Edge List: [(u, v), ...]
edges = [('A', 'B'), ('A', 'C'), ('B', 'D')]
Direct DFS on edge list: O(V × E) — For each vertex, scan all edges to find neighbors.
Better approach: Convert to adjacency list first (O(E)), then run standard DFS (O(V + E)). Total: O(V + E).
def edge_list_to_adj_list(edges):
graph = {}
for u, v in edges:
graph.setdefault(u, []).append(v)
graph.setdefault(v, []).append(u) # For undirected
return graph
| Representation | DFS Time Complexity | Space for Graph | Best For |
|---|---|---|---|
| Adjacency List | O(V + E) | O(V + E) | Sparse graphs, most applications |
| Adjacency Matrix | O(V²) | O(V²) | Dense graphs, edge existence queries |
| Edge List (direct) | O(V × E) | O(E) | Edge-centric algorithms |
| Edge List (convert first) | O(V + E) | O(V + E) | When starting from edge data |
Let's put everything together with a complete, production-quality DFS implementation that handles common edge cases and provides both recursive and iterative versions.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162
from collections import defaultdictfrom typing import List, Set, Optional, Callable class Graph: """ A flexible graph class supporting both directed and undirected graphs. Uses adjacency list representation for efficient DFS. """ def __init__(self, directed: bool = False): self.graph = defaultdict(list) self.directed = directed self.vertices = set() def add_edge(self, u, v): """Add an edge from u to v (and v to u if undirected).""" self.graph[u].append(v) self.vertices.add(u) self.vertices.add(v) if not self.directed: self.graph[v].append(u) def add_vertex(self, v): """Add an isolated vertex.""" self.vertices.add(v) def dfs_recursive( self, start, on_visit: Optional[Callable] = None ) -> List: """ Recursive DFS with optional callback function. Args: start: Starting vertex on_visit: Optional function called with each visited vertex Returns: List of vertices in DFS traversal order """ if start not in self.vertices: raise ValueError(f"Start vertex {start} not in graph") visited = set() result = [] def _dfs(v): visited.add(v) result.append(v) if on_visit: on_visit(v) for neighbor in self.graph[v]: if neighbor not in visited: _dfs(neighbor) _dfs(start) return result def dfs_iterative( self, start, on_visit: Optional[Callable] = None ) -> List: """ Iterative DFS with optional callback function. Args: start: Starting vertex on_visit: Optional function called with each visited vertex Returns: List of vertices in DFS traversal order """ if start not in self.vertices: raise ValueError(f"Start vertex {start} not in graph") visited = set([start]) stack = [start] result = [] while stack: vertex = stack.pop() result.append(vertex) if on_visit: on_visit(vertex) # Reverse to maintain same order as recursive for neighbor in reversed(self.graph[vertex]): if neighbor not in visited: visited.add(neighbor) stack.append(neighbor) return result def dfs_all_vertices(self) -> List[List]: """ Perform DFS on entire graph, handling disconnected components. Returns: List of components, each component is a list of vertices """ visited = set() components = [] for vertex in self.vertices: if vertex not in visited: component = [] stack = [vertex] visited.add(vertex) while stack: v = stack.pop() component.append(v) for neighbor in reversed(self.graph[v]): if neighbor not in visited: visited.add(neighbor) stack.append(neighbor) components.append(component) return components # Demonstrationif __name__ == "__main__": # Create an undirected graph g = Graph(directed=False) edges = [('A', 'B'), ('A', 'C'), ('B', 'D'), ('B', 'E'), ('C', 'F'), ('E', 'F')] for u, v in edges: g.add_edge(u, v) print("Graph edges:", dict(g.graph)) print() # DFS with callback print("DFS Recursive from 'A':") result = g.dfs_recursive('A', on_visit=lambda v: print(f" Visiting: {v}")) print(f" Traversal order: {result}") print() print("DFS Iterative from 'A':") result = g.dfs_iterative('A', on_visit=lambda v: print(f" Visiting: {v}")) print(f" Traversal order: {result}") print() # Disconnected graph g.add_vertex('X') g.add_edge('X', 'Y') g.add_vertex('Z') # Isolated vertex print("All components (disconnected graph):") components = g.dfs_all_vertices() for i, comp in enumerate(components): print(f" Component {i + 1}: {comp}")This implementation includes: (1) Support for both directed and undirected graphs, (2) Callback functions for custom processing, (3) Handling of disconnected components, (4) Input validation with clear error messages, (5) Both recursive and iterative methods producing identical traversal orders.
We've established a deep understanding of DFS mechanics. Let's consolidate the key insights:
What's Next:
With the DFS algorithm firmly understood, we'll explore its core mechanism: stack-based exploration. The next page examines how the stack determines traversal order, why DFS naturally produces certain orderings, and how this stack behavior enables powerful applications like backtracking and topological sorting.
You now have a thorough understanding of DFS mechanics in both recursive and iterative forms. You can implement DFS from scratch, trace its execution, and choose the appropriate version for any situation. Next, we'll dive deeper into the stack behavior that gives DFS its unique character.