Data Structures & AlgorithmsGraph Algorithms

Depth-First Search (DFS) — Algorithm & Applications

LevelIntermediate

Duration90 mins

TopicGraph Algorithms

1 / 5

DFS Algorithm — Recursive and Iterative Implementations

Exploring the Depths First

Imagine you're exploring a vast cave system. You stand at the entrance with multiple tunnels branching before you. One approach is to fully explore the first tunnel—following it as far as it goes, exploring every sub-tunnel before returning—before moving to the next. This is the essence of Depth-First Search (DFS): commit fully to one path before backtracking.

DFS is one of the two fundamental graph traversal algorithms (alongside Breadth-First Search), yet it holds a special place in computer science. Its elegant recursive structure maps directly onto the call stack, making it intuitive to implement. Its iterative version reveals the explicit stack mechanics that power recursive algorithms. And its applications span from detecting cycles in dependency graphs to topological sorting in build systems.

What You Will Learn

By the end of this page, you will deeply understand DFS mechanics from first principles. You'll implement both recursive and iterative versions, trace through execution step-by-step, recognize when each version is preferred, and understand the role of the stack in graph exploration. This foundation prepares you for all subsequent DFS applications.

What Is Depth-First Search?

Depth-First Search (DFS) is a graph traversal algorithm that explores as far as possible along each branch before backtracking. Given a starting vertex, DFS visits a vertex, then recursively visits all vertices reachable from it through unvisited edges, before returning to explore other branches.

The core principle: Go deep before going wide.

This contrasts with Breadth-First Search (BFS), which explores all neighbors at the current level before moving deeper. DFS commits to following one path completely—like a determined explorer who refuses to turn around until hitting a dead end.

Formal Definition:

Given a graph G = (V, E) and a starting vertex s, DFS systematically explores the graph by:

Visiting the starting vertex s
For each unvisited neighbor u of the current vertex v:
- Recursively apply DFS starting from u
When all neighbors have been visited, backtrack to the previous vertex

This simple rule generates a traversal that visits every vertex reachable from s exactly once.

DFS vs BFS: A Mental Model

Think of DFS as a depth-obsessed explorer who always takes the first unexplored doorway they find, going as deep as possible before retreating. BFS, in contrast, is methodical—checking every room on the current floor before taking stairs to the next level. Both visit all reachable vertices, but in fundamentally different orders.

Why "Depth-First"?

The name captures the algorithm's defining characteristic: it prioritizes depth over breadth. When multiple unvisited neighbors exist, DFS doesn't systematically explore all neighbors before proceeding—it picks one and dives deeper immediately.

Consider a tree structure:

       A
      /|
     B C D
    /|   |
   E F   G

Starting from A, DFS might visit: A → B → E → F → C → D → G

Notice how it fully explores B's subtree (E, F) before moving to C, then fully explores D's subtree (G). Each subtree is completely traversed before sibling subtrees are considered.

Key Insight: The Traversal Forms a Spanning Tree

As DFS explores, the edges it traverses form a DFS spanning tree (or forest, if the graph is disconnected). This tree structure is not arbitrary—it encodes the discovery relationships: for any vertex v, all vertices discovered from v form the subtree rooted at v. This property is foundational to many DFS applications.

The Recursive DFS Implementation

The recursive implementation of DFS is remarkably elegant—it mirrors the algorithm's natural definition almost verbatim. The system call stack implicitly tracks where we've been, eliminating the need for explicit stack management.

Core Recursive Structure:

DFS(vertex v):
    mark v as visited
    process v (if desired)
    for each neighbor u of v:
        if u is not visited:
            DFS(u)

That's it. The recursion handles backtracking automatically—when a DFS call completes (no more unvisited neighbors), the call returns and the previous call continues from where it left off.

Why Recursion Maps So Naturally

DFS has an inherently recursive structure:

Base case: A vertex with no unvisited neighbors (dead end)
Recursive case: Visit each unvisited neighbor, fully exploring its subtree

This maps perfectly to function calls: each recursive call explores a subtree, and returning from the call naturally backtracks to the parent.

recursive_dfs.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
def dfs_recursive(graph, start, visited=None):
    """
    Perform Depth-First Search recursively on a graph.
    
    Args:
        graph: Adjacency list representation {vertex: [neighbors]}
        start: Starting vertex for traversal
        visited: Set of visited vertices (created on first call)
    
    Returns:
        List of vertices in DFS traversal order
    """
    # Initialize visited set on first call
    if visited is None:
        visited = set()
    
    # Mark current vertex as visited
    visited.add(start)
    
    # Process current vertex (here we collect traversal order)
    result = [start]
    
    # Explore all unvisited neighbors
    for neighbor in graph.get(start, []):
        if neighbor not in visited:
            # Recursive call - dive deeper into this branch
            result.extend(dfs_recursive(graph, neighbor, visited))
    
    # Backtracking happens implicitly when we return
    return result
 
 
# Example usage with an undirected graph
graph = {
    'A': ['B', 'C', 'D'],
    'B': ['A', 'E', 'F'],
    'C': ['A'],
    'D': ['A', 'G'],
    'E': ['B'],
    'F': ['B'],
    'G': ['D']
}
 
print("DFS Traversal:", dfs_recursive(graph, 'A'))
# Output: DFS Traversal: ['A', 'B', 'E', 'F', 'C', 'D', 'G']

Step-by-Step Execution Trace

Let's trace through the recursive DFS on our example graph, showing the call stack at each step:

Graph:
   A ── B ── E
   |    |
   C    F
   |
   D ── G

Step	Action	Call Stack	Visited
1	Call DFS(A)	[A]	{A}
2	A's first unvisited neighbor is B. Call DFS(B)	[A, B]	{A, B}
3	B's first unvisited neighbor is E. Call DFS(E)	[A, B, E]	{A, B, E}
4	E has no unvisited neighbors. Return from DFS(E)	[A, B]	{A, B, E}
5	Back in DFS(B), next unvisited neighbor is F. Call DFS(F)	[A, B, F]	{A, B, E, F}
6	F has no unvisited neighbors. Return from DFS(F)	[A, B]	{A, B, E, F}
7	Back in DFS(B), no more unvisited neighbors. Return	[A]	{A, B, E, F}
8	Back in DFS(A), next unvisited neighbor is C. Call DFS(C)	[A, C]	{A, B, C, E, F}
9	C has no unvisited neighbors. Return from DFS(C)	[A]	{A, B, C, E, F}
10	Back in DFS(A), next unvisited neighbor is D. Call DFS(D)	[A, D]	{A, B, C, D, E, F}
11	D's first unvisited neighbor is G. Call DFS(G)	[A, D, G]	{A, B, C, D, E, F, G}
12	G has no unvisited neighbors. Return from DFS(G)	[A, D]	{A, B, C, D, E, F, G}
13	Back in DFS(D), no more unvisited neighbors. Return	[A]	All visited
14	Back in DFS(A), no more unvisited neighbors. Return	[]	All visited

Final traversal order: A → B → E → F → C → D → G

Recursion Depth Limit

Recursive DFS uses system call stack space proportional to the maximum depth of the DFS tree. For graphs with very deep paths (e.g., a linked list of 100,000 nodes), this can cause stack overflow. Python's default recursion limit is ~1000. Use iterative DFS for potentially deep graphs, or increase the recursion limit carefully.

The Iterative DFS Implementation

While recursive DFS is elegant, it relies on the implicit call stack. The iterative version makes this stack explicit, giving us direct control over memory usage and eliminating recursion depth limits.

Core Insight: Replace Recursion with an Explicit Stack

Every recursive algorithm can be converted to an iterative one using an explicit stack data structure. In recursive DFS:

The call stack stores the sequence of vertices we've visited but haven't fully explored
Returning from a call is equivalent to popping from the stack

In iterative DFS, we manage this stack ourselves:

Iterative-DFS(vertex start):
    create empty stack S
    push start onto S
    
    while S is not empty:
        v = pop from S
        if v is not visited:
            mark v as visited
            process v
            for each neighbor u of v:
                if u is not visited:
                    push u onto S

Subtle Difference: Traversal Order

The iterative version may produce a slightly different traversal order than the recursive version. Why? The recursive version processes neighbors in their listed order (first neighbor first). The iterative version pushes all neighbors onto the stack, so the last neighbor pushed is processed first (LIFO behavior).

To achieve identical order, push neighbors in reverse order.

iterative_dfs.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
def dfs_iterative(graph, start):
    """
    Perform Depth-First Search iteratively using an explicit stack.
    
    Args:
        graph: Adjacency list representation {vertex: [neighbors]}
        start: Starting vertex for traversal
    
    Returns:
        List of vertices in DFS traversal order
    """
    visited = set()
    stack = [start]  # Explicit stack replaces call stack
    result = []
    
    while stack:
        # Pop the next vertex to explore
        vertex = stack.pop()
        
        # Skip if already visited (handles duplicates in stack)
        if vertex in visited:
            continue
        
        # Mark as visited and process
        visited.add(vertex)
        result.append(vertex)
        
        # Push all unvisited neighbors onto the stack
        # Reverse to maintain same order as recursive version
        for neighbor in reversed(graph.get(vertex, [])):
            if neighbor not in visited:
                stack.append(neighbor)
    
    return result
 
 
# Alternative: Check visited before pushing (more memory efficient)
def dfs_iterative_optimized(graph, start):
    """
    Memory-optimized iterative DFS that checks visited before pushing.
    This prevents duplicate vertices from being added to the stack.
    """
    visited = set([start])  # Mark start as visited immediately
    stack = [start]
    result = []
    
    while stack:
        vertex = stack.pop()
        result.append(vertex)
        
        # Only push unvisited neighbors, mark them visited immediately
        for neighbor in reversed(graph.get(vertex, [])):
            if neighbor not in visited:
                visited.add(neighbor)  # Mark visited when pushing, not popping
                stack.append(neighbor)
    
    return result
 
 
# Example usage
graph = {
    'A': ['B', 'C', 'D'],
    'B': ['A', 'E', 'F'],
    'C': ['A'],
    'D': ['A', 'G'],
    'E': ['B'],
    'F': ['B'],
    'G': ['D']
}
 
print("Iterative DFS:", dfs_iterative(graph, 'A'))
print("Optimized DFS:", dfs_iterative_optimized(graph, 'A'))
# Both output: ['A', 'B', 'E', 'F', 'C', 'D', 'G']

Detailed Stack Trace: Iterative DFS Execution

Let's trace the iterative version step by step:

Step	Stack (top → bottom)	Action	Result	Visited
0	[A]	Initialize	[]	{}
1	[]	Pop A, visit, push D,C,B	[A]	{A}
2	[D, C, B]	Pop B, visit, push F,E	[A,B]	{A,B}
3	[D, C, F, E]	Pop E, visit (no new neighbors)	[A,B,E]	{A,B,E}
4	[D, C, F]	Pop F, visit (no new neighbors)	[A,B,E,F]	{A,B,E,F}
5	[D, C]	Pop C, visit (A already visited)	[A,B,E,F,C]	{A,B,C,E,F}
6	[D]	Pop D, visit, push G	[A,B,E,F,C,D]	{A,B,C,D,E,F}
7	[G]	Pop G, visit (no new neighbors)	[A,B,E,F,C,D,G]	{A,B,C,D,E,F,G}
8	[]	Stack empty, done	[A,B,E,F,C,D,G]	All visited

Final traversal: A → B → E → F → C → D → G

(Same as recursive version because we pushed neighbors in reverse order)

Why Two Ways to Handle 'Visited'?

Mark on pop: Simple, but the same vertex can appear multiple times in the stack before being visited.

Mark on push: More memory efficient—each vertex appears at most once in the stack. But slightly more complex: you must mark the start vertex before entering the loop.

For most applications, "mark on pop" is fine. For very large, dense graphs, "mark on push" saves memory.

Recursive vs Iterative — Deep Comparison

Both implementations achieve the same goal—visiting all reachable vertices in depth-first order—but differ in important ways. Understanding these differences helps you choose the right approach for each situation.

Recursive vs Iterative DFS: Comprehensive Comparison
Aspect	Recursive DFS	Iterative DFS
Code Simplicity	Very simple, mirrors definition	Slightly more verbose
Stack Management	Implicit (call stack)	Explicit (data structure)
Memory Location	Call stack (limited size)	Heap (dynamically sized)
Max Depth (Python default)	~1000 calls	Limited only by heap memory
Tail Call Optimization	Rarely available (not in Python/Java)	Not applicable
Debugging	Clear stack trace	Manual stack inspection
State Preservation	Function variables preserved naturally	Must explicitly store state
Pre/Post-order Actions	Natural placement	Requires extra bookkeeping
Performance (typical)	Slightly slower (function call overhead)	Slightly faster

When to Use Recursive DFS

•Graph depth is bounded and small — Tree structures, small graphs, known shallow traversals
•Pre-order and post-order processing — When you need to perform actions both before and after exploring children
•Code clarity is priority — When readability matters more than handling edge cases
•Backtracking problems — Where natural recursion models solution exploration
•Already in recursive context — When the calling code is already recursive

When to Use Iterative DFS

•Graph depth may be very large — Long chains, paths, or any graph where depth isn't bounded
•Stack overflow is a real risk — Production systems that can't crash
•Fine-grained control needed — When you need to pause/resume traversal
•Memory optimization critical — When you need control over memory allocation
•Avoiding function call overhead — Performance-critical code

Real-World Consideration: Stack Overflow in Production

In 2012, Apple's SSL vulnerability (goto fail) was discovered. During analysis, engineers noted that code with deep recursion could crash on certain inputs. In production systems, always consider whether recursion depth is bounded. If you can't guarantee a depth limit, use iterative DFS or increase system limits carefully.

Handling Pre-order and Post-order in Iterative DFS

Recursive DFS naturally supports both pre-order (process before children) and post-order (process after children) actions:

def dfs_recursive(v):
    pre_order_action(v)  # Before exploring children
    for child in children(v):
        dfs(child)
    post_order_action(v)  # After all children explored

In iterative DFS, pre-order is straightforward (process when popping), but post-order requires explicitly tracking when all children have been processed. A common technique is to push a "marker" or use state flags:

def dfs_with_postorder(start):
    stack = [(start, False)]  # (vertex, is_processed_flag)
    
    while stack:
        vertex, processed = stack.pop()
        
        if processed:
            post_order_action(vertex)
            continue
        
        pre_order_action(vertex)
        stack.append((vertex, True))  # Push back with flag = True
        
        for child in reversed(children(vertex)):
            if not visited(child):
                stack.append((child, False))

This technique is essential for applications like topological sorting that rely on post-order.

DFS on Different Graph Representations

DFS can operate on any graph representation, but the choice affects both implementation details and performance. Let's examine DFS across the three common representations.

1. Adjacency List (Most Common)

This is the standard representation for DFS. Each vertex stores a list of its neighbors, allowing O(degree(v)) iteration over neighbors.

# Adjacency List: {vertex: [neighbor1, neighbor2, ...]}
graph = {
    'A': ['B', 'C'],
    'B': ['A', 'D'],
    'C': ['A'],
    'D': ['B']
}

Total DFS complexity: O(V + E) — We visit each vertex once, and for each vertex, we iterate over its adjacency list. The sum of all adjacency list lengths equals 2E (for undirected) or E (for directed).

2. Adjacency Matrix

Matrix representation uses M[i][j] = 1 to indicate an edge from i to j.

# Adjacency Matrix: matrix[i][j] = 1 if edge (i,j) exists
matrix = [
    [0, 1, 1, 0],  # A's connections
    [1, 0, 0, 1],  # B's connections
    [1, 0, 0, 0],  # C's connections
    [0, 1, 0, 0]   # D's connections
]

Total DFS complexity: O(V²) — For each vertex, we scan the entire row (V entries) to find neighbors. This is inefficient for sparse graphs.

dfs_matrix.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
def dfs_on_matrix(matrix, start):
    """
    DFS on adjacency matrix representation.
    
    Args:
        matrix: 2D list where matrix[i][j] = 1 if edge exists
        start: Starting vertex index
    
    Returns:
        List of vertex indices in DFS order
    """
    n = len(matrix)
    visited = [False] * n
    result = []
    
    def dfs(v):
        visited[v] = True
        result.append(v)
        
        # Must scan entire row to find neighbors - O(V) per vertex
        for neighbor in range(n):
            if matrix[v][neighbor] == 1 and not visited[neighbor]:
                dfs(neighbor)
    
    dfs(start)
    return result
 
 
# Example: Graph with 4 vertices
matrix = [
    [0, 1, 1, 0],  # Vertex 0: connected to 1, 2
    [1, 0, 0, 1],  # Vertex 1: connected to 0, 3
    [1, 0, 0, 0],  # Vertex 2: connected to 0
    [0, 1, 0, 0]   # Vertex 3: connected to 1
]
 
print("DFS on matrix:", dfs_on_matrix(matrix, 0))
# Output: [0, 1, 3, 2]

3. Edge List

Edge list stores edges as pairs (u, v). This representation is space-efficient but requires preprocessing for efficient DFS.

# Edge List: [(u, v), ...]
edges = [('A', 'B'), ('A', 'C'), ('B', 'D')]

Direct DFS on edge list: O(V × E) — For each vertex, scan all edges to find neighbors.

Better approach: Convert to adjacency list first (O(E)), then run standard DFS (O(V + E)). Total: O(V + E).

def edge_list_to_adj_list(edges):
    graph = {}
    for u, v in edges:
        graph.setdefault(u, []).append(v)
        graph.setdefault(v, []).append(u)  # For undirected
    return graph

DFS Complexity by Graph Representation
Representation	DFS Time Complexity	Space for Graph	Best For
Adjacency List	O(V + E)	O(V + E)	Sparse graphs, most applications
Adjacency Matrix	O(V²)	O(V²)	Dense graphs, edge existence queries
Edge List (direct)	O(V × E)	O(E)	Edge-centric algorithms
Edge List (convert first)	O(V + E)	O(V + E)	When starting from edge data

Complete Working Example

Let's put everything together with a complete, production-quality DFS implementation that handles common edge cases and provides both recursive and iterative versions.

complete_dfs.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
from collections import defaultdict
from typing import List, Set, Optional, Callable
 
class Graph:
    """
    A flexible graph class supporting both directed and undirected graphs.
    Uses adjacency list representation for efficient DFS.
    """
    
    def __init__(self, directed: bool = False):
        self.graph = defaultdict(list)
        self.directed = directed
        self.vertices = set()
    
    def add_edge(self, u, v):
        """Add an edge from u to v (and v to u if undirected)."""
        self.graph[u].append(v)
        self.vertices.add(u)
        self.vertices.add(v)
        
        if not self.directed:
            self.graph[v].append(u)
    
    def add_vertex(self, v):
        """Add an isolated vertex."""
        self.vertices.add(v)
    
    def dfs_recursive(
        self, 
        start, 
        on_visit: Optional[Callable] = None
    ) -> List:
        """
        Recursive DFS with optional callback function.
        
        Args:
            start: Starting vertex
            on_visit: Optional function called with each visited vertex
        
        Returns:
            List of vertices in DFS traversal order
        """
        if start not in self.vertices:
            raise ValueError(f"Start vertex {start} not in graph")
        
        visited = set()
        result = []
        
        def _dfs(v):
            visited.add(v)
            result.append(v)
            
            if on_visit:
                on_visit(v)
            
            for neighbor in self.graph[v]:
                if neighbor not in visited:
                    _dfs(neighbor)
        
        _dfs(start)
        return result
    
    def dfs_iterative(
        self, 
        start,
        on_visit: Optional[Callable] = None
    ) -> List:
        """
        Iterative DFS with optional callback function.
        
        Args:
            start: Starting vertex
            on_visit: Optional function called with each visited vertex
        
        Returns:
            List of vertices in DFS traversal order
        """
        if start not in self.vertices:
            raise ValueError(f"Start vertex {start} not in graph")
        
        visited = set([start])
        stack = [start]
        result = []
        
        while stack:
            vertex = stack.pop()
            result.append(vertex)
            
            if on_visit:
                on_visit(vertex)
            
            # Reverse to maintain same order as recursive
            for neighbor in reversed(self.graph[vertex]):
                if neighbor not in visited:
                    visited.add(neighbor)
                    stack.append(neighbor)
        
        return result
    
    def dfs_all_vertices(self) -> List[List]:
        """
        Perform DFS on entire graph, handling disconnected components.
        
        Returns:
            List of components, each component is a list of vertices
        """
        visited = set()
        components = []
        
        for vertex in self.vertices:
            if vertex not in visited:
                component = []
                stack = [vertex]
                visited.add(vertex)
                
                while stack:
                    v = stack.pop()
                    component.append(v)
                    
                    for neighbor in reversed(self.graph[v]):
                        if neighbor not in visited:
                            visited.add(neighbor)
                            stack.append(neighbor)
                
                components.append(component)
        
        return components
 
 
# Demonstration
if __name__ == "__main__":
    # Create an undirected graph
    g = Graph(directed=False)
    edges = [('A', 'B'), ('A', 'C'), ('B', 'D'), ('B', 'E'), 
             ('C', 'F'), ('E', 'F')]
    
    for u, v in edges:
        g.add_edge(u, v)
    
    print("Graph edges:", dict(g.graph))
    print()
    
    # DFS with callback
    print("DFS Recursive from 'A':")
    result = g.dfs_recursive('A', on_visit=lambda v: print(f"  Visiting: {v}"))
    print(f"  Traversal order: {result}")
    print()
    
    print("DFS Iterative from 'A':")
    result = g.dfs_iterative('A', on_visit=lambda v: print(f"  Visiting: {v}"))
    print(f"  Traversal order: {result}")
    print()
    
    # Disconnected graph
    g.add_vertex('X')
    g.add_edge('X', 'Y')
    g.add_vertex('Z')  # Isolated vertex
    
    print("All components (disconnected graph):")
    components = g.dfs_all_vertices()
    for i, comp in enumerate(components):
        print(f"  Component {i + 1}: {comp}")

Key Implementation Details

This implementation includes: (1) Support for both directed and undirected graphs, (2) Callback functions for custom processing, (3) Handling of disconnected components, (4) Input validation with clear error messages, (5) Both recursive and iterative methods producing identical traversal orders.

Summary: DFS Algorithm Foundations

We've established a deep understanding of DFS mechanics. Let's consolidate the key insights:

Key Takeaways

•DFS explores depth-first — It fully explores one branch before backtracking to explore siblings, forming a spanning tree over the graph.
•Recursive DFS is elegant — It directly mirrors the algorithm definition, with the call stack implicitly managing backtracking.
•Iterative DFS is robust — It uses an explicit stack, avoiding recursion depth limits and giving fine-grained control.
•Both produce the same result — When neighbors are processed in consistent order, both approaches visit vertices identically.
•Time complexity is O(V + E) — Each vertex is visited once, and each edge is examined once (twice for undirected graphs).
•Graph representation matters — Adjacency lists give optimal DFS performance; matrices incur O(V²) overhead.

What's Next:

With the DFS algorithm firmly understood, we'll explore its core mechanism: stack-based exploration. The next page examines how the stack determines traversal order, why DFS naturally produces certain orderings, and how this stack behavior enables powerful applications like backtracking and topological sorting.

Page Complete

You now have a thorough understanding of DFS mechanics in both recursive and iterative forms. You can implement DFS from scratch, trace its execution, and choose the appropriate version for any situation. Next, we'll dive deeper into the stack behavior that gives DFS its unique character.

1 / 5

Loading learning content...

Data Structures & AlgorithmsGraph Algorithms

Depth-First Search (DFS) — Algorithm & Applications

LevelIntermediate

Duration90 mins

TopicGraph Algorithms

1 / 5

DFS Algorithm — Recursive and Iterative Implementations

Exploring the Depths First

What You Will Learn

What Is Depth-First Search?

The core principle: Go deep before going wide.

Formal Definition:

Given a graph G = (V, E) and a starting vertex s, DFS systematically explores the graph by:

Visiting the starting vertex s
For each unvisited neighbor u of the current vertex v:
- Recursively apply DFS starting from u
When all neighbors have been visited, backtrack to the previous vertex

This simple rule generates a traversal that visits every vertex reachable from s exactly once.

DFS vs BFS: A Mental Model

Why "Depth-First"?

Consider a tree structure:

       A
      /|
     B C D
    /|   |
   E F   G

Starting from A, DFS might visit: A → B → E → F → C → D → G

Notice how it fully explores B's subtree (E, F) before moving to C, then fully explores D's subtree (G). Each subtree is completely traversed before sibling subtrees are considered.

Key Insight: The Traversal Forms a Spanning Tree

The Recursive DFS Implementation

Core Recursive Structure:

DFS(vertex v):
    mark v as visited
    process v (if desired)
    for each neighbor u of v:
        if u is not visited:
            DFS(u)

That's it. The recursion handles backtracking automatically—when a DFS call completes (no more unvisited neighbors), the call returns and the previous call continues from where it left off.

Why Recursion Maps So Naturally

DFS has an inherently recursive structure:

Base case: A vertex with no unvisited neighbors (dead end)
Recursive case: Visit each unvisited neighbor, fully exploring its subtree

This maps perfectly to function calls: each recursive call explores a subtree, and returning from the call naturally backtracks to the parent.

recursive_dfs.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
def dfs_recursive(graph, start, visited=None):
    """
    Perform Depth-First Search recursively on a graph.
    
    Args:
        graph: Adjacency list representation {vertex: [neighbors]}
        start: Starting vertex for traversal
        visited: Set of visited vertices (created on first call)
    
    Returns:
        List of vertices in DFS traversal order
    """
    # Initialize visited set on first call
    if visited is None:
        visited = set()
    
    # Mark current vertex as visited
    visited.add(start)
    
    # Process current vertex (here we collect traversal order)
    result = [start]
    
    # Explore all unvisited neighbors
    for neighbor in graph.get(start, []):
        if neighbor not in visited:
            # Recursive call - dive deeper into this branch
            result.extend(dfs_recursive(graph, neighbor, visited))
    
    # Backtracking happens implicitly when we return
    return result
 
 
# Example usage with an undirected graph
graph = {
    'A': ['B', 'C', 'D'],
    'B': ['A', 'E', 'F'],
    'C': ['A'],
    'D': ['A', 'G'],
    'E': ['B'],
    'F': ['B'],
    'G': ['D']
}
 
print("DFS Traversal:", dfs_recursive(graph, 'A'))
# Output: DFS Traversal: ['A', 'B', 'E', 'F', 'C', 'D', 'G']

Step-by-Step Execution Trace

Let's trace through the recursive DFS on our example graph, showing the call stack at each step:

Graph:
   A ── B ── E
   |    |
   C    F
   |
   D ── G

Step	Action	Call Stack	Visited
1	Call DFS(A)	[A]	{A}
2	A's first unvisited neighbor is B. Call DFS(B)	[A, B]	{A, B}
3	B's first unvisited neighbor is E. Call DFS(E)	[A, B, E]	{A, B, E}
4	E has no unvisited neighbors. Return from DFS(E)	[A, B]	{A, B, E}
5	Back in DFS(B), next unvisited neighbor is F. Call DFS(F)	[A, B, F]	{A, B, E, F}
6	F has no unvisited neighbors. Return from DFS(F)	[A, B]	{A, B, E, F}
7	Back in DFS(B), no more unvisited neighbors. Return	[A]	{A, B, E, F}
8	Back in DFS(A), next unvisited neighbor is C. Call DFS(C)	[A, C]	{A, B, C, E, F}
9	C has no unvisited neighbors. Return from DFS(C)	[A]	{A, B, C, E, F}
10	Back in DFS(A), next unvisited neighbor is D. Call DFS(D)	[A, D]	{A, B, C, D, E, F}
11	D's first unvisited neighbor is G. Call DFS(G)	[A, D, G]	{A, B, C, D, E, F, G}
12	G has no unvisited neighbors. Return from DFS(G)	[A, D]	{A, B, C, D, E, F, G}
13	Back in DFS(D), no more unvisited neighbors. Return	[A]	All visited
14	Back in DFS(A), no more unvisited neighbors. Return	[]	All visited

Final traversal order: A → B → E → F → C → D → G

Recursion Depth Limit

The Iterative DFS Implementation

Core Insight: Replace Recursion with an Explicit Stack

Every recursive algorithm can be converted to an iterative one using an explicit stack data structure. In recursive DFS:

The call stack stores the sequence of vertices we've visited but haven't fully explored
Returning from a call is equivalent to popping from the stack

In iterative DFS, we manage this stack ourselves:

Iterative-DFS(vertex start):
    create empty stack S
    push start onto S
    
    while S is not empty:
        v = pop from S
        if v is not visited:
            mark v as visited
            process v
            for each neighbor u of v:
                if u is not visited:
                    push u onto S

Subtle Difference: Traversal Order

To achieve identical order, push neighbors in reverse order.

iterative_dfs.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
def dfs_iterative(graph, start):
    """
    Perform Depth-First Search iteratively using an explicit stack.
    
    Args:
        graph: Adjacency list representation {vertex: [neighbors]}
        start: Starting vertex for traversal
    
    Returns:
        List of vertices in DFS traversal order
    """
    visited = set()
    stack = [start]  # Explicit stack replaces call stack
    result = []
    
    while stack:
        # Pop the next vertex to explore
        vertex = stack.pop()
        
        # Skip if already visited (handles duplicates in stack)
        if vertex in visited:
            continue
        
        # Mark as visited and process
        visited.add(vertex)
        result.append(vertex)
        
        # Push all unvisited neighbors onto the stack
        # Reverse to maintain same order as recursive version
        for neighbor in reversed(graph.get(vertex, [])):
            if neighbor not in visited:
                stack.append(neighbor)
    
    return result
 
 
# Alternative: Check visited before pushing (more memory efficient)
def dfs_iterative_optimized(graph, start):
    """
    Memory-optimized iterative DFS that checks visited before pushing.
    This prevents duplicate vertices from being added to the stack.
    """
    visited = set([start])  # Mark start as visited immediately
    stack = [start]
    result = []
    
    while stack:
        vertex = stack.pop()
        result.append(vertex)
        
        # Only push unvisited neighbors, mark them visited immediately
        for neighbor in reversed(graph.get(vertex, [])):
            if neighbor not in visited:
                visited.add(neighbor)  # Mark visited when pushing, not popping
                stack.append(neighbor)
    
    return result
 
 
# Example usage
graph = {
    'A': ['B', 'C', 'D'],
    'B': ['A', 'E', 'F'],
    'C': ['A'],
    'D': ['A', 'G'],
    'E': ['B'],
    'F': ['B'],
    'G': ['D']
}
 
print("Iterative DFS:", dfs_iterative(graph, 'A'))
print("Optimized DFS:", dfs_iterative_optimized(graph, 'A'))
# Both output: ['A', 'B', 'E', 'F', 'C', 'D', 'G']

Detailed Stack Trace: Iterative DFS Execution

Let's trace the iterative version step by step:

Step	Stack (top → bottom)	Action	Result	Visited
0	[A]	Initialize	[]	{}
1	[]	Pop A, visit, push D,C,B	[A]	{A}
2	[D, C, B]	Pop B, visit, push F,E	[A,B]	{A,B}
3	[D, C, F, E]	Pop E, visit (no new neighbors)	[A,B,E]	{A,B,E}
4	[D, C, F]	Pop F, visit (no new neighbors)	[A,B,E,F]	{A,B,E,F}
5	[D, C]	Pop C, visit (A already visited)	[A,B,E,F,C]	{A,B,C,E,F}
6	[D]	Pop D, visit, push G	[A,B,E,F,C,D]	{A,B,C,D,E,F}
7	[G]	Pop G, visit (no new neighbors)	[A,B,E,F,C,D,G]	{A,B,C,D,E,F,G}
8	[]	Stack empty, done	[A,B,E,F,C,D,G]	All visited

Final traversal: A → B → E → F → C → D → G

(Same as recursive version because we pushed neighbors in reverse order)

Why Two Ways to Handle 'Visited'?

Mark on pop: Simple, but the same vertex can appear multiple times in the stack before being visited.

Mark on push: More memory efficient—each vertex appears at most once in the stack. But slightly more complex: you must mark the start vertex before entering the loop.

For most applications, "mark on pop" is fine. For very large, dense graphs, "mark on push" saves memory.

Recursive vs Iterative — Deep Comparison

Recursive vs Iterative DFS: Comprehensive Comparison
Aspect	Recursive DFS	Iterative DFS
Code Simplicity	Very simple, mirrors definition	Slightly more verbose
Stack Management	Implicit (call stack)	Explicit (data structure)
Memory Location	Call stack (limited size)	Heap (dynamically sized)
Max Depth (Python default)	~1000 calls	Limited only by heap memory
Tail Call Optimization	Rarely available (not in Python/Java)	Not applicable
Debugging	Clear stack trace	Manual stack inspection
State Preservation	Function variables preserved naturally	Must explicitly store state
Pre/Post-order Actions	Natural placement	Requires extra bookkeeping
Performance (typical)	Slightly slower (function call overhead)	Slightly faster

When to Use Recursive DFS

•Graph depth is bounded and small — Tree structures, small graphs, known shallow traversals
•Pre-order and post-order processing — When you need to perform actions both before and after exploring children
•Code clarity is priority — When readability matters more than handling edge cases
•Backtracking problems — Where natural recursion models solution exploration
•Already in recursive context — When the calling code is already recursive

When to Use Iterative DFS

•Graph depth may be very large — Long chains, paths, or any graph where depth isn't bounded
•Stack overflow is a real risk — Production systems that can't crash
•Fine-grained control needed — When you need to pause/resume traversal
•Memory optimization critical — When you need control over memory allocation
•Avoiding function call overhead — Performance-critical code

Real-World Consideration: Stack Overflow in Production

Handling Pre-order and Post-order in Iterative DFS

Recursive DFS naturally supports both pre-order (process before children) and post-order (process after children) actions:

def dfs_recursive(v):
    pre_order_action(v)  # Before exploring children
    for child in children(v):
        dfs(child)
    post_order_action(v)  # After all children explored

def dfs_with_postorder(start):
    stack = [(start, False)]  # (vertex, is_processed_flag)
    
    while stack:
        vertex, processed = stack.pop()
        
        if processed:
            post_order_action(vertex)
            continue
        
        pre_order_action(vertex)
        stack.append((vertex, True))  # Push back with flag = True
        
        for child in reversed(children(vertex)):
            if not visited(child):
                stack.append((child, False))

This technique is essential for applications like topological sorting that rely on post-order.

DFS on Different Graph Representations

DFS can operate on any graph representation, but the choice affects both implementation details and performance. Let's examine DFS across the three common representations.

1. Adjacency List (Most Common)

This is the standard representation for DFS. Each vertex stores a list of its neighbors, allowing O(degree(v)) iteration over neighbors.

# Adjacency List: {vertex: [neighbor1, neighbor2, ...]}
graph = {
    'A': ['B', 'C'],
    'B': ['A', 'D'],
    'C': ['A'],
    'D': ['B']
}

2. Adjacency Matrix

Matrix representation uses M[i][j] = 1 to indicate an edge from i to j.

# Adjacency Matrix: matrix[i][j] = 1 if edge (i,j) exists
matrix = [
    [0, 1, 1, 0],  # A's connections
    [1, 0, 0, 1],  # B's connections
    [1, 0, 0, 0],  # C's connections
    [0, 1, 0, 0]   # D's connections
]

Total DFS complexity: O(V²) — For each vertex, we scan the entire row (V entries) to find neighbors. This is inefficient for sparse graphs.

dfs_matrix.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
def dfs_on_matrix(matrix, start):
    """
    DFS on adjacency matrix representation.
    
    Args:
        matrix: 2D list where matrix[i][j] = 1 if edge exists
        start: Starting vertex index
    
    Returns:
        List of vertex indices in DFS order
    """
    n = len(matrix)
    visited = [False] * n
    result = []
    
    def dfs(v):
        visited[v] = True
        result.append(v)
        
        # Must scan entire row to find neighbors - O(V) per vertex
        for neighbor in range(n):
            if matrix[v][neighbor] == 1 and not visited[neighbor]:
                dfs(neighbor)
    
    dfs(start)
    return result
 
 
# Example: Graph with 4 vertices
matrix = [
    [0, 1, 1, 0],  # Vertex 0: connected to 1, 2
    [1, 0, 0, 1],  # Vertex 1: connected to 0, 3
    [1, 0, 0, 0],  # Vertex 2: connected to 0
    [0, 1, 0, 0]   # Vertex 3: connected to 1
]
 
print("DFS on matrix:", dfs_on_matrix(matrix, 0))
# Output: [0, 1, 3, 2]

3. Edge List

Edge list stores edges as pairs (u, v). This representation is space-efficient but requires preprocessing for efficient DFS.

# Edge List: [(u, v), ...]
edges = [('A', 'B'), ('A', 'C'), ('B', 'D')]

Direct DFS on edge list: O(V × E) — For each vertex, scan all edges to find neighbors.

Better approach: Convert to adjacency list first (O(E)), then run standard DFS (O(V + E)). Total: O(V + E).

def edge_list_to_adj_list(edges):
    graph = {}
    for u, v in edges:
        graph.setdefault(u, []).append(v)
        graph.setdefault(v, []).append(u)  # For undirected
    return graph

DFS Complexity by Graph Representation
Representation	DFS Time Complexity	Space for Graph	Best For
Adjacency List	O(V + E)	O(V + E)	Sparse graphs, most applications
Adjacency Matrix	O(V²)	O(V²)	Dense graphs, edge existence queries
Edge List (direct)	O(V × E)	O(E)	Edge-centric algorithms
Edge List (convert first)	O(V + E)	O(V + E)	When starting from edge data

Complete Working Example

Let's put everything together with a complete, production-quality DFS implementation that handles common edge cases and provides both recursive and iterative versions.

complete_dfs.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
from collections import defaultdict
from typing import List, Set, Optional, Callable
 
class Graph:
    """
    A flexible graph class supporting both directed and undirected graphs.
    Uses adjacency list representation for efficient DFS.
    """
    
    def __init__(self, directed: bool = False):
        self.graph = defaultdict(list)
        self.directed = directed
        self.vertices = set()
    
    def add_edge(self, u, v):
        """Add an edge from u to v (and v to u if undirected)."""
        self.graph[u].append(v)
        self.vertices.add(u)
        self.vertices.add(v)
        
        if not self.directed:
            self.graph[v].append(u)
    
    def add_vertex(self, v):
        """Add an isolated vertex."""
        self.vertices.add(v)
    
    def dfs_recursive(
        self, 
        start, 
        on_visit: Optional[Callable] = None
    ) -> List:
        """
        Recursive DFS with optional callback function.
        
        Args:
            start: Starting vertex
            on_visit: Optional function called with each visited vertex
        
        Returns:
            List of vertices in DFS traversal order
        """
        if start not in self.vertices:
            raise ValueError(f"Start vertex {start} not in graph")
        
        visited = set()
        result = []
        
        def _dfs(v):
            visited.add(v)
            result.append(v)
            
            if on_visit:
                on_visit(v)
            
            for neighbor in self.graph[v]:
                if neighbor not in visited:
                    _dfs(neighbor)
        
        _dfs(start)
        return result
    
    def dfs_iterative(
        self, 
        start,
        on_visit: Optional[Callable] = None
    ) -> List:
        """
        Iterative DFS with optional callback function.
        
        Args:
            start: Starting vertex
            on_visit: Optional function called with each visited vertex
        
        Returns:
            List of vertices in DFS traversal order
        """
        if start not in self.vertices:
            raise ValueError(f"Start vertex {start} not in graph")
        
        visited = set([start])
        stack = [start]
        result = []
        
        while stack:
            vertex = stack.pop()
            result.append(vertex)
            
            if on_visit:
                on_visit(vertex)
            
            # Reverse to maintain same order as recursive
            for neighbor in reversed(self.graph[vertex]):
                if neighbor not in visited:
                    visited.add(neighbor)
                    stack.append(neighbor)
        
        return result
    
    def dfs_all_vertices(self) -> List[List]:
        """
        Perform DFS on entire graph, handling disconnected components.
        
        Returns:
            List of components, each component is a list of vertices
        """
        visited = set()
        components = []
        
        for vertex in self.vertices:
            if vertex not in visited:
                component = []
                stack = [vertex]
                visited.add(vertex)
                
                while stack:
                    v = stack.pop()
                    component.append(v)
                    
                    for neighbor in reversed(self.graph[v]):
                        if neighbor not in visited:
                            visited.add(neighbor)
                            stack.append(neighbor)
                
                components.append(component)
        
        return components
 
 
# Demonstration
if __name__ == "__main__":
    # Create an undirected graph
    g = Graph(directed=False)
    edges = [('A', 'B'), ('A', 'C'), ('B', 'D'), ('B', 'E'), 
             ('C', 'F'), ('E', 'F')]
    
    for u, v in edges:
        g.add_edge(u, v)
    
    print("Graph edges:", dict(g.graph))
    print()
    
    # DFS with callback
    print("DFS Recursive from 'A':")
    result = g.dfs_recursive('A', on_visit=lambda v: print(f"  Visiting: {v}"))
    print(f"  Traversal order: {result}")
    print()
    
    print("DFS Iterative from 'A':")
    result = g.dfs_iterative('A', on_visit=lambda v: print(f"  Visiting: {v}"))
    print(f"  Traversal order: {result}")
    print()
    
    # Disconnected graph
    g.add_vertex('X')
    g.add_edge('X', 'Y')
    g.add_vertex('Z')  # Isolated vertex
    
    print("All components (disconnected graph):")
    components = g.dfs_all_vertices()
    for i, comp in enumerate(components):
        print(f"  Component {i + 1}: {comp}")

Key Implementation Details

Summary: DFS Algorithm Foundations

We've established a deep understanding of DFS mechanics. Let's consolidate the key insights:

Key Takeaways

•DFS explores depth-first — It fully explores one branch before backtracking to explore siblings, forming a spanning tree over the graph.
•Recursive DFS is elegant — It directly mirrors the algorithm definition, with the call stack implicitly managing backtracking.
•Iterative DFS is robust — It uses an explicit stack, avoiding recursion depth limits and giving fine-grained control.
•Both produce the same result — When neighbors are processed in consistent order, both approaches visit vertices identically.
•Time complexity is O(V + E) — Each vertex is visited once, and each edge is examined once (twice for undirected graphs).
•Graph representation matters — Adjacency lists give optimal DFS performance; matrices incur O(V²) overhead.

What's Next:

Page Complete

1 / 5