Loading content...
Imagine navigating a city where every street is one-way. You can travel from point A to point B, but returning might require an entirely different route—or might be impossible altogether. This asymmetry, where connections have a specific direction, is the defining characteristic of directed graphs.
Directed graphs, also known as digraphs, are among the most powerful modeling tools in computer science. They capture relationships where order matters: following someone on Twitter doesn't mean they follow you back; a function calling another function doesn't imply a reverse call; a web page linking to another doesn't guarantee a reciprocal link. This inherent asymmetry makes directed graphs essential for modeling countless real-world systems.
By the end of this page, you will understand the formal definition and notation of directed graphs, recognize their applications in software systems and algorithms, master essential digraph-specific concepts like in-degree and out-degree, and appreciate why direction fundamentally changes how we reason about graph problems.
A directed graph (or digraph) is a mathematical structure that formalizes the concept of one-way relationships between entities. Unlike undirected graphs where edges represent symmetric connections, directed graphs capture asymmetric relationships where the direction of the connection carries meaning.
Formal Definition:
A directed graph G is defined as an ordered pair G = (V, E), where:
The crucial distinction from undirected graphs lies in the word ordered. In a directed edge (u, v), the vertex u is the source (or tail), and v is the target (or head). This edge represents a connection from u to v, but explicitly not from v to u.
In directed graphs, (u, v) ≠ (v, u). The edge (A, B) means 'A connects to B,' while (B, A) means 'B connects to A.' These are distinct edges. In contrast, undirected graphs treat {A, B} and {B, A} as the same edge.
Example:
Consider a directed graph with V = {A, B, C, D} and E = {(A, B), (A, C), (B, C), (C, D), (D, B)}.
In this graph:
This creates interesting reachability patterns. You can get from A to D (via A→C→D), but you cannot get from D to A through any path—the graph has no directed path from D to A.
| Term | Definition | Example |
|---|---|---|
| Directed Edge (Arc) | An ordered pair (u, v) representing a connection from u to v | (A, B) means A points to B |
| Source (Tail) | The starting vertex of a directed edge | In (A, B), A is the source |
| Target (Head) | The ending vertex of a directed edge | In (A, B), B is the target |
| Out-neighbor | A vertex v such that (u, v) ∈ E | B is an out-neighbor of A if (A, B) exists |
| In-neighbor | A vertex u such that (u, v) ∈ E | A is an in-neighbor of B if (A, B) exists |
| Directed Path | A sequence of vertices where each consecutive pair forms a directed edge | A→B→C is a path if (A,B) and (B,C) exist |
| Directed Cycle | A directed path that starts and ends at the same vertex | B→C→D→B if (B,C), (C,D), (D,B) exist |
In undirected graphs, each vertex has a single degree—the number of edges incident to it. Directed graphs require a more nuanced concept because edges have direction. We distinguish between edges entering a vertex and edges leaving a vertex.
Out-Degree (deg⁺(v)): The out-degree of a vertex v is the number of edges that have v as their source. Formally:
deg⁺(v) = |{(v, u) ∈ E : u ∈ V}|
This counts how many vertices you can reach directly from v in one step.
In-Degree (deg⁻(v)): The in-degree of a vertex v is the number of edges that have v as their target. Formally:
deg⁻(v) = |{(u, v) ∈ E : u ∈ V}|
This counts how many vertices can reach v directly in one step.
In any directed graph, the sum of all in-degrees equals the sum of all out-degrees, and both equal the total number of edges: Σdeg⁺(v) = Σdeg⁻(v) = |E|. Every edge contributes exactly 1 to some vertex's out-degree and exactly 1 to some vertex's in-degree.
Special Vertices Based on Degree:
The in-degree and out-degree of vertices reveal their structural role in the graph:
Source vertex: A vertex with in-degree 0 (no incoming edges). No other vertex points to it. In a dependency graph, this represents a task with no prerequisites.
Sink vertex: A vertex with out-degree 0 (no outgoing edges). It doesn't point to any other vertex. In a web graph, this might be a page with no outbound links.
Isolated vertex: A vertex with both in-degree and out-degree equal to 0. It has no connections whatsoever.
Hub vertex: Informally, a vertex with high out-degree, pointing to many other vertices.
Authority vertex: Informally, a vertex with high in-degree, pointed to by many other vertices.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152
def calculate_degrees(adjacency_list): """ Calculate in-degree and out-degree for all vertices. Args: adjacency_list: Dict mapping each vertex to its list of out-neighbors Returns: Tuple of (in_degrees, out_degrees) dictionaries """ vertices = set(adjacency_list.keys()) # Initialize degrees in_degrees = {v: 0 for v in vertices} out_degrees = {v: 0 for v in vertices} # Count degrees for source, neighbors in adjacency_list.items(): out_degrees[source] = len(neighbors) for target in neighbors: if target in in_degrees: in_degrees[target] += 1 return in_degrees, out_degrees def find_sources_and_sinks(adjacency_list): """ Find all source vertices (in-degree 0) and sink vertices (out-degree 0) in a directed graph. """ in_deg, out_deg = calculate_degrees(adjacency_list) sources = [v for v, deg in in_deg.items() if deg == 0] sinks = [v for v, deg in out_deg.items() if deg == 0] return sources, sinks # Example usagegraph = { 'A': ['B', 'C'], 'B': ['C'], 'C': ['D'], 'D': ['B']} in_deg, out_deg = calculate_degrees(graph)print("In-degrees:", in_deg) # {'A': 0, 'B': 2, 'C': 2, 'D': 1}print("Out-degrees:", out_deg) # {'A': 2, 'B': 1, 'C': 1, 'D': 1} sources, sinks = find_sources_and_sinks(graph)print("Sources:", sources) # ['A']print("Sinks:", sinks) # []Directed graphs are ubiquitous in computing and beyond. The asymmetric nature of their edges makes them the natural choice for modeling relationships where direction carries semantic meaning. Let's explore the most significant applications across different domains.
foo() calls bar(), there's an edge from foo to bar. This enables dead code detection, inlining decisions, and cycle detection (which may indicate recursion or mutual recursion).| Domain | Vertices | Edges | Key Algorithms |
|---|---|---|---|
| Web Graph | Web pages | Hyperlinks | PageRank, HITS, Crawling |
| Dependencies | Packages/tasks | Dependencies | Topological sort, Cycle detection |
| Social (asymmetric) | Users | Follows | Influence propagation, Community detection |
| Call Graphs | Functions | Calls | Reachability, Dead code elimination |
| State Machines | States | Transitions | DFS/BFS traversal, Minimization |
| Citation Networks | Papers | Citations | Impact analysis, Trend detection |
When analyzing a real-world problem, ask: 'If entity A relates to entity B, does B necessarily relate to A in the same way?' If not—if the relationship is inherently one-directional—you're dealing with a directed graph problem.
The concept of reachability in directed graphs is fundamentally different from undirected graphs. In an undirected connected graph, you can reach any vertex from any other vertex. In directed graphs, connectivity is far more nuanced—you might reach vertex B from A but not reach A from B.
Directed Path:
A directed path from vertex u to vertex v is a sequence of vertices (u = v₀, v₁, v₂, ..., vₖ = v) such that for each consecutive pair (vᵢ, vᵢ₊₁), the directed edge (vᵢ, vᵢ₊₁) exists in E. The direction of edges must be respected.
Reachability:
Vertex v is reachable from vertex u if there exists a directed path from u to v. We denote this as u ⟿ v. Note that u ⟿ v does NOT imply v ⟿ u in general.
Transitive Closure:
The transitive closure of a directed graph G = (V, E) is a graph G* = (V, E*) where (u, v) ∈ E* if and only if there exists a directed path from u to v in G. The transitive closure makes reachability queries O(1) after O(V³) preprocessing—useful when you need many reachability queries.
Practical Implications:
Consider a dependency graph where edges represent 'depends on.' If package A depends on B, and B depends on C, then A transitively depends on C. The transitive closure reveals all transitive dependencies, critical for determining what needs to be installed or rebuilt.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980
from collections import deque def find_reachable_vertices(graph, start): """ Find all vertices reachable from a starting vertex using BFS. Respects edge direction. Args: graph: Adjacency list representation start: Starting vertex Returns: Set of all reachable vertices (including start) """ reachable = set() queue = deque([start]) reachable.add(start) while queue: current = queue.popleft() for neighbor in graph.get(current, []): if neighbor not in reachable: reachable.add(neighbor) queue.append(neighbor) return reachable def is_reachable(graph, source, target): """ Check if target is reachable from source. """ return target in find_reachable_vertices(graph, source) def compute_transitive_closure(graph): """ Compute the transitive closure using Floyd-Warshall. Returns a set of all (u, v) pairs where u can reach v. Time Complexity: O(V³) Space Complexity: O(V²) """ vertices = list(graph.keys()) n = len(vertices) vertex_to_idx = {v: i for i, v in enumerate(vertices)} # Initialize reachability matrix reach = [[False] * n for _ in range(n)] # Direct edges for u, neighbors in graph.items(): i = vertex_to_idx[u] reach[i][i] = True # Self-reachability for v in neighbors: j = vertex_to_idx.get(v, -1) if j != -1: reach[i][j] = True # Floyd-Warshall for transitive closure for k in range(n): for i in range(n): for j in range(n): reach[i][j] = reach[i][j] or (reach[i][k] and reach[k][j]) # Convert back to vertex pairs closure = set() for i in range(n): for j in range(n): if reach[i][j]: closure.add((vertices[i], vertices[j])) return closure # Examplegraph = {'A': ['B', 'C'], 'B': ['C'], 'C': ['D'], 'D': ['B']} print("Reachable from A:", find_reachable_vertices(graph, 'A'))# Output: {'A', 'B', 'C', 'D'} print("Reachable from D:", find_reachable_vertices(graph, 'D'))# Output: {'D', 'B', 'C'} - Note: A is not reachable from D!Directed Cycles:
A directed cycle is a directed path that starts and ends at the same vertex, with at least one edge. Formally, it's a sequence (v₀, v₁, ..., vₖ) where k ≥ 1, edges (vᵢ, vᵢ₊₁) exist for all i, and v₀ = vₖ.
Cycles in directed graphs have profound implications:
Directed Acyclic Graphs (DAGs):
A DAG is a directed graph with no directed cycles. DAGs are extraordinarily important because:
DAGs hit a sweet spot: they're more expressive than trees (a node can have multiple parents), but more structured than arbitrary digraphs (no cycles). This makes them ideal for version control (Git's commit graph), expression evaluation, data pipelines, and scheduling problems.
Cycle Detection in Directed Graphs:
Detecting cycles in directed graphs is subtly different from undirected graphs. In undirected graphs, any back edge during DFS indicates a cycle. In directed graphs, we must be more careful—we need to track whether a vertex is currently in our DFS recursion stack.
We classify edges during DFS into:
A directed graph has a cycle if and only if DFS encounters a back edge.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778
def has_cycle_directed(graph): """ Detect if a directed graph contains a cycle. Uses DFS with a recursion stack tracking. Args: graph: Adjacency list (dict of vertex -> list of neighbors) Returns: True if cycle exists, False otherwise """ # States: 0 = unvisited, 1 = in current path, 2 = completed state = {v: 0 for v in graph} def dfs(vertex): if state[vertex] == 1: # Back edge: cycle found return True if state[vertex] == 2: # Already processed: no cycle here return False state[vertex] = 1 # Mark as being processed for neighbor in graph.get(vertex, []): if neighbor in state and dfs(neighbor): return True state[vertex] = 2 # Mark as completed return False # Check from all vertices (graph may be disconnected) for vertex in graph: if state[vertex] == 0: if dfs(vertex): return True return False def topological_sort(graph): """ Returns a topological ordering of a DAG. Raises ValueError if the graph contains a cycle. Uses Kahn's algorithm (BFS-based). """ # Calculate in-degrees in_degree = {v: 0 for v in graph} for v, neighbors in graph.items(): for neighbor in neighbors: if neighbor in in_degree: in_degree[neighbor] += 1 # Start with all sources (in-degree 0) queue = [v for v, deg in in_degree.items() if deg == 0] result = [] while queue: vertex = queue.pop(0) result.append(vertex) for neighbor in graph.get(vertex, []): if neighbor in in_degree: in_degree[neighbor] -= 1 if in_degree[neighbor] == 0: queue.append(neighbor) if len(result) != len(graph): raise ValueError("Graph contains a cycle - no topological order exists") return result # Example: DAGdag = {'A': ['B', 'C'], 'B': ['D'], 'C': ['D'], 'D': []}print("Has cycle:", has_cycle_directed(dag)) # Falseprint("Topological order:", topological_sort(dag)) # ['A', 'B', 'C', 'D'] or similar # Example: Graph with cyclecyclic = {'A': ['B'], 'B': ['C'], 'C': ['A']}print("Has cycle:", has_cycle_directed(cyclic)) # TrueIn undirected graphs, we talk about connected components—maximal sets of vertices where every pair is connected by a path. Directed graphs require a more nuanced concept: strongly connected components (SCCs).
Definition:
A strongly connected component of a directed graph is a maximal set of vertices such that for every pair of vertices u and v in the set, both u ⟿ v (u can reach v) AND v ⟿ u (v can reach u) hold.
In other words, within an SCC, you can travel between any two vertices in both directions. Every vertex can reach every other vertex, and vice versa.
Key Properties:
Once you decompose a graph into SCCs and form the condensation DAG, you've reduced an arbitrary digraph problem to a DAG problem. This is a powerful technique: analyze the macro-level structure between components (DAG structure), then handle intra-component details (each SCC is strongly connected).
Finding SCCs:
Two classic linear-time algorithms find all SCCs:
Kosaraju's Algorithm: Run two DFS passes—one on the original graph, one on the transpose (reversed edges)—using finish times from the first pass to order the second pass.
Tarjan's Algorithm: Single DFS with low-link values to identify SCC roots during the traversal.
Applications of SCCs:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879
def find_sccs_kosaraju(graph): """ Find all strongly connected components using Kosaraju's algorithm. Args: graph: Adjacency list (dict of vertex -> list of neighbors) Returns: List of SCCs, where each SCC is a list of vertices Time Complexity: O(V + E) Space Complexity: O(V) """ # Step 1: Build transpose graph vertices = set(graph.keys()) for neighbors in graph.values(): vertices.update(neighbors) transpose = {v: [] for v in vertices} for u, neighbors in graph.items(): for v in neighbors: transpose[v].append(u) # Step 2: First DFS to get finish order visited = set() finish_order = [] def dfs1(vertex): visited.add(vertex) for neighbor in graph.get(vertex, []): if neighbor not in visited: dfs1(neighbor) finish_order.append(vertex) for vertex in vertices: if vertex not in visited: dfs1(vertex) # Step 3: Second DFS on transpose in reverse finish order visited.clear() sccs = [] def dfs2(vertex, component): visited.add(vertex) component.append(vertex) for neighbor in transpose[vertex]: if neighbor not in visited: dfs2(neighbor, component) for vertex in reversed(finish_order): if vertex not in visited: component = [] dfs2(vertex, component) sccs.append(component) return sccs # Examplegraph = { 'A': ['B'], 'B': ['C', 'E', 'F'], 'C': ['D', 'G'], 'D': ['C', 'H'], 'E': ['A', 'F'], 'F': ['G'], 'G': ['F', 'H'], 'H': ['H']} sccs = find_sccs_kosaraju(graph)print("Strongly Connected Components:")for i, scc in enumerate(sccs, 1): print(f" SCC {i}: {scc}") # Output (order may vary):# SCC 1: ['H']# SCC 2: ['G', 'F']# SCC 3: ['D', 'C']# SCC 4: ['E', 'B', 'A']We've explored directed graphs comprehensively—from their formal mathematical foundation to practical algorithms and real-world applications. Let's consolidate the key concepts:
What's Next:
We've mastered directed graphs with their one-way edges. Next, we'll explore undirected graphs, where every edge is bidirectional. While simpler in some ways (no in/out-degree distinction), undirected graphs have their own rich properties and application domains. Understanding both types—and when to use each—is essential for effective graph modeling.
You now have a thorough understanding of directed graphs: their formal definition, degree concepts, reachability, cycles, DAGs, and strongly connected components. You can recognize when a problem requires directed graph modeling and apply appropriate algorithms to analyze digraph structure.