Loading learning content...
One of the most powerful skills in algorithm design is the ability to see graphs where they aren't explicitly present. Many problems that seem unrelated to graphs—puzzles, optimization tasks, configuration problems—can be elegantly transformed into graph problems, unlocking decades of refined graph algorithms.
This transformation isn't trivial. It requires a trained eye to recognize relationships as edges, entities as vertices, and constraints as graph properties. But once mastered, this skill becomes a superpower—turning seemingly intractable problems into applications of well-known graph traversals, shortest paths, or connectivity algorithms.
By the end of this page, you will understand how to systematically identify graph structures in problems that don't explicitly mention graphs. You'll learn to define vertices and edges, recognize implicit graph structures, and apply this modeling technique to solve problems ranging from word puzzles to flight planning to social network analysis.
Before diving into techniques, let's establish the fundamental mindset for graph modeling. At its core, a graph is nothing more than a representation of pairwise relationships. The key insight is this:
If your problem involves entities that have relationships with each other, there's likely a graph hiding inside.
This simple principle has profound implications. Consider:
| Problem Domain | Entities → Vertices | Relationships → Edges | Properties → Weights |
|---|---|---|---|
| Social Network | People | Friendships | Interaction frequency |
| Road Network | Intersections | Roads | Distance or travel time |
| Flight Planning | Cities/Airports | Direct flights | Flight duration or cost |
| Word Ladder Puzzle | Words | Differ by one letter | Number of changes |
| Task Scheduling | Tasks | Dependencies | Time to complete |
| Web Crawling | Web pages | Hyperlinks | Link relevance |
When faced with any problem, ask yourself: 'What are the things (vertices)? What connects them (edges)? What's important about those connections (weights)?' These three questions form the foundation of all graph modeling.
Converting a problem to a graph requires a structured approach. Here's a battle-tested framework used by experienced engineers and competitive programmers:
Step 1: Identify the Decision Space
Ask: "What am I trying to find or optimize?" The answer often reveals what your vertices should represent. If you're finding the shortest path between configurations, each configuration is a vertex. If you're finding the best sequence of actions, states resulting from actions are vertices.
Step 2: Define the Vertices Precisely
This is often the trickiest part. Vertices must encode all information needed to:
Ambiguity in vertex definition leads to broken algorithms.
A frequent mistake is not including enough information in the vertex definition. For example, in a maze where you can pick up keys, the vertex isn't just (row, col)—it's (row, col, keys_collected). Missing state components leads to incorrect algorithms that visit 'different' states as if they were the same.
Let's apply our framework to a classic problem that beautifully illustrates graph modeling:
The Word Ladder Problem:
Given two words
beginWordandendWord, and a dictionary of words, find the length of the shortest transformation sequence frombeginWordtoendWord, where:
- Only one letter can be changed at a time
- Each transformed word must exist in the dictionary
Example:
beginWord = "hit"endWord = "cog"dictionary = ["hot", "dot", "dog", "lot", "log", "cog"]5 (hit → hot → dot → dog → cog)At first glance, this looks nothing like a graph problem. There are no vertices, no edges, no paths mentioned. But let's apply our framework:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253
from collections import dequefrom typing import List, Set def word_ladder(begin_word: str, end_word: str, word_list: List[str]) -> int: """ Find the shortest transformation sequence length using BFS on an implicit graph. Time Complexity: O(N × M²) where N = number of words, M = word length Space Complexity: O(N × M) for the word set and queue """ # Build the vertex set (dictionary + begin word) word_set: Set[str] = set(word_list) if end_word not in word_set: return 0 # Goal vertex doesn't exist in graph # Edge definition: words differing by exactly one letter def get_neighbors(word: str) -> List[str]: """Find all vertices (words) connected to this vertex by an edge.""" neighbors = [] for i in range(len(word)): for c in 'abcdefghijklmnopqrstuvwxyz': if c != word[i]: # Create a word differing by one character at position i new_word = word[:i] + c + word[i+1:] if new_word in word_set: neighbors.append(new_word) return neighbors # BFS to find shortest path in unweighted graph queue = deque([(begin_word, 1)]) # (current_vertex, path_length) visited = {begin_word} while queue: current_word, distance = queue.popleft() # Check if we've reached the goal vertex if current_word == end_word: return distance # Explore all neighboring vertices (words differing by one letter) for neighbor in get_neighbors(current_word): if neighbor not in visited: visited.add(neighbor) queue.append((neighbor, distance + 1)) return 0 # No path exists from start to goal # Example usagebegin_word = "hit"end_word = "cog"word_list = ["hot", "dot", "dog", "lot", "log", "cog"]result = word_ladder(begin_word, end_word, word_list)print(f"Shortest transformation: {result} steps") # Output: 5Why this works:
The insight is that we've reframed string manipulation as graph traversal. Instead of thinking about letter changes, we think about moving between connected vertices. The shortest transformation sequence is simply the shortest path in our implicitly defined graph.
This transformation is powerful because:
Let's examine a more directly graph-like problem, but one where the modeling choices still matter:
The Problem:
You're given a list of flights with source city, destination city, and price. Find the cheapest route from city A to city B with at most K stops.
Example:
[("NYC", "CHI", 100), ("CHI", "LA", 200), ("NYC", "LA", 500)]300 (NYC → CHI → LA)This problem explicitly involves connections, so the graph structure is more obvious. But the constraint "at most K stops" adds a twist that affects our vertex definition.
Without the K-stop constraint, vertices would simply be cities. But with the constraint, we need to track how many stops we've used. This leads to an expanded state space where vertices are (city, stops_remaining) pairs—a classic pattern in constrained graph problems.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566
import heapqfrom typing import List, Tuple, Dict def find_cheapest_flight( flights: List[Tuple[str, str, int]], src: str, dst: str, k: int) -> int: """ Find cheapest route with at most k stops using modified Dijkstra's. Vertex definition: (city, stops_remaining) Edge definition: direct flights between cities Edge weight: flight price Time Complexity: O(E × K × log(V × K)) Space Complexity: O(V × K) for the expanded state space """ # Build adjacency list: city -> [(destination, price), ...] graph: Dict[str, List[Tuple[str, int]]] = {} for from_city, to_city, price in flights: if from_city not in graph: graph[from_city] = [] graph[from_city].append((to_city, price)) # State: (total_cost, city, stops_remaining) # Priority queue for Dijkstra's algorithm # k+1 because k stops means k+1 flights pq = [(0, src, k + 1)] # Track best cost to reach each (city, stops) state # This prevents revisiting worse states best: Dict[Tuple[str, int], int] = {} while pq: cost, city, remaining = heapq.heappop(pq) # Goal check: reached destination? if city == dst: return cost # Skip if we've seen this state with lower cost if (city, remaining) in best: continue best[(city, remaining)] = cost # Can't expand if no stops remaining if remaining == 0: continue # Explore all outgoing edges (flights from this city) for next_city, price in graph.get(city, []): new_cost = cost + price new_state = (next_city, remaining - 1) # Only explore if we haven't found a better path to this state if new_state not in best: heapq.heappush(pq, (new_cost, next_city, remaining - 1)) return -1 # No valid route exists # Example usageflights = [("NYC", "CHI", 100), ("CHI", "LA", 200), ("NYC", "LA", 500)]result = find_cheapest_flight(flights, "NYC", "LA", 1)print(f"Cheapest route: ${result}") # Output: 300Key Modeling Insight:
Notice how the constraint fundamentally changed our vertex definition. Without it, each city appears once. With it, each city appears up to K+1 times (once for each possible "remaining stops" value).
This is a recurring pattern called state expansion or state augmentation:
This pattern appears whenever:
With practice, you'll develop an instinct for recognizing graph problems in disguise. Here are the telltale signals:
Signal 1: "Find the shortest/minimum/fastest way to get from A to B"
This screams shortest path. The question is: what's the graph?
Signal 2: "Can you reach state X from state Y?"
This is graph reachability—either BFS/DFS or connectivity check.
Signal 3: "In how many ways can you..." (with overlapping structure)
Often a DAG (Directed Acyclic Graph) where we count paths.
Signal 4: "Find all things connected to..." or "Group similar things"
Connected components or graph clustering.
Signal 5: "What order should we do things in?"
Topological sort on a dependency graph.
| Problem Phrase | Graph Interpretation | Algorithm |
|---|---|---|
| Shortest path from A to B | Find minimum edge path | BFS (unweighted) / Dijkstra (weighted) |
| Minimum cost to transform | Weighted shortest path | Dijkstra / Bellman-Ford |
| Is it possible to reach... | Path existence / Reachability | BFS / DFS |
| Find all connected items | Connected components | Union-Find / BFS |
| Optimal ordering of tasks | Topological sort | Kahn's algorithm / DFS |
| Cycle detection | Does a cycle exist? | DFS with coloring / Union-Find |
| Minimum connections needed | Minimum spanning tree | Kruskal / Prim |
| Maximum flow/matching | Flow network | Ford-Fulkerson |
When you see 'transform A to B' or 'convert X to Y', think graphs. Each valid state is a vertex, each valid transformation is an edge. Finding the minimum/optimal transformation is finding the shortest/optimal path.
Over time, certain modeling patterns emerge repeatedly. Internalizing these patterns accelerates your problem-solving:
Pattern 1: State as Vertex
When a problem involves transitions between configurations or states, model each unique state as a vertex. Edges represent valid transitions.
Examples: Rubik's cube configurations, game board states, process states in scheduling.
Pattern 2: Entity as Vertex, Relationship as Edge
The most direct mapping—entities (people, places, objects) are vertices; relationships (knows, connects, flows) are edges.
Examples: Social networks, road networks, computer networks.
Pattern 3: Build Graph Explicitly vs Implicitly
For some problems, you build the entire graph upfront (adjacency list/matrix). For others—especially with large or infinite state spaces—you generate neighbors on-the-fly during traversal. This is called an implicit graph.
Explicit graph: When the graph is small and you'll do multiple queries on it. Implicit graph: When the state space is huge but you only explore a small fraction (e.g., puzzle solving with BFS).
We'll explore implicit graphs in depth on the next page.
To solidify your understanding, let's walk through how you would model several diverse problems as graphs:
Problem A: Minimum Genetic Mutation
A gene string is 8 characters from {A, C, G, T}. Given start gene, end gene, and a bank of valid genes, find minimum mutations to transform start to end.
Modeling:
Problem B: Open the Lock
A lock has 4 wheels, each with digits 0-9. From 0000, reach target, avoiding deadends.
Modeling:
Given a 2x3 board with tiles 1-5 and one empty space (0), slide tiles to reach the goal state [[1,2,3],[4,5,0]]. Minimum moves?
Modeling:
This is a classic example where the graph is implicit—you don't precompute all 6! = 720 configurations; you generate neighbors during BFS.
Problem D: Course Schedule
Given n courses and prerequisites [[a,b],...] meaning 'b before a', determine if you can finish all courses.
Modeling:
Problem E: Accounts Merge
Given accounts where each has a name and emails, merge accounts that share emails.
Modeling:
The ability to convert problems to graphs is a fundamental skill that opens up a vast toolkit of efficient algorithms. Let's consolidate what we've learned:
What's Next:
Now that you understand explicit graph modeling, the next page explores implicit graphs and state spaces—a powerful technique for handling problems with enormous or even infinite state spaces without constructing the entire graph in memory.
You now understand how to systematically convert problems into graph representations. This skill transforms seemingly unrelated problems into applications of well-studied graph algorithms. Next, we'll explore implicit graphs—the technique for handling massive state spaces efficiently.