Loading content...
Imagine you're building a word puzzle game—something like Boggle or a crossword validator. You have a 2D grid of characters and a dictionary of valid words. Your task is to find all words from the dictionary that can be formed by starting at any cell and traversing adjacent cells (horizontally, vertically, or diagonally) without revisiting cells.
This problem appears everywhere: word games, text mining, pattern matching, and is a favorite in technical interviews because it tests multiple skills simultaneously—graph traversal, backtracking, data structure selection, and algorithmic optimization.
The naive approach—check each word individually—quickly becomes catastrophically slow. But with tries, we transform this from an intractable problem into an elegant, efficient solution. This is one of the most powerful and instructive applications of the trie data structure.
By the end of this page, you will understand: (1) Why naive word search fails at scale, (2) How tries enable searching for multiple words simultaneously, (3) The complete implementation of trie-optimized grid search, (4) Time and space complexity analysis, and (5) Optimization techniques used in production systems.
Let's formalize the Word Search in Grid problem, often called Word Search II in competitive programming contexts.
Problem Statement:
Given:
m × n grid board of charactersk words from a dictionaryFind all words that can be constructed by sequentially connecting adjacent cells. Adjacent cells are those horizontally, vertically, or diagonally neighboring (8 directions, or 4 directions in simpler variants). Each cell may only be used once per word.
Example:
Grid (3×4):┌───┬───┬───┬───┐│ o │ a │ a │ n │├───┼───┼───┼───┤│ e │ t │ a │ e │├───┼───┼───┼───┤│ i │ h │ k │ r │└───┴───┴───┴───┘ Dictionary: ["oath", "pea", "eat", "rain", "oat", "eta"] Found words: ["oath", "eat", "oat", "eta"] Paths:• "oath": (0,0)→(1,0)→(1,1)→(2,1) [o→a→t→h but wait, that's wrong] Actually: (0,0)→(0,1)→(1,1)→(2,1) [o→a→t→h] ✓• "eat": (1,0)→(0,1)→(1,1) [e→a→t] ✓• "oat": (0,0)→(0,1)→(1,1) [o→a→t] ✓• "eta": (1,0)→(1,1)→(0,1) [e→t→a] ✓Why This Problem Is Challenging:
The challenge lies in the combinatorial explosion of possible paths. From each cell, you can move to up to 8 neighbors. For a path of length L, the number of possible paths from a single starting cell is roughly 8^L. Multiply this across all m × n starting cells and all k words in the dictionary.
With a 10×10 grid and 1000 words averaging 5 characters each, the naive approach performs approximately:
100 (cells) × 1000 (words) × 8^5 (paths) ≈ 3.3 billion operations
This is clearly unacceptable. We need a smarter approach.
Before appreciating the trie optimization, let's understand exactly why the straightforward approach fails.
The Naive Strategy:
For each word in the dictionary:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051
# Naive Approach: O(k × m × n × 8^L) - INEFFICIENTdef find_words_naive(board: list[list[str]], words: list[str]) -> list[str]: """ Naive approach: Check each word individually using DFS. Time: O(k × m × n × 8^L) where L = max word length Space: O(L) for recursion stack """ m, n = len(board), len(board[0]) directions = [(-1,-1), (-1,0), (-1,1), (0,-1), (0,1), (1,-1), (1,0), (1,1)] found = [] def dfs(row: int, col: int, word_index: int, word: str, visited: set) -> bool: # Base case: word completely matched if word_index == len(word): return True # Bounds check and character match if (row < 0 or row >= m or col < 0 or col >= n or (row, col) in visited or board[row][col] != word[word_index]): return False # Mark as visited visited.add((row, col)) # Try all 8 directions for dr, dc in directions: if dfs(row + dr, col + dc, word_index + 1, word, visited): visited.remove((row, col)) # Backtrack return True # Backtrack visited.remove((row, col)) return False # Check each word individually - THIS IS THE PROBLEM for word in words: word_found = False for i in range(m): for j in range(n): if board[i][j] == word[0]: # Start if first char matches if dfs(i, j, 0, word, set()): word_found = True break if word_found: break if word_found: found.append(word) return foundThe naive approach performs a complete DFS for every single word. If 100 words share the prefix 'pre', we perform 100 separate, redundant searches to find 'pre' in the grid. This duplication of effort is catastrophic at scale.
Complexity Analysis of Naive Approach:
| Component | Complexity |
|---|---|
| Words to check | O(k) |
| Starting positions | O(m × n) |
| Paths per position | O(8^L) where L = word length |
| Total Time | O(k × m × n × 8^L) |
| Space (recursion) | O(L) |
The Fundamental Problem:
We're treating each word as an isolated search. But words share structure—they share prefixes! The words 'cat', 'car', 'card', 'care' all start with 'ca'. When we find 'ca' in the grid, we should be able to check all four words simultaneously, not search four separate times.
The key insight that transforms this problem is recognizing that tries naturally represent shared prefixes as shared paths.
Instead of searching for words one at a time, we:
This is a paradigm shift: we go from k separate word searches to one grid traversal with k-word awareness.
Visual: Trie Enables Prefix Sharing
Consider the dictionary: ['cat', 'car', 'card', 'care', 'cards']
(root) │ c │ a ─────────┐ │ │ t* r* │ ┌───┴───┐ d* e* │ s* Nodes marked with * indicate end-of-word. When exploring the grid:- Find 'c' → Enter trie at 'c'- Find 'a' → Move to 'a' node (both 'cat' and 'car*' branches exist)- Find 't' → Move to 't*' → Found "cat"! Continue searching...- Find 'r' → Move to 'r*' → Found "car"! Continue for 'd' and 'e'... ONE traversal checks ALL words simultaneously!When the trie node has no child for the current character, we immediately prune that search path. If no dictionary word starts with 'cx', we never explore past 'cx' in the grid—regardless of how many cells we'd otherwise visit. This pruning is the source of the dramatic speedup.
The first step is constructing a trie from the dictionary. For word search problems, we need a trie that:
Let's build this specialized trie:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273
class TrieNode: """ A trie node optimized for word search problems. Key features: - 'children' dict for O(1) child lookup - 'word' stores the complete word at end nodes (for easy result collection) - 'count' tracks number of words passing through (for optimization) """ __slots__ = ['children', 'word', 'count'] # Memory optimization def __init__(self): self.children: dict[str, 'TrieNode'] = {} self.word: str | None = None # Non-None if this node marks end of a word self.count: int = 0 # Number of words passing through this node class Trie: """ Trie data structure for word search optimization. Supports: - insert(word): Add a word to the trie - Word storage at leaf nodes for result collection - Count tracking for optimization during search """ def __init__(self): self.root = TrieNode() def insert(self, word: str) -> None: """ Insert a word into the trie. Time: O(L) where L = len(word) Space: O(L) in worst case (no prefix sharing) """ node = self.root for char in word: if char not in node.children: node.children[char] = TrieNode() node = node.children[char] node.count += 1 # Track words passing through node.word = word # Store complete word at end node def build_from_words(self, words: list[str]) -> None: """ Build trie from a list of words. Time: O(sum of all word lengths) """ for word in words: self.insert(word) # Example usage:words = ["oath", "pea", "eat", "rain", "oat", "eta"]trie = Trie()trie.build_from_words(words) # The trie structure now looks like:## (root)# / | \# o p e...# | |# a e# | |# t* a*# /# h*## Where * indicates node.word is set to the complete wordBy storing the complete word at end-of-word nodes (node.word), we avoid reconstructing the word during DFS. When we reach an end-of-word node during grid traversal, we simply add node.word to our results. This is cleaner and more efficient than tracking the path.
The count Field Explained:
The count field tracks how many dictionary words "pass through" each node. This enables a crucial optimization:
This prevents finding the same word multiple times and accelerates the search as words are discovered.
Now we combine everything into the complete, optimized word search algorithm. The key insight is that we traverse the grid while simultaneously navigating the trie. The trie node acts as our "state" of which words are still possible.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120
class TrieNode: __slots__ = ['children', 'word', 'count'] def __init__(self): self.children: dict[str, 'TrieNode'] = {} self.word: str | None = None self.count: int = 0 class Solution: """ Word Search II - Trie-Optimized Solution Time Complexity: O(m × n × 4^L) where L = max word length But with aggressive pruning, much faster in practice Space Complexity: O(k × L) for trie + O(L) for recursion where k = number of words """ def find_words(self, board: list[list[str]], words: list[str]) -> list[str]: if not board or not board[0] or not words: return [] # Step 1: Build trie from all dictionary words root = TrieNode() for word in words: node = root for char in word: if char not in node.children: node.children[char] = TrieNode() node = node.children[char] node.count += 1 node.word = word # Step 2: Set up grid traversal m, n = len(board), len(board[0]) # 4-directional movement (can be 8 for diagonal) directions = [(-1, 0), (1, 0), (0, -1), (0, 1)] result = [] def dfs(row: int, col: int, node: TrieNode) -> None: """ DFS traversal of grid, synchronized with trie navigation. Args: row, col: Current position in grid node: Current position in trie (represents prefix matched so far) """ char = board[row][col] # Pruning: If current char not in trie children, no word continues if char not in node.children: return # Move to child node in trie child_node = node.children[char] # Check if we've found a word if child_node.word is not None: result.append(child_node.word) # OPTIMIZATION: Remove word to prevent duplicates child_node.word = None # Decrement counts to enable pruning self._decrement_counts(root, child_node.word if child_node.word else result[-1]) # OPTIMIZATION: If no words remain through this path, prune if child_node.count == 0: # Remove this child from parent to speed up future searches del node.children[char] return # Mark cell as visited (in-place modification) original_char = board[row][col] board[row][col] = '#' # Use special marker # Explore all 4 directions for dr, dc in directions: new_row, new_col = row + dr, col + dc if 0 <= new_row < m and 0 <= new_col < n: if board[new_row][new_col] != '#': # Not visited dfs(new_row, new_col, child_node) # Backtrack: Restore cell board[row][col] = original_char # Step 3: Start DFS from every cell for i in range(m): for j in range(n): if board[i][j] in root.children: dfs(i, j, root) return result def _decrement_counts(self, root: TrieNode, word: str) -> None: """ Decrement counts along the path for a found word. This enables pruning of exhausted branches. """ node = root for char in word: if char in node.children: node.children[char].count -= 1 # If count reaches 0, we could delete the child if node.children[char].count == 0: del node.children[char] return node = node.children[char] # Example usagesolution = Solution()board = [ ['o', 'a', 'a', 'n'], ['e', 't', 'a', 'e'], ['i', 'h', 'k', 'r']]words = ["oath", "pea", "eat", "rain", "oat", "eta"] result = solution.find_words(board, words)print(result) # ['oath', 'oat', 'eat', 'eta'] (order may vary)Algorithm Walkthrough:
Build the Trie (Lines 23-31): Insert all dictionary words. Each end-of-word node stores the complete word.
Start DFS from Each Cell (Lines 79-82): For each cell whose character exists as a child of root, begin DFS.
Synchronized Traversal (Lines 39-75): As we move through the grid, we simultaneously move through the trie. The trie node tells us which words are still possible.
Word Discovery (Lines 53-59): When we reach a node with word set, we've found a dictionary word. Add it to results.
Backtracking (Lines 69-74): Restore the cell and return, allowing other paths to use this cell.
Pruning (Lines 61-65): When a branch is exhausted (count = 0), remove it from the trie to prevent future wasteful exploration.
The basic trie-optimized algorithm is already dramatically faster than the naive approach. However, several additional optimizations can make it production-ready:
visited set (O(1) but high overhead), modify the board directly by replacing characters with a marker like '#'. Restore during backtracking. This avoids set operations entirely.word = None) and decrement counts. When counts hit zero, delete the branch. This shrinks the trie as we find words.root.children. Skip cells that can't start any word.m × n (impossible to form). Track max word length to limit DFS depth.1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768
def find_words_optimized(board: list[list[str]], words: list[str]) -> list[str]: """ Fully optimized word search with all production-level enhancements. """ if not board or not board[0] or not words: return [] m, n = len(board), len(board[0]) # OPTIMIZATION 1: Pre-compute grid character frequency grid_chars = set() for row in board: grid_chars.update(row) # OPTIMIZATION 2: Filter words with impossible characters valid_words = [] for word in words: if all(c in grid_chars for c in word) and len(word) <= m * n: valid_words.append(word) if not valid_words: return [] # Build trie only with valid words root = build_trie(valid_words) # OPTIMIZATION 3: Pre-compute which cells can start a word starting_cells = [] for i in range(m): for j in range(n): if board[i][j] in root.children: starting_cells.append((i, j)) result = [] directions = [(-1, 0), (1, 0), (0, -1), (0, 1)] def dfs(row: int, col: int, node: TrieNode) -> None: char = board[row][col] if char not in node.children: return child = node.children[char] if child.word: result.append(child.word) child.word = None # Prevent duplicates # OPTIMIZATION 4: In-place marking with original char backup board[row][col] = '#' for dr, dc in directions: nr, nc = row + dr, col + dc if 0 <= nr < m and 0 <= nc < n and board[nr][nc] != '#': if board[nr][nc] in child.children: # Extra pruning dfs(nr, nc, child) board[row][col] = char # Restore # OPTIMIZATION 5: Prune exhausted branches if not child.children and child.word is None: del node.children[char] # Only start from viable cells for i, j in starting_cells: dfs(i, j, root) return resultIn worst-case scenarios, these optimizations can provide 10-100x speedup. The character frequency pre-check alone can eliminate 50%+ of words in many real-world dictionaries. Dynamic pruning ensures the algorithm gets faster as it progresses—the trie shrinks with each found word.
Understanding the complexity of the trie-optimized solution requires careful analysis of both the trie construction and the grid search phases.
| Aspect | Naive Approach | Trie-Optimized |
|---|---|---|
| Time: Build Phase | N/A | O(W × L) where W = total words, L = avg length |
| Time: Search Phase | O(k × m × n × 4^L) | O(m × n × 4^L) |
| Time: Overall | O(k × m × n × 4^L) | O(W × L + m × n × 4^L) |
| Space: Data Structure | O(1) | O(W × L) for trie |
| Space: Recursion | O(L) | O(L) |
| Pruning Benefit | None | Massive (early termination) |
Detailed Analysis:
Time Complexity:
Trie Construction: O(W × L) where W is the total number of characters across all words
Grid Search: O(m × n × 4^L) in the worst case
Key Improvement: The factor of k (number of words) is eliminated!
In Practice:
The 4^L factor is a loose upper bound. In practice:
Eliminating the k factor is the crucial win. With 10,000 dictionary words, the naive approach is 10,000× slower. The trie-optimized approach handles 10 words or 10 million words with the same grid traversal—only the trie build time increases linearly.
Space Complexity:
Trie Storage: O(W × L) in the worst case
Recursion Stack: O(L) where L is the maximum word length
Result Storage: O(k) for storing found words
Space Optimization Note:
For memory-constrained environments, consider processing words in batches—build smaller tries, search, then rebuild. This trades time for space.
The trie-optimized word search pattern extends far beyond puzzle games. Understanding this pattern unlocks solutions to many practical problems:
Whenever you face a problem involving: (1) a 2D grid of characters, (2) a dictionary of patterns to find, and (3) path-based matching, immediately consider the trie-optimized approach. It's the canonical solution for this problem class.
We've covered one of the most powerful applications of the trie data structure—optimizing word search in grids. Let's consolidate the key insights:
You now understand how to use tries to optimize word search in grids—a classic and frequently-tested pattern. Next, we'll explore another essential trie pattern: finding the longest common prefix among strings.