Loading content...
Once you've navigated to the prefix node in a trie, you face a subtree containing all possible completions. This subtree can be thought of as a maze where each path leads to a valid word. Your task is to systematically explore this maze and collect every word hiding within.
Depth-First Search (DFS) is the algorithm of choice for this exploration. But DFS isn't a single algorithm—it's a family of traversal strategies, each with different characteristics:
This page will make you a master of trie traversal, equipping you with multiple implementation strategies and the wisdom to choose among them.
By the end of this page, you will understand the mechanics of DFS in tree structures, implement both recursive and iterative trie traversals, compare pre-order vs post-order vs level-order approaches, construct words efficiently during traversal, and apply filtering and transformation during collection.
Before implementing DFS for tries, let's build a solid conceptual foundation by examining how DFS applies to tree structures in general, and tries specifically.
DFS Core Principle:
DFS explores a tree by going as deep as possible along each branch before backtracking. When you reach a leaf or a node with no unvisited children, you backtrack to explore remaining branches.
In a Trie Context:
Each path from the root to an end-of-word node spells out a complete word. DFS follows these paths, diving character by character until it reaches word endings, then backtracks to explore alternative branches.
Visual Example:
prefix node: 'pro'
│
g─────────j─────────m
│ │ │
r e i
│ │ │
a c s
│ │ │
m ∎ t ∎ e ∎
/
m
│
e─────────i
│ │
r ∎ n
│
g ∎
DFS Traversal Order (one possibility):
Notice how DFS naturally handles the tree structure—we don't need to remember which level we're at or explicitly track paths.
BFS (Breadth-First Search) explores level by level, which would find shorter words before longer ones. DFS doesn't guarantee any particular length order but uses less memory (O(max_depth) vs O(max_width)) and maps naturally to recursive implementations. For autocomplete, DFS is the standard choice.
Recursive DFS leverages the call stack to track the traversal state. It's concise, readable, and follows the natural recursive structure of trees.
The Pattern:
DFS(node):
process(node) // Do something with this node
for each child of node:
DFS(child) // Recurse on children
For trie word collection, "process" means checking if the node is an end-of-word and adding the word to our results.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253
class TrieNode { children: Map<string, TrieNode> = new Map(); isEndOfWord: boolean = false; word: string | null = null; // Store complete word for easy retrieval} /** * Recursive DFS to collect all words in a trie subtree. * Simple and elegant—leverages the call stack for state. * * @param node - Current node in traversal * @param results - Array collecting found words * * Time: O(n) where n is number of nodes in subtree * Space: O(h) recursion stack where h is max word length */function collectWordsDFS(node: TrieNode, results: string[]): void { // Base case is implicit: when children is empty, we don't recurse // Process current node: check if it's an end-of-word if (node.isEndOfWord && node.word !== null) { results.push(node.word); } // Recurse on all children for (const [char, child] of node.children) { collectWordsDFS(child, results); }} /** * Main function: Get all words starting with a prefix. */function getWordsWithPrefix(root: TrieNode, prefix: string): string[] { // Phase 1: Navigate to prefix node let current = root; for (const char of prefix) { if (!current.children.has(char)) { return []; // Prefix doesn't exist } current = current.children.get(char)!; } // Phase 2: Collect all words via DFS const results: string[] = []; collectWordsDFS(current, results); return results;} // Usageconst trie = buildTrie(['program', 'programmer', 'progress']);const words = getWordsWithPrefix(trie.root, 'pro');console.log(words); // ['program', 'programmer', 'progress']Tracing the Recursion:
Let's trace through collecting words from the 'pro' prefix node:
collectWordsDFS('pro' node)
├─ isEndOfWord? No
├─ child 'g' → collectWordsDFS('g' node)
│ ├─ child 'r' → collectWordsDFS('r' node)
│ │ ├─ child 'a' → collectWordsDFS('a' node)
│ │ │ ├─ child 'm' → collectWordsDFS('m' node)
│ │ │ │ ├─ isEndOfWord? Yes → push 'program'
│ │ │ │ ├─ child 'm' → collectWordsDFS...
│ │ │ │ │ └─ (and so on for 'programmer', 'programming')
│ │ │ │ └─ return
│ │ │ └─ return
│ │ └─ return
│ └─ return
├─ child 'j' → collectWordsDFS('j' node)
│ └─ ... ('project')
└─ child 'm' → collectWordsDFS('m' node)
└─ ... ('promise')
The call stack naturally tracks our position in the trie. When we return from a recursive call, we 'pop' back to the parent node and continue with the next child.
Most languages have limited stack space for recursion. Python defaults to ~1000 frames; JavaScript engines typically allow 10,000+. For natural language (words rarely exceed 30 characters), this is fine. For applications with very long strings (URLs, file paths, genomic data), use iterative DFS instead.
In the previous examples, we assumed each end-of-word node stores the complete word. But sometimes we need to construct words during traversal—either because nodes don't store words, or for memory efficiency.
Two Approaches:
The array approach is more efficient because string concatenation creates new string objects in many languages.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687
/** * Approach 1: String concatenation (simpler but creates many string objects) */function collectWordsConcat( node: TrieNode, currentWord: string, results: string[]): void { if (node.isEndOfWord) { results.push(currentWord); // Current path is a complete word } for (const [char, child] of node.children) { // Pass concatenated string to child collectWordsConcat(child, currentWord + char, results); }} // Usage:// const results: string[] = [];// collectWordsConcat(prefixNode, prefix, results); /** * Approach 2: Array-based path (more efficient, avoids string copies) */function collectWordsArray( node: TrieNode, path: string[], // Mutable path array results: string[]): void { if (node.isEndOfWord) { results.push(path.join('')); // Join array to string only when needed } for (const [char, child] of node.children) { path.push(char); // Add character to path collectWordsArray(child, path, results); path.pop(); // Remove character (backtrack) }} // Usage:// const results: string[] = [];// const path = [...prefix]; // Start with prefix characters// collectWordsArray(prefixNode, path, results); /** * Approach 3: StringBuilder pattern (optimal for languages with mutable strings) * JavaScript doesn't have true StringBuilders, but we can simulate */class StringBuilder { private chars: string[] = []; append(char: string): void { this.chars.push(char); } pop(): void { this.chars.pop(); } toString(): string { return this.chars.join(''); } get length(): number { return this.chars.length; }} function collectWithStringBuilder( node: TrieNode, sb: StringBuilder, results: string[]): void { if (node.isEndOfWord) { results.push(sb.toString()); } for (const [char, child] of node.children) { sb.append(char); collectWithStringBuilder(child, sb, results); sb.pop(); // Backtrack }}| Approach | Time per Word | Memory Allocation | Code Complexity |
|---|---|---|---|
| String concatenation | O(word_length²) | Many temporary strings | Simplest |
| Array/List path | O(word_length) | One array, reused | Medium |
| StringBuilder | O(word_length) | One buffer, reused | Medium |
| Store in nodes | O(1) | Pre-paid at insertion | Simplest at collection |
Notice the 'push-recurse-pop' pattern: we modify state (push character), recurse to explore that branch, then undo the modification (pop character). This backtracking pattern is fundamental to DFS and appears throughout algorithm design—in graph traversal, constraint satisfaction, game tree exploration, and more.
Iterative DFS replaces the recursion call stack with a manually-managed stack data structure. This gives us more control and avoids stack overflow issues.
The Key Insight:
Recursion uses the call stack to remember:
For iterative DFS, we manage this state explicitly. For simple word collection from tries, we just need to track which nodes to visit.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119
/** * Simple iterative DFS when words are stored in nodes. * No complex state tracking needed. */function collectWordsIterative(prefixNode: TrieNode): string[] { const results: string[] = []; const stack: TrieNode[] = [prefixNode]; while (stack.length > 0) { const current = stack.pop()!; // Process this node if (current.isEndOfWord && current.word !== null) { results.push(current.word); } // Add children to stack for later processing // Note: DFS order depends on which children we push first for (const child of current.children.values()) { stack.push(child); } } return results;} /** * Iterative DFS when constructing words during traversal. * Need to track both node AND current path. */interface StackFrame { node: TrieNode; path: string; // Path so far including this node's character} function collectWordsIterativeConstruct( prefixNode: TrieNode, prefix: string): string[] { const results: string[] = []; const stack: StackFrame[] = [{ node: prefixNode, path: prefix }]; while (stack.length > 0) { const { node, path } = stack.pop()!; if (node.isEndOfWord) { results.push(path); } for (const [char, child] of node.children) { stack.push({ node: child, path: path + char }); } } return results;} /** * Memory-optimized: avoid storing full path in each stack frame. * Track depth and use a single mutable path. */interface CompactFrame { node: TrieNode; depth: number; // How deep is this node from prefix node charIter: Iterator<[string, TrieNode]>; // Iterator over children} function collectWordsIterativeOptimized( prefixNode: TrieNode, prefix: string): string[] { const results: string[] = []; const path: string[] = [...prefix]; // Mutable path const stack: CompactFrame[] = [{ node: prefixNode, depth: prefix.length, charIter: prefixNode.children.entries() }]; // Check if prefix node itself is a word if (prefixNode.isEndOfWord) { results.push(path.join('')); } while (stack.length > 0) { const frame = stack[stack.length - 1]; // Peek, don't pop const next = frame.charIter.next(); if (next.done) { // All children processed, backtrack stack.pop(); path.pop(); } else { const [char, childNode] = next.value; path.push(char); if (childNode.isEndOfWord) { results.push(path.join('')); } // Push child frame if it has children if (childNode.children.size > 0) { stack.push({ node: childNode, depth: frame.depth + 1, charIter: childNode.children.entries() }); } else { // Leaf node, backtrack immediately path.pop(); } } } return results;}Use iterative DFS when: (1) Maximum recursion depth might be exceeded, (2) You need fine-grained control (early termination, progress tracking), (3) You want to avoid call stack overhead in performance-critical code. Use recursive DFS when: code clarity is priority and depths are manageable.
DFS traversal can process nodes in different orders relative to recursing on children. Each order has different properties that can be useful in various scenarios.
Pre-Order Traversal: Process node BEFORE children.
visit(node)
for each child: traverse(child)
Post-Order Traversal: Process node AFTER children.
for each child: traverse(child)
visit(node)
Level-Order (BFS): Process level by level.
enqueue(root)
while queue not empty:
node = dequeue()
visit(node)
enqueue(children)
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071
/** * Pre-order: Process node before children. * Words are found in depth-first order (not sorted by length). */function preOrderCollect(node: TrieNode, results: string[]): void { // PROCESS FIRST if (node.isEndOfWord && node.word) { results.push(node.word); } // THEN RECURSE for (const child of node.children.values()) { preOrderCollect(child, results); }} /** * Post-order: Process node after children. * Children's words are collected before the current word. * (Rarely useful for autocomplete, included for completeness) */function postOrderCollect(node: TrieNode, results: string[]): void { // RECURSE FIRST for (const child of node.children.values()) { postOrderCollect(child, results); } // THEN PROCESS if (node.isEndOfWord && node.word) { results.push(node.word); }} /** * Level-order (BFS): Process level by level. * Finds shorter words before longer words. * Useful when you want results sorted by length. */function levelOrderCollect(prefixNode: TrieNode): string[] { const results: string[] = []; const queue: TrieNode[] = [prefixNode]; while (queue.length > 0) { const current = queue.shift()!; // Dequeue front if (current.isEndOfWord && current.word) { results.push(current.word); } // Enqueue children (they'll be processed after current level) for (const child of current.children.values()) { queue.push(child); } } return results;} // Example outputs for prefix 'pro' with words:// [program, programmer, programming, project, promise] // Pre-order might output:// ['program', 'programmer', 'programming', 'project', 'promise']// (order depends on hash map iteration order) // Level-order outputs (approximate, depends on branching):// ['project', 'promise', 'program', 'programmer', 'programming'] // Shorter words first| Order | Process Timing | Memory | Result Order | Use Case |
|---|---|---|---|---|
| Pre-order (DFS) | Before children | O(depth) | Depth-first | Standard word collection |
| Post-order (DFS) | After children | O(depth) | Reversed depth | Tree cleanup, dependency resolution |
| Level-order (BFS) | By level | O(width) | Shorter words first | Length-prioritized results |
For autocomplete, pre-order DFS is the standard choice because: (1) It's memory-efficient, (2) Ranking usually overrides natural order anyway, (3) We're collecting, not deleting. Use level-order only if you specifically need shorter words first and can afford the memory cost.
Often we don't want all words in a subtree—we want filtered or transformed results. Doing this during traversal (rather than post-collection) can be more efficient, especially when we can prune entire branches.
Common Filtering Scenarios:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697
interface CollectionOptions { minLength?: number; maxLength?: number; minFrequency?: number; maxResults?: number; excludeWords?: Set<string>; transformFn?: (word: string) => string;} /** * Flexible DFS with filtering and transformation. * Applies filters during traversal for efficiency. */function collectWithFilters( prefixNode: TrieNode, prefix: string, options: CollectionOptions = {}): string[] { const { minLength = 0, maxLength = Infinity, minFrequency = 0, maxResults = Infinity, excludeWords = new Set(), transformFn = (w) => w } = options; const results: string[] = []; const prefixLen = prefix.length; function dfs(node: TrieNode, depth: number): boolean { // Early termination if we have enough results if (results.length >= maxResults) { return true; // Signal to stop } // Current word length would be prefix length + depth from prefix node const currentLength = prefixLen + depth; // Pruning: if we're already past maxLength, don't go deeper if (currentLength > maxLength) { return false; } // Check if this node is a valid word to collect if (node.isEndOfWord && node.word !== null) { const word = node.word; const wordLen = word.length; // Apply filters const passesLengthFilter = wordLen >= minLength && wordLen <= maxLength; const passesFrequencyFilter = (node.frequency ?? 0) >= minFrequency; const passesExclusionFilter = !excludeWords.has(word); if (passesLengthFilter && passesFrequencyFilter && passesExclusionFilter) { results.push(transformFn(word)); if (results.length >= maxResults) { return true; // Stop after reaching limit } } } // Recurse on children for (const child of node.children.values()) { if (dfs(child, depth + 1)) { return true; // Propagate stop signal } } return false; } dfs(prefixNode, 0); return results;} // Usage examples: // Get words of length 5-8 with high frequencyconst options1: CollectionOptions = { minLength: 5, maxLength: 8, minFrequency: 100}; // Get first 10 words, lowercase transformedconst options2: CollectionOptions = { maxResults: 10, transformFn: (word) => word.toLowerCase()}; // Exclude specific wordsconst options3: CollectionOptions = { excludeWords: new Set(['program', 'progress'])};The maxLength pruning check demonstrates an important optimization: if we know the current path is already too long, we don't need to explore any children. This can eliminate entire subtrees. Similar logic applies to frequency thresholds if nodes store max-descendant frequencies.
For production autocomplete, we need ranked results—not just all words that match. There are two primary strategies:
Strategy 1: Collect All, Then Rank
Strategy 2: Rank-Guided Traversal
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125
import { MinHeap, MaxHeap } from './heap'; interface ScoredWord { word: string; score: number;} /** * Strategy 1: Collect all, then rank. * Simple and correct but may do extra work. */function collectAndRank(prefixNode: TrieNode, topK: number): string[] { const allWords: ScoredWord[] = []; function dfs(node: TrieNode): void { if (node.isEndOfWord && node.word) { allWords.push({ word: node.word, score: node.frequency ?? 0 }); } for (const child of node.children.values()) { dfs(child); } } dfs(prefixNode); // Sort by score descending, take top K return allWords .sort((a, b) => b.score - a.score) .slice(0, topK) .map(sw => sw.word);} /** * Strategy 2: Use a bounded min-heap for top-K. * Maintains only the K best words found so far. * Still visits all nodes but avoids storing all words. */function collectTopKWithHeap(prefixNode: TrieNode, topK: number): string[] { // Min-heap of size K (smallest of top-K at root) const heap = new MinHeap<ScoredWord>((a, b) => a.score - b.score); function dfs(node: TrieNode): void { if (node.isEndOfWord && node.word) { const candidate = { word: node.word, score: node.frequency ?? 0 }; if (heap.size() < topK) { heap.push(candidate); } else if (candidate.score > heap.peek()!.score) { heap.pop(); heap.push(candidate); } } for (const child of node.children.values()) { dfs(child); } } dfs(prefixNode); // Extract in sorted order const result: string[] = []; while (heap.size() > 0) { result.push(heap.pop()!.word); } return result.reverse(); // Highest score first} /** * Strategy 3: Priority-guided traversal with pruning. * Requires nodes to store max descendant score. * Can skip entire branches that can't compete. */interface TrieNodeWithMax extends TrieNode { maxDescendantScore: number; // Max score of any word in this subtree} function collectTopKOptimized( prefixNode: TrieNodeWithMax, topK: number): string[] { const heap = new MinHeap<ScoredWord>((a, b) => a.score - b.score); // Priority queue: visit promising branches first const pq = new MaxHeap<TrieNodeWithMax>( (a, b) => a.maxDescendantScore - b.maxDescendantScore ); pq.push(prefixNode); while (pq.size() > 0) { const current = pq.pop()!; // Pruning: can this branch improve our top-K? if (heap.size() >= topK && current.maxDescendantScore <= heap.peek()!.score) { // No word in this subtree can beat our worst top-K continue; } if (current.isEndOfWord && current.word) { const candidate = { word: current.word, score: current.frequency ?? 0 }; if (heap.size() < topK) { heap.push(candidate); } else if (candidate.score > heap.peek()!.score) { heap.pop(); heap.push(candidate); } } for (const child of current.children.values()) { pq.push(child as TrieNodeWithMax); } } const result: string[] = []; while (heap.size() > 0) { result.push(heap.pop()!.word); } return result.reverse();}| Strategy | Time | Extra Space | Maintenance Cost | Best For |
|---|---|---|---|---|
| Collect + Sort | O(s + s log s) | O(s) | None | Small subtrees |
| Bounded Heap | O(s log K) | O(K) | None | Large subtrees, moderate K |
| Priority + Prune | O(visited log K) | O(K + pq) | O(n) to maintain maxScores | Very large, skewed distributions |
Production systems often use a hybrid: pre-compute top-K for very common prefixes (single characters, common bigrams), use bounded heap for typical queries, and fall back to collect-all-sort for unusual prefixes with few matches.
Understanding performance characteristics helps you optimize for your specific use case. Let's examine the factors that affect DFS performance in tries.
Memory Access Patterns:
Tries are pointer-heavy structures. Each node points to its children, which may be scattered in memory. This leads to poor cache locality compared to array-based structures.
Optimization Opportunities:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455
/** * Optimized trie node for ASCII lowercase letters. * Uses fixed-size array instead of Map. */class OptimizedTrieNode { // Fixed array: indices 0-25 for 'a'-'z' children: (OptimizedTrieNode | null)[] = new Array(26).fill(null); isEndOfWord: boolean = false; wordIndex: number = -1; // Index into a separate word array getChild(char: string): OptimizedTrieNode | null { const index = char.charCodeAt(0) - 97; // 'a' = 97 return this.children[index]; } setChild(char: string, node: OptimizedTrieNode): void { const index = char.charCodeAt(0) - 97; this.children[index] = node; }} /** * Optimized collection with pre-allocated result array. */function collectOptimized( prefixNode: OptimizedTrieNode, wordArray: string[], // All words stored in a central array maxResults: number): string[] { // Pre-allocate result array const results: string[] = new Array(maxResults); let resultCount = 0; const stack: OptimizedTrieNode[] = [prefixNode]; while (stack.length > 0 && resultCount < maxResults) { const current = stack.pop()!; if (current.isEndOfWord) { results[resultCount++] = wordArray[current.wordIndex]; } // Iterate in reverse to maintain order (optional) for (let i = 25; i >= 0; i--) { const child = current.children[i]; if (child !== null) { stack.push(child); } } } // Trim to actual size results.length = resultCount; return results;}These optimizations add complexity. Don't apply them blindly. Profile your specific workload first. For most autocomplete scenarios with reasonable dictionary sizes (under 1M words), the simple recursive DFS with Map-based children performs excellently.
We've thoroughly explored DFS for trie traversal—the algorithm that powers word collection in autocomplete systems. Let's consolidate the key insights:
What's Next:
In the final page of this module, we'll explore ranking suggestions—how to order the words we've collected to present the most useful suggestions first. We'll cover popularity-based ranking, personalization, recency, and more sophisticated ranking algorithms.
You now have comprehensive knowledge of DFS techniques for trie traversal. You can implement recursive and iterative traversals, construct words during traversal, apply filtering and ranking, and optimize for performance. This algorithmic foundation is essential for building efficient autocomplete systems.