Loading learning content...
In the previous page, we established that standard tries suffer from an abundance of unary nodes—nodes with exactly one child that contribute nothing to disambiguation. These nodes form chains that inflate memory usage without adding structural value.
Now we turn to the practical question: How exactly do we identify and eliminate these chains?
Compacting single-child chains is the heart of trie compression. It's a deceptively simple idea—concatenate edge labels through unary nodes—but the implementation requires careful attention to correctness, especially during dynamic operations like insertion and deletion.
This page provides a thorough treatment of the compaction process: the algorithms for identification, the mechanics of concatenation, the handling of edge cases, and the maintenance of invariants during tree modifications.
By the end of this page, you will understand: • How to identify unary chains in a trie structure • The step-by-step algorithm for edge concatenation • How to split edges during insertion (the inverse operation) • How to merge edges during deletion (restoring compression) • Implementation strategies for efficient compaction • Correctness criteria and invariant maintenance
Before we can compact, we must identify. A unary chain is a maximal sequence of nodes where each internal node has exactly one child and is not an end-of-word marker.
Formal Definition:
A unary chain in a trie T is a path p₁ → p₂ → ... → pₖ where:
Why end-of-word matters:
Consider storing "app" and "apple". The path is:
root ─a→ ● ─p→ ● ─p→ ● ✓(app) ─l→ ● ─e→ ● ✓(apple)
The node marked "app" has one child but IS an end-of-word. We cannot compress through it because that would lose the information that "app" is a complete word. The compression must be:
root ─"app"→ ● ✓ ─"le"→ ● ✓
Not:
root ─"apple"→ ● ✓ (WRONG! Lost "app")
End-of-word markers act as compression boundaries. A unary node that's also an end-of-word cannot be eliminated—it must remain as a node to record that a word terminates there. Forgetting this leads to lost words and corrupted data.
Chain Detection Algorithm:
To find all unary chains suitable for compaction:
function findCompressibleChains(node):
chains = []
for each child c of node:
if isCompressibleChainStart(c):
chain = extractChain(c)
chains.append(chain)
else:
chains.extend(findCompressibleChains(c))
return chains
function isCompressibleChainStart(node):
# Node starts a chain if it has exactly one child
# and is not end-of-word
return childCount(node) == 1 AND NOT node.isEndOfWord
function extractChain(startNode):
chain = [startNode]
current = startNode
# Extend chain while current has exactly one child and isn't EOW
while childCount(current) == 1 AND NOT current.isEndOfWord:
current = onlyChild(current)
chain.append(current)
return chain
| Path Segment | End-of-Word? | Children Count | Compressible? | Reason |
|---|---|---|---|---|
| a → p | No | 1 | Yes | Unary, not EOW |
| p → p | No | 1 | Yes | Unary, not EOW |
| p → l | Yes ("app") | 1 | No | EOW boundary |
| l → e | No | 1 | Yes | Unary, not EOW |
| e → (none) | Yes ("apple") | 0 | N/A | Leaf node |
| n → {i,g} | No | 2 | No | Branching node |
Visualization of Chain Detection:
Consider a standard trie storing {"testing", "test", "team", "tea"}:
root
└── t [chain start - 1 child, not EOW]
└── e [chain continues - 1 child, not EOW]
└── s → ...t [branch to "test", "testing"]
└── a → ...m [branch to "tea", "team"]
Wait, 't' leads to 'e', and 'e' has TWO children ('s' and 'a'). So the chain is only root→t→e, and 'e' is the branch point.
Actual structure:
root ─t→ ● ─e→ ●
├── s ─t→ ● ✓(test) ─i─n─g→ ● ✓(testing)
└── a ● ✓(tea) ─m→ ● ✓(team)
Compressible chains:
Once chains are identified, compaction transforms the structure. The algorithm takes a standard trie and produces an equivalent compressed trie.
Top-Down Compaction (Building Compressed Trie):
The cleanest approach builds a compressed trie directly from a set of strings, avoiding construction of an intermediate standard trie:
This is the preferred approach for building from scratch.
Bottom-Up Compaction (Compressing Existing Trie):
When given an existing standard trie to compress:
12345678910111213141516171819202122232425262728293031323334353637383940414243
class TrieNode: def __init__(self): self.children = {} # char -> TrieNode for standard trie self.is_end = False class CompressedTrieNode: def __init__(self): self.children = {} # edge_label (string) -> CompressedTrieNode self.is_end = False def compress_trie(standard_root): """ Convert a standard trie to a compressed trie. Uses DFS to handle each subtree recursively. """ compressed_root = CompressedTrieNode() def compress_from_node(std_node, comp_node): """ Process children of std_node, adding to comp_node with compression. """ for char, child in std_node.children.items(): # Start building edge label edge_label = char current = child # Extend edge label through unary non-EOW nodes while len(current.children) == 1 and not current.is_end: next_char = list(current.children.keys())[0] edge_label += next_char current = current.children[next_char] # 'current' is now the endpoint: it either branches, is EOW, or is leaf new_child = CompressedTrieNode() new_child.is_end = current.is_end comp_node.children[edge_label] = new_child # Recurse to handle current's children compress_from_node(current, new_child) # Handle root specially since root is never EOW compress_from_node(standard_root, compressed_root) return compressed_rootStep-by-Step Example:
Compressing a standard trie with words {"abc", "abd", "xyz"}:
Standard Trie:
root
├── a
│ └── b
│ ├── c ✓
│ └── d ✓
└── x
└── y
└── z ✓
Compression Trace:
Process 'a' branch:
Process 'b's children (at new compressed node):
Process 'x' branch:
Compressed Trie:
root
├──"ab"──→ ●
│ ├──"c"──→ ● ✓
│ └──"d"──→ ● ✓
└──"xyz"──→ ● ✓
Node count: 8 (standard) → 5 (compressed) — and the difference grows dramatically with longer chains.
Building a compressed trie isn't just about initial construction—we need to support dynamic insertions. When inserting a new string, three scenarios arise:
Scenario 1: Completely New Path
The new string has no common prefix with any existing edge from the current node. Simply create a new edge labeled with the remaining suffix.
Scenario 2: Existing Edge is Prefix of New String
Follow the edge completely and continue insertion from the child node.
Scenario 3: Divergence Within Edge (Edge Splitting)
The new string and an existing edge share a prefix but then diverge. We must split the edge at the divergence point.
The Edge Splitting Process:
When inserting key k splits edge (label, child) at position p (where k and label first differ):
Where pos is the current position in key k when we reached this edge.
Example: Split in Action
Compressed trie contains "application":
root ──"application"──→ ● ✓
Inserting "apply":
Split process:
Result:
root ──"appl"──→ ●
├──"ication"──→ ● ✓
└──"y"──→ ● ✓
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970
def insert(root, key): """ Insert key into compressed trie, splitting edges as needed. Time: O(m) where m = len(key) """ node = root i = 0 # Position in key while i < len(key): remaining = key[i:] found = False for edge_label, child in list(node.children.items()): # Find longest common prefix between remaining key and edge label common_len = common_prefix_length(remaining, edge_label) if common_len == 0: continue # No match, try next edge found = True if common_len == len(edge_label): # Edge label is entirely contained; follow edge i += common_len node = child break else: # Divergence within edge - SPLIT! # Split edge_label at common_len common = edge_label[:common_len] edge_remainder = edge_label[common_len:] key_remainder = remaining[common_len:] # Create new internal node split_node = CompressedTrieNode() # Update parent: replace old edge with edge to split_node del node.children[edge_label] node.children[common] = split_node # Add original child under edge_remainder split_node.children[edge_remainder] = child # Add new key under key_remainder (if any) if key_remainder: new_leaf = CompressedTrieNode() new_leaf.is_end = True split_node.children[key_remainder] = new_leaf else: # Key ends exactly at split point split_node.is_end = True return # Insertion complete if not found: # No matching edge; add entire remaining key new_leaf = CompressedTrieNode() new_leaf.is_end = True node.children[remaining] = new_leaf return # Key is a prefix of existing path node.is_end = True def common_prefix_length(s1, s2): """Return length of common prefix between two strings.""" i = 0 while i < len(s1) and i < len(s2) and s1[i] == s2[i]: i += 1 return iEvery split operation creates an internal node with exactly two children (the remainder of the original edge and the new key's suffix). This is the only way new internal nodes arise in compressed tries. The total internal node count equals the number of splits performed during all insertions.
Deletion is the inverse of insertion, and edge merging is the inverse of edge splitting. When removing a string, we may create unary internal nodes that violate the compression invariant.
When Merging is Needed:
After deleting a string, an internal node might become:
The Edge Merging Process:
If node N has exactly one child C, and N is not end-of-word:
Example: Deletion and Merge
Compressed trie contains "apply" and "application":
root ──"appl"──→ ●
├──"ication"──→ ● ✓
└──"y"──→ ● ✓
Deleting "apply":
root ──"application"──→ ● ✓
We've restored the original single-edge structure.
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667
def delete(root, key): """ Delete key from compressed trie, merging edges as needed. Returns True if deletion was successful. """ # First, find the path to the key path = [] # Stack of (parent, edge_label, node) node = root i = 0 while i < len(key): remaining = key[i:] found = False for edge_label, child in node.children.items(): if remaining.startswith(edge_label): # Follow this edge path.append((node, edge_label, child)) i += len(edge_label) node = child found = True break elif edge_label.startswith(remaining): # Key ends mid-edge - not an exact match return False if not found: return False # Key not in trie # Check if this node is actually marked as end-of-word if not node.is_end: return False # Unmark as end-of-word node.is_end = False # Now clean up: remove/merge nodes bottom-up cleanup_after_delete(path, node) return True def cleanup_after_delete(path, leaf_node): """ Clean up tree after deletion, merging edges as needed. """ node = leaf_node while path: parent, edge_label, current = path.pop() if len(node.children) == 0 and not node.is_end: # Node is useless; remove it del parent.children[edge_label] node = parent elif len(node.children) == 1 and not node.is_end: # Node is unary; merge with child child_label = list(node.children.keys())[0] child = node.children[child_label] # Create merged edge new_label = edge_label + child_label del parent.children[edge_label] parent.children[new_label] = child node = parent else: # Node is valid; stop cleanup breakOne deletion can trigger a cascade of merges up the tree. Each merge might make the parent unary, requiring another merge. Always propagate cleanup to the root to maintain the compression invariant throughout the tree.
A critical implementation decision is how to store edge labels. The naive approach—copying substrings—works but may waste memory. Several strategies exist:
Strategy 1: Copied Substrings (Naive)
Each edge stores a full copy of its label string.
Pros:
Cons:
Strategy 2: Start/Length Pointers
Store references into original strings: (original_string_id, start_index, length).
Pros:
Cons:
| Strategy | Space Per Edge | Access Time | Implementation Complexity | Best For |
|---|---|---|---|---|
| Copied String | O(label length) | O(1) to access | Low | Simple implementations |
| Start/Length | O(1) fixed | O(1) with indirection | Medium | Memory-critical systems |
| Shared Pool | O(1) fixed + pool | O(1) with indirection | High | Many shared substrings |
| Inline Short | O(1) for short, O(n) otherwise | O(1) | Medium | Mixed length labels |
Strategy 3: String Interning Pool
Maintain a pool of unique substrings. Edges reference pool entries.
Pros:
Cons:
Strategy 4: Inline Short Strings
A hybrid: store short labels inline (e.g., ≤8 characters) and only allocate separately for longer labels.
Pros:
Cons:
1234567891011121314151617181920212223242526
interface EdgeLabel { // If length <= 8, characters are stored inline // Otherwise, pointer to heap-allocated string inline: string | null; // For labels <= 8 chars external: string | null; // For labels > 8 chars} class CompressedTrieNode { children: Map<string, CompressedTrieNode>; isEnd: boolean; /** * Get edge label efficiently. * Short labels avoid heap allocation. */ setEdge(label: string, child: CompressedTrieNode): void { // In real implementation, we'd store inline vs external // Here simplified to use string key directly this.children.set(label, child); }} // Production implementation might use:// - Small Buffer Optimization (SBO) pattern// - Tagged union for inline vs pointer// - Platform-specific optimizationsThe Linux kernel's radix tree uses a clever technique: keys are often integers, and the 'label' is implicitly determined by bit patterns. No separate label storage is needed—the path through the tree encodes the key. This is the ultimate space optimization when keys have regular structure.
Maintaining correctness in compressed tries requires careful attention to invariants. Bugs in edge splitting/merging can cause subtle corruption that manifests late—strings that should exist don't, or phantom strings appear.
The Core Invariants (Revisited):
Testing Strategies:
Robust testing is essential for trie implementations:
Property-Based Testing:
Metamorphic Testing:
Edge Cases:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748
def verify_invariants(root): """ Verify all compressed trie invariants. Call after operations during testing. """ errors = [] def check_node(node, path=""): # Check 1: No unary internal nodes if len(node.children) == 1 and not node.is_end: errors.append(f"Unary internal node at path: {path}") # Check 2: No edge prefix collisions labels = list(node.children.keys()) for i, l1 in enumerate(labels): for l2 in labels[i+1:]: if l1.startswith(l2) or l2.startswith(l1): errors.append( f"Edge prefix collision at {path}: '{l1}' vs '{l2}'" ) # Recurse for label, child in node.children.items(): check_node(child, path + label) check_node(root) if errors: raise AssertionError("Invariant violations:\n" + "\n".join(errors)) # Usage in tests:def test_insert_delete(): trie = CompressedTrie() words = ["apple", "apply", "application", "apt"] for word in words: trie.insert(word) verify_invariants(trie.root) # Check after each insert for word in words: assert trie.search(word), f"{word} should exist" for word in words[:2]: trie.delete(word) verify_invariants(trie.root) # Check after each delete assert not trie.search("apple") assert trie.search("application")Understanding the performance impact of compaction helps in making implementation decisions.
Time Complexity:
| Operation | Standard Trie | Compressed Trie | Notes |
|---|---|---|---|
| Search | O(m) | O(m) | Same total comparisons |
| Insert | O(m) | O(m) | Plus possible allocation |
| Delete | O(m) | O(m) | Plus possible merge cost |
| Build (n strings) | O(L) | O(L) | L = total length |
The O(m) complexity is preserved because we process each character exactly once—just in larger chunks per node visit.
Constant Factors:
While big-O is identical, constant factors differ:
Standard Trie Advantages:
Compressed Trie Advantages:
When Compressed Tries Win on Speed:
When Standard Tries Win on Speed:
| Operation Step | Standard Trie | Compressed Trie |
|---|---|---|
| Locate next edge | Array index: O(1) | String prefix match: O(edge length) |
| Node allocation | One per character | One per branch point |
| Character comparison | One per node | Many per node (batched) |
| Insert edge | Set array slot | Hash map insert + possible split |
| Delete edge | Clear array slot | Hash map delete + possible merge |
Compression trades implementation simplicity and insert/delete speed for memory efficiency. This tradeoff is usually favorable because: (1) memory is often the bottleneck, (2) reads typically dominate writes, and (3) the memory savings enable larger datasets to fit in faster storage tiers (cache, RAM).
We've thoroughly explored the mechanics of compacting single-child chains—the core technique that transforms space-hungry standard tries into efficient compressed tries.
What's Next:
With the mechanics of compression fully understood, we turn to a specific instantiation: Radix Trees. Radix trees apply compression principles with particular optimizations for numeric and binary keys, enabling efficient implementations in operating system kernels, network routers, and memory management systems. We'll explore their structure, properties, and real-world applications.
You now understand the detailed algorithms for compacting single-child chains—how to identify unary chains, how to concatenate edge labels, how to split edges during insertion, and how to merge edges during deletion. These techniques are the operational foundation of all compressed trie variants.