Data Structures & AlgorithmsCompressed Tries — Patricia/Radix Trees

Compressed Tries — Patricia/Radix Trees (Conceptual)

LevelAdvanced

Duration75 mins

TopicCompressed Tries — Patricia/Radix Trees

2 / 4

Compacting Single-Child Chains

The Art of Chain Elimination

In the previous page, we established that standard tries suffer from an abundance of unary nodes—nodes with exactly one child that contribute nothing to disambiguation. These nodes form chains that inflate memory usage without adding structural value.

Now we turn to the practical question: How exactly do we identify and eliminate these chains?

Compacting single-child chains is the heart of trie compression. It's a deceptively simple idea—concatenate edge labels through unary nodes—but the implementation requires careful attention to correctness, especially during dynamic operations like insertion and deletion.

This page provides a thorough treatment of the compaction process: the algorithms for identification, the mechanics of concatenation, the handling of edge cases, and the maintenance of invariants during tree modifications.

What You Will Learn

By the end of this page, you will understand: • How to identify unary chains in a trie structure • The step-by-step algorithm for edge concatenation • How to split edges during insertion (the inverse operation) • How to merge edges during deletion (restoring compression) • Implementation strategies for efficient compaction • Correctness criteria and invariant maintenance

Identifying Unary Chains

Before we can compact, we must identify. A unary chain is a maximal sequence of nodes where each internal node has exactly one child and is not an end-of-word marker.

Formal Definition:

A unary chain in a trie T is a path p₁ → p₂ → ... → pₖ where:

Each pᵢ (for i < k) has exactly one child (pᵢ₊₁)
Each pᵢ (for i < k) is NOT marked as end-of-word
The chain is maximal: p₁'s parent either has multiple children or is root; pₖ either has multiple children, is end-of-word, or is a leaf

Why end-of-word matters:

Consider storing "app" and "apple". The path is:

root ─a→ ● ─p→ ● ─p→ ● ✓(app) ─l→ ● ─e→ ● ✓(apple)

The node marked "app" has one child but IS an end-of-word. We cannot compress through it because that would lose the information that "app" is a complete word. The compression must be:

root ─"app"→ ● ✓ ─"le"→ ● ✓

Not:

root ─"apple"→ ● ✓  (WRONG! Lost "app")

The End-of-Word Boundary

End-of-word markers act as compression boundaries. A unary node that's also an end-of-word cannot be eliminated—it must remain as a node to record that a word terminates there. Forgetting this leads to lost words and corrupted data.

Chain Detection Algorithm:

To find all unary chains suitable for compaction:

function findCompressibleChains(node):
    chains = []
    
    for each child c of node:
        if isCompressibleChainStart(c):
            chain = extractChain(c)
            chains.append(chain)
        else:
            chains.extend(findCompressibleChains(c))
    
    return chains

function isCompressibleChainStart(node):
    # Node starts a chain if it has exactly one child
    # and is not end-of-word
    return childCount(node) == 1 AND NOT node.isEndOfWord

function extractChain(startNode):
    chain = [startNode]
    current = startNode
    
    # Extend chain while current has exactly one child and isn't EOW
    while childCount(current) == 1 AND NOT current.isEndOfWord:
        current = onlyChild(current)
        chain.append(current)
    
    return chain

Unary Chain Classification Examples
Path Segment	End-of-Word?	Children Count	Compressible?	Reason
a → p	No	1	Yes	Unary, not EOW
p → p	No	1	Yes	Unary, not EOW
p → l	Yes ("app")	1	No	EOW boundary
l → e	No	1	Yes	Unary, not EOW
e → (none)	Yes ("apple")	0	N/A	Leaf node
n → {i,g}	No	2	No	Branching node

Visualization of Chain Detection:

Consider a standard trie storing {"testing", "test", "team", "tea"}:

root
 └── t [chain start - 1 child, not EOW]
     └── e [chain continues - 1 child, not EOW]
         └── s → ...t [branch to "test", "testing"]
         └── a → ...m [branch to "tea", "team"]

Wait, 't' leads to 'e', and 'e' has TWO children ('s' and 'a'). So the chain is only root→t→e, and 'e' is the branch point.

Actual structure:

root ─t→ ● ─e→ ●
              ├── s ─t→ ● ✓(test) ─i─n─g→ ● ✓(testing)
              └── a ● ✓(tea) ─m→ ● ✓(team)

Compressible chains:

root→t→e (but 'e' branches, so chain is t→e compacted to edge "te")
s→t (compress to "st")
t(in testing path)→i→n→g (compress to "ing")
a→m can't fully compress because 'a' is EOW for "tea"

The Compaction Algorithm

Once chains are identified, compaction transforms the structure. The algorithm takes a standard trie and produces an equivalent compressed trie.

Top-Down Compaction (Building Compressed Trie):

The cleanest approach builds a compressed trie directly from a set of strings, avoiding construction of an intermediate standard trie:

Start with an empty compressed trie (just root)
Insert each string, maintaining compression invariants
Each insertion either adds a new edge or splits an existing edge

This is the preferred approach for building from scratch.

Bottom-Up Compaction (Compressing Existing Trie):

When given an existing standard trie to compress:

Post-order traverse the trie (process children before parents)
At each node, if it has exactly one child and is not EOW:
- Concatenate edge labels: parent→node + node→child becomes parent→child with combined label
- Remove the intermediate node
Continue until no more compactions possible

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
class TrieNode:
    def __init__(self):
        self.children = {}  # char -> TrieNode for standard trie
        self.is_end = False
 
class CompressedTrieNode:
    def __init__(self):
        self.children = {}  # edge_label (string) -> CompressedTrieNode
        self.is_end = False
 
def compress_trie(standard_root):
    """
    Convert a standard trie to a compressed trie.
    Uses DFS to handle each subtree recursively.
    """
    compressed_root = CompressedTrieNode()
    
    def compress_from_node(std_node, comp_node):
        """
        Process children of std_node, adding to comp_node with compression.
        """
        for char, child in std_node.children.items():
            # Start building edge label
            edge_label = char
            current = child
            
            # Extend edge label through unary non-EOW nodes
            while len(current.children) == 1 and not current.is_end:
                next_char = list(current.children.keys())[0]
                edge_label += next_char
                current = current.children[next_char]
            
            # 'current' is now the endpoint: it either branches, is EOW, or is leaf
            new_child = CompressedTrieNode()
            new_child.is_end = current.is_end
            comp_node.children[edge_label] = new_child
            
            # Recurse to handle current's children
            compress_from_node(current, new_child)
    
    # Handle root specially since root is never EOW
    compress_from_node(standard_root, compressed_root)
    return compressed_root

Step-by-Step Example:

Compressing a standard trie with words {"abc", "abd", "xyz"}:

Standard Trie:

root
 ├── a
 │   └── b
 │       ├── c ✓
 │       └── d ✓
 └── x
     └── y
         └── z ✓

Compression Trace:

Process 'a' branch:
- Start with edge_label = "a", current = node 'a'
- Node 'a' has 1 child 'b' and is not EOW → extend
- edge_label = "ab", current = node 'b'
- Node 'b' has 2 children → stop extension
- Create edge "ab" from root to new node
- Recurse into 'b's children
Process 'b's children (at new compressed node):
- Child 'c': edge_label = "c", current = node 'c'
- Node 'c' has 0 children (leaf) → stop
- Create edge "c" marked EOW
- Child 'd': similar, create edge "d" marked EOW
Process 'x' branch:
- edge_label = "x", current = node 'x'
- Node 'x' has 1 child 'y', not EOW → extend
- edge_label = "xy", current = node 'y'
- Node 'y' has 1 child 'z', not EOW → extend
- edge_label = "xyz", current = node 'z'
- Node 'z' has 0 children (leaf) → stop
- Create edge "xyz" marked EOW

Compressed Trie:

root
 ├──"ab"──→ ●
 │          ├──"c"──→ ● ✓
 │          └──"d"──→ ● ✓
 └──"xyz"──→ ● ✓

Node count: 8 (standard) → 5 (compressed) — and the difference grows dramatically with longer chains.

Edge Splitting During Insertion

Building a compressed trie isn't just about initial construction—we need to support dynamic insertions. When inserting a new string, three scenarios arise:

Scenario 1: Completely New Path

The new string has no common prefix with any existing edge from the current node. Simply create a new edge labeled with the remaining suffix.

Scenario 2: Existing Edge is Prefix of New String

Follow the edge completely and continue insertion from the child node.

Scenario 3: Divergence Within Edge (Edge Splitting)

The new string and an existing edge share a prefix but then diverge. We must split the edge at the divergence point.

The Edge Splitting Process:

When inserting key k splits edge (label, child) at position p (where k and label first differ):

Create a new internal node N
Replace the original edge with: edge(label[0:p], N)
Add to N: edge(label[p:], child) — the remainder of original edge
Add to N: edge(k[pos+p:], new_leaf) — the remainder of new key

Where pos is the current position in key k when we reached this edge.

Example: Split in Action

Compressed trie contains "application":

root ──"application"──→ ● ✓

Inserting "apply":

Traverse edge "application"
"apply" matches "appl" (first 4 chars)
Divergence at position 4: 'y' vs 'i'
Split required!

Split process:

Create new node N
Replace edge: root ──"appl"──→ N
Add to N: edge "ication" → original leaf
Add to N: edge "y" → new leaf marked EOW

Result:

root ──"appl"──→ ●
                 ├──"ication"──→ ● ✓
                 └──"y"──→ ● ✓

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
def insert(root, key):
    """
    Insert key into compressed trie, splitting edges as needed.
    Time: O(m) where m = len(key)
    """
    node = root
    i = 0  # Position in key
    
    while i < len(key):
        remaining = key[i:]
        found = False
        
        for edge_label, child in list(node.children.items()):
            # Find longest common prefix between remaining key and edge label
            common_len = common_prefix_length(remaining, edge_label)
            
            if common_len == 0:
                continue  # No match, try next edge
            
            found = True
            
            if common_len == len(edge_label):
                # Edge label is entirely contained; follow edge
                i += common_len
                node = child
                break
            else:
                # Divergence within edge - SPLIT!
                # Split edge_label at common_len
                common = edge_label[:common_len]
                edge_remainder = edge_label[common_len:]
                key_remainder = remaining[common_len:]
                
                # Create new internal node
                split_node = CompressedTrieNode()
                
                # Update parent: replace old edge with edge to split_node
                del node.children[edge_label]
                node.children[common] = split_node
                
                # Add original child under edge_remainder
                split_node.children[edge_remainder] = child
                
                # Add new key under key_remainder (if any)
                if key_remainder:
                    new_leaf = CompressedTrieNode()
                    new_leaf.is_end = True
                    split_node.children[key_remainder] = new_leaf
                else:
                    # Key ends exactly at split point
                    split_node.is_end = True
                
                return  # Insertion complete
        
        if not found:
            # No matching edge; add entire remaining key
            new_leaf = CompressedTrieNode()
            new_leaf.is_end = True
            node.children[remaining] = new_leaf
            return
    
    # Key is a prefix of existing path
    node.is_end = True
 
def common_prefix_length(s1, s2):
    """Return length of common prefix between two strings."""
    i = 0
    while i < len(s1) and i < len(s2) and s1[i] == s2[i]:
        i += 1
    return i

Splitting Creates Branching Nodes

Every split operation creates an internal node with exactly two children (the remainder of the original edge and the new key's suffix). This is the only way new internal nodes arise in compressed tries. The total internal node count equals the number of splits performed during all insertions.

Edge Merging During Deletion

Deletion is the inverse of insertion, and edge merging is the inverse of edge splitting. When removing a string, we may create unary internal nodes that violate the compression invariant.

When Merging is Needed:

After deleting a string, an internal node might become:

Unary (exactly one child) AND not end-of-word → must merge
Childless AND not end-of-word → should be removed, potentially triggering parent merge

The Edge Merging Process:

If node N has exactly one child C, and N is not end-of-word:

Concatenate edge labels: edge(parent→N) + edge(N→C)
Make C the direct child of N's parent with the combined label
Remove N from the tree
Check if N's parent now needs merging (propagate upward)

Example: Deletion and Merge

Compressed trie contains "apply" and "application":

root ──"appl"──→ ●
                 ├──"ication"──→ ● ✓
                 └──"y"──→ ● ✓

Deleting "apply":

Navigate to "y" node, unmark as EOW
"y" node is now childless and not EOW → remove it
Parent node (at "appl") now has one child ("ication")
Parent is not EOW → merge!
Combine "appl" + "ication" = "application"
Result:

root ──"application"──→ ● ✓

We've restored the original single-edge structure.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
def delete(root, key):
    """
    Delete key from compressed trie, merging edges as needed.
    Returns True if deletion was successful.
    """
    # First, find the path to the key
    path = []  # Stack of (parent, edge_label, node)
    node = root
    i = 0
    
    while i < len(key):
        remaining = key[i:]
        found = False
        
        for edge_label, child in node.children.items():
            if remaining.startswith(edge_label):
                # Follow this edge
                path.append((node, edge_label, child))
                i += len(edge_label)
                node = child
                found = True
                break
            elif edge_label.startswith(remaining):
                # Key ends mid-edge - not an exact match
                return False
        
        if not found:
            return False  # Key not in trie
    
    # Check if this node is actually marked as end-of-word
    if not node.is_end:
        return False
    
    # Unmark as end-of-word
    node.is_end = False
    
    # Now clean up: remove/merge nodes bottom-up
    cleanup_after_delete(path, node)
    return True
 
def cleanup_after_delete(path, leaf_node):
    """
    Clean up tree after deletion, merging edges as needed.
    """
    node = leaf_node
    
    while path:
        parent, edge_label, current = path.pop()
        
        if len(node.children) == 0 and not node.is_end:
            # Node is useless; remove it
            del parent.children[edge_label]
            node = parent
        elif len(node.children) == 1 and not node.is_end:
            # Node is unary; merge with child
            child_label = list(node.children.keys())[0]
            child = node.children[child_label]
            
            # Create merged edge
            new_label = edge_label + child_label
            del parent.children[edge_label]
            parent.children[new_label] = child
            
            node = parent
        else:
            # Node is valid; stop cleanup
            break

Propagating Merges

One deletion can trigger a cascade of merges up the tree. Each merge might make the parent unary, requiring another merge. Always propagate cleanup to the root to maintain the compression invariant throughout the tree.

Edge Label Storage Strategies

A critical implementation decision is how to store edge labels. The naive approach—copying substrings—works but may waste memory. Several strategies exist:

Strategy 1: Copied Substrings (Naive)

Each edge stores a full copy of its label string.

Pros:

Simple implementation
Strings are self-contained; no external dependencies
Safe even if original strings are modified/freed

Cons:

Duplicates character data
For shared prefixes, we store the same characters in multiple edges
Space overhead: O(L) where L is total label length

Strategy 2: Start/Length Pointers

Store references into original strings: (original_string_id, start_index, length).

Pros:

No duplication of character data
Edge labels are just three integers (~12-24 bytes)
Original strings needed anyway for retrieval

Cons:

Requires keeping original strings alive
More complex memory management
Indirection on label access

Edge Label Storage Comparison
Strategy	Space Per Edge	Access Time	Implementation Complexity	Best For
Copied String	O(label length)	O(1) to access	Low	Simple implementations
Start/Length	O(1) fixed	O(1) with indirection	Medium	Memory-critical systems
Shared Pool	O(1) fixed + pool	O(1) with indirection	High	Many shared substrings
Inline Short	O(1) for short, O(n) otherwise	O(1)	Medium	Mixed length labels

Strategy 3: String Interning Pool

Maintain a pool of unique substrings. Edges reference pool entries.

Pros:

Deduplicates identical labels (e.g., common words appearing in multiple paths)
Enables O(1) label equality comparison (pointer equality)

Cons:

Pool management complexity
Garbage collection for unused strings
Overkill for most use cases

Strategy 4: Inline Short Strings

A hybrid: store short labels inline (e.g., ≤8 characters) and only allocate separately for longer labels.

Pros:

Captures common case efficiently (most labels are short after compression)
Avoids allocation overhead for typical paths
Cache-friendly for short labels

Cons:

More complex node structure
Must handle both cases in all operations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
interface EdgeLabel {
    // If length <= 8, characters are stored inline
    // Otherwise, pointer to heap-allocated string
    inline: string | null;  // For labels <= 8 chars
    external: string | null; // For labels > 8 chars
}
 
class CompressedTrieNode {
    children: Map<string, CompressedTrieNode>;
    isEnd: boolean;
    
    /**
     * Get edge label efficiently.
     * Short labels avoid heap allocation.
     */
    setEdge(label: string, child: CompressedTrieNode): void {
        // In real implementation, we'd store inline vs external
        // Here simplified to use string key directly
        this.children.set(label, child);
    }
}
 
// Production implementation might use:
// - Small Buffer Optimization (SBO) pattern
// - Tagged union for inline vs pointer
// - Platform-specific optimizations

The Linux Kernel Approach

The Linux kernel's radix tree uses a clever technique: keys are often integers, and the 'label' is implicitly determined by bit patterns. No separate label storage is needed—the path through the tree encodes the key. This is the ultimate space optimization when keys have regular structure.

Correctness Considerations

Maintaining correctness in compressed tries requires careful attention to invariants. Bugs in edge splitting/merging can cause subtle corruption that manifests late—strings that should exist don't, or phantom strings appear.

The Core Invariants (Revisited):

No Unary Internal Nodes: Every internal node has ≥2 children OR is marked end-of-word
Deterministic Edges: No two edges from the same node share a common prefix
Path Completeness: Every stored string corresponds to exactly one root-to-marked-node path
Path Uniqueness: Every root-to-marked-node path corresponds to exactly one stored string

Common Implementation Bugs

•Off-by-one in split position: Splitting at wrong index creates incorrect prefixes; strings not found on search
•Forgetting to update parent's child map: Old edge remains after split; search may follow wrong path
•Not propagating merge: Stopping merge early leaves unary nodes; space isn't fully reclaimed
•End-of-word not transferred: When splitting, the new internal node must inherit EOW status if key ends there
•Concurrent modification: Iterating over children while modifying; undefined behavior in most languages
•Edge label prefix collision: After operations, two edges might share prefix; search becomes non-deterministic

Testing Strategies:

Robust testing is essential for trie implementations:

Property-Based Testing:

After any sequence of operations, invariants must hold
All inserted strings must be findable
All non-inserted strings must return false
Deletion followed by insertion restores original state

Metamorphic Testing:

Insert all strings, then check membership for all
Insert in random order vs sorted order should yield equivalent tries
Delete then re-insert should be idempotent

Edge Cases:

Empty string insertion
Single-character strings
One string being prefix of another
All strings identical except last character
Very long strings (test for stack overflow in recursive implementations)
Alternating insert/delete of same string

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
def verify_invariants(root):
    """
    Verify all compressed trie invariants.
    Call after operations during testing.
    """
    errors = []
    
    def check_node(node, path=""):
        # Check 1: No unary internal nodes
        if len(node.children) == 1 and not node.is_end:
            errors.append(f"Unary internal node at path: {path}")
        
        # Check 2: No edge prefix collisions
        labels = list(node.children.keys())
        for i, l1 in enumerate(labels):
            for l2 in labels[i+1:]:
                if l1.startswith(l2) or l2.startswith(l1):
                    errors.append(
                        f"Edge prefix collision at {path}: '{l1}' vs '{l2}'"
                    )
        
        # Recurse
        for label, child in node.children.items():
            check_node(child, path + label)
    
    check_node(root)
    
    if errors:
        raise AssertionError("Invariant violations:\n" + "\n".join(errors))
 
# Usage in tests:
def test_insert_delete():
    trie = CompressedTrie()
    words = ["apple", "apply", "application", "apt"]
    
    for word in words:
        trie.insert(word)
        verify_invariants(trie.root)  # Check after each insert
    
    for word in words:
        assert trie.search(word), f"{word} should exist"
    
    for word in words[:2]:
        trie.delete(word)
        verify_invariants(trie.root)  # Check after each delete
    
    assert not trie.search("apple")
    assert trie.search("application")

Performance Characteristics of Compaction

Understanding the performance impact of compaction helps in making implementation decisions.

Time Complexity:

Operation	Standard Trie	Compressed Trie	Notes
Search	O(m)	O(m)	Same total comparisons
Insert	O(m)	O(m)	Plus possible allocation
Delete	O(m)	O(m)	Plus possible merge cost
Build (n strings)	O(L)	O(L)	L = total length

The O(m) complexity is preserved because we process each character exactly once—just in larger chunks per node visit.

Constant Factors:

While big-O is identical, constant factors differ:

Standard Trie Advantages:

Simpler node traversal (single character edges)
No string comparison needed (just pointer follow)
Simpler insert/delete (no split/merge)

Compressed Trie Advantages:

Fewer nodes to allocate/traverse
Better cache locality (fewer pointer chases)
Fewer memory allocations overall

When Compressed Tries Win on Speed:

Cache-bound workloads with large datasets
Read-heavy applications (fewer nodes = faster search)
Long strings with shared prefixes

When Standard Tries Win on Speed:

Small datasets that fit in cache anyway
Write-heavy applications with frequent modifications
When split/merge overhead dominates

Operation Micro-Overhead Comparison
Operation Step	Standard Trie	Compressed Trie
Locate next edge	Array index: O(1)	String prefix match: O(edge length)
Node allocation	One per character	One per branch point
Character comparison	One per node	Many per node (batched)
Insert edge	Set array slot	Hash map insert + possible split
Delete edge	Clear array slot	Hash map delete + possible merge

Memory vs Speed Tradeoff

Compression trades implementation simplicity and insert/delete speed for memory efficiency. This tradeoff is usually favorable because: (1) memory is often the bottleneck, (2) reads typically dominate writes, and (3) the memory savings enable larger datasets to fit in faster storage tiers (cache, RAM).

Summary: Mastering Chain Compaction

We've thoroughly explored the mechanics of compacting single-child chains—the core technique that transforms space-hungry standard tries into efficient compressed tries.

Key Takeaways

•Unary chains are the target: Nodes with exactly one child and no end-of-word marker can be eliminated
•End-of-word creates boundaries: We cannot compress through nodes that mark word endings
•Edge splitting enables dynamic insertion: New strings that diverge mid-edge require creating new internal nodes
•Edge merging maintains compression: Deletion may create unary nodes that must be merged with their children
•Label storage matters: Pointer-based references can avoid duplicating character data
•Invariants require vigilance: Off-by-one errors and incomplete merges cause subtle corruption

Splitting (Insert)

•Find divergence point in edge label
•Create new internal node at split
•Original edge becomes two edges
•New key suffix becomes third edge

Merging (Delete)

•Remove deleted key's node
•Check if parent becomes unary
•Concatenate parent and child edges
•Propagate merging upward

What's Next:

With the mechanics of compression fully understood, we turn to a specific instantiation: Radix Trees. Radix trees apply compression principles with particular optimizations for numeric and binary keys, enabling efficient implementations in operating system kernels, network routers, and memory management systems. We'll explore their structure, properties, and real-world applications.

Page Complete

You now understand the detailed algorithms for compacting single-child chains—how to identify unary chains, how to concatenate edge labels, how to split edges during insertion, and how to merge edges during deletion. These techniques are the operational foundation of all compressed trie variants.

2 / 4

Loading learning content...

Data Structures & AlgorithmsCompressed Tries — Patricia/Radix Trees

Compressed Tries — Patricia/Radix Trees (Conceptual)

LevelAdvanced

Duration75 mins

TopicCompressed Tries — Patricia/Radix Trees

2 / 4

Compacting Single-Child Chains

The Art of Chain Elimination

Now we turn to the practical question: How exactly do we identify and eliminate these chains?

What You Will Learn

Identifying Unary Chains

Before we can compact, we must identify. A unary chain is a maximal sequence of nodes where each internal node has exactly one child and is not an end-of-word marker.

Formal Definition:

A unary chain in a trie T is a path p₁ → p₂ → ... → pₖ where:

Each pᵢ (for i < k) has exactly one child (pᵢ₊₁)
Each pᵢ (for i < k) is NOT marked as end-of-word
The chain is maximal: p₁'s parent either has multiple children or is root; pₖ either has multiple children, is end-of-word, or is a leaf

Why end-of-word matters:

Consider storing "app" and "apple". The path is:

root ─a→ ● ─p→ ● ─p→ ● ✓(app) ─l→ ● ─e→ ● ✓(apple)

The node marked "app" has one child but IS an end-of-word. We cannot compress through it because that would lose the information that "app" is a complete word. The compression must be:

root ─"app"→ ● ✓ ─"le"→ ● ✓

Not:

root ─"apple"→ ● ✓  (WRONG! Lost "app")

The End-of-Word Boundary

Chain Detection Algorithm:

To find all unary chains suitable for compaction:

function findCompressibleChains(node):
    chains = []
    
    for each child c of node:
        if isCompressibleChainStart(c):
            chain = extractChain(c)
            chains.append(chain)
        else:
            chains.extend(findCompressibleChains(c))
    
    return chains

function isCompressibleChainStart(node):
    # Node starts a chain if it has exactly one child
    # and is not end-of-word
    return childCount(node) == 1 AND NOT node.isEndOfWord

function extractChain(startNode):
    chain = [startNode]
    current = startNode
    
    # Extend chain while current has exactly one child and isn't EOW
    while childCount(current) == 1 AND NOT current.isEndOfWord:
        current = onlyChild(current)
        chain.append(current)
    
    return chain

Unary Chain Classification Examples
Path Segment	End-of-Word?	Children Count	Compressible?	Reason
a → p	No	1	Yes	Unary, not EOW
p → p	No	1	Yes	Unary, not EOW
p → l	Yes ("app")	1	No	EOW boundary
l → e	No	1	Yes	Unary, not EOW
e → (none)	Yes ("apple")	0	N/A	Leaf node
n → {i,g}	No	2	No	Branching node

Visualization of Chain Detection:

Consider a standard trie storing {"testing", "test", "team", "tea"}:

root
 └── t [chain start - 1 child, not EOW]
     └── e [chain continues - 1 child, not EOW]
         └── s → ...t [branch to "test", "testing"]
         └── a → ...m [branch to "tea", "team"]

Wait, 't' leads to 'e', and 'e' has TWO children ('s' and 'a'). So the chain is only root→t→e, and 'e' is the branch point.

Actual structure:

root ─t→ ● ─e→ ●
              ├── s ─t→ ● ✓(test) ─i─n─g→ ● ✓(testing)
              └── a ● ✓(tea) ─m→ ● ✓(team)

Compressible chains:

root→t→e (but 'e' branches, so chain is t→e compacted to edge "te")
s→t (compress to "st")
t(in testing path)→i→n→g (compress to "ing")
a→m can't fully compress because 'a' is EOW for "tea"

The Compaction Algorithm

Once chains are identified, compaction transforms the structure. The algorithm takes a standard trie and produces an equivalent compressed trie.

Top-Down Compaction (Building Compressed Trie):

The cleanest approach builds a compressed trie directly from a set of strings, avoiding construction of an intermediate standard trie:

Start with an empty compressed trie (just root)
Insert each string, maintaining compression invariants
Each insertion either adds a new edge or splits an existing edge

This is the preferred approach for building from scratch.

Bottom-Up Compaction (Compressing Existing Trie):

When given an existing standard trie to compress:

Post-order traverse the trie (process children before parents)
At each node, if it has exactly one child and is not EOW:
- Concatenate edge labels: parent→node + node→child becomes parent→child with combined label
- Remove the intermediate node
Continue until no more compactions possible

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
class TrieNode:
    def __init__(self):
        self.children = {}  # char -> TrieNode for standard trie
        self.is_end = False
 
class CompressedTrieNode:
    def __init__(self):
        self.children = {}  # edge_label (string) -> CompressedTrieNode
        self.is_end = False
 
def compress_trie(standard_root):
    """
    Convert a standard trie to a compressed trie.
    Uses DFS to handle each subtree recursively.
    """
    compressed_root = CompressedTrieNode()
    
    def compress_from_node(std_node, comp_node):
        """
        Process children of std_node, adding to comp_node with compression.
        """
        for char, child in std_node.children.items():
            # Start building edge label
            edge_label = char
            current = child
            
            # Extend edge label through unary non-EOW nodes
            while len(current.children) == 1 and not current.is_end:
                next_char = list(current.children.keys())[0]
                edge_label += next_char
                current = current.children[next_char]
            
            # 'current' is now the endpoint: it either branches, is EOW, or is leaf
            new_child = CompressedTrieNode()
            new_child.is_end = current.is_end
            comp_node.children[edge_label] = new_child
            
            # Recurse to handle current's children
            compress_from_node(current, new_child)
    
    # Handle root specially since root is never EOW
    compress_from_node(standard_root, compressed_root)
    return compressed_root

Step-by-Step Example:

Compressing a standard trie with words {"abc", "abd", "xyz"}:

Standard Trie:

root
 ├── a
 │   └── b
 │       ├── c ✓
 │       └── d ✓
 └── x
     └── y
         └── z ✓

Compression Trace:

Process 'a' branch:
- Start with edge_label = "a", current = node 'a'
- Node 'a' has 1 child 'b' and is not EOW → extend
- edge_label = "ab", current = node 'b'
- Node 'b' has 2 children → stop extension
- Create edge "ab" from root to new node
- Recurse into 'b's children
Process 'b's children (at new compressed node):
- Child 'c': edge_label = "c", current = node 'c'
- Node 'c' has 0 children (leaf) → stop
- Create edge "c" marked EOW
- Child 'd': similar, create edge "d" marked EOW
Process 'x' branch:
- edge_label = "x", current = node 'x'
- Node 'x' has 1 child 'y', not EOW → extend
- edge_label = "xy", current = node 'y'
- Node 'y' has 1 child 'z', not EOW → extend
- edge_label = "xyz", current = node 'z'
- Node 'z' has 0 children (leaf) → stop
- Create edge "xyz" marked EOW

Compressed Trie:

root
 ├──"ab"──→ ●
 │          ├──"c"──→ ● ✓
 │          └──"d"──→ ● ✓
 └──"xyz"──→ ● ✓

Node count: 8 (standard) → 5 (compressed) — and the difference grows dramatically with longer chains.

Edge Splitting During Insertion

Building a compressed trie isn't just about initial construction—we need to support dynamic insertions. When inserting a new string, three scenarios arise:

Scenario 1: Completely New Path

The new string has no common prefix with any existing edge from the current node. Simply create a new edge labeled with the remaining suffix.

Scenario 2: Existing Edge is Prefix of New String

Follow the edge completely and continue insertion from the child node.

Scenario 3: Divergence Within Edge (Edge Splitting)

The new string and an existing edge share a prefix but then diverge. We must split the edge at the divergence point.

The Edge Splitting Process:

When inserting key k splits edge (label, child) at position p (where k and label first differ):

Create a new internal node N
Replace the original edge with: edge(label[0:p], N)
Add to N: edge(label[p:], child) — the remainder of original edge
Add to N: edge(k[pos+p:], new_leaf) — the remainder of new key

Where pos is the current position in key k when we reached this edge.

Example: Split in Action

Compressed trie contains "application":

root ──"application"──→ ● ✓

Inserting "apply":

Traverse edge "application"
"apply" matches "appl" (first 4 chars)
Divergence at position 4: 'y' vs 'i'
Split required!

Split process:

Create new node N
Replace edge: root ──"appl"──→ N
Add to N: edge "ication" → original leaf
Add to N: edge "y" → new leaf marked EOW

Result:

root ──"appl"──→ ●
                 ├──"ication"──→ ● ✓
                 └──"y"──→ ● ✓

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
def insert(root, key):
    """
    Insert key into compressed trie, splitting edges as needed.
    Time: O(m) where m = len(key)
    """
    node = root
    i = 0  # Position in key
    
    while i < len(key):
        remaining = key[i:]
        found = False
        
        for edge_label, child in list(node.children.items()):
            # Find longest common prefix between remaining key and edge label
            common_len = common_prefix_length(remaining, edge_label)
            
            if common_len == 0:
                continue  # No match, try next edge
            
            found = True
            
            if common_len == len(edge_label):
                # Edge label is entirely contained; follow edge
                i += common_len
                node = child
                break
            else:
                # Divergence within edge - SPLIT!
                # Split edge_label at common_len
                common = edge_label[:common_len]
                edge_remainder = edge_label[common_len:]
                key_remainder = remaining[common_len:]
                
                # Create new internal node
                split_node = CompressedTrieNode()
                
                # Update parent: replace old edge with edge to split_node
                del node.children[edge_label]
                node.children[common] = split_node
                
                # Add original child under edge_remainder
                split_node.children[edge_remainder] = child
                
                # Add new key under key_remainder (if any)
                if key_remainder:
                    new_leaf = CompressedTrieNode()
                    new_leaf.is_end = True
                    split_node.children[key_remainder] = new_leaf
                else:
                    # Key ends exactly at split point
                    split_node.is_end = True
                
                return  # Insertion complete
        
        if not found:
            # No matching edge; add entire remaining key
            new_leaf = CompressedTrieNode()
            new_leaf.is_end = True
            node.children[remaining] = new_leaf
            return
    
    # Key is a prefix of existing path
    node.is_end = True
 
def common_prefix_length(s1, s2):
    """Return length of common prefix between two strings."""
    i = 0
    while i < len(s1) and i < len(s2) and s1[i] == s2[i]:
        i += 1
    return i

Splitting Creates Branching Nodes

Edge Merging During Deletion

Deletion is the inverse of insertion, and edge merging is the inverse of edge splitting. When removing a string, we may create unary internal nodes that violate the compression invariant.

When Merging is Needed:

After deleting a string, an internal node might become:

Unary (exactly one child) AND not end-of-word → must merge
Childless AND not end-of-word → should be removed, potentially triggering parent merge

The Edge Merging Process:

If node N has exactly one child C, and N is not end-of-word:

Concatenate edge labels: edge(parent→N) + edge(N→C)
Make C the direct child of N's parent with the combined label
Remove N from the tree
Check if N's parent now needs merging (propagate upward)

Example: Deletion and Merge

Compressed trie contains "apply" and "application":

root ──"appl"──→ ●
                 ├──"ication"──→ ● ✓
                 └──"y"──→ ● ✓

Deleting "apply":

Navigate to "y" node, unmark as EOW
"y" node is now childless and not EOW → remove it
Parent node (at "appl") now has one child ("ication")
Parent is not EOW → merge!
Combine "appl" + "ication" = "application"
Result:

root ──"application"──→ ● ✓

We've restored the original single-edge structure.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
def delete(root, key):
    """
    Delete key from compressed trie, merging edges as needed.
    Returns True if deletion was successful.
    """
    # First, find the path to the key
    path = []  # Stack of (parent, edge_label, node)
    node = root
    i = 0
    
    while i < len(key):
        remaining = key[i:]
        found = False
        
        for edge_label, child in node.children.items():
            if remaining.startswith(edge_label):
                # Follow this edge
                path.append((node, edge_label, child))
                i += len(edge_label)
                node = child
                found = True
                break
            elif edge_label.startswith(remaining):
                # Key ends mid-edge - not an exact match
                return False
        
        if not found:
            return False  # Key not in trie
    
    # Check if this node is actually marked as end-of-word
    if not node.is_end:
        return False
    
    # Unmark as end-of-word
    node.is_end = False
    
    # Now clean up: remove/merge nodes bottom-up
    cleanup_after_delete(path, node)
    return True
 
def cleanup_after_delete(path, leaf_node):
    """
    Clean up tree after deletion, merging edges as needed.
    """
    node = leaf_node
    
    while path:
        parent, edge_label, current = path.pop()
        
        if len(node.children) == 0 and not node.is_end:
            # Node is useless; remove it
            del parent.children[edge_label]
            node = parent
        elif len(node.children) == 1 and not node.is_end:
            # Node is unary; merge with child
            child_label = list(node.children.keys())[0]
            child = node.children[child_label]
            
            # Create merged edge
            new_label = edge_label + child_label
            del parent.children[edge_label]
            parent.children[new_label] = child
            
            node = parent
        else:
            # Node is valid; stop cleanup
            break

Propagating Merges

Edge Label Storage Strategies

A critical implementation decision is how to store edge labels. The naive approach—copying substrings—works but may waste memory. Several strategies exist:

Strategy 1: Copied Substrings (Naive)

Each edge stores a full copy of its label string.

Pros:

Simple implementation
Strings are self-contained; no external dependencies
Safe even if original strings are modified/freed

Cons:

Duplicates character data
For shared prefixes, we store the same characters in multiple edges
Space overhead: O(L) where L is total label length

Strategy 2: Start/Length Pointers

Store references into original strings: (original_string_id, start_index, length).

Pros:

No duplication of character data
Edge labels are just three integers (~12-24 bytes)
Original strings needed anyway for retrieval

Cons:

Requires keeping original strings alive
More complex memory management
Indirection on label access

Edge Label Storage Comparison
Strategy	Space Per Edge	Access Time	Implementation Complexity	Best For
Copied String	O(label length)	O(1) to access	Low	Simple implementations
Start/Length	O(1) fixed	O(1) with indirection	Medium	Memory-critical systems
Shared Pool	O(1) fixed + pool	O(1) with indirection	High	Many shared substrings
Inline Short	O(1) for short, O(n) otherwise	O(1)	Medium	Mixed length labels

Strategy 3: String Interning Pool

Maintain a pool of unique substrings. Edges reference pool entries.

Pros:

Deduplicates identical labels (e.g., common words appearing in multiple paths)
Enables O(1) label equality comparison (pointer equality)

Cons:

Pool management complexity
Garbage collection for unused strings
Overkill for most use cases

Strategy 4: Inline Short Strings

A hybrid: store short labels inline (e.g., ≤8 characters) and only allocate separately for longer labels.

Pros:

Captures common case efficiently (most labels are short after compression)
Avoids allocation overhead for typical paths
Cache-friendly for short labels

Cons:

More complex node structure
Must handle both cases in all operations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
interface EdgeLabel {
    // If length <= 8, characters are stored inline
    // Otherwise, pointer to heap-allocated string
    inline: string | null;  // For labels <= 8 chars
    external: string | null; // For labels > 8 chars
}
 
class CompressedTrieNode {
    children: Map<string, CompressedTrieNode>;
    isEnd: boolean;
    
    /**
     * Get edge label efficiently.
     * Short labels avoid heap allocation.
     */
    setEdge(label: string, child: CompressedTrieNode): void {
        // In real implementation, we'd store inline vs external
        // Here simplified to use string key directly
        this.children.set(label, child);
    }
}
 
// Production implementation might use:
// - Small Buffer Optimization (SBO) pattern
// - Tagged union for inline vs pointer
// - Platform-specific optimizations

The Linux Kernel Approach

Correctness Considerations

The Core Invariants (Revisited):

No Unary Internal Nodes: Every internal node has ≥2 children OR is marked end-of-word
Deterministic Edges: No two edges from the same node share a common prefix
Path Completeness: Every stored string corresponds to exactly one root-to-marked-node path
Path Uniqueness: Every root-to-marked-node path corresponds to exactly one stored string

Common Implementation Bugs

•Off-by-one in split position: Splitting at wrong index creates incorrect prefixes; strings not found on search
•Forgetting to update parent's child map: Old edge remains after split; search may follow wrong path
•Not propagating merge: Stopping merge early leaves unary nodes; space isn't fully reclaimed
•End-of-word not transferred: When splitting, the new internal node must inherit EOW status if key ends there
•Concurrent modification: Iterating over children while modifying; undefined behavior in most languages
•Edge label prefix collision: After operations, two edges might share prefix; search becomes non-deterministic

Testing Strategies:

Robust testing is essential for trie implementations:

Property-Based Testing:

After any sequence of operations, invariants must hold
All inserted strings must be findable
All non-inserted strings must return false
Deletion followed by insertion restores original state

Metamorphic Testing:

Insert all strings, then check membership for all
Insert in random order vs sorted order should yield equivalent tries
Delete then re-insert should be idempotent

Edge Cases:

Empty string insertion
Single-character strings
One string being prefix of another
All strings identical except last character
Very long strings (test for stack overflow in recursive implementations)
Alternating insert/delete of same string

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
def verify_invariants(root):
    """
    Verify all compressed trie invariants.
    Call after operations during testing.
    """
    errors = []
    
    def check_node(node, path=""):
        # Check 1: No unary internal nodes
        if len(node.children) == 1 and not node.is_end:
            errors.append(f"Unary internal node at path: {path}")
        
        # Check 2: No edge prefix collisions
        labels = list(node.children.keys())
        for i, l1 in enumerate(labels):
            for l2 in labels[i+1:]:
                if l1.startswith(l2) or l2.startswith(l1):
                    errors.append(
                        f"Edge prefix collision at {path}: '{l1}' vs '{l2}'"
                    )
        
        # Recurse
        for label, child in node.children.items():
            check_node(child, path + label)
    
    check_node(root)
    
    if errors:
        raise AssertionError("Invariant violations:\n" + "\n".join(errors))
 
# Usage in tests:
def test_insert_delete():
    trie = CompressedTrie()
    words = ["apple", "apply", "application", "apt"]
    
    for word in words:
        trie.insert(word)
        verify_invariants(trie.root)  # Check after each insert
    
    for word in words:
        assert trie.search(word), f"{word} should exist"
    
    for word in words[:2]:
        trie.delete(word)
        verify_invariants(trie.root)  # Check after each delete
    
    assert not trie.search("apple")
    assert trie.search("application")

Performance Characteristics of Compaction

Understanding the performance impact of compaction helps in making implementation decisions.

Time Complexity:

Operation	Standard Trie	Compressed Trie	Notes
Search	O(m)	O(m)	Same total comparisons
Insert	O(m)	O(m)	Plus possible allocation
Delete	O(m)	O(m)	Plus possible merge cost
Build (n strings)	O(L)	O(L)	L = total length

The O(m) complexity is preserved because we process each character exactly once—just in larger chunks per node visit.

Constant Factors:

While big-O is identical, constant factors differ:

Standard Trie Advantages:

Simpler node traversal (single character edges)
No string comparison needed (just pointer follow)
Simpler insert/delete (no split/merge)

Compressed Trie Advantages:

Fewer nodes to allocate/traverse
Better cache locality (fewer pointer chases)
Fewer memory allocations overall

When Compressed Tries Win on Speed:

Cache-bound workloads with large datasets
Read-heavy applications (fewer nodes = faster search)
Long strings with shared prefixes

When Standard Tries Win on Speed:

Small datasets that fit in cache anyway
Write-heavy applications with frequent modifications
When split/merge overhead dominates

Operation Micro-Overhead Comparison
Operation Step	Standard Trie	Compressed Trie
Locate next edge	Array index: O(1)	String prefix match: O(edge length)
Node allocation	One per character	One per branch point
Character comparison	One per node	Many per node (batched)
Insert edge	Set array slot	Hash map insert + possible split
Delete edge	Clear array slot	Hash map delete + possible merge

Memory vs Speed Tradeoff

Summary: Mastering Chain Compaction

We've thoroughly explored the mechanics of compacting single-child chains—the core technique that transforms space-hungry standard tries into efficient compressed tries.

Key Takeaways

•Unary chains are the target: Nodes with exactly one child and no end-of-word marker can be eliminated
•End-of-word creates boundaries: We cannot compress through nodes that mark word endings
•Edge splitting enables dynamic insertion: New strings that diverge mid-edge require creating new internal nodes
•Edge merging maintains compression: Deletion may create unary nodes that must be merged with their children
•Label storage matters: Pointer-based references can avoid duplicating character data
•Invariants require vigilance: Off-by-one errors and incomplete merges cause subtle corruption

Splitting (Insert)

•Find divergence point in edge label
•Create new internal node at split
•Original edge becomes two edges
•New key suffix becomes third edge

Merging (Delete)

•Remove deleted key's node
•Check if parent becomes unary
•Concatenate parent and child edges
•Propagate merging upward

What's Next:

Page Complete

2 / 4