Data Structures & AlgorithmsBST Variants & Extensions

BST Variants & Extensions

LevelIntermediate

Duration90 mins

TopicBST Variants & Extensions

3 / 4

Augmented BSTs

Beyond Basic BST: The Augmentation Pattern

In the previous page, we saw how adding a single field—subtree size—transforms a basic BST into a powerful Order Statistics Tree capable of O(log n) rank queries. This wasn't magic or a special data structure; it was an application of a general pattern: tree augmentation.

Tree augmentation is the technique of storing additional computed information at each node to efficiently support operations beyond basic insert, delete, and search. The key insight is that trees naturally aggregate information along paths—if we carefully choose what to aggregate, we can answer sophisticated queries without traversing the entire structure.

This page generalizes the augmentation pattern, exploring multiple examples and providing a framework for designing your own augmentations.

What You Will Master

By the end of this page, you will understand the general augmentation methodology, see how it applies to interval trees, range queries, and other patterns, learn the rules for what can and cannot be augmented efficiently, and gain a framework for designing custom augmentations for your specific requirements.

The Augmentation Theorem

Not all node properties can be efficiently maintained during tree modifications. The Augmentation Theorem tells us exactly what can work:

Augmentation Theorem (informal): A field f can be efficiently maintained in a BST if and only if f at any node can be computed solely from:

Information stored in the node itself (key, value, other node fields)

f values of its children

O(1) additional information

In formal terms, if we can express:

node.f = compute(node.key, node.value, node.left.f, node.right.f)

...where compute runs in O(1) time, then the augmentation is efficient.

Why This Works

The theorem works because tree modifications (insert, delete, rotate) only affect nodes on a single path from leaf to root. If each node's augmented field depends only on itself and its children, we can update all affected nodes in O(h) time by walking up the path and recomputing. In a balanced tree, this is O(log n).

Examples of valid augmentations:

Augmentation	Formula	Use Case
Subtree size	`1 + size(left) + size(right)`	Order statistics
Subtree height	`1 + max(height(left), height(right))`	AVL balance
Subtree max	`max(key, max(left), max(right))`	Interval trees
Subtree sum	`value + sum(left) + sum(right)`	Range sum queries
Black height	Based on children's black heights	Red-Black trees

Examples of invalid augmentations:

Attempted Augmentation	Why It Fails
Ancestor count	Depends on path to root, not children
Median of subtree	Requires O(n) knowledge of all subtree values
Parent's key	Requires parent reference, not derived from children
Depth	Depends on ancestors, not descendants

The Locality Principle

Efficient augmentation requires locality—each node's augmented data must be computable from local information (itself and children). Any augmentation requiring knowledge of ancestors, cousins, or the global tree structure cannot be maintained in O(log n) per modification.

Augmentation for Range Queries

One of the most powerful applications of augmentation is enabling efficient range queries. Consider a BST where we want to quickly answer: 'What is the sum of all values with keys in range [L, R]?'

Naive approach: Traverse all nodes, check if key is in range, sum values. Time: O(n).

Augmented approach: Store the sum of values in each subtree. Use the BST property to prune subtrees that are entirely outside the range. Time: O(log n + k) where k is the number of keys in range, or O(log n) with clever aggregation.

range-sum-bst.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
interface RangeSumNode {
    key: number;
    value: number;
    left: RangeSumNode | null;
    right: RangeSumNode | null;
    subtreeSum: number;  // Sum of values in this subtree
}
 
function getSum(node: RangeSumNode | null): number {
    return node === null ? 0 : node.subtreeSum;
}
 
function updateNode(node: RangeSumNode): void {
    node.subtreeSum = node.value + getSum(node.left) + getSum(node.right);
}
 
/**
 * Range sum query: sum of values where key ∈ [low, high]
 * 
 * Time Complexity: O(log n) when range spans a small fraction of keys
 *                  O(n) worst case when range includes all keys
 * 
 * The key insight: we don't need to visit every node in range.
 * We can use subtreeSum to include entire subtrees when they're
 * fully contained in the range.
 */
function rangeSum(
    node: RangeSumNode | null,
    low: number,
    high: number
): number {
    if (node === null) return 0;
 
    // Case 1: Current node is below range - search right only
    if (node.key < low) {
        return rangeSum(node.right, low, high);
    }
 
    // Case 2: Current node is above range - search left only
    if (node.key > high) {
        return rangeSum(node.left, low, high);
    }
 
    // Case 3: Current node is in range [low, high]
    // Include this node's value plus contributions from children
    let sum = node.value;
 
    // Left child: we need keys >= low
    if (node.left !== null) {
        if (node.left.key >= low) {
            // Entire left subtree might be in range
            sum += rangeSumOptimized(node.left, low, high);
        } else {
            sum += rangeSum(node.left, low, high);
        }
    }
 
    // Right child: we need keys <= high
    if (node.right !== null) {
        if (node.right.key <= high) {
            // Entire right subtree might be in range
            sum += rangeSumOptimized(node.right, low, high);
        } else {
            sum += rangeSum(node.right, low, high);
        }
    }
 
    return sum;
}
 
/**
 * Optimized range sum when we know entire subtree is in range.
 * Uses the subtreeSum augmentation directly.
 */
function rangeSumOptimized(
    node: RangeSumNode | null,
    low: number,
    high: number
): number {
    if (node === null) return 0;
 
    // Check if entire subtree is within range
    // (This requires also storing subtreeMin and subtreeMax)
    // For simplicity, we do standard rangeSum
    return rangeSum(node, low, high);
}

Enhanced version with min/max augmentation:

To truly achieve O(log n) range sums, we can add subtreeMin and subtreeMax fields. When processing a node:

If [subtreeMin, subtreeMax] is entirely within [low, high] → return subtreeSum directly
If [subtreeMin, subtreeMax] is entirely outside [low, high] → return 0
Otherwise → recurse into children

This allows us to short-circuit entire subtrees, achieving O(log n) for typical range queries.

Interval Trees: A Classic Augmentation

Interval Trees are a canonical example of BST augmentation. They solve the interval overlap query problem:

Given a set of intervals [start, end], efficiently find all intervals that overlap with a query interval [q_start, q_end].

Applications include:

Calendar systems: Find meetings that conflict with a proposed time slot
Computational geometry: Find segments intersecting a query region
Database systems: Range predicate optimization
Network management: IP ranges containing a given address

The augmentation: Each node stores the maximum endpoint value in its subtree.

interval-tree.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
interface Interval {
    start: number;
    end: number;
}
 
interface IntervalNode {
    interval: Interval;
    left: IntervalNode | null;
    right: IntervalNode | null;
    maxEnd: number;  // Maximum 'end' value in this subtree
}
 
/**
 * Get the maximum end value in a subtree.
 */
function getMaxEnd(node: IntervalNode | null): number {
    return node === null ? -Infinity : node.maxEnd;
}
 
/**
 * Update the maxEnd augmentation after modification.
 */
function updateMaxEnd(node: IntervalNode): void {
    node.maxEnd = Math.max(
        node.interval.end,
        getMaxEnd(node.left),
        getMaxEnd(node.right)
    );
}
 
/**
 * Insert an interval. BST is keyed by interval.start.
 */
function insertInterval(
    root: IntervalNode | null,
    interval: Interval
): IntervalNode {
    if (root === null) {
        return {
            interval,
            left: null,
            right: null,
            maxEnd: interval.end,
        };
    }
 
    if (interval.start < root.interval.start) {
        root.left = insertInterval(root.left, interval);
    } else {
        root.right = insertInterval(root.right, interval);
    }
 
    updateMaxEnd(root);
    return root;
}
 
/**
 * Two intervals overlap if: a.start <= b.end AND b.start <= a.end
 */
function intervalsOverlap(a: Interval, b: Interval): boolean {
    return a.start <= b.end && b.start <= a.end;
}
 
/**
 * Find ONE interval that overlaps with the query.
 * Time: O(log n) expected
 */
function findOverlapping(
    node: IntervalNode | null,
    query: Interval
): Interval | null {
    if (node === null) return null;
 
    // Check current node
    if (intervalsOverlap(node.interval, query)) {
        return node.interval;
    }
 
    // Decide which subtree to search:
    // If left subtree exists AND its maxEnd >= query.start,
    // there MIGHT be an overlapping interval on the left.
    if (node.left !== null && node.left.maxEnd >= query.start) {
        const found = findOverlapping(node.left, query);
        if (found !== null) return found;
    }
 
    // Otherwise, search right subtree
    return findOverlapping(node.right, query);
}
 
/**
 * Find ALL intervals that overlap with the query.
 * Time: O(log n + k) where k is the number of overlapping intervals
 */
function findAllOverlapping(
    node: IntervalNode | null,
    query: Interval,
    results: Interval[] = []
): Interval[] {
    if (node === null) return results;
 
    // Add current node if it overlaps
    if (intervalsOverlap(node.interval, query)) {
        results.push(node.interval);
    }
 
    // Search left if there might be overlaps
    if (node.left !== null && node.left.maxEnd >= query.start) {
        findAllOverlapping(node.left, query, results);
    }
 
    // Search right if there might be overlaps
    // The right subtree can have overlaps if its intervals start
    // before the query ends
    if (node.right !== null && node.interval.start <= query.end) {
        findAllOverlapping(node.right, query, results);
    }
 
    return results;
}

The Power of maxEnd

The maxEnd field enables powerful pruning. When searching for overlaps with query [q_start, q_end], if a subtree's maxEnd < q_start, we know NO interval in that subtree can overlap with the query (all intervals end before the query starts). This single augmented field transforms O(n) searching into O(log n + k).

Interval tree complexity:

Operation	Time Complexity
Insert	O(log n)
Delete	O(log n)
Find any overlap	O(log n)
Find all k overlaps	O(log n + k)
Space	O(n)

The interval tree is a textbook example of how a single augmented field (maxEnd) enables an entirely new category of queries while preserving the O(log n) modification complexity.

Multi-Field Augmentation

We're not limited to a single augmented field. Complex applications often require multiple augmentations that work together:

Example: A comprehensive statistics tree

Imagine we need:

Order statistics (k-th element, rank)
Range sum queries
Min/max queries
All with O(log n) performance

We can combine multiple augmentations in a single structure:

multi-augmented-bst.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
interface StatisticsNode {
    key: number;
    value: number;
    left: StatisticsNode | null;
    right: StatisticsNode | null;
    
    // Multiple augmentations
    size: number;       // For order statistics
    sum: number;        // For range sum queries
    min: number;        // Minimum key in subtree
    max: number;        // Maximum key in subtree
    height: number;     // For balance checking
}
 
/**
 * Helper functions for null-safe access
 */
function getSize(node: StatisticsNode | null): number {
    return node === null ? 0 : node.size;
}
 
function getSum(node: StatisticsNode | null): number {
    return node === null ? 0 : node.sum;
}
 
function getMin(node: StatisticsNode | null): number {
    return node === null ? Infinity : node.min;
}
 
function getMax(node: StatisticsNode | null): number {
    return node === null ? -Infinity : node.max;
}
 
function getHeight(node: StatisticsNode | null): number {
    return node === null ? -1 : node.height;
}
 
/**
 * Update ALL augmented fields after modification.
 * Each field follows the augmentation theorem:
 * - Computable from self and children only
 * - O(1) computation time
 */
function updateAugmentations(node: StatisticsNode): void {
    node.size = 1 + getSize(node.left) + getSize(node.right);
    node.sum = node.value + getSum(node.left) + getSum(node.right);
    node.min = Math.min(node.key, getMin(node.left), getMin(node.right));
    node.max = Math.max(node.key, getMax(node.left), getMax(node.right));
    node.height = 1 + Math.max(getHeight(node.left), getHeight(node.right));
}
 
/**
 * Create a new leaf node with all augmentations initialized.
 */
function createStatsNode(key: number, value: number): StatisticsNode {
    return {
        key,
        value,
        left: null,
        right: null,
        size: 1,
        sum: value,
        min: key,
        max: key,
        height: 0,
    };
}
 
/**
 * Now we can implement multiple operations efficiently:
 */
 
// Order statistics: Select k-th element
function select(node: StatisticsNode | null, k: number): StatisticsNode | null {
    if (node === null || k < 1 || k > getSize(node)) return null;
    
    const leftSize = getSize(node.left);
    if (k <= leftSize) return select(node.left, k);
    if (k === leftSize + 1) return node;
    return select(node.right, k - leftSize - 1);
}
 
// Range sum: sum of values in key range [lo, hi]
function rangeSum(
    node: StatisticsNode | null,
    lo: number,
    hi: number
): number {
    if (node === null) return 0;
    
    // Use min/max to check if entire subtree is in range
    if (node.min >= lo && node.max <= hi) {
        return node.sum;  // Entire subtree in range!
    }
    
    if (node.max < lo || node.min > hi) {
        return 0;  // Entire subtree out of range
    }
    
    // Partial overlap: recurse
    let result = 0;
    if (node.key >= lo && node.key <= hi) {
        result += node.value;
    }
    result += rangeSum(node.left, lo, hi);
    result += rangeSum(node.right, lo, hi);
    return result;
}
 
// Check AVL balance using height
function isBalanced(node: StatisticsNode | null): boolean {
    if (node === null) return true;
    const balanceFactor = getHeight(node.left) - getHeight(node.right);
    return Math.abs(balanceFactor) <= 1 &&
           isBalanced(node.left) &&
           isBalanced(node.right);
}

Augmentation Overhead

Each augmented field adds: • O(1) space per node • O(1) time per modification (for updates)

With k augmented fields, insertions and deletions become O(k × log n). For a fixed number of augmentations, this is still O(log n). Be mindful of adding too many augmentations if modification performance is critical.

Designing Custom Augmentations

When you need an operation not covered by standard augmentations, follow this systematic process:

Step 1: Define the query precisely

What exactly do you need to compute? Write it mathematically.

Step 2: Identify what information would help

What aggregate data, if available at each node, would let you answer the query without full traversal?

Step 3: Check the augmentation theorem

Can this data be computed from the node itself and its children's augmented values? If not, the augmentation won't be efficient.

Step 4: Design the update formula

Write the explicit recurrence for computing the field from children.

Step 5: Handle edge cases

What value does a null (empty) subtree contribute? This becomes your base case.

Worked example: Second-maximum in subtree

Query: For any subtree, find the second-largest key.

Why useful: Efficiently find 'runner-up' in range queries.

Analysis:

Can we compute secondMax from children? Let's think:

From left child: max_L, secondMax_L
From right child: max_R, secondMax_R
Current node: key

The secondMax of current subtree is the second element in: {key, max_L, max_R, secondMax_L, secondMax_R}

This can be computed in O(1) by sorting 5 values and taking the second.

second-max-augmentation.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
interface SecondMaxNode {
    key: number;
    left: SecondMaxNode | null;
    right: SecondMaxNode | null;
    max: number;        // Maximum in subtree
    secondMax: number;  // Second maximum in subtree
}
 
/**
 * Compute max and secondMax from children.
 * 
 * We need the two largest values among:
 * - Current node's key
 * - Left child's max and secondMax
 * - Right child's max and secondMax
 */
function updateSecondMax(node: SecondMaxNode): void {
    // Collect all candidate values
    const candidates: number[] = [node.key];
    
    if (node.left !== null) {
        candidates.push(node.left.max);
        if (node.left.secondMax !== -Infinity) {
            candidates.push(node.left.secondMax);
        }
    }
    
    if (node.right !== null) {
        candidates.push(node.right.max);
        if (node.right.secondMax !== -Infinity) {
            candidates.push(node.right.secondMax);
        }
    }
    
    // Find max and second max from candidates
    let max1 = -Infinity;
    let max2 = -Infinity;
    
    for (const val of candidates) {
        if (val > max1) {
            max2 = max1;
            max1 = val;
        } else if (val > max2 && val !== max1) {
            max2 = val;
        }
    }
    
    node.max = max1;
    node.secondMax = max2;
}
 
/**
 * Edge case: When is secondMax = -Infinity?
 * - Single-node subtree: only one value, no second max
 * - All nodes have same key: no distinct second max
 * 
 * Client code should check for secondMax === -Infinity to handle these.
 */
 
/**
 * Application: Find second-largest in a key range [lo, hi]
 * 
 * This is more complex because we need to filter by range,
 * but the augmentation still helps by providing quick answers
 * for fully-contained subtrees.
 */
function secondMaxInRange(
    node: SecondMaxNode | null,
    lo: number,
    hi: number
): number {
    if (node === null) return -Infinity;
    
    // If entire subtree is in range, use augmented values
    // (Would need subtreeMin, subtreeMax augmentations for this check)
    
    // For now, collect and process
    const values: number[] = [];
    collectInRange(node, lo, hi, values);
    
    // Sort and get second max
    values.sort((a, b) => b - a);
    return values.length >= 2 ? values[1] : -Infinity;
}
 
function collectInRange(
    node: SecondMaxNode | null,
    lo: number,
    hi: number,
    results: number[]
): void {
    if (node === null) return;
    
    if (node.key > lo) collectInRange(node.left, lo, hi, results);
    if (node.key >= lo && node.key <= hi) results.push(node.key);
    if (node.key < hi) collectInRange(node.right, lo, hi, results);
}

Augmentation with Self-Balancing Trees

For augmented BSTs to be truly useful, we need O(log n) height—which requires self-balancing. The good news: augmentations that satisfy the augmentation theorem work seamlessly with AVL or Red-Black trees.

Key insight: Rotations (the core balancing operation) affect only O(1) nodes at a time. After each rotation, we update augmented fields on the affected nodes. Since each update is O(1), rotations remain O(1), and balancing remains O(log n).

augmented-rotation.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
interface AugmentedNode<T> {
    key: number;
    value: T;
    left: AugmentedNode<T> | null;
    right: AugmentedNode<T> | null;
    height: number;  // For AVL balance
    size: number;    // For order statistics
}
 
/**
 * Right rotation with augmentation updates.
 * 
 *       Y                 X
 *      / \               / \
 *     X   γ     →       α   Y
 *    / \                   / \
 *   α   β                 β   γ
 * 
 * Both X and Y may have augmented fields that need updating.
 * Order matters: update Y first (now lower), then X (now higher).
 */
function rotateRight<T>(y: AugmentedNode<T>): AugmentedNode<T> {
    const x = y.left!;
    const beta = x.right;
    
    // Perform rotation
    x.right = y;
    y.left = beta;
    
    // Update augmentations bottom-up
    updateAugmentations(y);  // Y is now lower
    updateAugmentations(x);  // X is now root, uses Y's updated values
    
    return x;
}
 
/**
 * Left rotation with augmentation updates.
 * Mirror of right rotation.
 */
function rotateLeft<T>(x: AugmentedNode<T>): AugmentedNode<T> {
    const y = x.right!;
    const beta = y.left;
    
    // Perform rotation
    y.left = x;
    x.right = beta;
    
    // Update augmentations bottom-up
    updateAugmentations(x);  // X is now lower
    updateAugmentations(y);  // Y is now root
    
    return y;
}
 
function updateAugmentations<T>(node: AugmentedNode<T>): void {
    node.height = 1 + Math.max(
        getHeight(node.left),
        getHeight(node.right)
    );
    node.size = 1 + getSize(node.left) + getSize(node.right);
}
 
function getHeight<T>(node: AugmentedNode<T> | null): number {
    return node === null ? -1 : node.height;
}
 
function getSize<T>(node: AugmentedNode<T> | null): number {
    return node === null ? 0 : node.size;
}
 
/**
 * Complete AVL insert with augmentation maintenance.
 */
function avlInsertAugmented<T>(
    node: AugmentedNode<T> | null,
    key: number,
    value: T
): AugmentedNode<T> {
    // Standard BST insert
    if (node === null) {
        return { key, value, left: null, right: null, height: 0, size: 1 };
    }
 
    if (key < node.key) {
        node.left = avlInsertAugmented(node.left, key, value);
    } else {
        node.right = avlInsertAugmented(node.right, key, value);
    }
 
    // Update augmentations before potential rotations
    updateAugmentations(node);
 
    // Get balance factor
    const balance = getHeight(node.left) - getHeight(node.right);
 
    // Left-Left case
    if (balance > 1 && key < node.left!.key) {
        return rotateRight(node);
    }
 
    // Right-Right case
    if (balance < -1 && key > node.right!.key) {
        return rotateLeft(node);
    }
 
    // Left-Right case
    if (balance > 1 && key > node.left!.key) {
        node.left = rotateLeft(node.left!);
        return rotateRight(node);
    }
 
    // Right-Left case
    if (balance < -1 && key < node.right!.key) {
        node.right = rotateRight(node.right!);
        return rotateLeft(node);
    }
 
    return node;
}

The Bottom-Up Rule

After any restructuring (rotation, node removal), always update augmentations from bottom to top. A node's augmented field may depend on its children's fields, so children must be updated first. Missing or incorrect update order is a common source of bugs in augmented tree implementations.

Real-World Augmented BST Applications

Augmented BSTs power critical systems across many domains:

Text Editors (Rope Data Structure)

•Structure: Self-balancing BST where each node represents a text chunk
•Augmentation: Subtree weight (total characters in subtree)
•Query: Find character at position i in O(log n) navigating by weights
•Operations: Insert/delete text chunks with O(log n) rebalancing
•Used by: VS Code, Sublime Text, other modern editors

IP Routing Tables

•Structure: Patricia trie (variant of BST on bit strings)
•Augmentation: Longest prefix match information
•Query: Given IP address, find longest matching prefix for routing
•Scale: Millions of routing entries with O(log n) lookups
•Used by: Core internet routers

Database B+ Trees

•Structure: Multi-way search tree (BST generalization)
•Augmentation: Key counts, min/max values per node
•Query: Range scans, cardinality estimation, ordered iteration
•Operations: Efficient bulk insert, split/merge for parallel access
•Used by: PostgreSQL, MySQL, Oracle, all major databases

Geographic Information Systems (R-Trees)

•Structure: Multi-dimensional BST variant
•Augmentation: Bounding box of all objects in subtree
•Query: Find all objects intersecting a rectangular region
•Operations: Efficient insertion with node splitting
•Used by: PostGIS, Google Maps, any spatial database

Summary: Augmented BSTs

We've explored the powerful pattern of tree augmentation. Let's consolidate the key insights:

Key Takeaways

•Augmentation adds computed fields to nodes — Store aggregate information that enables new query types
•The augmentation theorem defines what works — Field must be computable from self + children in O(1) time
•Multiple augmentations can coexist — Combine size, sum, min, max, height as needed
•Interval trees showcase the power — Single maxEnd field enables O(log n) overlap queries
•Custom augmentations follow a systematic process — Define query, identify needed aggregates, verify theorem, implement
•Compatible with self-balancing — Update augmentations bottom-up after rotations
•Underlies many real-world systems — Text editors, databases, GIS, routing all use augmented trees

What's next:

Our exploration of BST extensions concludes with understanding how BSTs connect to the broader family of balanced tree structures. The final page examines the connection to other balanced trees—how concepts from BSTs extend to AVL trees, Red-Black trees, B-trees, and beyond, and how to choose the right variant for your needs.

Page Complete

You now understand augmented BSTs as a general pattern—how to identify valid augmentations, implement them correctly, maintain them through modifications and rotations, and apply them to solve real-world problems. This knowledge transforms BSTs from a basic data structure into a flexible framework for building specialized solutions.

3 / 4

Loading learning content...

Data Structures & AlgorithmsBST Variants & Extensions

BST Variants & Extensions

LevelIntermediate

Duration90 mins

TopicBST Variants & Extensions

3 / 4

Augmented BSTs

Beyond Basic BST: The Augmentation Pattern

This page generalizes the augmentation pattern, exploring multiple examples and providing a framework for designing your own augmentations.

What You Will Master

The Augmentation Theorem

Not all node properties can be efficiently maintained during tree modifications. The Augmentation Theorem tells us exactly what can work:

Augmentation Theorem (informal): A field f can be efficiently maintained in a BST if and only if f at any node can be computed solely from:

Information stored in the node itself (key, value, other node fields)

f values of its children

O(1) additional information

In formal terms, if we can express:

node.f = compute(node.key, node.value, node.left.f, node.right.f)

...where compute runs in O(1) time, then the augmentation is efficient.

Why This Works

Examples of valid augmentations:

Augmentation	Formula	Use Case
Subtree size	`1 + size(left) + size(right)`	Order statistics
Subtree height	`1 + max(height(left), height(right))`	AVL balance
Subtree max	`max(key, max(left), max(right))`	Interval trees
Subtree sum	`value + sum(left) + sum(right)`	Range sum queries
Black height	Based on children's black heights	Red-Black trees

Examples of invalid augmentations:

Attempted Augmentation	Why It Fails
Ancestor count	Depends on path to root, not children
Median of subtree	Requires O(n) knowledge of all subtree values
Parent's key	Requires parent reference, not derived from children
Depth	Depends on ancestors, not descendants

The Locality Principle

Augmentation for Range Queries

One of the most powerful applications of augmentation is enabling efficient range queries. Consider a BST where we want to quickly answer: 'What is the sum of all values with keys in range [L, R]?'

Naive approach: Traverse all nodes, check if key is in range, sum values. Time: O(n).

range-sum-bst.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
interface RangeSumNode {
    key: number;
    value: number;
    left: RangeSumNode | null;
    right: RangeSumNode | null;
    subtreeSum: number;  // Sum of values in this subtree
}
 
function getSum(node: RangeSumNode | null): number {
    return node === null ? 0 : node.subtreeSum;
}
 
function updateNode(node: RangeSumNode): void {
    node.subtreeSum = node.value + getSum(node.left) + getSum(node.right);
}
 
/**
 * Range sum query: sum of values where key ∈ [low, high]
 * 
 * Time Complexity: O(log n) when range spans a small fraction of keys
 *                  O(n) worst case when range includes all keys
 * 
 * The key insight: we don't need to visit every node in range.
 * We can use subtreeSum to include entire subtrees when they're
 * fully contained in the range.
 */
function rangeSum(
    node: RangeSumNode | null,
    low: number,
    high: number
): number {
    if (node === null) return 0;
 
    // Case 1: Current node is below range - search right only
    if (node.key < low) {
        return rangeSum(node.right, low, high);
    }
 
    // Case 2: Current node is above range - search left only
    if (node.key > high) {
        return rangeSum(node.left, low, high);
    }
 
    // Case 3: Current node is in range [low, high]
    // Include this node's value plus contributions from children
    let sum = node.value;
 
    // Left child: we need keys >= low
    if (node.left !== null) {
        if (node.left.key >= low) {
            // Entire left subtree might be in range
            sum += rangeSumOptimized(node.left, low, high);
        } else {
            sum += rangeSum(node.left, low, high);
        }
    }
 
    // Right child: we need keys <= high
    if (node.right !== null) {
        if (node.right.key <= high) {
            // Entire right subtree might be in range
            sum += rangeSumOptimized(node.right, low, high);
        } else {
            sum += rangeSum(node.right, low, high);
        }
    }
 
    return sum;
}
 
/**
 * Optimized range sum when we know entire subtree is in range.
 * Uses the subtreeSum augmentation directly.
 */
function rangeSumOptimized(
    node: RangeSumNode | null,
    low: number,
    high: number
): number {
    if (node === null) return 0;
 
    // Check if entire subtree is within range
    // (This requires also storing subtreeMin and subtreeMax)
    // For simplicity, we do standard rangeSum
    return rangeSum(node, low, high);
}

Enhanced version with min/max augmentation:

To truly achieve O(log n) range sums, we can add subtreeMin and subtreeMax fields. When processing a node:

If [subtreeMin, subtreeMax] is entirely within [low, high] → return subtreeSum directly
If [subtreeMin, subtreeMax] is entirely outside [low, high] → return 0
Otherwise → recurse into children

This allows us to short-circuit entire subtrees, achieving O(log n) for typical range queries.

Interval Trees: A Classic Augmentation

Interval Trees are a canonical example of BST augmentation. They solve the interval overlap query problem:

Given a set of intervals [start, end], efficiently find all intervals that overlap with a query interval [q_start, q_end].

Applications include:

Calendar systems: Find meetings that conflict with a proposed time slot
Computational geometry: Find segments intersecting a query region
Database systems: Range predicate optimization
Network management: IP ranges containing a given address

The augmentation: Each node stores the maximum endpoint value in its subtree.

interval-tree.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
interface Interval {
    start: number;
    end: number;
}
 
interface IntervalNode {
    interval: Interval;
    left: IntervalNode | null;
    right: IntervalNode | null;
    maxEnd: number;  // Maximum 'end' value in this subtree
}
 
/**
 * Get the maximum end value in a subtree.
 */
function getMaxEnd(node: IntervalNode | null): number {
    return node === null ? -Infinity : node.maxEnd;
}
 
/**
 * Update the maxEnd augmentation after modification.
 */
function updateMaxEnd(node: IntervalNode): void {
    node.maxEnd = Math.max(
        node.interval.end,
        getMaxEnd(node.left),
        getMaxEnd(node.right)
    );
}
 
/**
 * Insert an interval. BST is keyed by interval.start.
 */
function insertInterval(
    root: IntervalNode | null,
    interval: Interval
): IntervalNode {
    if (root === null) {
        return {
            interval,
            left: null,
            right: null,
            maxEnd: interval.end,
        };
    }
 
    if (interval.start < root.interval.start) {
        root.left = insertInterval(root.left, interval);
    } else {
        root.right = insertInterval(root.right, interval);
    }
 
    updateMaxEnd(root);
    return root;
}
 
/**
 * Two intervals overlap if: a.start <= b.end AND b.start <= a.end
 */
function intervalsOverlap(a: Interval, b: Interval): boolean {
    return a.start <= b.end && b.start <= a.end;
}
 
/**
 * Find ONE interval that overlaps with the query.
 * Time: O(log n) expected
 */
function findOverlapping(
    node: IntervalNode | null,
    query: Interval
): Interval | null {
    if (node === null) return null;
 
    // Check current node
    if (intervalsOverlap(node.interval, query)) {
        return node.interval;
    }
 
    // Decide which subtree to search:
    // If left subtree exists AND its maxEnd >= query.start,
    // there MIGHT be an overlapping interval on the left.
    if (node.left !== null && node.left.maxEnd >= query.start) {
        const found = findOverlapping(node.left, query);
        if (found !== null) return found;
    }
 
    // Otherwise, search right subtree
    return findOverlapping(node.right, query);
}
 
/**
 * Find ALL intervals that overlap with the query.
 * Time: O(log n + k) where k is the number of overlapping intervals
 */
function findAllOverlapping(
    node: IntervalNode | null,
    query: Interval,
    results: Interval[] = []
): Interval[] {
    if (node === null) return results;
 
    // Add current node if it overlaps
    if (intervalsOverlap(node.interval, query)) {
        results.push(node.interval);
    }
 
    // Search left if there might be overlaps
    if (node.left !== null && node.left.maxEnd >= query.start) {
        findAllOverlapping(node.left, query, results);
    }
 
    // Search right if there might be overlaps
    // The right subtree can have overlaps if its intervals start
    // before the query ends
    if (node.right !== null && node.interval.start <= query.end) {
        findAllOverlapping(node.right, query, results);
    }
 
    return results;
}

The Power of maxEnd

Interval tree complexity:

Operation	Time Complexity
Insert	O(log n)
Delete	O(log n)
Find any overlap	O(log n)
Find all k overlaps	O(log n + k)
Space	O(n)

The interval tree is a textbook example of how a single augmented field (maxEnd) enables an entirely new category of queries while preserving the O(log n) modification complexity.

Multi-Field Augmentation

We're not limited to a single augmented field. Complex applications often require multiple augmentations that work together:

Example: A comprehensive statistics tree

Imagine we need:

Order statistics (k-th element, rank)
Range sum queries
Min/max queries
All with O(log n) performance

We can combine multiple augmentations in a single structure:

multi-augmented-bst.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
interface StatisticsNode {
    key: number;
    value: number;
    left: StatisticsNode | null;
    right: StatisticsNode | null;
    
    // Multiple augmentations
    size: number;       // For order statistics
    sum: number;        // For range sum queries
    min: number;        // Minimum key in subtree
    max: number;        // Maximum key in subtree
    height: number;     // For balance checking
}
 
/**
 * Helper functions for null-safe access
 */
function getSize(node: StatisticsNode | null): number {
    return node === null ? 0 : node.size;
}
 
function getSum(node: StatisticsNode | null): number {
    return node === null ? 0 : node.sum;
}
 
function getMin(node: StatisticsNode | null): number {
    return node === null ? Infinity : node.min;
}
 
function getMax(node: StatisticsNode | null): number {
    return node === null ? -Infinity : node.max;
}
 
function getHeight(node: StatisticsNode | null): number {
    return node === null ? -1 : node.height;
}
 
/**
 * Update ALL augmented fields after modification.
 * Each field follows the augmentation theorem:
 * - Computable from self and children only
 * - O(1) computation time
 */
function updateAugmentations(node: StatisticsNode): void {
    node.size = 1 + getSize(node.left) + getSize(node.right);
    node.sum = node.value + getSum(node.left) + getSum(node.right);
    node.min = Math.min(node.key, getMin(node.left), getMin(node.right));
    node.max = Math.max(node.key, getMax(node.left), getMax(node.right));
    node.height = 1 + Math.max(getHeight(node.left), getHeight(node.right));
}
 
/**
 * Create a new leaf node with all augmentations initialized.
 */
function createStatsNode(key: number, value: number): StatisticsNode {
    return {
        key,
        value,
        left: null,
        right: null,
        size: 1,
        sum: value,
        min: key,
        max: key,
        height: 0,
    };
}
 
/**
 * Now we can implement multiple operations efficiently:
 */
 
// Order statistics: Select k-th element
function select(node: StatisticsNode | null, k: number): StatisticsNode | null {
    if (node === null || k < 1 || k > getSize(node)) return null;
    
    const leftSize = getSize(node.left);
    if (k <= leftSize) return select(node.left, k);
    if (k === leftSize + 1) return node;
    return select(node.right, k - leftSize - 1);
}
 
// Range sum: sum of values in key range [lo, hi]
function rangeSum(
    node: StatisticsNode | null,
    lo: number,
    hi: number
): number {
    if (node === null) return 0;
    
    // Use min/max to check if entire subtree is in range
    if (node.min >= lo && node.max <= hi) {
        return node.sum;  // Entire subtree in range!
    }
    
    if (node.max < lo || node.min > hi) {
        return 0;  // Entire subtree out of range
    }
    
    // Partial overlap: recurse
    let result = 0;
    if (node.key >= lo && node.key <= hi) {
        result += node.value;
    }
    result += rangeSum(node.left, lo, hi);
    result += rangeSum(node.right, lo, hi);
    return result;
}
 
// Check AVL balance using height
function isBalanced(node: StatisticsNode | null): boolean {
    if (node === null) return true;
    const balanceFactor = getHeight(node.left) - getHeight(node.right);
    return Math.abs(balanceFactor) <= 1 &&
           isBalanced(node.left) &&
           isBalanced(node.right);
}

Augmentation Overhead

Each augmented field adds: • O(1) space per node • O(1) time per modification (for updates)

Designing Custom Augmentations

When you need an operation not covered by standard augmentations, follow this systematic process:

Step 1: Define the query precisely

What exactly do you need to compute? Write it mathematically.

Step 2: Identify what information would help

What aggregate data, if available at each node, would let you answer the query without full traversal?

Step 3: Check the augmentation theorem

Can this data be computed from the node itself and its children's augmented values? If not, the augmentation won't be efficient.

Step 4: Design the update formula

Write the explicit recurrence for computing the field from children.

Step 5: Handle edge cases

What value does a null (empty) subtree contribute? This becomes your base case.

Worked example: Second-maximum in subtree

Query: For any subtree, find the second-largest key.

Why useful: Efficiently find 'runner-up' in range queries.

Analysis:

Can we compute secondMax from children? Let's think:

From left child: max_L, secondMax_L
From right child: max_R, secondMax_R
Current node: key

The secondMax of current subtree is the second element in: {key, max_L, max_R, secondMax_L, secondMax_R}

This can be computed in O(1) by sorting 5 values and taking the second.

second-max-augmentation.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
interface SecondMaxNode {
    key: number;
    left: SecondMaxNode | null;
    right: SecondMaxNode | null;
    max: number;        // Maximum in subtree
    secondMax: number;  // Second maximum in subtree
}
 
/**
 * Compute max and secondMax from children.
 * 
 * We need the two largest values among:
 * - Current node's key
 * - Left child's max and secondMax
 * - Right child's max and secondMax
 */
function updateSecondMax(node: SecondMaxNode): void {
    // Collect all candidate values
    const candidates: number[] = [node.key];
    
    if (node.left !== null) {
        candidates.push(node.left.max);
        if (node.left.secondMax !== -Infinity) {
            candidates.push(node.left.secondMax);
        }
    }
    
    if (node.right !== null) {
        candidates.push(node.right.max);
        if (node.right.secondMax !== -Infinity) {
            candidates.push(node.right.secondMax);
        }
    }
    
    // Find max and second max from candidates
    let max1 = -Infinity;
    let max2 = -Infinity;
    
    for (const val of candidates) {
        if (val > max1) {
            max2 = max1;
            max1 = val;
        } else if (val > max2 && val !== max1) {
            max2 = val;
        }
    }
    
    node.max = max1;
    node.secondMax = max2;
}
 
/**
 * Edge case: When is secondMax = -Infinity?
 * - Single-node subtree: only one value, no second max
 * - All nodes have same key: no distinct second max
 * 
 * Client code should check for secondMax === -Infinity to handle these.
 */
 
/**
 * Application: Find second-largest in a key range [lo, hi]
 * 
 * This is more complex because we need to filter by range,
 * but the augmentation still helps by providing quick answers
 * for fully-contained subtrees.
 */
function secondMaxInRange(
    node: SecondMaxNode | null,
    lo: number,
    hi: number
): number {
    if (node === null) return -Infinity;
    
    // If entire subtree is in range, use augmented values
    // (Would need subtreeMin, subtreeMax augmentations for this check)
    
    // For now, collect and process
    const values: number[] = [];
    collectInRange(node, lo, hi, values);
    
    // Sort and get second max
    values.sort((a, b) => b - a);
    return values.length >= 2 ? values[1] : -Infinity;
}
 
function collectInRange(
    node: SecondMaxNode | null,
    lo: number,
    hi: number,
    results: number[]
): void {
    if (node === null) return;
    
    if (node.key > lo) collectInRange(node.left, lo, hi, results);
    if (node.key >= lo && node.key <= hi) results.push(node.key);
    if (node.key < hi) collectInRange(node.right, lo, hi, results);
}

Augmentation with Self-Balancing Trees

augmented-rotation.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
interface AugmentedNode<T> {
    key: number;
    value: T;
    left: AugmentedNode<T> | null;
    right: AugmentedNode<T> | null;
    height: number;  // For AVL balance
    size: number;    // For order statistics
}
 
/**
 * Right rotation with augmentation updates.
 * 
 *       Y                 X
 *      / \               / \
 *     X   γ     →       α   Y
 *    / \                   / \
 *   α   β                 β   γ
 * 
 * Both X and Y may have augmented fields that need updating.
 * Order matters: update Y first (now lower), then X (now higher).
 */
function rotateRight<T>(y: AugmentedNode<T>): AugmentedNode<T> {
    const x = y.left!;
    const beta = x.right;
    
    // Perform rotation
    x.right = y;
    y.left = beta;
    
    // Update augmentations bottom-up
    updateAugmentations(y);  // Y is now lower
    updateAugmentations(x);  // X is now root, uses Y's updated values
    
    return x;
}
 
/**
 * Left rotation with augmentation updates.
 * Mirror of right rotation.
 */
function rotateLeft<T>(x: AugmentedNode<T>): AugmentedNode<T> {
    const y = x.right!;
    const beta = y.left;
    
    // Perform rotation
    y.left = x;
    x.right = beta;
    
    // Update augmentations bottom-up
    updateAugmentations(x);  // X is now lower
    updateAugmentations(y);  // Y is now root
    
    return y;
}
 
function updateAugmentations<T>(node: AugmentedNode<T>): void {
    node.height = 1 + Math.max(
        getHeight(node.left),
        getHeight(node.right)
    );
    node.size = 1 + getSize(node.left) + getSize(node.right);
}
 
function getHeight<T>(node: AugmentedNode<T> | null): number {
    return node === null ? -1 : node.height;
}
 
function getSize<T>(node: AugmentedNode<T> | null): number {
    return node === null ? 0 : node.size;
}
 
/**
 * Complete AVL insert with augmentation maintenance.
 */
function avlInsertAugmented<T>(
    node: AugmentedNode<T> | null,
    key: number,
    value: T
): AugmentedNode<T> {
    // Standard BST insert
    if (node === null) {
        return { key, value, left: null, right: null, height: 0, size: 1 };
    }
 
    if (key < node.key) {
        node.left = avlInsertAugmented(node.left, key, value);
    } else {
        node.right = avlInsertAugmented(node.right, key, value);
    }
 
    // Update augmentations before potential rotations
    updateAugmentations(node);
 
    // Get balance factor
    const balance = getHeight(node.left) - getHeight(node.right);
 
    // Left-Left case
    if (balance > 1 && key < node.left!.key) {
        return rotateRight(node);
    }
 
    // Right-Right case
    if (balance < -1 && key > node.right!.key) {
        return rotateLeft(node);
    }
 
    // Left-Right case
    if (balance > 1 && key > node.left!.key) {
        node.left = rotateLeft(node.left!);
        return rotateRight(node);
    }
 
    // Right-Left case
    if (balance < -1 && key < node.right!.key) {
        node.right = rotateRight(node.right!);
        return rotateLeft(node);
    }
 
    return node;
}

The Bottom-Up Rule

Real-World Augmented BST Applications

Augmented BSTs power critical systems across many domains:

Text Editors (Rope Data Structure)

•Structure: Self-balancing BST where each node represents a text chunk
•Augmentation: Subtree weight (total characters in subtree)
•Query: Find character at position i in O(log n) navigating by weights
•Operations: Insert/delete text chunks with O(log n) rebalancing
•Used by: VS Code, Sublime Text, other modern editors

IP Routing Tables

•Structure: Patricia trie (variant of BST on bit strings)
•Augmentation: Longest prefix match information
•Query: Given IP address, find longest matching prefix for routing
•Scale: Millions of routing entries with O(log n) lookups
•Used by: Core internet routers

Database B+ Trees

•Structure: Multi-way search tree (BST generalization)
•Augmentation: Key counts, min/max values per node
•Query: Range scans, cardinality estimation, ordered iteration
•Operations: Efficient bulk insert, split/merge for parallel access
•Used by: PostgreSQL, MySQL, Oracle, all major databases

Geographic Information Systems (R-Trees)

•Structure: Multi-dimensional BST variant
•Augmentation: Bounding box of all objects in subtree
•Query: Find all objects intersecting a rectangular region
•Operations: Efficient insertion with node splitting
•Used by: PostGIS, Google Maps, any spatial database

Summary: Augmented BSTs

We've explored the powerful pattern of tree augmentation. Let's consolidate the key insights:

Key Takeaways

•Augmentation adds computed fields to nodes — Store aggregate information that enables new query types
•The augmentation theorem defines what works — Field must be computable from self + children in O(1) time
•Multiple augmentations can coexist — Combine size, sum, min, max, height as needed
•Interval trees showcase the power — Single maxEnd field enables O(log n) overlap queries
•Custom augmentations follow a systematic process — Define query, identify needed aggregates, verify theorem, implement
•Compatible with self-balancing — Update augmentations bottom-up after rotations
•Underlies many real-world systems — Text editors, databases, GIS, routing all use augmented trees

What's next:

Page Complete

3 / 4