Common Tree Patterns - Learning Module

Loading content...

0/276

Lowest Common Ancestor (LCA)

Finding the Meeting Point — Where Two Paths Converge

Consider two people in a large family tree who want to find their nearest common relative. Both can trace their ancestry back to the family's founding member, but somewhere along those ascending lines, their paths meet at their most recent shared ancestor.

This intuitive question captures the essence of the Lowest Common Ancestor (LCA) problem: given two nodes in a tree, find the deepest node that is an ancestor of both. 'Deepest' here means closest to the given nodes — the most recent point where their ancestral paths diverge.

LCA is far more than an elegant puzzle. It underpins numerous tree algorithms and appears in:

Version control systems (finding the merge base of two branches)
Network routing (finding the common backbone for two endpoints)
Computational biology (finding the most recent common ancestor of species)
Query optimization (finding common subexpressions in execution plans)

What You Will Master

By the end of this page, you will:

• Deeply understand what LCA means and why it matters • Implement the elegant recursive O(n) solution for binary trees • Apply BST properties to optimize LCA to O(h) time • Handle edge cases with precision (node not found, one node is ancestor of other) • Recognize LCA patterns in real-world engineering problems • Understand the path to O(log n) / O(1) query solutions for repeated queries

Understanding the LCA Problem

Formal Definition:

The Lowest Common Ancestor of two nodes p and q in a tree is the deepest node that has both p and q as descendants. A node is considered a descendant of itself (by convention), so if p is an ancestor of q, then the LCA of p and q is p itself.

Key Properties:

Existence: In a connected tree with both nodes present, an LCA always exists (worst case: the root)
Uniqueness: The LCA is always unique for any pair of nodes
Descendant convention: A node is an ancestor/descendant of itself
Path perspective: The LCA is where the paths from root to p and root to q last coincide

Why 'Lowest'?

The root is a common ancestor of all nodes, so we need to find the deepest (lowest in the tree, farthest from root) common ancestor. This is the most informative — it tells us exactly where the paths to p and q split.

Visual Example:

LCA Examples:

LCA(6, 2) = 5 — Both are descendants of 5; 5 is deeper than 3
LCA(6, 4) = 5 — 4 is under 2, 6 and 4's paths meet at 5
LCA(5, 1) = 3 — These are in different subtrees; only root is common
LCA(5, 4) = 5 — 4 is a descendant of 5, so LCA is 5 itself
LCA(7, 8) = 3 — They are in completely different subtrees

The 'Self-Ancestor' Convention:

The convention that a node is its own ancestor is computationally convenient and semantically sensible. If we ask 'what's the most recent common ancestor of me and my child?' the answer should be 'me' — I am the most recent point where our lineages meet.

LCA as Path Intersection

Conceptually, LCA is where two paths from the root diverge. If path-to-p is [3, 5, 6] and path-to-q is [3, 5, 2, 7], the common prefix is [3, 5], so LCA = 5. This 'path intersection' view motivates one algorithm: find both paths, then find the last common element.

The Elegant Recursive Solution

The recursive LCA algorithm for binary trees is remarkably elegant. It leverages a profound insight: we don't need to explicitly find paths — we can discover the LCA by exploring where p and q exist in the tree structure.

The Core Logic:

At each node, we ask a simple question: "Are p and q in my left subtree, my right subtree, or split across both?"

If both p and q are in the left subtree: LCA is in the left subtree
If both p and q are in the right subtree: LCA is in the right subtree
If p and q are split (one left, one right): Current node is the LCA!
If current node is p or q: Current node is the LCA (due to ancestor convention)

Why this works:

The algorithm implicitly traces paths by propagating found/not-found information upward. When a node receives 'found' signals from both subtrees, it knows it's the split point — the LCA.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
interface TreeNode {
    val: number;
    left: TreeNode | null;
    right: TreeNode | null;
}
 
/**
 * Finds the Lowest Common Ancestor of nodes p and q in a binary tree.
 * 
 * Key insight: Traverse the tree once, propagating "found" signals upward.
 * - If a node is p or q, return that node (found!)
 * - If left and right subtrees both return non-null, this node is LCA
 * - If only one side returns non-null, propagate that result up
 * 
 * Time: O(n) - visit each node at most once
 * Space: O(h) - recursion stack depth
 * 
 * ASSUMPTION: Both p and q exist in the tree. See notes for variants.
 */
function lowestCommonAncestor(
    root: TreeNode | null,
    p: TreeNode,
    q: TreeNode
): TreeNode | null {
    // Base case 1: empty tree
    if (root === null) {
        return null;
    }
    
    // Base case 2: current node is p or q
    // If we find p, we don't need to search further in this subtree
    // If q is below p, LCA is p. If q is elsewhere, parent will handle.
    if (root === p || root === q) {
        return root;
    }
    
    // Recursively search left and right subtrees
    const leftResult = lowestCommonAncestor(root.left, p, q);
    const rightResult = lowestCommonAncestor(root.right, p, q);
    
    // Case 1: Both subtrees returned something — current node is LCA
    // This means p is in one subtree and q is in the other
    if (leftResult !== null && rightResult !== null) {
        return root;
    }
    
    // Case 2: Only one subtree returned something — propagate up
    // Both p and q are in that subtree; LCA is deeper down
    return leftResult !== null ? leftResult : rightResult;
}

Critical Assumption: Both Nodes Exist

This classic solution assumes both p and q exist in the tree. If p exists but q doesn't, it incorrectly returns p as the LCA. For robust production code, you may need a two-pass approach: first verify both nodes exist, then find LCA. We'll cover this variant shortly.

Tracing the Algorithm:

Let's trace LCA(5, 1) in our example tree:

           3          <- Returns 3 (both found, split here)
          / \
         5   1        <- Left returns 5, Right returns 1
        / \ / \
       6  2 0  8

Step-by-step:

At 3: Not p(5) or q(1). Search children.
At 5: This IS p! Return 5. (Don't search 5's subtrees for LCA purposes)
At 1: This IS q! Return 1.
Back at 3: leftResult=5 (non-null), rightResult=1 (non-null)
- Both non-null → 3 is the LCA!

Another trace — LCA(6, 4) where 4 is under 5:

           3          <- Returns 5 (only left found anything)
          / \
         5   1        <- Returns 5 (both found in this subtree)
        / \ / \
       6  2 0  8      <- 6 returns 6, 2 returns 4
         / \
        7   4         <- 4 returns 4

At 6: This IS p! Return 6.
At 4: This IS q! Return 4.
At 2: Not p or q. Left returns null, right returns 4. Propagate 4.
At 5: Not p or q. Left returns 6, right returns 4. Both non-null — 5 is LCA!
At 3: Left returns 5, right returns null. Propagate 5.

Final answer: 5, which is correct!

Handling Edge Cases Robustly

The classic algorithm is elegant but has a subtle flaw: if only one of p or q exists in the tree, it returns that node — incorrectly suggesting it's the LCA. For production code, we need a more robust approach.

The Two-Pass Solution:

We can make the algorithm fully correct by tracking whether we actually found both nodes:

Traverse and search for LCA candidate
Verify that the candidate is valid (both nodes were actually found)

A Cleaner Single-Pass Approach:

We can track both the LCA and the count of found nodes in a single traversal.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
/**
 * Robust LCA that returns null if either p or q is not in the tree.
 * Uses an auxiliary structure to track both result and count.
 * 
 * Time: O(n)
 * Space: O(h)
 */
function lowestCommonAncestorRobust(
    root: TreeNode | null,
    p: TreeNode,
    q: TreeNode
): TreeNode | null {
    
    // Track: [potential LCA, count of found nodes (0, 1, or 2)]
    interface SearchResult {
        node: TreeNode | null;
        foundCount: number;
    }
    
    function search(current: TreeNode | null): SearchResult {
        if (current === null) {
            return { node: null, foundCount: 0 };
        }
        
        const leftResult = search(current.left);
        const rightResult = search(current.right);
        
        // Count how many of {p, q} we've found including current node
        let foundCount = leftResult.foundCount + rightResult.foundCount;
        if (current === p || current === q) {
            foundCount += 1;
        }
        
        // Determine the LCA candidate
        let lcaCandidate: TreeNode | null = null;
        
        // If both found and split here, this is the LCA
        if (leftResult.node !== null && rightResult.node !== null) {
            lcaCandidate = current;
        }
        // If current is p or q and something found below
        else if ((current === p || current === q) && 
                 (leftResult.foundCount > 0 || rightResult.foundCount > 0)) {
            lcaCandidate = current;
        }
        // Otherwise propagate up any LCA found below
        else {
            lcaCandidate = leftResult.node ?? rightResult.node;
        }
        
        return { node: lcaCandidate, foundCount };
    }
    
    const result = search(root);
    
    // Only return LCA if we actually found both nodes
    return result.foundCount === 2 ? result.node : null;
}

Edge Cases to Consider:

Scenario	Expected Behavior
p is ancestor of q	Return p
q is ancestor of p	Return q
p and q are the same node	Return that node
Only p exists in tree	Return null (robust version)
Neither p nor q exists	Return null (robust version)
Empty tree	Return null

Interview Tip:

In interviews, clarify assumptions! Ask: "Can I assume both nodes exist in the tree?" If yes, use the simple version. If no, use the robust version. This shows you understand the algorithm's limitations.

When Simple is Better

If the problem guarantees both nodes exist (common in LeetCode-style problems), use the simple algorithm. It's cleaner and easier to verify. Add robustness only when requirements demand it — premature complexity is its own bug.

LCA in Binary Search Trees — Leveraging Order

When the tree is a Binary Search Tree (BST), we can exploit the ordering property to find LCA without exploring the entire tree. The BST property tells us exactly which subtree to explore.

The BST LCA Insight:

In a BST, at any node with value V:

If both p.val and q.val are < V: Both are in left subtree
If both p.val and q.val are > V: Both are in right subtree
Otherwise (one ≤ V, one ≥ V): Current node is where they split — it's the LCA!

This leads to an O(h) algorithm that follows a single path from root to LCA, never needing to explore both subtrees.

Why it works:

The BST property guarantees that all nodes less than V are left, all greater are right. So if p < V and q > V, p and q must be in different subtrees. The current node is therefore the deepest common ancestor — the LCA.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
/**
 * Finds LCA in a Binary Search Tree using the BST property.
 * 
 * Key insight: BST ordering tells us which subtree to explore.
 * We never need to explore both subtrees — follow the guided path.
 * 
 * Time: O(h) where h is tree height (O(log n) balanced, O(n) skewed)
 * Space: O(1) iterative, O(h) recursive
 * 
 * ASSUMPTION: Both p and q exist and p.val ≠ q.val
 */
function lowestCommonAncestorBST(
    root: TreeNode | null,
    p: TreeNode,
    q: TreeNode
): TreeNode | null {
    // Iterative version for O(1) space
    let current = root;
    
    while (current !== null) {
        // Both values are smaller → LCA must be in left subtree
        if (p.val < current.val && q.val < current.val) {
            current = current.left;
        }
        // Both values are larger → LCA must be in right subtree
        else if (p.val > current.val && q.val > current.val) {
            current = current.right;
        }
        // Split! One is ≤, one is ≥ → current node is LCA
        // (This also handles when current === p or current === q)
        else {
            return current;
        }
    }
    
    // Should never reach here if both nodes exist
    return null;
}
 
/**
 * Recursive version — same logic, different style
 */
function lowestCommonAncestorBSTRecursive(
    root: TreeNode | null,
    p: TreeNode,
    q: TreeNode
): TreeNode | null {
    if (root === null) return null;
    
    if (p.val < root.val && q.val < root.val) {
        return lowestCommonAncestorBSTRecursive(root.left, p, q);
    }
    if (p.val > root.val && q.val > root.val) {
        return lowestCommonAncestorBSTRecursive(root.right, p, q);
    }
    
    return root;
}

The Split Condition

The condition (p.val <= current.val && q.val >= current.val) || (p.val >= current.val && q.val <= current.val) simplifies to: 'values are on different sides OR one equals the current node.' The else clause in our code elegantly captures this — if neither 'both left' nor 'both right' is true, we've found the split.

Complexity Comparison:

Approach	Time	Space	When to Use
General binary tree	O(n)	O(h)	Any binary tree
BST-specific	O(h)	O(1) or O(h)	When BST property is known

For a balanced BST, the BST-specific approach is O(log n) vs O(n) — a significant improvement.

Tracing the BST Algorithm:

LCA(2, 8) = 6:

At 6: 2 < 6 and 8 > 6 → Split! Return 6. ✓

LCA(3, 5) = 4:

At 6: 3 < 6 and 5 < 6 → Go left
At 2: 3 > 2 and 5 > 2 → Go right
At 4: 3 < 4 and 5 > 4 → Split! Return 4. ✓

LCA(0, 4) = 2:

At 6: 0 < 6 and 4 < 6 → Go left
At 2: 0 < 2 but 4 > 2 → Split! Return 2. ✓

Path-Based LCA — The Explicit Approach

An alternative approach makes the 'path intersection' intuition explicit: find paths from root to both nodes, then find where the paths diverge.

Algorithm:

Find path from root to p: [root, ..., p]
Find path from root to q: [root, ..., q]
Compare paths to find the last common node

This approach is conceptually clearer and can be useful when:

You need the actual path, not just the LCA
You're working with trees that have parent pointers
You want to extend to LCA of multiple nodes

Trade-off:

Less elegant: requires storing paths
Same asymptotic complexity: O(n) time, O(h) space
More flexible for extensions

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
/**
 * Finds LCA by explicitly computing paths from root to each node.
 * More explicit than the recursive solution, useful for debugging
 * or when you need the actual paths.
 * 
 * Time: O(n) - up to two full traversals
 * Space: O(h) - storing paths
 */
function lowestCommonAncestorPathBased(
    root: TreeNode | null,
    p: TreeNode,
    q: TreeNode
): TreeNode | null {
    
    /**
     * Find path from root to target node.
     * Returns null if target not found.
     */
    function findPath(
        current: TreeNode | null,
        target: TreeNode,
        path: TreeNode[]
    ): boolean {
        if (current === null) {
            return false;
        }
        
        // Add current node to path
        path.push(current);
        
        // Found target — path is complete
        if (current === target) {
            return true;
        }
        
        // Search children
        if (findPath(current.left, target, path) || 
            findPath(current.right, target, path)) {
            return true;
        }
        
        // Target not in this subtree — backtrack
        path.pop();
        return false;
    }
    
    // Find paths to both nodes
    const pathToP: TreeNode[] = [];
    const pathToQ: TreeNode[] = [];
    
    const foundP = findPath(root, p, pathToP);
    const foundQ = findPath(root, q, pathToQ);
    
    // If either not found, no LCA exists
    if (!foundP || !foundQ) {
        return null;
    }
    
    // Find the last common node in both paths
    // Both paths start at root, so compare from start
    let lca: TreeNode | null = null;
    const minLength = Math.min(pathToP.length, pathToQ.length);
    
    for (let i = 0; i < minLength; i++) {
        if (pathToP[i] === pathToQ[i]) {
            lca = pathToP[i];  // Still on common path
        } else {
            break;  // Paths diverged
        }
    }
    
    return lca;
}

When Path-Based is Preferable:

Debugging: You can print paths to understand what's happening
Multiple queries with same node: Cache the path to one node
Need distance: The path directly gives you the distance to LCA
Multiple nodes: Generalize to LCA of k nodes by finding common prefix of k paths
Parent pointers available: Can build paths by ascending instead of descending

LCA with Parent Pointers:

If nodes have parent pointers, finding LCA becomes similar to finding the intersection of two linked lists:

function lcaWithParent(p: NodeWithParent, q: NodeWithParent): NodeWithParent {
    const ancestors = new Set<NodeWithParent>();
    
    // Add all ancestors of p to set
    let current: NodeWithParent | null = p;
    while (current !== null) {
        ancestors.add(current);
        current = current.parent;
    }
    
    // Find first ancestor of q that's in the set
    current = q;
    while (current !== null) {
        if (ancestors.has(current)) {
            return current;
        }
        current = current.parent;
    }
    
    return null;  // Should never happen in valid tree
}

Advanced — LCA for Multiple Queries

What if we need to answer many LCA queries on the same tree? With n nodes and q queries, naive approach takes O(q × n) time. Can we do better by preprocessing?

Yes! Several approaches exist:

1. Euler Tour + Range Minimum Query (RMQ):

Preprocess: O(n log n) time and space
Query: O(1) time!
Key insight: LCA corresponds to the node with minimum depth in a range of the Euler tour

2. Binary Lifting:

Preprocess: O(n log n) time and space
Query: O(log n) time
Key insight: Precompute 2^k-th ancestors for each node

3. Sparse Table:

Preprocess: O(n log n)
Query: O(1)
Works with Euler tour approach

While implementing these is beyond our current scope, understanding they exist is crucial for system design.

LCA Algorithm Comparison for q Queries on n Nodes
Approach	Preprocess Time	Query Time	Space	Use Case
Naive (no preprocess)	—	O(n)	O(h)	Few queries
Binary Lifting	O(n log n)	O(log n)	O(n log n)	Balanced trade-off
Euler + Sparse Table	O(n log n)	O(1)	O(n log n)	Many queries
Euler + Segment Tree	O(n)	O(log n)	O(n)	Dynamic trees

The Euler Tour Insight

An Euler tour traverses every edge twice, creating a sequence where nodes appear multiple times. Between any two occurrences of nodes p and q, their LCA is the node with minimum depth. This reduces LCA to Range Minimum Query — a well-studied problem with O(1) solutions!

Binary Lifting Intuition:

We precompute ancestor[node][k] = the 2^k-th ancestor of node. Any distance can be expressed as sum of powers of 2, so we can 'jump' to any ancestor in O(log n) steps.

To find LCA(p, q):

Bring p and q to the same depth (binary jumps)
Jump both upward together until they meet

// Pseudocode for binary lifting LCA
function lcaWithLifting(p, q):
    // Step 1: Bring to same depth
    if depth[p] > depth[q]: swap(p, q)
    diff = depth[q] - depth[p]
    for k from log(n) down to 0:
        if diff has kth bit set:
            q = ancestor[q][k]  // Jump 2^k steps
    
    // Step 2: Jump together until meet
    if p == q: return p
    for k from log(n) down to 0:
        if ancestor[p][k] != ancestor[q][k]:
            p = ancestor[p][k]
            q = ancestor[q][k]
    
    return ancestor[p][0]  // Parent is LCA

This technique is extremely powerful and appears in many competitive programming problems.

Real-World Applications of LCA

LCA is far more than an algorithm exercise — it's a fundamental operation in systems you use every day.

1. Version Control Systems (Git):

When you merge two branches, Git finds their merge base — the LCA of their commit histories. This determines what changes each branch made:

         main
           |
     A --- B --- C
      \         /
       D --- E
           |
        feature

LCA(C, E) = A (the merge base)

Git's three-way merge uses LCA to identify:

What main changed (diff from A to C)
What feature changed (diff from A to E)
How to combine them

2. Network Routing:

In hierarchical networks, LCA represents the common backbone where paths meet. Finding LCA helps optimize routing and identify bottlenecks.

3. Taxonomic Classification:

Biologists use LCA to find the most recent common ancestor of species, critical for evolutionary analysis and phylogenetic studies.

LCA Applications Across Domains
Domain	Tree Represents	LCA Represents	Application
Version Control	Commit history	Merge base	Three-way merge, blame tracking
HTML/DOM	Document structure	Common container	Event bubbling target, styling scope
File Systems	Directory hierarchy	Common parent dir	Relative path computation
Networking	Router topology	Common backbone	Routing optimization
Compilers	AST	Common expression	CSE, optimization scope
Biology	Phylogenetic tree	Common ancestor	Evolutionary distance

LCA in DOM Operations

When you use JavaScript's element.contains(other) or compute relative paths between elements, the browser internally computes LCA. React's reconciliation algorithm uses LCA-like concepts to determine the minimal DOM updates needed.

Summary and Key Takeaways

The Lowest Common Ancestor problem elegantly demonstrates how tree structure enables efficient computation. Let's consolidate our understanding:

Core Concepts Mastered

•Definition: LCA is the deepest node that is an ancestor of both given nodes (a node is its own ancestor)
•Recursive solution: Propagate 'found' signals upward; LCA is where signals from both subtrees meet
•BST optimization: Use ordering property to follow a single path in O(h) time
•Path-based: Explicitly find paths and compare — clearer and more extensible
•Robustness: Handle edge cases (missing nodes, same node) for production quality
•Advanced techniques: Binary lifting and Euler tour enable O(1) queries after preprocessing

Algorithm Selection Guide

Generic binary tree, single query: Use the recursive O(n) solution

BST: Use value-based navigation for O(h) solution

Many queries on static tree: Preprocess with binary lifting or Euler tour

Need actual paths: Use path-based approach

Nodes have parent pointers: Use hash set intersection approach

What's Next:

LCA taught us to find where two paths converge. In the next page, we'll tackle Tree Symmetry and Comparison — a different kind of structural analysis where we ask whether trees are mirrors of each other, copies, or subtrees of one another. These problems reveal the deep relationship between tree structure and recursive reasoning.

Page Complete

You now mastery the Lowest Common Ancestor problem — from the elegant recursive solution to BST optimizations to advanced preprocessing techniques. LCA is a cornerstone of tree algorithms; its applications span version control, networking, and beyond. Recognize it in disguise when you encounter path-convergence problems.