Complete Binary Trees - Learning Module

Loading content...

0/276

Why Completeness Enables Array Storage

From Trees to Arrays — The Core Insight

Throughout our study of trees—binary trees, BSTs, and now heaps—we've used pointer-based representations. Each node contains data plus pointers to its children (and sometimes parent). This is intuitive: the tree structure maps directly to the pointer structure.

But heaps break from this pattern. Despite being tree-structured conceptually, heaps are almost universally implemented as arrays, with no pointers at all. The tree structure is implicit in the array indices.

This isn't a mere implementation trick—it's a fundamental insight about the relationship between complete binary trees and contiguous sequences. Understanding this connection deeply will clarify not just heap implementation, but also segment trees, tournament trees, and other array-based tree structures.

What You Will Learn

By the end of this page, you will understand why arbitrary binary trees cannot be efficiently stored in arrays, how the complete property creates a bijection between trees and arrays, the formal proof that array storage is always efficient for complete trees, and the practical implementation of implicit tree representation.

The Problem with Arbitrary Binary Tree Array Storage

Before appreciating why complete trees are special, let's understand why arbitrary binary trees cannot be efficiently stored in arrays using index arithmetic.

The Standard Index Convention:

For any binary tree, we could attempt array storage using:

Node at index 1 is the root
Left child of node i is at index 2i
Right child of node i is at index 2i + 1

This formula works perfectly if we're willing to store 'null' or 'empty' markers for missing nodes. But here's the problem: for unbalanced trees, the number of array positions needed grows exponentially worse than the number of nodes.

Array Storage Problem Demonstration
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
CASE 1: Skewed Tree (Linked List Shape) — n = 4 nodes
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
Tree Structure:
    (A)           Index allocation (using 2i for left, 2i+1 for right):
      \           
      (B)         Position 1: A (root)
        \         Position 3: B (right child of 1)
        (C)       Position 7: C (right child of 3)
          \       Position 15: D (right child of 7)
          (D)     
 
Array (indices 1-15): [A, -, B, -, -, -, C, -, -, -, -, -, -, -, D]
 
Efficiency: 4 nodes, 15 positions = 26.7% utilization
 
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
CASE 2: Longer Skewed Tree — n = 10 nodes
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
With 10 right-only nodes:
  Node 1 at index 1
  Node 2 at index 3
  Node 3 at index 7
  Node 4 at index 15
  Node 5 at index 31
  ...
  Node 10 at index 2^10 - 1 = 1023
 
Required array size: 1023 positions for 10 nodes = 0.98% utilization
 
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
CASE 3: General Unbalanced Tree — n = 8 nodes
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
        (1)
       /   \
     (2)   (3)
     /       \
   (4)       (7)
   / \         \
 (8)(9)       (15)
              /
           (30)
            /
          (60)
          
Required index: 60 (at least)
Array size: 60+ positions for 8 nodes ≈ 13% utilization

The Exponential Blowup:

For a tree of height h:

Maximum index needed: 2^(h+1) - 1 ≈ 2^h
But the tree might have as few as h+1 nodes (a skewed path)

Ratio: Array size ÷ Node count = 2^h ÷ (h+1), which is exponential in h.

For h = 20 (a mere 21-node skewed tree):

Array size needed: 2^21 ≈ 2 million positions
Actual nodes: 21
Wasted space: 99.999%

This is why general binary trees use pointer-based representation—it scales linearly with node count regardless of shape.

The Core Issue: Gaps

The problem is that arbitrary trees have 'gaps' in their level-order numbering. Missing nodes create holes in the index sequence. We either waste space storing these holes, or we need a separate mapping from logical indices to packed storage—which defeats the simplicity benefit of array representation.

How Completeness Eliminates All Gaps

Complete binary trees are precisely the class of trees where the level-order index sequence has no gaps. Let's prove this formally and understand why it enables perfect array storage.

Theorem: Complete ⟺ No Gaps in Level-Order Indices

A binary tree is complete if and only if, when nodes are numbered 1, 2, 3, ... in level-order, the set of indices used is exactly {1, 2, 3, ..., n} for some n.

Proof (⟹ direction, Complete → No Gaps):

Assume T is a complete binary tree with n nodes. We prove the indices are {1, 2, ..., n}.

By definition, levels 0 through h-1 are completely filled.
Level k contains indices 2^k through 2^(k+1) - 1 (when full).
For levels 0 to h-1: all indices from 1 to (2^h - 1) are present.
Level h is filled left-to-right with no gaps.
If level h has m nodes (1 ≤ m ≤ 2^h), they occupy indices 2^h, 2^h+1, ..., 2^h+m-1.
Therefore, the total indices are 1, 2, ..., (2^h - 1), 2^h, ..., 2^h+m-1.
This is exactly 1, 2, ..., n where n = (2^h - 1) + m. ✓

Completeness and Gap Analysis
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
COMPLETE TREE (n = 10): No Gaps
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
Level 0:         [1]
               /     \
Level 1:     [2]      [3]
            /   \    /   \
Level 2:  [4]  [5] [6]  [7]
          / \   /
Level 3: [8][9][10]
 
Indices used: {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
Gaps: NONE
Array: [_, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]  (index 0 unused)
 
Array utilization: 10 nodes / 10 positions = 100%
 
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
NOT COMPLETE (gap at position 6): Has Gaps
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
Level 0:         [1]
               /     \
Level 1:     [2]      [3]
            /   \        \
Level 2:  [4]  [5]       [7]   ← Position 6 is missing!
 
Indices used: {1, 2, 3, 4, 5, 7}
Gaps: Position 6 is missing
Array must be: [_, 1, 2, 3, 4, 5, -, 7]
 
To use index formulas correctly:
  Position 6 must exist in array (storing null/sentinel)
  Otherwise, children of 3 would compute wrong indices
 
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
VERIFICATION: Why Level-Order Must Be Gapless for Complete
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
The level-order filling rule states:
  "All nodes at level k exist before any node at level k+1"
  "Within level k, all positions fill left-to-right with no gaps"
 
Consequence:
  If node at position i exists, ALL positions 1 through i-1 must exist.
  
Therefore:
  n nodes occupy exactly positions 1, 2, ..., n.
  No gaps possible.

Proof (⟸ direction, No Gaps → Complete):

Assume T has nodes at indices {1, 2, ..., n} with no gaps. We prove T is complete.

The root exists at index 1 (since n ≥ 1 implies 1 ∈ {1, ..., n}).
Consider any index i < n. Both 2i and 2i+1 might be ≤ n.
If 2i ≤ n: node i has a left child (at 2i). If 2i > n: node i has no left child.
If 2i+1 ≤ n: node i has a right child. If 2i+1 > n: node i has no right child.
For any level k < h: all 2^k positions are in {1, ..., n}, so level is full.
For level h: positions 2^h through n are present, filled left-to-right.
By definition, T is complete. ✓

The Bijection:

This theorem establishes a one-to-one correspondence between:

Complete binary trees with n nodes, and
Arrays of length n

Every complete tree maps to exactly one array, and every array of length n defines exactly one complete binary tree. This is the mathematical foundation of implicit heap representation.

The Implicit Tree Representation

An implicit data structure is one where relationships between elements are encoded in their positions rather than stored explicitly as pointers. The array representation of complete binary trees is the canonical example of implicit structure.

Explicit vs Implicit Representation:

Aspect	Explicit (Pointer-Based)	Implicit (Array-Based)
Parent access	Follow parent pointer	Compute i // 2
Child access	Follow left/right pointer	Compute 2i, 2i+1
Navigation cost	Pointer dereference	Arithmetic + array access
Memory for structure	2-3 pointers per node	Zero (structure is free)
Shape constraint	None	Must be complete
Locality	Poor (scattered nodes)	Excellent (contiguous)

implicit_vs_explicit.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
import sys
from dataclasses import dataclass
from typing import Optional
 
 
# ===============================================
# EXPLICIT (POINTER-BASED) TREE NODE
# ===============================================
@dataclass
class ExplicitNode:
    """Traditional pointer-based tree node."""
    value: int
    left: Optional['ExplicitNode'] = None
    right: Optional['ExplicitNode'] = None
    parent: Optional['ExplicitNode'] = None  # Optional, adds more overhead
 
 
def explicit_node_memory():
    """Estimate memory usage for pointer-based node."""
    # Python object overhead: ~56 bytes
    # Reference (pointer): 8 bytes each
    # Integer value: ~28 bytes (Python int object)
    # Total with 3 pointers: ~56 + 28 + 3*8 = ~108 bytes per node
    return "~108 bytes per node (Python), ~40 bytes (C struct with pointers)"
 
 
# ===============================================
# IMPLICIT (ARRAY-BASED) TREE
# ===============================================
class ImplicitCompleteTree:
    """Array-based complete binary tree with implicit structure."""
    
    def __init__(self, capacity: int = 16):
        # Index 0 unused for cleaner arithmetic
        self._data = [None] * (capacity + 1)
        self._size = 0
    
    def _parent(self, i: int) -> int:
        return i >> 1  # i // 2, but faster
    
    def _left(self, i: int) -> int:
        return i << 1  # 2 * i, but faster
    
    def _right(self, i: int) -> int:
        return (i << 1) | 1  # 2 * i + 1, but faster
    
    def get_value(self, i: int) -> Optional[int]:
        if 1 <= i <= self._size:
            return self._data[i]
        return None
    
    def set_value(self, i: int, value: int):
        if 1 <= i <= self._size:
            self._data[i] = value
    
    def get_parent_value(self, i: int) -> Optional[int]:
        """Navigate to parent via index arithmetic—no pointer chase."""
        parent_idx = self._parent(i)
        return self.get_value(parent_idx)
    
    def get_children_values(self, i: int) -> tuple[Optional[int], Optional[int]]:
        """Get children via index arithmetic."""
        left_val = self.get_value(self._left(i))
        right_val = self.get_value(self._right(i))
        return (left_val, right_val)
 
 
# ===============================================
# MEMORY COMPARISON
# ===============================================
def memory_comparison(n: int):
    """Compare memory usage for n nodes."""
    
    # Pointer-based (conservative estimate for compiled languages)
    # Node: value (8 bytes) + 2 pointers (16 bytes) = 24 bytes minimum
    # With parent pointer: 32 bytes
    explicit_per_node = 24  # Without parent
    explicit_total = n * explicit_per_node
    
    # Array-based
    # Just the values, stored contiguously: 8 bytes per value (64-bit)
    # Plus one unused index 0: negligible
    implicit_per_node = 8
    implicit_total = n * implicit_per_node
    
    savings_bytes = explicit_total - implicit_total
    savings_percent = (savings_bytes / explicit_total) * 100
    
    return {
        'nodes': n,
        'explicit_total_kb': explicit_total / 1024,
        'implicit_total_kb': implicit_total / 1024,
        'savings_kb': savings_bytes / 1024,
        'savings_percent': savings_percent
    }
 
 
# Print comparison for various sizes
for n in [1000, 10000, 100000, 1000000]:
    comp = memory_comparison(n)
    print(f"n = {comp['nodes']:,}:")
    print(f"  Explicit: {comp['explicit_total_kb']:.1f} KB")
    print(f"  Implicit: {comp['implicit_total_kb']:.1f} KB")
    print(f"  Savings:  {comp['savings_kb']:.1f} KB ({comp['savings_percent']:.0f}%)")
    print()

The Implicit Representation Formulas:

For 1-indexed arrays (index 0 unused):

parent(i) = i / 2     (integer division)
left(i)   = 2 * i
right(i)  = 2 * i + 1

For 0-indexed arrays (index 0 is root):

parent(i) = (i - 1) / 2   (integer division)
left(i)   = 2 * i + 1
right(i)  = 2 * i + 2

Most heap implementations use 1-indexing for cleaner formulas, sacrificing one array position.

Bit Operation Speedups

The 1-indexed formulas can be implemented with single-cycle bit operations: parent(i) = i >> 1, left(i) = i << 1, right(i) = (i << 1) | 1. These are among the fastest operations on any CPU, taking a single clock cycle. Pointer dereferences require memory access (10-100+ cycles if cache miss).

Why Implicit Representation Requires Completeness

The implicit representation only works correctly when the tree is complete. Let's examine exactly what fails when this constraint is violated.

Case Study: Non-Complete Tree Array Representation

Consider a tree where node 5 has a right child but no left child (violating completeness):

        (1)
       /   \
     (2)   (3)
    /   \
  (4)   (5)
          \
          (11)  ← Right child only

Completeness Violation Analysis
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
PROBLEM: Non-Complete Tree with Index Formulas
 
Tree Structure:
        1
       / \
      2   3
     / \
    4   5
         \
         11  (right child of 5, no left child)
 
Level-order indices:
  1 → index 1
  2 → index 2
  3 → index 3
  4 → index 4
  5 → index 5
  (position 10 = left child of 5: EMPTY)
  11 → index 11 (right child of 5 by formula: 2*5+1=11)
 
Array Representation:
Index:  0   1   2   3   4   5   6   7   8   9  10  11
Value:  -   1   2   3   4   5   -   -   -   -   -  11
 
Problems:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
1. WASTED SPACE: 6 nodes require 11 positions (55% waste)
 
2. INDEX HOLES: Indices 6, 7, 8, 9, 10 are empty but must be tracked
 
3. VALIDITY COMPLEXITY: 
   - Can't just check "i <= n" for node existence
   - Need separate tracking of which indices are occupied
   - Storage for "occupied" bitmask adds overhead
 
4. NAVIGATION ISSUES:
   - left(5) = 10, but position 10 is empty
   - Must check "is position occupied" at every navigation step
   - Loses O(1) simplicity of implicit structure
 
5. INSERTION CONFUSION:
   - Where does next node go? Not obvious
   - Need to track "first empty index" or similar
   - Breaks the simple "n+1" insertion rule
 
COMPLETE TREE: Same 6 nodes, but MUST include position 6
 
        1
       / \
      2   3
     / \ /
    4   5 6  ← Node 6 is mandatory!
 
Array: [-, 1, 2, 3, 4, 5, 6]
All indices 1-6 used. Zero waste. Simple validity: 1 ≤ i ≤ 6.

The Only-Complete Theorem:

Implicit array representation with the parent/child index formulas is space-efficient if and only if the tree is complete.

Proof:

(⟹) If the tree is complete:

Indices are {1, 2, ..., n} with no gaps (proven earlier)
Array of size n stores all nodes
Space efficiency: 100%

(⟸) If space efficiency is 100%:

All array positions must be used
Indices are contiguous: {1, 2, ..., n}
Tree must be complete (proven earlier)

Necessity of Completeness:

If we relax completeness:

Either we waste array space on gaps
Or we use a different indexing scheme (losing simplicity)
Or we fall back to pointers

There's no way to have implicit structure, arbitrary shape, AND space efficiency. Completeness is the price for implicit representation.

Why This Tradeoff is Worth It for Heaps

Heaps don't need arbitrary shape because they don't maintain BST ordering. The heap property (parent ≥ children) doesn't depend on tree shape. So we CAN enforce completeness without losing functionality—and gain major efficiency benefits.

Implementation Patterns for Array-Based Trees

Let's examine the key implementation patterns for array-based complete binary trees, which form the foundation of heap implementations.

Pattern 1: Capacity Management

Arrays have fixed capacity, but heaps need to grow. Standard approach:

Initial capacity (e.g., 16)
When full, reallocate with 2× capacity
Amortized O(1) insertion despite occasional O(n) reallocations

array_tree_patterns.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
from typing import TypeVar, Generic, Optional, Iterator
 
T = TypeVar('T')
 
 
class ArrayCompleteTree(Generic[T]):
    """
    Complete binary tree backed by a dynamic array.
    Forms the structural foundation of binary heap implementations.
    """
    
    DEFAULT_CAPACITY = 16
    
    def __init__(self, initial_capacity: int = DEFAULT_CAPACITY):
        """Initialize with given capacity. Position 0 is unused."""
        self._capacity = initial_capacity
        self._data: list[Optional[T]] = [None] * (initial_capacity + 1)
        self._size = 0
    
    # =========================================
    # SIZE AND CAPACITY MANAGEMENT
    # =========================================
    
    def __len__(self) -> int:
        """Return number of elements."""
        return self._size
    
    def is_empty(self) -> bool:
        """Check if tree is empty."""
        return self._size == 0
    
    def _ensure_capacity(self, required: int) -> None:
        """Grow array if needed. Amortized O(1) via doubling."""
        if required > self._capacity:
            new_capacity = self._capacity * 2
            while new_capacity < required:
                new_capacity *= 2
            
            new_data: list[Optional[T]] = [None] * (new_capacity + 1)
            for i in range(self._size + 1):
                new_data[i] = self._data[i]
            
            self._data = new_data
            self._capacity = new_capacity
    
    # =========================================
    # INDEX ARITHMETIC (1-INDEXED)
    # =========================================
    
    @staticmethod
    def parent(i: int) -> int:
        """Parent index. Root's parent is 0 (sentinel)."""
        return i >> 1
    
    @staticmethod
    def left(i: int) -> int:
        """Left child index."""
        return i << 1
    
    @staticmethod
    def right(i: int) -> int:
        """Right child index."""
        return (i << 1) | 1
    
    def has_left(self, i: int) -> bool:
        """Check if node at i has a left child."""
        return self.left(i) <= self._size
    
    def has_right(self, i: int) -> bool:
        """Check if node at i has a right child."""
        return self.right(i) <= self._size
    
    def is_leaf(self, i: int) -> bool:
        """Check if node at i is a leaf."""
        return not self.has_left(i)
    
    def is_valid_index(self, i: int) -> bool:
        """Check if index i corresponds to an existing node."""
        return 1 <= i <= self._size
    
    # =========================================
    # STRUCTURAL OPERATIONS
    # =========================================
    
    def append(self, value: T) -> int:
        """
        Add value at next position (maintains completeness).
        Returns the index where value was placed.
        """
        self._size += 1
        self._ensure_capacity(self._size)
        self._data[self._size] = value
        return self._size
    
    def remove_last(self) -> T:
        """
        Remove and return last element (maintains completeness).
        """
        if self.is_empty():
            raise IndexError("Tree is empty")
        
        value = self._data[self._size]
        self._data[self._size] = None  # Help garbage collection
        self._size -= 1
        return value  # type: ignore
    
    def get(self, i: int) -> T:
        """Get value at index i."""
        if not self.is_valid_index(i):
            raise IndexError(f"Invalid index: {i}")
        return self._data[i]  # type: ignore
    
    def set(self, i: int, value: T) -> None:
        """Set value at index i."""
        if not self.is_valid_index(i):
            raise IndexError(f"Invalid index: {i}")
        self._data[i] = value
    
    def swap(self, i: int, j: int) -> None:
        """Swap values at indices i and j."""
        self._data[i], self._data[j] = self._data[j], self._data[i]
    
    # =========================================
    # TRAVERSAL
    # =========================================
    
    def level_order(self) -> Iterator[T]:
        """Iterate in level order (which is just array order!)."""
        for i in range(1, self._size + 1):
            yield self._data[i]  # type: ignore
    
    def get_level(self, i: int) -> int:
        """Return the level of node at index i (0-based level)."""
        if i < 1:
            raise ValueError("Index must be positive")
        return i.bit_length() - 1
    
    # =========================================
    # DEBUGGING UTILITIES
    # =========================================
    
    def _repr_tree(self, i: int = 1, prefix: str = "", is_left: bool = True) -> str:
        """Generate visual tree representation for debugging."""
        if not self.is_valid_index(i):
            return ""
        
        result = ""
        if self.has_right(i):
            result += self._repr_tree(
                self.right(i), 
                prefix + ("│   " if is_left else "    "),
                False
            )
        
        result += prefix + ("└── " if is_left else "┌── ") + str(self._data[i]) + "\n"
        
        if self.has_left(i):
            result += self._repr_tree(
                self.left(i),
                prefix + ("    " if is_left else "│   "),
                True
            )
        
        return result
    
    def __repr__(self) -> str:
        if self.is_empty():
            return "ArrayCompleteTree(empty)"
        return f"ArrayCompleteTree(size={self._size}):\n{self._repr_tree()}"
 
 
# Demonstration
tree = ArrayCompleteTree[str]()
for char in "ABCDEFGHIJ":
    tree.append(char)
 
print(f"Size: {len(tree)}")
print(f"Level order: {list(tree.level_order())}")
print(f"Node at 3: {tree.get(3)}, parent: {tree.get(tree.parent(3))}")
print(f"Children of 2: {tree.get(tree.left(2))}, {tree.get(tree.right(2))}")
print()
print(tree)

Pattern 2: Bounds Checking

The simplicity of bounds checking is a major benefit:

Valid index: 1 ≤ i ≤ n
Has left child: 2i ≤ n
Has right child: 2i + 1 ≤ n
Is root: i = 1
Is leaf: 2i > n

All checks are single comparisons against n—no pointer null checks needed.

Pattern 3: Level Computation

Level of node i: ⌊log₂(i)⌋ which equals bit_length(i) - 1 in most languages. This is a single CPU instruction on modern hardware.

Beyond Heaps — Other Array-Based Tree Structures

The complete binary tree array pattern extends beyond heaps to other important data structures. Understanding this generality reinforces the importance of the pattern.

Segment Trees

Segment trees store aggregate information (sum, min, max, etc.) over array ranges:

Leaves hold original array elements
Internal nodes hold aggregates of their subtrees
Use the same 2i / 2i+1 indexing as heaps
Not strictly complete (may have 'null' leaves), but still use array storage

Tournament Trees

Used for k-way merging and selection:

Leaves are the competing elements
Internal nodes record 'winners' of comparisons
Complete tree structure with array storage
Efficient for finding min/max repeatedly after updates

Array-Based Tree Structures

•Binary Heaps — Priority queue with heap property (this chapter)
•Segment Trees — Range queries and point updates in O(log n)
•Fenwick Trees (BIT) — Implicit structure using power-of-2 relationships
•Tournament Trees — Efficient kth-element finding and merging
•Van Emde Boas Layout — Cache-optimized BST layout using complete tree blocks
•B-Heaps — Cache-friendly heap variant using complete multi-way trees

The Common Thread:

All these structures share the insight that completeness enables implicit storage. They differ in:

What invariant they maintain (heap property, aggregates, tournament winners)
What operations they support (insert/extract vs. range query vs. update)

But the structural foundation—complete tree in an array—is the same. Mastering this pattern unlocks understanding of many advanced data structures.

When Other Structures Are Needed

Complete tree array storage is excellent when you only need operations that the structure supports. For BSTs (which need ordered iteration, predecessor/successor, etc.), the complete constraint is incompatible with the ordering requirement, so pointer-based or other representations are necessary.

Summary: The Complete-to-Array Connection

We've thoroughly explored why the complete binary tree structure enables efficient array-based storage—the key insight that makes heap implementations practical.

Key Takeaways

•Arbitrary trees have exponential array overhead — Skewed trees may require 2^n positions for n nodes, making array storage impractical.
•Complete trees have zero gaps in level-order — Indices 1, 2, 3, ..., n are all used with no holes.
•This creates a perfect bijection — Every complete tree maps to exactly one array, and vice versa.
•Implicit structure eliminates pointer overhead — Parent/child relationships are computed via fast arithmetic, not stored.
•Completeness is necessary and sufficient — The gapless property holds if and only if the tree is complete.
•The pattern generalizes beyond heaps — Segment trees, tournament trees, and other structures use the same foundation.

What's Next:

With the array representation understood, we turn to the final piece of the complete binary tree story: the height guarantee. We'll prove that complete binary trees always have height O(log n), which directly bounds the time complexity of heap operations.

Page Complete

You now understand the deep connection between complete binary trees and array storage. The gapless index sequence is the key property that enables implicit representation, eliminating pointer overhead while maintaining O(1) access to all structural relationships.