Data Structures & AlgorithmsComplete Binary Tree Representation

Complete Binary Trees — The Foundation of Array-Based Heaps

LevelIntermediate

Duration50 mins

TopicComplete Binary Tree Representation

2 / 4

Level-by-Level Filling

The Filling Pattern That Enables Everything

In the previous page, we defined complete binary trees by their structural property: all levels are completely filled except possibly the last, which is filled from left to right. Now we explore this filling process in depth—not just as a static property, but as a dynamic invariant that guides how we build, insert into, and delete from complete binary trees.

Understanding level-by-level filling is crucial because it answers fundamental questions:

Where exactly does a new node go when we insert?
Which node gets removed when we delete?
How do we navigate the tree without explicit parent/child pointers?
Why can we represent the tree as a contiguous array?

The filling pattern isn't just a description—it's an algorithm for locating, inserting, and removing nodes in O(1) time (once we know the size).

What You Will Learn

By the end of this page, you will understand the precise level-order filling algorithm, how it determines insertion and deletion positions, the relationship between tree size and node position, and how this filling pattern creates the predictable index structure that enables array representation.

Level-Order Traversal — The Foundation

Level-order traversal (also called breadth-first traversal) visits tree nodes level by level, from top to bottom and within each level from left to right. For complete binary trees, this traversal order is more than a search strategy—it defines the canonical numbering of nodes.

The Level-Order Numbering System:

If we number nodes starting from 1 in level-order:

Node 1: Root (level 0)
Nodes 2-3: Level 1 (left child first, then right child of root)
Nodes 4-7: Level 2 (children of node 2, then children of node 3)
Nodes 8-15: Level 3
...
Nodes 2^k to 2^(k+1)-1: Level k

This numbering has mathematical elegance: for any node i, its left child is 2i and its right child is 2i+1. Its parent is ⌊i/2⌋.

Level-Order Numbering Diagram
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Level-Order Numbering in a Complete Binary Tree:
 
Level 0 (2⁰ = 1 node):           [1]
                                /   \
Level 1 (2¹ = 2 nodes):       [2]   [3]
                             /  \   /  \
Level 2 (2² = 4 nodes):    [4] [5][6] [7]
                          / \  / \
Level 3 (partial):       [8][9][10][11] ...
 
Numbering Properties:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
• Node i is at level ⌊log₂(i)⌋
• Level k contains nodes 2^k through 2^(k+1) - 1
• Left child of i:  2i
• Right child of i: 2i + 1  
• Parent of i:      ⌊i/2⌋
• Left sibling of i (if i is right child): i - 1
• Right sibling of i (if i is left child): i + 1
 
This works BECAUSE nodes are filled in this exact order!

Level-Order Traversal Algorithm:

The classic BFS approach uses a queue:

1. Initialize queue with root
2. While queue is not empty:
   a. Dequeue front node, process it
   b. Enqueue left child (if exists)
   c. Enqueue right child (if exists)

However, for complete binary trees with array representation, traversal is trivial—just iterate through the array from index 1 to n. The array is the level-order sequence.

The Filling Invariant:

The level-order numbering creates a fundamental invariant:

A complete binary tree with n nodes contains exactly nodes 1, 2, 3, ..., n
No gaps, no skips, completely contiguous
The 'next' position for insertion is always n+1
The 'last' position for deletion is always n

Array Index Intuition

Think of the level-order numbering as filling seats in a theater, row by row, left to right. If 50 people are seated, they occupy seats 1-50 with no empty seats. The next person takes seat 51. This predictability is why we can use a simple counter (the size n) to track everything about the tree's shape.

The Level-by-Level Filling Process

Let's trace exactly how a complete binary tree grows as we add nodes one by one. This process illustrates the level-order filling invariant in action.

Starting from Empty:

We'll build a complete binary tree by inserting nodes with values A, B, C, D, E, F, G, H, I, J in sequence.

Complete Binary Tree Construction Trace
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
Step-by-step construction (inserting A, B, C, D, E, F, G, H, I, J):
 
After A (n=1):     [A]                    Array: [_, A]
                   
After B (n=2):     [A]                    Array: [_, A, B]
                   /
                 [B]
 
After C (n=3):     [A]                    Array: [_, A, B, C]
                  / \
                [B] [C]
 
After D (n=4):      [A]                   Array: [_, A, B, C, D]
                   / \
                 [B] [C]
                 /
               [D]
 
After E (n=5):      [A]                   Array: [_, A, B, C, D, E]
                   / \
                 [B] [C]
                 / \
               [D][E]
 
After F (n=6):       [A]                  Array: [_, A, B, C, D, E, F]
                    / \
                  [B] [C]
                  / \  /
                [D][E][F]
 
After G (n=7):       [A]                  Array: [_, A, B, C, D, E, F, G]
                    / \
                  [B] [C]                 ← Level 2 now full!
                  / \  / \
                [D][E][F][G]
 
After H (n=8):           [A]              Array: [_, A, B, C, D, E, F, G, H]
                        / \
                      [B] [C]
                      / \  / \
                    [D][E][F][G]
                    /
                  [H]
 
After I (n=9):           [A]              Array: [_, A, B, C, D, E, F, G, H, I]
                        / \
                      [B] [C]
                      / \  / \
                    [D][E][F][G]
                    / \
                  [H][I]
 
After J (n=10):          [A]              Array: [_, A, B, C, D, E, F, G, H, I, J]
                        / \
                      [B] [C]
                      / \  / \
                    [D][E][F][G]
                    / \ /
                  [H][I][J]

Observations from the Trace:

Each insertion goes to position n+1 — After 6 nodes, the 7th goes to position 7. No calculation of 'where' is needed; we just append.
Levels fill completely before moving to the next — Level 2 (positions 4-7) fills completely before level 3 starts at position 8.
Within a level, filling is strictly left-to-right — Position 8 (leftmost of level 3) fills before position 9, never the reverse.
Parent of new node is deterministic — A node inserted at position n is the child of node ⌊n/2⌋. Position 10 is the child of position 5.
Left vs right child is determined by parity — Even positions are left children; odd positions (except 1) are right children.

The Deterministic Nature

Notice how every aspect of insertion is completely deterministic: given only the size n, we know exactly where the next node goes (position n+1), who its parent is (⌊(n+1)/2⌋), and whether it's a left or right child ((n+1) mod 2). This determinism eliminates the need for search during insertion.

Maintaining the Filling Invariant During Operations

The level-by-level filling property isn't just for construction—it must be maintained during heap operations. This creates specific constraints on how insertions and deletions work.

Insertion Strategy:

For heaps, insertion works as follows:

Add node at position n+1 (the 'next' position in level-order)
This automatically maintains the complete tree property
The heap property may be violated, so we 'bubble up' to restore it

The key insight: we don't choose where to insert based on value. We insert at the structurally correct position, then fix ordering.

Deletion Strategy:

Heaps support deletion of the root (extract-max or extract-min). The process:

Swap root with node at position n (the 'last' position)
Remove what's now at position n (decrement n)
This maintains completeness, but violates heap property at root
'Bubble down' from root to restore heap property

Again, we don't search for what to delete—we always operate at the extremes (position 1 and position n).

maintaining_completeness.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
class CompleteTree:
    """
    Demonstrates how completeness is maintained during dynamic operations.
    This is the structural skeleton that heaps build upon.
    """
    
    def __init__(self):
        # Index 0 unused for cleaner parent/child arithmetic
        self._data = [None]  
        
    def size(self) -> int:
        """Current number of nodes."""
        return len(self._data) - 1
    
    def _parent(self, i: int) -> int:
        """Parent index of node at index i."""
        return i // 2
    
    def _left(self, i: int) -> int:
        """Left child index of node at index i."""
        return 2 * i
    
    def _right(self, i: int) -> int:
        """Right child index of node at index i."""
        return 2 * i + 1
    
    def _has_left(self, i: int) -> bool:
        """Check if node i has a left child."""
        return self._left(i) <= self.size()
    
    def _has_right(self, i: int) -> bool:
        """Check if node i has a right child."""
        return self._right(i) <= self.size()
    
    def insert(self, value) -> int:
        """
        Insert a value and return its position.
        The position is determined by the filling rule, not the value.
        """
        # Step 1: Append to maintain completeness
        self._data.append(value)
        new_position = self.size()
        
        # Step 2: For heaps, we would bubble up here
        # (omitted for pure structural demonstration)
        
        return new_position
    
    def remove_last(self):
        """
        Remove the last node (maintaining completeness).
        For heaps, the root is swapped with last, then this is called.
        """
        if self.size() == 0:
            raise IndexError("Cannot remove from empty tree")
        
        return self._data.pop()
    
    def get_insertion_position(self) -> dict:
        """Predict where the next insertion will go."""
        next_pos = self.size() + 1
        parent_pos = next_pos // 2
        is_left_child = (next_pos % 2 == 0)
        level = next_pos.bit_length() - 1  # floor(log2(next_pos))
        
        return {
            'position': next_pos,
            'parent': parent_pos,
            'is_left_child': is_left_child,
            'level': level
        }
    
    def get_last_position(self) -> dict:
        """Information about the current last node."""
        if self.size() == 0:
            return {'position': None}
        
        last_pos = self.size()
        parent_pos = last_pos // 2
        is_left_child = (last_pos % 2 == 0)
        level = last_pos.bit_length() - 1
        
        return {
            'position': last_pos,
            'parent': parent_pos,
            'is_left_child': is_left_child,
            'level': level,
            'value': self._data[last_pos]
        }
 
 
# Demonstration
tree = CompleteTree()
for val in ['A', 'B', 'C', 'D', 'E']:
    pos = tree.insert(val)
    info = tree.get_last_position()
    print(f"Inserted '{val}' at position {pos}, parent={info['parent']}, "
          f"is_left={info['is_left_child']}, level={info['level']}")
 
# Output:
# Inserted 'A' at position 1, parent=0, is_left=False, level=0
# Inserted 'B' at position 2, parent=1, is_left=True, level=1
# Inserted 'C' at position 3, parent=1, is_left=False, level=1
# Inserted 'D' at position 4, parent=2, is_left=True, level=2
# Inserted 'E' at position 5, parent=2, is_left=False, level=2

Why This Approach Works:

The beauty of this system is that maintaining completeness is free—it costs O(1) additional work:

Insertion: Append to array (O(1) amortized)
Deletion: Pop from array end (O(1))

The actual work in heap operations comes from restoring the heap property (bubbling), not from maintaining completeness. The complete tree structure is a consequence of always operating at the boundaries (append/pop) rather than in the middle.

No Middle Insertions or Deletions

Standard heaps don't support inserting at arbitrary positions or deleting arbitrary elements efficiently. If you need these operations, you need a different structure (indexed heap, Fibonacci heap, etc.). The complete tree property is maintained precisely because we only modify at the 'ends' of the level-order sequence.

The Mathematics of Position Relationships

The level-by-level filling creates a beautiful mathematical structure where the position of every node encodes all its relationships. Let's derive and prove these formulas rigorously.

The Fundamental Relationships (1-indexed):

For a node at position i:

Parent position: ⌊i/2⌋
Left child position: 2i
Right child position: 2i + 1
Level: ⌊log₂(i)⌋
Position within level: i - 2^(⌊log₂(i)⌋)

Mathematical Derivation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
PROOF: Left child of node i is at position 2i
 
Consider level k, which contains nodes 2^k through 2^(k+1) - 1.
The first node of level k is 2^k.
 
Level k has 2^k nodes.
Level k+1 (children level) has 2^(k+1) nodes.
 
Node i is at position (i - 2^k) within level k.
It has two children in level k+1.
 
Children of all level-k nodes precede children of later level-k nodes.
So node i's children come after the children of nodes 2^k through i-1.
 
Number of children before node i's children:
  2 × (i - 2^k) = 2i - 2^(k+1)
 
First child of first node of level k is:
  2^(k+1) (first position of level k+1)
 
Therefore, left child of node i is at:
  2^(k+1) + 2(i - 2^k) = 2^(k+1) + 2i - 2^(k+1) = 2i  ✓
 
Right child is simply 2i + 1. ✓
 
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
PROOF: Parent of node i is at position ⌊i/2⌋
 
If i is a left child, i = 2p for some parent p.
  Then ⌊i/2⌋ = ⌊2p/2⌋ = p. ✓
 
If i is a right child, i = 2p + 1 for some parent p.
  Then ⌊i/2⌋ = ⌊(2p+1)/2⌋ = ⌊p + 0.5⌋ = p. ✓
 
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
DETERMINING IF NODE i IS LEFT OR RIGHT CHILD:
 
i is a left child  ⟺  i is even  ⟺  i = 2p
i is a right child ⟺  i is odd   ⟺  i = 2p + 1
 
Note: The root (i=1) is a special case with no parent.

Position Relationships for First 15 Nodes
Position i	Parent ⌊i/2⌋	Left 2i	Right 2i+1	Level	L/R Child
1 (root)	—	2	3	0	—
2	1	4	5	1	Left
3	1	6	7	1	Right
4	2	8	9	2	Left
5	2	10	11	2	Right
6	3	12	13	2	Left
7	3	14	15	2	Right
8	4	16	17	3	Left
9	4	18	19	3	Right
10	5	20	21	3	Left

Alternative: 0-indexed Formulas

Some implementations use 0-indexed arrays. The formulas adjust slightly:

Parent: ⌊(i-1)/2⌋
Left child: 2i + 1
Right child: 2i + 2
Level: ⌊log₂(i+1)⌋

However, 1-indexed arithmetic is cleaner (parent = i/2, children = 2i and 2i+1), which is why many implementations leave index 0 unused.

Bit Manipulation Tricks

For 1-indexed arrays: Parent of i is simply i >> 1 (right shift). Left child is i << 1 (left shift). Right child is (i << 1) | 1. Level is the position of the highest set bit minus one. These bit operations are extremely fast on modern CPUs.

Sibling and Subtree Relationships

The level-order indexing also makes sibling and subtree relationships easily computable—useful for more advanced heap operations and analysis.

Sibling Relationships:

For node i (where i > 1):

If i is a left child (i even): sibling is at i + 1 (if i + 1 ≤ n)
If i is a right child (i odd): sibling is at i - 1

Subtree Size Calculation:

For a complete binary tree, we can compute the size of the subtree rooted at node i:

If node i has no children (i > n/2): subtree size = 1
If node i has one child: subtree size = 1 + subtree_size(left_child)
If node i has two children: subtree size = 1 + subtree_size(left) + subtree_size(right)

However, for a complete tree, there's a more direct formula based on the last node's position.

advanced_relationships.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
def get_sibling(i: int, n: int) -> int | None:
    """
    Get the sibling of node i in a complete tree of size n.
    Returns None if no sibling exists.
    """
    if i == 1:  # Root has no sibling
        return None
    
    if i % 2 == 0:  # i is a left child
        sibling = i + 1
        return sibling if sibling <= n else None
    else:  # i is a right child
        sibling = i - 1
        return sibling  # Right child always has a left sibling
 
 
def is_leaf(i: int, n: int) -> bool:
    """Check if node i is a leaf in a complete tree of size n."""
    return 2 * i > n  # No left child means no children (complete property)
 
 
def is_internal(i: int, n: int) -> bool:
    """Check if node i is an internal node (has at least one child)."""
    return 2 * i <= n
 
 
def get_descendants_range(i: int, n: int, height: int = None) -> tuple[int, int]:
    """
    Get the range of indices of all descendants of node i.
    Returns (first_descendant, last_descendant).
    For a leaf, returns (None, None).
    """
    import math
    
    if height is None:
        height = math.floor(math.log2(n)) if n > 0 else 0
    
    level_i = math.floor(math.log2(i)) if i > 0 else 0
    
    if is_leaf(i, n):
        return (None, None)
    
    # First descendant is left child
    first_desc = 2 * i
    
    # Last descendant: follow rightmost path until exceeding n
    last_desc = i
    while 2 * last_desc + 1 <= n:  # Has right child
        last_desc = 2 * last_desc + 1
    if 2 * last_desc <= n:  # Only has left child
        last_desc = 2 * last_desc
    
    return (first_desc, last_desc)
 
 
def get_ancestors(i: int) -> list[int]:
    """Get all ancestor indices from node i up to root (inclusive of i)."""
    ancestors = []
    while i >= 1:
        ancestors.append(i)
        i //= 2
    return ancestors
 
 
# Demonstration
n = 10  # Complete tree with 10 nodes
print(f"Complete tree with {n} nodes:")
print(f"Node 3 sibling: {get_sibling(3, n)}")  # 2
print(f"Node 4 sibling: {get_sibling(4, n)}")  # 5
print(f"Node 5 sibling: {get_sibling(5, n)}")  # 4
print(f"Node 7 sibling: {get_sibling(7, n)}")  # 6
print()
print(f"Leaves: {[i for i in range(1, n+1) if is_leaf(i, n)]}")
print(f"Internal: {[i for i in range(1, n+1) if is_internal(i, n)]}")
print()
print(f"Ancestors of node 10: {get_ancestors(10)}")  # [10, 5, 2, 1]
print(f"Ancestors of node 8: {get_ancestors(8)}")    # [8, 4, 2, 1]

Internal Nodes vs Leaves:

In a complete binary tree with n nodes:

Leaves: Nodes i where 2i > n (no left child, hence no children)
Internal nodes: Nodes i where 2i ≤ n (has at least a left child)

The count is always:

Leaves: ⌈n/2⌉
Internal nodes: ⌊n/2⌋

This means roughly half the nodes are at the bottom level or very near it. This is crucial for the O(n) heap construction algorithm, which processes nodes in reverse level order.

Practical Implications for Heap Implementation

The level-by-level filling pattern has profound practical implications for heap implementation efficiency. Let's enumerate the key benefits.

1. O(1) Access to All Node Relationships

With pointer-based trees, following a parent pointer requires a memory access (cache miss likely). With array indices:

Parent, left child, right child: simple arithmetic, no memory access
These are among the fastest operations a CPU can perform
Zero cache misses for navigation itself (data access still requires memory)

Performance Benefits of Level-Order Array Layout

•No pointer storage overhead — Save 2 pointers (16 bytes on 64-bit) per node. For 1 million nodes, that's 16 MB saved.
•Sequential memory layout — Adjacent indices are adjacent in memory, maximizing cache line utilization during traversal.
•Predictable memory access — The compiler and CPU can prefetch data effectively because access patterns are arithmetic, not data-dependent.
•O(1) insertion position — No traversal needed to find where to insert; just append.
•O(1) deletion position — The 'last' node is always at index n, immediately accessible.
•Simple bounds checking — Valid indices are exactly [1, n]. One comparison for all validity checks.

2. Memory Allocation Simplicity

Pointer-based trees allocate each node separately, leading to:

Memory fragmentation
Allocator overhead per node
Unpredictable memory layout

Array-based heaps allocate a single contiguous block. Growth uses amortized O(1) reallocation (doubling strategy).

3. Serialization Triviality

To serialize a pointer-based tree, you need a traversal algorithm and format for encoding structure. To serialize an array-based heap:

Just write the array. That's it.
The structure is implicit.
Deserialization: just load the array.

4. Copy and Clone Simplicity

Copying a pointer-based tree requires traversal and node allocation. Copying an array-based heap: memcpy the array. Single operation, maximum efficiency.

Real-World Cache Effects

Modern CPUs load data in cache lines (typically 64 bytes). An array-based heap accessing adjacent indices often finds the data already in cache. Pointer-based trees chase pointers to unpredictable memory locations, causing cache misses. For large heaps, this difference can be 10-100x in actual performance.

Summary: Level-by-Level Filling Mastery

We've thoroughly explored the level-by-level filling pattern that characterizes complete binary trees. This pattern is the key that unlocks efficient heap implementations.

Key Takeaways

•Level-order numbering assigns indices 1 through n to nodes top-to-bottom, left-to-right — This creates a contiguous index space with no gaps.
•Position i encodes all relationships arithmetically — Parent at ⌊i/2⌋, children at 2i and 2i+1, level at ⌊log₂(i)⌋.
•Insertions always go to position n+1 — The 'next' position is deterministic, requiring no search.
•Deletions remove from position n — The 'last' position is always known, enabling O(1) removal.
•Maintaining completeness is automatic — Append for insert, pop for delete—structure is preserved implicitly.
•The pattern enables direct array representation — Level-order is literally the array order, making traversal trivial.

What's Next:

With level-by-level filling understood, we now have the conceptual foundation for the next crucial topic: why completeness enables array storage. We'll see how the mathematical relationships we've derived translate directly into an array implementation, eliminating the need for pointers entirely.

Page Complete

You now understand the level-by-level filling process as both a construction algorithm and a dynamic invariant. The deterministic relationship between tree size and node positions is the bridge between abstract tree structure and practical array-based implementation.

2 / 4

Loading learning content...

Data Structures & AlgorithmsComplete Binary Tree Representation

Complete Binary Trees — The Foundation of Array-Based Heaps

LevelIntermediate

Duration50 mins

TopicComplete Binary Tree Representation

2 / 4

Level-by-Level Filling

The Filling Pattern That Enables Everything

Understanding level-by-level filling is crucial because it answers fundamental questions:

Where exactly does a new node go when we insert?
Which node gets removed when we delete?
How do we navigate the tree without explicit parent/child pointers?
Why can we represent the tree as a contiguous array?

The filling pattern isn't just a description—it's an algorithm for locating, inserting, and removing nodes in O(1) time (once we know the size).

What You Will Learn

Level-Order Traversal — The Foundation

The Level-Order Numbering System:

If we number nodes starting from 1 in level-order:

Node 1: Root (level 0)
Nodes 2-3: Level 1 (left child first, then right child of root)
Nodes 4-7: Level 2 (children of node 2, then children of node 3)
Nodes 8-15: Level 3
...
Nodes 2^k to 2^(k+1)-1: Level k

This numbering has mathematical elegance: for any node i, its left child is 2i and its right child is 2i+1. Its parent is ⌊i/2⌋.

Level-Order Numbering Diagram
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Level-Order Numbering in a Complete Binary Tree:
 
Level 0 (2⁰ = 1 node):           [1]
                                /   \
Level 1 (2¹ = 2 nodes):       [2]   [3]
                             /  \   /  \
Level 2 (2² = 4 nodes):    [4] [5][6] [7]
                          / \  / \
Level 3 (partial):       [8][9][10][11] ...
 
Numbering Properties:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
• Node i is at level ⌊log₂(i)⌋
• Level k contains nodes 2^k through 2^(k+1) - 1
• Left child of i:  2i
• Right child of i: 2i + 1  
• Parent of i:      ⌊i/2⌋
• Left sibling of i (if i is right child): i - 1
• Right sibling of i (if i is left child): i + 1
 
This works BECAUSE nodes are filled in this exact order!

Level-Order Traversal Algorithm:

The classic BFS approach uses a queue:

1. Initialize queue with root
2. While queue is not empty:
   a. Dequeue front node, process it
   b. Enqueue left child (if exists)
   c. Enqueue right child (if exists)

However, for complete binary trees with array representation, traversal is trivial—just iterate through the array from index 1 to n. The array is the level-order sequence.

The Filling Invariant:

The level-order numbering creates a fundamental invariant:

A complete binary tree with n nodes contains exactly nodes 1, 2, 3, ..., n
No gaps, no skips, completely contiguous
The 'next' position for insertion is always n+1
The 'last' position for deletion is always n

Array Index Intuition

The Level-by-Level Filling Process

Let's trace exactly how a complete binary tree grows as we add nodes one by one. This process illustrates the level-order filling invariant in action.

Starting from Empty:

We'll build a complete binary tree by inserting nodes with values A, B, C, D, E, F, G, H, I, J in sequence.

Complete Binary Tree Construction Trace
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
Step-by-step construction (inserting A, B, C, D, E, F, G, H, I, J):
 
After A (n=1):     [A]                    Array: [_, A]
                   
After B (n=2):     [A]                    Array: [_, A, B]
                   /
                 [B]
 
After C (n=3):     [A]                    Array: [_, A, B, C]
                  / \
                [B] [C]
 
After D (n=4):      [A]                   Array: [_, A, B, C, D]
                   / \
                 [B] [C]
                 /
               [D]
 
After E (n=5):      [A]                   Array: [_, A, B, C, D, E]
                   / \
                 [B] [C]
                 / \
               [D][E]
 
After F (n=6):       [A]                  Array: [_, A, B, C, D, E, F]
                    / \
                  [B] [C]
                  / \  /
                [D][E][F]
 
After G (n=7):       [A]                  Array: [_, A, B, C, D, E, F, G]
                    / \
                  [B] [C]                 ← Level 2 now full!
                  / \  / \
                [D][E][F][G]
 
After H (n=8):           [A]              Array: [_, A, B, C, D, E, F, G, H]
                        / \
                      [B] [C]
                      / \  / \
                    [D][E][F][G]
                    /
                  [H]
 
After I (n=9):           [A]              Array: [_, A, B, C, D, E, F, G, H, I]
                        / \
                      [B] [C]
                      / \  / \
                    [D][E][F][G]
                    / \
                  [H][I]
 
After J (n=10):          [A]              Array: [_, A, B, C, D, E, F, G, H, I, J]
                        / \
                      [B] [C]
                      / \  / \
                    [D][E][F][G]
                    / \ /
                  [H][I][J]

Observations from the Trace:

Each insertion goes to position n+1 — After 6 nodes, the 7th goes to position 7. No calculation of 'where' is needed; we just append.
Levels fill completely before moving to the next — Level 2 (positions 4-7) fills completely before level 3 starts at position 8.
Within a level, filling is strictly left-to-right — Position 8 (leftmost of level 3) fills before position 9, never the reverse.
Parent of new node is deterministic — A node inserted at position n is the child of node ⌊n/2⌋. Position 10 is the child of position 5.
Left vs right child is determined by parity — Even positions are left children; odd positions (except 1) are right children.

The Deterministic Nature

Maintaining the Filling Invariant During Operations

The level-by-level filling property isn't just for construction—it must be maintained during heap operations. This creates specific constraints on how insertions and deletions work.

Insertion Strategy:

For heaps, insertion works as follows:

Add node at position n+1 (the 'next' position in level-order)
This automatically maintains the complete tree property
The heap property may be violated, so we 'bubble up' to restore it

The key insight: we don't choose where to insert based on value. We insert at the structurally correct position, then fix ordering.

Deletion Strategy:

Heaps support deletion of the root (extract-max or extract-min). The process:

Swap root with node at position n (the 'last' position)
Remove what's now at position n (decrement n)
This maintains completeness, but violates heap property at root
'Bubble down' from root to restore heap property

Again, we don't search for what to delete—we always operate at the extremes (position 1 and position n).

maintaining_completeness.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
class CompleteTree:
    """
    Demonstrates how completeness is maintained during dynamic operations.
    This is the structural skeleton that heaps build upon.
    """
    
    def __init__(self):
        # Index 0 unused for cleaner parent/child arithmetic
        self._data = [None]  
        
    def size(self) -> int:
        """Current number of nodes."""
        return len(self._data) - 1
    
    def _parent(self, i: int) -> int:
        """Parent index of node at index i."""
        return i // 2
    
    def _left(self, i: int) -> int:
        """Left child index of node at index i."""
        return 2 * i
    
    def _right(self, i: int) -> int:
        """Right child index of node at index i."""
        return 2 * i + 1
    
    def _has_left(self, i: int) -> bool:
        """Check if node i has a left child."""
        return self._left(i) <= self.size()
    
    def _has_right(self, i: int) -> bool:
        """Check if node i has a right child."""
        return self._right(i) <= self.size()
    
    def insert(self, value) -> int:
        """
        Insert a value and return its position.
        The position is determined by the filling rule, not the value.
        """
        # Step 1: Append to maintain completeness
        self._data.append(value)
        new_position = self.size()
        
        # Step 2: For heaps, we would bubble up here
        # (omitted for pure structural demonstration)
        
        return new_position
    
    def remove_last(self):
        """
        Remove the last node (maintaining completeness).
        For heaps, the root is swapped with last, then this is called.
        """
        if self.size() == 0:
            raise IndexError("Cannot remove from empty tree")
        
        return self._data.pop()
    
    def get_insertion_position(self) -> dict:
        """Predict where the next insertion will go."""
        next_pos = self.size() + 1
        parent_pos = next_pos // 2
        is_left_child = (next_pos % 2 == 0)
        level = next_pos.bit_length() - 1  # floor(log2(next_pos))
        
        return {
            'position': next_pos,
            'parent': parent_pos,
            'is_left_child': is_left_child,
            'level': level
        }
    
    def get_last_position(self) -> dict:
        """Information about the current last node."""
        if self.size() == 0:
            return {'position': None}
        
        last_pos = self.size()
        parent_pos = last_pos // 2
        is_left_child = (last_pos % 2 == 0)
        level = last_pos.bit_length() - 1
        
        return {
            'position': last_pos,
            'parent': parent_pos,
            'is_left_child': is_left_child,
            'level': level,
            'value': self._data[last_pos]
        }
 
 
# Demonstration
tree = CompleteTree()
for val in ['A', 'B', 'C', 'D', 'E']:
    pos = tree.insert(val)
    info = tree.get_last_position()
    print(f"Inserted '{val}' at position {pos}, parent={info['parent']}, "
          f"is_left={info['is_left_child']}, level={info['level']}")
 
# Output:
# Inserted 'A' at position 1, parent=0, is_left=False, level=0
# Inserted 'B' at position 2, parent=1, is_left=True, level=1
# Inserted 'C' at position 3, parent=1, is_left=False, level=1
# Inserted 'D' at position 4, parent=2, is_left=True, level=2
# Inserted 'E' at position 5, parent=2, is_left=False, level=2

Why This Approach Works:

The beauty of this system is that maintaining completeness is free—it costs O(1) additional work:

Insertion: Append to array (O(1) amortized)
Deletion: Pop from array end (O(1))

No Middle Insertions or Deletions

The Mathematics of Position Relationships

The level-by-level filling creates a beautiful mathematical structure where the position of every node encodes all its relationships. Let's derive and prove these formulas rigorously.

The Fundamental Relationships (1-indexed):

For a node at position i:

Parent position: ⌊i/2⌋
Left child position: 2i
Right child position: 2i + 1
Level: ⌊log₂(i)⌋
Position within level: i - 2^(⌊log₂(i)⌋)

Mathematical Derivation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
PROOF: Left child of node i is at position 2i
 
Consider level k, which contains nodes 2^k through 2^(k+1) - 1.
The first node of level k is 2^k.
 
Level k has 2^k nodes.
Level k+1 (children level) has 2^(k+1) nodes.
 
Node i is at position (i - 2^k) within level k.
It has two children in level k+1.
 
Children of all level-k nodes precede children of later level-k nodes.
So node i's children come after the children of nodes 2^k through i-1.
 
Number of children before node i's children:
  2 × (i - 2^k) = 2i - 2^(k+1)
 
First child of first node of level k is:
  2^(k+1) (first position of level k+1)
 
Therefore, left child of node i is at:
  2^(k+1) + 2(i - 2^k) = 2^(k+1) + 2i - 2^(k+1) = 2i  ✓
 
Right child is simply 2i + 1. ✓
 
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
PROOF: Parent of node i is at position ⌊i/2⌋
 
If i is a left child, i = 2p for some parent p.
  Then ⌊i/2⌋ = ⌊2p/2⌋ = p. ✓
 
If i is a right child, i = 2p + 1 for some parent p.
  Then ⌊i/2⌋ = ⌊(2p+1)/2⌋ = ⌊p + 0.5⌋ = p. ✓
 
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
 
DETERMINING IF NODE i IS LEFT OR RIGHT CHILD:
 
i is a left child  ⟺  i is even  ⟺  i = 2p
i is a right child ⟺  i is odd   ⟺  i = 2p + 1
 
Note: The root (i=1) is a special case with no parent.

Position Relationships for First 15 Nodes
Position i	Parent ⌊i/2⌋	Left 2i	Right 2i+1	Level	L/R Child
1 (root)	—	2	3	0	—
2	1	4	5	1	Left
3	1	6	7	1	Right
4	2	8	9	2	Left
5	2	10	11	2	Right
6	3	12	13	2	Left
7	3	14	15	2	Right
8	4	16	17	3	Left
9	4	18	19	3	Right
10	5	20	21	3	Left

Alternative: 0-indexed Formulas

Some implementations use 0-indexed arrays. The formulas adjust slightly:

Parent: ⌊(i-1)/2⌋
Left child: 2i + 1
Right child: 2i + 2
Level: ⌊log₂(i+1)⌋

However, 1-indexed arithmetic is cleaner (parent = i/2, children = 2i and 2i+1), which is why many implementations leave index 0 unused.

Bit Manipulation Tricks

Sibling and Subtree Relationships

The level-order indexing also makes sibling and subtree relationships easily computable—useful for more advanced heap operations and analysis.

Sibling Relationships:

For node i (where i > 1):

If i is a left child (i even): sibling is at i + 1 (if i + 1 ≤ n)
If i is a right child (i odd): sibling is at i - 1

Subtree Size Calculation:

For a complete binary tree, we can compute the size of the subtree rooted at node i:

If node i has no children (i > n/2): subtree size = 1
If node i has one child: subtree size = 1 + subtree_size(left_child)
If node i has two children: subtree size = 1 + subtree_size(left) + subtree_size(right)

However, for a complete tree, there's a more direct formula based on the last node's position.

advanced_relationships.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
def get_sibling(i: int, n: int) -> int | None:
    """
    Get the sibling of node i in a complete tree of size n.
    Returns None if no sibling exists.
    """
    if i == 1:  # Root has no sibling
        return None
    
    if i % 2 == 0:  # i is a left child
        sibling = i + 1
        return sibling if sibling <= n else None
    else:  # i is a right child
        sibling = i - 1
        return sibling  # Right child always has a left sibling
 
 
def is_leaf(i: int, n: int) -> bool:
    """Check if node i is a leaf in a complete tree of size n."""
    return 2 * i > n  # No left child means no children (complete property)
 
 
def is_internal(i: int, n: int) -> bool:
    """Check if node i is an internal node (has at least one child)."""
    return 2 * i <= n
 
 
def get_descendants_range(i: int, n: int, height: int = None) -> tuple[int, int]:
    """
    Get the range of indices of all descendants of node i.
    Returns (first_descendant, last_descendant).
    For a leaf, returns (None, None).
    """
    import math
    
    if height is None:
        height = math.floor(math.log2(n)) if n > 0 else 0
    
    level_i = math.floor(math.log2(i)) if i > 0 else 0
    
    if is_leaf(i, n):
        return (None, None)
    
    # First descendant is left child
    first_desc = 2 * i
    
    # Last descendant: follow rightmost path until exceeding n
    last_desc = i
    while 2 * last_desc + 1 <= n:  # Has right child
        last_desc = 2 * last_desc + 1
    if 2 * last_desc <= n:  # Only has left child
        last_desc = 2 * last_desc
    
    return (first_desc, last_desc)
 
 
def get_ancestors(i: int) -> list[int]:
    """Get all ancestor indices from node i up to root (inclusive of i)."""
    ancestors = []
    while i >= 1:
        ancestors.append(i)
        i //= 2
    return ancestors
 
 
# Demonstration
n = 10  # Complete tree with 10 nodes
print(f"Complete tree with {n} nodes:")
print(f"Node 3 sibling: {get_sibling(3, n)}")  # 2
print(f"Node 4 sibling: {get_sibling(4, n)}")  # 5
print(f"Node 5 sibling: {get_sibling(5, n)}")  # 4
print(f"Node 7 sibling: {get_sibling(7, n)}")  # 6
print()
print(f"Leaves: {[i for i in range(1, n+1) if is_leaf(i, n)]}")
print(f"Internal: {[i for i in range(1, n+1) if is_internal(i, n)]}")
print()
print(f"Ancestors of node 10: {get_ancestors(10)}")  # [10, 5, 2, 1]
print(f"Ancestors of node 8: {get_ancestors(8)}")    # [8, 4, 2, 1]

Internal Nodes vs Leaves:

In a complete binary tree with n nodes:

Leaves: Nodes i where 2i > n (no left child, hence no children)
Internal nodes: Nodes i where 2i ≤ n (has at least a left child)

The count is always:

Leaves: ⌈n/2⌉
Internal nodes: ⌊n/2⌋

This means roughly half the nodes are at the bottom level or very near it. This is crucial for the O(n) heap construction algorithm, which processes nodes in reverse level order.

Practical Implications for Heap Implementation

The level-by-level filling pattern has profound practical implications for heap implementation efficiency. Let's enumerate the key benefits.

1. O(1) Access to All Node Relationships

With pointer-based trees, following a parent pointer requires a memory access (cache miss likely). With array indices:

Parent, left child, right child: simple arithmetic, no memory access
These are among the fastest operations a CPU can perform
Zero cache misses for navigation itself (data access still requires memory)

Performance Benefits of Level-Order Array Layout

•No pointer storage overhead — Save 2 pointers (16 bytes on 64-bit) per node. For 1 million nodes, that's 16 MB saved.
•Sequential memory layout — Adjacent indices are adjacent in memory, maximizing cache line utilization during traversal.
•Predictable memory access — The compiler and CPU can prefetch data effectively because access patterns are arithmetic, not data-dependent.
•O(1) insertion position — No traversal needed to find where to insert; just append.
•O(1) deletion position — The 'last' node is always at index n, immediately accessible.
•Simple bounds checking — Valid indices are exactly [1, n]. One comparison for all validity checks.

2. Memory Allocation Simplicity

Pointer-based trees allocate each node separately, leading to:

Memory fragmentation
Allocator overhead per node
Unpredictable memory layout

Array-based heaps allocate a single contiguous block. Growth uses amortized O(1) reallocation (doubling strategy).

3. Serialization Triviality

To serialize a pointer-based tree, you need a traversal algorithm and format for encoding structure. To serialize an array-based heap:

Just write the array. That's it.
The structure is implicit.
Deserialization: just load the array.

4. Copy and Clone Simplicity

Copying a pointer-based tree requires traversal and node allocation. Copying an array-based heap: memcpy the array. Single operation, maximum efficiency.

Real-World Cache Effects

Summary: Level-by-Level Filling Mastery

We've thoroughly explored the level-by-level filling pattern that characterizes complete binary trees. This pattern is the key that unlocks efficient heap implementations.

Key Takeaways

•Level-order numbering assigns indices 1 through n to nodes top-to-bottom, left-to-right — This creates a contiguous index space with no gaps.
•Position i encodes all relationships arithmetically — Parent at ⌊i/2⌋, children at 2i and 2i+1, level at ⌊log₂(i)⌋.
•Insertions always go to position n+1 — The 'next' position is deterministic, requiring no search.
•Deletions remove from position n — The 'last' position is always known, enabling O(1) removal.
•Maintaining completeness is automatic — Append for insert, pop for delete—structure is preserved implicitly.
•The pattern enables direct array representation — Level-order is literally the array order, making traversal trivial.

What's Next:

Page Complete

2 / 4