Loading learning content...
In the previous page, we defined complete binary trees by their structural property: all levels are completely filled except possibly the last, which is filled from left to right. Now we explore this filling process in depth—not just as a static property, but as a dynamic invariant that guides how we build, insert into, and delete from complete binary trees.
Understanding level-by-level filling is crucial because it answers fundamental questions:
The filling pattern isn't just a description—it's an algorithm for locating, inserting, and removing nodes in O(1) time (once we know the size).
By the end of this page, you will understand the precise level-order filling algorithm, how it determines insertion and deletion positions, the relationship between tree size and node position, and how this filling pattern creates the predictable index structure that enables array representation.
Level-order traversal (also called breadth-first traversal) visits tree nodes level by level, from top to bottom and within each level from left to right. For complete binary trees, this traversal order is more than a search strategy—it defines the canonical numbering of nodes.
The Level-Order Numbering System:
If we number nodes starting from 1 in level-order:
Node 1: Root (level 0)
Nodes 2-3: Level 1 (left child first, then right child of root)
Nodes 4-7: Level 2 (children of node 2, then children of node 3)
Nodes 8-15: Level 3
...
Nodes 2^k to 2^(k+1)-1: Level k
This numbering has mathematical elegance: for any node i, its left child is 2i and its right child is 2i+1. Its parent is ⌊i/2⌋.
123456789101112131415161718192021
Level-Order Numbering in a Complete Binary Tree: Level 0 (2⁰ = 1 node): [1] / \Level 1 (2¹ = 2 nodes): [2] [3] / \ / \Level 2 (2² = 4 nodes): [4] [5][6] [7] / \ / \Level 3 (partial): [8][9][10][11] ... Numbering Properties:━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━• Node i is at level ⌊log₂(i)⌋• Level k contains nodes 2^k through 2^(k+1) - 1• Left child of i: 2i• Right child of i: 2i + 1 • Parent of i: ⌊i/2⌋• Left sibling of i (if i is right child): i - 1• Right sibling of i (if i is left child): i + 1 This works BECAUSE nodes are filled in this exact order!Level-Order Traversal Algorithm:
The classic BFS approach uses a queue:
1. Initialize queue with root
2. While queue is not empty:
a. Dequeue front node, process it
b. Enqueue left child (if exists)
c. Enqueue right child (if exists)
However, for complete binary trees with array representation, traversal is trivial—just iterate through the array from index 1 to n. The array is the level-order sequence.
The Filling Invariant:
The level-order numbering creates a fundamental invariant:
Think of the level-order numbering as filling seats in a theater, row by row, left to right. If 50 people are seated, they occupy seats 1-50 with no empty seats. The next person takes seat 51. This predictability is why we can use a simple counter (the size n) to track everything about the tree's shape.
Let's trace exactly how a complete binary tree grows as we add nodes one by one. This process illustrates the level-order filling invariant in action.
Starting from Empty:
We'll build a complete binary tree by inserting nodes with values A, B, C, D, E, F, G, H, I, J in sequence.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859
Step-by-step construction (inserting A, B, C, D, E, F, G, H, I, J): After A (n=1): [A] Array: [_, A] After B (n=2): [A] Array: [_, A, B] / [B] After C (n=3): [A] Array: [_, A, B, C] / \ [B] [C] After D (n=4): [A] Array: [_, A, B, C, D] / \ [B] [C] / [D] After E (n=5): [A] Array: [_, A, B, C, D, E] / \ [B] [C] / \ [D][E] After F (n=6): [A] Array: [_, A, B, C, D, E, F] / \ [B] [C] / \ / [D][E][F] After G (n=7): [A] Array: [_, A, B, C, D, E, F, G] / \ [B] [C] ← Level 2 now full! / \ / \ [D][E][F][G] After H (n=8): [A] Array: [_, A, B, C, D, E, F, G, H] / \ [B] [C] / \ / \ [D][E][F][G] / [H] After I (n=9): [A] Array: [_, A, B, C, D, E, F, G, H, I] / \ [B] [C] / \ / \ [D][E][F][G] / \ [H][I] After J (n=10): [A] Array: [_, A, B, C, D, E, F, G, H, I, J] / \ [B] [C] / \ / \ [D][E][F][G] / \ / [H][I][J]Observations from the Trace:
Each insertion goes to position n+1 — After 6 nodes, the 7th goes to position 7. No calculation of 'where' is needed; we just append.
Levels fill completely before moving to the next — Level 2 (positions 4-7) fills completely before level 3 starts at position 8.
Within a level, filling is strictly left-to-right — Position 8 (leftmost of level 3) fills before position 9, never the reverse.
Parent of new node is deterministic — A node inserted at position n is the child of node ⌊n/2⌋. Position 10 is the child of position 5.
Left vs right child is determined by parity — Even positions are left children; odd positions (except 1) are right children.
Notice how every aspect of insertion is completely deterministic: given only the size n, we know exactly where the next node goes (position n+1), who its parent is (⌊(n+1)/2⌋), and whether it's a left or right child ((n+1) mod 2). This determinism eliminates the need for search during insertion.
The level-by-level filling property isn't just for construction—it must be maintained during heap operations. This creates specific constraints on how insertions and deletions work.
Insertion Strategy:
For heaps, insertion works as follows:
The key insight: we don't choose where to insert based on value. We insert at the structurally correct position, then fix ordering.
Deletion Strategy:
Heaps support deletion of the root (extract-max or extract-min). The process:
Again, we don't search for what to delete—we always operate at the extremes (position 1 and position n).
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105
class CompleteTree: """ Demonstrates how completeness is maintained during dynamic operations. This is the structural skeleton that heaps build upon. """ def __init__(self): # Index 0 unused for cleaner parent/child arithmetic self._data = [None] def size(self) -> int: """Current number of nodes.""" return len(self._data) - 1 def _parent(self, i: int) -> int: """Parent index of node at index i.""" return i // 2 def _left(self, i: int) -> int: """Left child index of node at index i.""" return 2 * i def _right(self, i: int) -> int: """Right child index of node at index i.""" return 2 * i + 1 def _has_left(self, i: int) -> bool: """Check if node i has a left child.""" return self._left(i) <= self.size() def _has_right(self, i: int) -> bool: """Check if node i has a right child.""" return self._right(i) <= self.size() def insert(self, value) -> int: """ Insert a value and return its position. The position is determined by the filling rule, not the value. """ # Step 1: Append to maintain completeness self._data.append(value) new_position = self.size() # Step 2: For heaps, we would bubble up here # (omitted for pure structural demonstration) return new_position def remove_last(self): """ Remove the last node (maintaining completeness). For heaps, the root is swapped with last, then this is called. """ if self.size() == 0: raise IndexError("Cannot remove from empty tree") return self._data.pop() def get_insertion_position(self) -> dict: """Predict where the next insertion will go.""" next_pos = self.size() + 1 parent_pos = next_pos // 2 is_left_child = (next_pos % 2 == 0) level = next_pos.bit_length() - 1 # floor(log2(next_pos)) return { 'position': next_pos, 'parent': parent_pos, 'is_left_child': is_left_child, 'level': level } def get_last_position(self) -> dict: """Information about the current last node.""" if self.size() == 0: return {'position': None} last_pos = self.size() parent_pos = last_pos // 2 is_left_child = (last_pos % 2 == 0) level = last_pos.bit_length() - 1 return { 'position': last_pos, 'parent': parent_pos, 'is_left_child': is_left_child, 'level': level, 'value': self._data[last_pos] } # Demonstrationtree = CompleteTree()for val in ['A', 'B', 'C', 'D', 'E']: pos = tree.insert(val) info = tree.get_last_position() print(f"Inserted '{val}' at position {pos}, parent={info['parent']}, " f"is_left={info['is_left_child']}, level={info['level']}") # Output:# Inserted 'A' at position 1, parent=0, is_left=False, level=0# Inserted 'B' at position 2, parent=1, is_left=True, level=1# Inserted 'C' at position 3, parent=1, is_left=False, level=1# Inserted 'D' at position 4, parent=2, is_left=True, level=2# Inserted 'E' at position 5, parent=2, is_left=False, level=2Why This Approach Works:
The beauty of this system is that maintaining completeness is free—it costs O(1) additional work:
The actual work in heap operations comes from restoring the heap property (bubbling), not from maintaining completeness. The complete tree structure is a consequence of always operating at the boundaries (append/pop) rather than in the middle.
Standard heaps don't support inserting at arbitrary positions or deleting arbitrary elements efficiently. If you need these operations, you need a different structure (indexed heap, Fibonacci heap, etc.). The complete tree property is maintained precisely because we only modify at the 'ends' of the level-order sequence.
The level-by-level filling creates a beautiful mathematical structure where the position of every node encodes all its relationships. Let's derive and prove these formulas rigorously.
The Fundamental Relationships (1-indexed):
For a node at position i:
12345678910111213141516171819202122232425262728293031323334353637383940414243
PROOF: Left child of node i is at position 2i Consider level k, which contains nodes 2^k through 2^(k+1) - 1.The first node of level k is 2^k. Level k has 2^k nodes.Level k+1 (children level) has 2^(k+1) nodes. Node i is at position (i - 2^k) within level k.It has two children in level k+1. Children of all level-k nodes precede children of later level-k nodes.So node i's children come after the children of nodes 2^k through i-1. Number of children before node i's children: 2 × (i - 2^k) = 2i - 2^(k+1) First child of first node of level k is: 2^(k+1) (first position of level k+1) Therefore, left child of node i is at: 2^(k+1) + 2(i - 2^k) = 2^(k+1) + 2i - 2^(k+1) = 2i ✓ Right child is simply 2i + 1. ✓ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ PROOF: Parent of node i is at position ⌊i/2⌋ If i is a left child, i = 2p for some parent p. Then ⌊i/2⌋ = ⌊2p/2⌋ = p. ✓ If i is a right child, i = 2p + 1 for some parent p. Then ⌊i/2⌋ = ⌊(2p+1)/2⌋ = ⌊p + 0.5⌋ = p. ✓ ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ DETERMINING IF NODE i IS LEFT OR RIGHT CHILD: i is a left child ⟺ i is even ⟺ i = 2pi is a right child ⟺ i is odd ⟺ i = 2p + 1 Note: The root (i=1) is a special case with no parent.| Position i | Parent ⌊i/2⌋ | Left 2i | Right 2i+1 | Level | L/R Child |
|---|---|---|---|---|---|
| 1 (root) | — | 2 | 3 | 0 | — |
| 2 | 1 | 4 | 5 | 1 | Left |
| 3 | 1 | 6 | 7 | 1 | Right |
| 4 | 2 | 8 | 9 | 2 | Left |
| 5 | 2 | 10 | 11 | 2 | Right |
| 6 | 3 | 12 | 13 | 2 | Left |
| 7 | 3 | 14 | 15 | 2 | Right |
| 8 | 4 | 16 | 17 | 3 | Left |
| 9 | 4 | 18 | 19 | 3 | Right |
| 10 | 5 | 20 | 21 | 3 | Left |
Alternative: 0-indexed Formulas
Some implementations use 0-indexed arrays. The formulas adjust slightly:
However, 1-indexed arithmetic is cleaner (parent = i/2, children = 2i and 2i+1), which is why many implementations leave index 0 unused.
For 1-indexed arrays: Parent of i is simply i >> 1 (right shift). Left child is i << 1 (left shift). Right child is (i << 1) | 1. Level is the position of the highest set bit minus one. These bit operations are extremely fast on modern CPUs.
The level-order indexing also makes sibling and subtree relationships easily computable—useful for more advanced heap operations and analysis.
Sibling Relationships:
For node i (where i > 1):
Subtree Size Calculation:
For a complete binary tree, we can compute the size of the subtree rooted at node i:
If node i has no children (i > n/2): subtree size = 1
If node i has one child: subtree size = 1 + subtree_size(left_child)
If node i has two children: subtree size = 1 + subtree_size(left) + subtree_size(right)
However, for a complete tree, there's a more direct formula based on the last node's position.
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677
def get_sibling(i: int, n: int) -> int | None: """ Get the sibling of node i in a complete tree of size n. Returns None if no sibling exists. """ if i == 1: # Root has no sibling return None if i % 2 == 0: # i is a left child sibling = i + 1 return sibling if sibling <= n else None else: # i is a right child sibling = i - 1 return sibling # Right child always has a left sibling def is_leaf(i: int, n: int) -> bool: """Check if node i is a leaf in a complete tree of size n.""" return 2 * i > n # No left child means no children (complete property) def is_internal(i: int, n: int) -> bool: """Check if node i is an internal node (has at least one child).""" return 2 * i <= n def get_descendants_range(i: int, n: int, height: int = None) -> tuple[int, int]: """ Get the range of indices of all descendants of node i. Returns (first_descendant, last_descendant). For a leaf, returns (None, None). """ import math if height is None: height = math.floor(math.log2(n)) if n > 0 else 0 level_i = math.floor(math.log2(i)) if i > 0 else 0 if is_leaf(i, n): return (None, None) # First descendant is left child first_desc = 2 * i # Last descendant: follow rightmost path until exceeding n last_desc = i while 2 * last_desc + 1 <= n: # Has right child last_desc = 2 * last_desc + 1 if 2 * last_desc <= n: # Only has left child last_desc = 2 * last_desc return (first_desc, last_desc) def get_ancestors(i: int) -> list[int]: """Get all ancestor indices from node i up to root (inclusive of i).""" ancestors = [] while i >= 1: ancestors.append(i) i //= 2 return ancestors # Demonstrationn = 10 # Complete tree with 10 nodesprint(f"Complete tree with {n} nodes:")print(f"Node 3 sibling: {get_sibling(3, n)}") # 2print(f"Node 4 sibling: {get_sibling(4, n)}") # 5print(f"Node 5 sibling: {get_sibling(5, n)}") # 4print(f"Node 7 sibling: {get_sibling(7, n)}") # 6print()print(f"Leaves: {[i for i in range(1, n+1) if is_leaf(i, n)]}")print(f"Internal: {[i for i in range(1, n+1) if is_internal(i, n)]}")print()print(f"Ancestors of node 10: {get_ancestors(10)}") # [10, 5, 2, 1]print(f"Ancestors of node 8: {get_ancestors(8)}") # [8, 4, 2, 1]Internal Nodes vs Leaves:
In a complete binary tree with n nodes:
The count is always:
This means roughly half the nodes are at the bottom level or very near it. This is crucial for the O(n) heap construction algorithm, which processes nodes in reverse level order.
The level-by-level filling pattern has profound practical implications for heap implementation efficiency. Let's enumerate the key benefits.
1. O(1) Access to All Node Relationships
With pointer-based trees, following a parent pointer requires a memory access (cache miss likely). With array indices:
2. Memory Allocation Simplicity
Pointer-based trees allocate each node separately, leading to:
Array-based heaps allocate a single contiguous block. Growth uses amortized O(1) reallocation (doubling strategy).
3. Serialization Triviality
To serialize a pointer-based tree, you need a traversal algorithm and format for encoding structure. To serialize an array-based heap:
4. Copy and Clone Simplicity
Copying a pointer-based tree requires traversal and node allocation. Copying an array-based heap: memcpy the array. Single operation, maximum efficiency.
Modern CPUs load data in cache lines (typically 64 bytes). An array-based heap accessing adjacent indices often finds the data already in cache. Pointer-based trees chase pointers to unpredictable memory locations, causing cache misses. For large heaps, this difference can be 10-100x in actual performance.
We've thoroughly explored the level-by-level filling pattern that characterizes complete binary trees. This pattern is the key that unlocks efficient heap implementations.
What's Next:
With level-by-level filling understood, we now have the conceptual foundation for the next crucial topic: why completeness enables array storage. We'll see how the mathematical relationships we've derived translate directly into an array implementation, eliminating the need for pointers entirely.
You now understand the level-by-level filling process as both a construction algorithm and a dynamic invariant. The deterministic relationship between tree size and node positions is the bridge between abstract tree structure and practical array-based implementation.