Non-Linear Data Structures - Learning Module

Loading content...

0/276

Examples — Trees, Graphs, Heaps, and Tries

The Non-Linear Data Structure Gallery

We've established the conceptual foundations: non-linear data structures break free from sequential organization to model hierarchical and networked relationships. We understand one-to-many and many-to-many cardinality. Now it's time to meet the concrete structures that embody these concepts.

This page introduces the four foundational non-linear data structures that you'll encounter throughout computer science and software engineering:

Trees — The quintessential hierarchical structure
Graphs — The universal network model
Heaps — Specialized trees for priority-based access
Tries — Specialized trees for prefix-based string operations

Each structure has unique characteristics, optimal use cases, and a rich ecosystem of algorithms. This page provides the conceptual overview; dedicated chapters will explore each in depth.

What You Will Learn

By the end of this page, you will have clear mental models for trees, graphs, heaps, and tries. You'll understand what makes each structure unique, when to use each, and how they relate to the relationship patterns we've studied. This overview prepares you for deep dives into each structure in subsequent chapters.

Trees — The Foundational Hierarchy

Trees are the most fundamental non-linear data structure. They model hierarchical relationships with a single root element, parent-child connections, and no cycles.

Formal definition:

A tree is a connected, acyclic graph with a designated root node. Every node except the root has exactly one parent. Every node can have zero or more children.

Core terminology:

Term	Definition
Root	The topmost node with no parent
Parent	A node's direct ancestor
Child	A node's direct descendant
Sibling	Nodes sharing the same parent
Leaf	A node with no children
Internal node	A non-leaf node (has children)
Depth	Distance from the root (edges on path to root)
Height	Longest path from node to any leaf descendant
Subtree	A node and all its descendants

Visualizing a tree:

                    Root (depth 0)
                      │
             ┌────────┼────────┐
             │        │        │
           Child1   Child2   Child3   ← depth 1
             │        │        │
           ┌─┴─┐    ┌─┴─┐    Leaf
           │   │    │   │
         Leaf Leaf Leaf Leaf   ← depth 2 (some are leaves)

Key observations:

Every node has exactly one path to the root
The structure 'branches' outward—never reconnects
Depth increases as you move away from the root

Key Tree Variants

•Binary Tree — Each node has at most 2 children (left and right). Foundation for BSTs, heaps, and many algorithms.
•Binary Search Tree (BST) — Binary tree where left children < parent < right children. Enables O(log n) search in balanced case.
•N-ary Tree — Each node can have up to n children. Generalizes binary trees for wider branching.
•Balanced Trees — Trees that maintain height ≈ O(log n) through rotations or restructuring. Examples: AVL, Red-Black, B-trees.
•Trie — Tree optimized for string prefix operations. Each path from root represents a prefix.
•Heap — Complete binary tree satisfying heap property. Optimized for priority queue operations.

Why Trees Are Everywhere

Trees appear in file systems (directories), HTML/XML (DOM), databases (B-trees), compilers (AST), AI (decision trees), version control (commit history), and countless other domains. Their hierarchical structure naturally models containment, classification, and decomposition—concepts fundamental to how humans organize information.

Trees — Operations and Complexity

Trees support a rich set of operations. The efficiency of these operations depends on tree structure and balance.

Core tree operations:

Tree Operations and Complexity
Operation	Description	General Tree	Balanced BST
Search	Find a node with given value	O(n)	O(log n)
Insert	Add a new node	O(1) if position known	O(log n)
Delete	Remove a node	O(1) if node known + O(k) for children handling	O(log n)
Find parent	Navigate to parent node	O(1) with parent pointer	O(1)
Find children	Get all children of a node	O(1) + O(k) to iterate k children	O(1)
Find depth	Distance to root	O(d) where d is depth	O(log n) for balanced
Traverse (all nodes)	Visit every node	O(n)	O(n)

Tree traversal patterns:

Traversal is fundamental to tree processing. The three classic patterns for binary trees are:

Pre-order (Root → Left → Right): Process root first, then recursively traverse subtrees. Used for copying trees, prefix expression evaluation.
In-order (Left → Root → Right): Traverse left subtree, process root, traverse right. For BSTs, this yields sorted order.
Post-order (Left → Right → Root): Traverse subtrees first, then process root. Used for deletion, postfix expression evaluation.
Level-order (BFS): Visit nodes level by level, left to right. Uses a queue rather than recursion.

        1
       / \
      2   3
     / \
    4   5

Pre-order:   1, 2, 4, 5, 3
In-order:    4, 2, 5, 1, 3
Post-order:  4, 5, 2, 3, 1
Level-order: 1, 2, 3, 4, 5

The Balance Imperative

An unbalanced tree can degenerate to a linked list—height becomes O(n), and all O(log n) operations become O(n). This is why balanced trees (AVL, Red-Black, B-tree) are crucial for maintaining performance guarantees. We'll explore balancing techniques in depth in later chapters.

Graphs — The Universal Network

Graphs are the most general non-linear data structure. They can model any relationship pattern, including all the structures we've discussed as special cases.

Formal definition:

A graph G = (V, E) consists of a set V of vertices (nodes) and a set E of edges (connections between vertices). An edge connects two vertices and may be directed (one-way) or undirected (bidirectional).

Core terminology:

Term	Definition
Vertex (Node)	A fundamental unit in the graph
Edge	A connection between two vertices
Adjacent	Two vertices connected by an edge
Degree	Number of edges connected to a vertex
Path	A sequence of vertices connected by edges
Cycle	A path that starts and ends at the same vertex
Connected	There exists a path between any two vertices
Component	A maximal connected subgraph

Visualizing graphs:

Undirected Graph:          Directed Graph (Digraph):

    A ─── B                    A ──→ B
    │\    │                    │     ↓
    │ \   │                    ↓     ↓
    │  \  │                    D ←── C
    D ─── C

A-B, A-C, A-D, B-C, D-C       A→B, A→D, B→C, C→D
(5 undirected edges)          (4 directed edges)

Key graph variants:

Graph Classifications

•Directed vs Undirected — Edges have direction (A→B) or are bidirectional (A↔B).
•Weighted vs Unweighted — Edges carry numeric values (distance, cost) or are uniform.
•Cyclic vs Acyclic — Contains cycles or not. DAGs (Directed Acyclic Graphs) are a crucial variant.
•Connected vs Disconnected — All vertices reachable from all others, or separated into components.
•Dense vs Sparse — Edge count near O(V²) or closer to O(V).
•Bipartite — Vertices can be divided into two sets with edges only between sets, not within.

Trees Are Graphs

A tree is precisely a connected, acyclic, undirected graph with a designated root. Every tree can be processed using graph algorithms, but specialized tree algorithms are often more efficient because they exploit the additional structural constraints.

Graphs — Representation and Operations

Graphs can be represented in multiple ways, each with different trade-offs:

Representation 1: Adjacency Matrix

A V×V matrix where entry (i,j) is 1 if there's an edge from vertex i to vertex j, 0 otherwise.

    A  B  C  D
A [ 0  1  1  1 ]
B [ 1  0  1  0 ]
C [ 1  1  0  1 ]
D [ 1  0  1  0 ]

Space: O(V²) always
Edge check: O(1)
Find all neighbors: O(V)
Best for: Dense graphs, when you need fast edge existence checks

Representation 2: Adjacency List

Each vertex stores a list of its neighbors.

A → [B, C, D]
B → [A, C]
C → [A, B, D]
D → [A, C]

Space: O(V + E)
Edge check: O(degree)
Find all neighbors: O(1) to access, O(degree) to iterate
Best for: Sparse graphs, traversal algorithms

Graph Operations and Complexity
Operation	Adjacency Matrix	Adjacency List
Check if edge exists	O(1)	O(degree)
Get all neighbors	O(V)	O(degree)
Add edge	O(1)	O(1)
Remove edge	O(1)	O(degree)
Add vertex	O(V²) — rebuild matrix	O(1)
Space	O(V²)	O(V + E)

Fundamental graph algorithms:

Algorithm	Purpose	Complexity
BFS	Find shortest path (unweighted), level-order traversal	O(V + E)
DFS	Explore all paths, detect cycles, topological sort	O(V + E)
Dijkstra	Shortest path in weighted graph (non-negative weights)	O((V + E) log V)
Bellman-Ford	Shortest path with negative weights	O(V × E)
Topological Sort	Order vertices in DAG by dependencies	O(V + E)
Kruskal/Prim	Minimum spanning tree	O(E log V)
Floyd-Warshall	All-pairs shortest paths	O(V³)

These algorithms form the foundation of graph processing and will be explored thoroughly in dedicated chapters.

Heaps — Priority-Based Trees

A heap is a specialized tree structure designed for one purpose: efficient access to the highest (or lowest) priority element.

Formal definition:

A (binary) heap is a complete binary tree that satisfies the heap property: for a max-heap, every parent is greater than or equal to its children; for a min-heap, every parent is less than or equal to its children.

Key characteristics:

Complete binary tree: All levels are fully filled except possibly the last, which is filled left-to-right. This ensures height ≈ log n.
Heap property: The root is always the maximum (max-heap) or minimum (min-heap) element. This enables O(1) access to the extreme value.
Array representation: Due to completeness, heaps can be stored in arrays without explicit pointers:
- Parent of node at index i: ⌊(i-1)/2⌋
- Left child of node at index i: 2i + 1
- Right child of node at index i: 2i + 2

Visualizing a max-heap:

          90                  Array: [90, 70, 80, 40, 50, 60, 30]
         /  \                 Indices: 0   1   2   3   4   5   6
        70   80
       /  \  /  \
      40  50 60  30

Property: Every parent ≥ both children
90 ≥ 70, 90 ≥ 80, 70 ≥ 40, 70 ≥ 50, 80 ≥ 60, 80 ≥ 30 ✓

Heap Operations and Complexity
Operation	Description	Complexity
peek/top	Access the max (or min) element	O(1)
insert	Add a new element (bubble up)	O(log n)
extract-max/min	Remove and return the extreme element	O(log n)
heapify	Convert an array to a heap	O(n)
increase-key	Increase an element's value (bubble up)	O(log n)
delete	Remove arbitrary element	O(log n)

Heaps Power Priority Queues

A priority queue is an abstract data type where the highest-priority element is always accessible. Binary heaps are the most common implementation. Priority queues are crucial for scheduling, Dijkstra's algorithm, Huffman coding, and many other applications where processing order depends on priority, not arrival time.

Common Heap Use Cases

•Priority scheduling — Process highest-priority tasks first (OS schedulers)
•Dijkstra's algorithm — Always extend the shortest known path
•Finding k largest/smallest elements — Maintain a heap of k elements
•Median finding — Use two heaps to track running median
•Huffman coding — Always combine two lowest-frequency symbols
•Heap sort — O(n log n) in-place sorting algorithm

Tries — Prefix Trees for Strings

A trie (pronounced "try" or "tree", from retrieval) is a specialized tree structure optimized for string operations, particularly prefix-based queries.

Formal definition:

A trie is a tree where each node represents a prefix of stored strings. The path from root to any node spells out that prefix. Edges are labeled with characters, and nodes may be marked as word endings.

Key characteristics:

Prefix-based structure: Common prefixes are stored once. "cat", "car", "card" share the path c-a.
No string duplication: Unlike hash tables, the trie structure itself encodes the strings.
Alphabetical ordering: In-order traversal yields strings in sorted order.
Prefix operations in O(m): Finding all strings with a given prefix takes O(m) where m is prefix length—independent of dictionary size.

Visualizing a trie:

Storing words: "cat", "car", "card", "care", "do", "dog"

              (root)
             /      \
            c        d
            |        |
            a        o
           / \       |
          t*  r*     (2)
              |      |
              (1)    g*
             / \
            d*  e*

* marks end of word
(1) = node for "car", branches to "card" and "care"
(2) = node for "do", branches to "dog"

To find "card": traverse c → a → r → d. O(4) operations regardless of dictionary size.

Trie Operations and Complexity
Operation	Description	Complexity
Insert	Add a word to the trie	O(m) where m = word length
Search	Check if word exists	O(m)
Prefix search	Check if any word has given prefix	O(m)
Find all with prefix	Get all words starting with prefix	O(m + k) where k = matching words
Delete	Remove a word	O(m)
Autocomplete	Suggest completions for prefix	O(m + k)

Tries vs Hash Tables for Strings

Hash tables give O(1) average lookup for exact strings. Tries are slower for exact lookup (O(m)) but excel at prefix operations. If you need 'all words starting with X', tries are essential. If you only need 'does exactly this word exist', hash tables are simpler.

Common Trie Use Cases

•Autocomplete systems — Suggest completions as user types
•Spell checkers — Find similar words for misspellings
•IP routing tables — Longest prefix matching for network routing
•Word games — Validate words, find anagrams with prefixes
•Search engines — Index terms for prefix queries
•DNA sequence matching — Find sequences with common prefixes

Comprehensive Comparison

Let's systematically compare these four non-linear structures to clarify when to use each.

Non-Linear Data Structures Comparison
Aspect	Trees (General)	Graphs	Heaps	Tries
Relationship model	One-to-many (hierarchy)	Many-to-many (network)	One-to-many (priority)	One-to-many (prefixes)
Cycles allowed	No	Yes	No	No
Primary optimization	Hierarchical access	Path finding	Priority access	Prefix operations
Common representation	Node + child pointers	Adjacency list/matrix	Array	Node + character map
Space complexity	O(n)	O(V + E)	O(n)	O(ALPHABET × m × n)
Typical use case	File systems, parsing	Networks, dependencies	Scheduling	Autocomplete, routing

Choose Trees When...

•Data is naturally hierarchical
•Each element has one parent
•You need ancestor/descendant queries
•The structure supports recursion well

Choose Heaps When...

•You need the max or min element repeatedly
•Elements have priorities
•Implementing a priority queue
•You need efficient insert + extract-extreme

Choose Graphs When...

•Entities connect to multiple others
•You need to find paths between entities
•Relationships may form cycles
•The structure is fundamentally networked

Choose Tries When...

•Working with strings/sequences
•Need prefix-based operations
•Implementing autocomplete
•Many strings share common prefixes

Structure Selection is Domain-Driven

The 'best' structure depends entirely on your domain and operations. A social network needs graphs. A file system needs trees. A task scheduler needs heaps. A search engine needs tries (and much more). Match the structure to the problem's inherent relationships and required operations.

Bridging to Linear Structures

Non-linear structures don't exist in isolation—they're built on and interface with linear structures in important ways.

Linear structures as building blocks:

Non-Linear Structure	Uses These Linear Structures
Trees	Arrays (for heap), arrays of children (n-ary)
Graphs	Array of adjacency arrays, array-based matrix
Heaps	Stored directly as arrays
Tries	Arrays or maps at each node for children

Converting between structures:

Many algorithms convert between linear and non-linear:

Tree traversals produce linear sequences. In-order traversal of a BST produces a sorted array.
Arrays can form trees. Heapify converts an array to a heap. Binary search conceptually treats a sorted array as a BST.
Graphs can be linearized. Topological sort produces a linear ordering of a DAG. BFS produces level-by-level sequences.
Tries can produce sorted outputs. Traversing a trie in alphabetical order produces a sorted list of words.

The spectrum of structure:

Simplest ─────────────────────────────────────────── Most Complex
   │                                                       │
Arrays → Linked Lists → Trees → DAGs → General Graphs
   │         │            │       │           │
   └─────────┴────────────┴───────┴───────────┘
              More relationships, more complexity

Each step adds expressive power at the cost of algorithmic complexity. The skill is choosing the simplest structure that captures your data's true relationships—adding structure for what you need, avoiding complexity you don't.

The Minimal Model Principle

Always prefer the most constrained structure that faithfully models your domain. Using a graph when a tree suffices wastes algorithmic efficiency. Using a tree when a graph is needed loses essential information. The art is in the precise match.

Summary: The Non-Linear Structure Gallery

We've toured the four foundational non-linear data structures. Let's consolidate the essential insights:

Key Takeaways

•Trees model hierarchies with parent-child relationships. They're fundamental to computing—from file systems to parsers to databases. Balanced variants guarantee O(log n) operations.
•Graphs model networks with arbitrary connections. They're the most general structure, essential for social networks, routing, dependencies, and any many-to-many relationships.
•Heaps optimize priority access by maintaining a complete binary tree where the extreme (max or min) element is always at the root. They power priority queues and enable efficient scheduling.
•Tries optimize prefix operations by storing strings as paths from root. They enable O(m) prefix queries independent of dictionary size—perfect for autocomplete and search.
•Each structure has unique strengths. Trees for hierarchy, graphs for networks, heaps for priority, tries for prefixes. Matching structure to domain is the key skill.
•Non-linear builds on linear. Arrays and lists underlie most non-linear implementations. Traversals convert non-linear structures to linear sequences.

What's next:

With this conceptual foundation of non-linear data structures complete, you're prepared to explore each structure in depth. Upcoming chapters will cover:

Trees: Deep exploration of binary trees, BSTs, and recursive tree algorithms
Balanced Trees: AVL, Red-Black, and B-trees for guaranteed efficiency
Heaps & Priority Queues: Implementation details and advanced variants
Tries: Optimizations, compressed tries, and practical applications
Graphs: Representations, traversals, and fundamental algorithms

Each chapter builds on the conceptual understanding you've developed here.

Module Complete

Congratulations! You've completed the Non-Linear Data Structures module. You now understand what makes a structure non-linear, the distinction between hierarchical and network relationships, cardinality patterns, and the four foundational non-linear structures. This conceptual mastery prepares you for deep dives into each structure's implementation and algorithms.

Examples — Trees, Graphs, Heaps, and Tries

The Non-Linear Data Structure Gallery

This page introduces the four foundational non-linear data structures that you'll encounter throughout computer science and software engineering:

Trees — The quintessential hierarchical structure
Graphs — The universal network model
Heaps — Specialized trees for priority-based access
Tries — Specialized trees for prefix-based string operations

Each structure has unique characteristics, optimal use cases, and a rich ecosystem of algorithms. This page provides the conceptual overview; dedicated chapters will explore each in depth.

What You Will Learn

Trees — The Foundational Hierarchy

Trees are the most fundamental non-linear data structure. They model hierarchical relationships with a single root element, parent-child connections, and no cycles.

Formal definition:

A tree is a connected, acyclic graph with a designated root node. Every node except the root has exactly one parent. Every node can have zero or more children.

Core terminology:

Term	Definition
Root	The topmost node with no parent
Parent	A node's direct ancestor
Child	A node's direct descendant
Sibling	Nodes sharing the same parent
Leaf	A node with no children
Internal node	A non-leaf node (has children)
Depth	Distance from the root (edges on path to root)
Height	Longest path from node to any leaf descendant
Subtree	A node and all its descendants

Visualizing a tree:

                    Root (depth 0)
                      │
             ┌────────┼────────┐
             │        │        │
           Child1   Child2   Child3   ← depth 1
             │        │        │
           ┌─┴─┐    ┌─┴─┐    Leaf
           │   │    │   │
         Leaf Leaf Leaf Leaf   ← depth 2 (some are leaves)

Key observations:

Every node has exactly one path to the root
The structure 'branches' outward—never reconnects
Depth increases as you move away from the root

Key Tree Variants

•Binary Tree — Each node has at most 2 children (left and right). Foundation for BSTs, heaps, and many algorithms.
•Binary Search Tree (BST) — Binary tree where left children < parent < right children. Enables O(log n) search in balanced case.
•N-ary Tree — Each node can have up to n children. Generalizes binary trees for wider branching.
•Balanced Trees — Trees that maintain height ≈ O(log n) through rotations or restructuring. Examples: AVL, Red-Black, B-trees.
•Trie — Tree optimized for string prefix operations. Each path from root represents a prefix.
•Heap — Complete binary tree satisfying heap property. Optimized for priority queue operations.

Why Trees Are Everywhere

Trees — Operations and Complexity

Trees support a rich set of operations. The efficiency of these operations depends on tree structure and balance.

Core tree operations:

Tree Operations and Complexity
Operation	Description	General Tree	Balanced BST
Search	Find a node with given value	O(n)	O(log n)
Insert	Add a new node	O(1) if position known	O(log n)
Delete	Remove a node	O(1) if node known + O(k) for children handling	O(log n)
Find parent	Navigate to parent node	O(1) with parent pointer	O(1)
Find children	Get all children of a node	O(1) + O(k) to iterate k children	O(1)
Find depth	Distance to root	O(d) where d is depth	O(log n) for balanced
Traverse (all nodes)	Visit every node	O(n)	O(n)

Tree traversal patterns:

Traversal is fundamental to tree processing. The three classic patterns for binary trees are:

Pre-order (Root → Left → Right): Process root first, then recursively traverse subtrees. Used for copying trees, prefix expression evaluation.
In-order (Left → Root → Right): Traverse left subtree, process root, traverse right. For BSTs, this yields sorted order.
Post-order (Left → Right → Root): Traverse subtrees first, then process root. Used for deletion, postfix expression evaluation.
Level-order (BFS): Visit nodes level by level, left to right. Uses a queue rather than recursion.

        1
       / \
      2   3
     / \
    4   5

Pre-order:   1, 2, 4, 5, 3
In-order:    4, 2, 5, 1, 3
Post-order:  4, 5, 2, 3, 1
Level-order: 1, 2, 3, 4, 5

The Balance Imperative

Graphs — The Universal Network

Graphs are the most general non-linear data structure. They can model any relationship pattern, including all the structures we've discussed as special cases.

Formal definition:

A graph G = (V, E) consists of a set V of vertices (nodes) and a set E of edges (connections between vertices). An edge connects two vertices and may be directed (one-way) or undirected (bidirectional).

Core terminology:

Term	Definition
Vertex (Node)	A fundamental unit in the graph
Edge	A connection between two vertices
Adjacent	Two vertices connected by an edge
Degree	Number of edges connected to a vertex
Path	A sequence of vertices connected by edges
Cycle	A path that starts and ends at the same vertex
Connected	There exists a path between any two vertices
Component	A maximal connected subgraph

Visualizing graphs:

Undirected Graph:          Directed Graph (Digraph):

    A ─── B                    A ──→ B
    │\    │                    │     ↓
    │ \   │                    ↓     ↓
    │  \  │                    D ←── C
    D ─── C

A-B, A-C, A-D, B-C, D-C       A→B, A→D, B→C, C→D
(5 undirected edges)          (4 directed edges)

Key graph variants:

Graph Classifications

•Directed vs Undirected — Edges have direction (A→B) or are bidirectional (A↔B).
•Weighted vs Unweighted — Edges carry numeric values (distance, cost) or are uniform.
•Cyclic vs Acyclic — Contains cycles or not. DAGs (Directed Acyclic Graphs) are a crucial variant.
•Connected vs Disconnected — All vertices reachable from all others, or separated into components.
•Dense vs Sparse — Edge count near O(V²) or closer to O(V).
•Bipartite — Vertices can be divided into two sets with edges only between sets, not within.

Trees Are Graphs

Graphs — Representation and Operations

Graphs can be represented in multiple ways, each with different trade-offs:

Representation 1: Adjacency Matrix

A V×V matrix where entry (i,j) is 1 if there's an edge from vertex i to vertex j, 0 otherwise.

    A  B  C  D
A [ 0  1  1  1 ]
B [ 1  0  1  0 ]
C [ 1  1  0  1 ]
D [ 1  0  1  0 ]

Space: O(V²) always
Edge check: O(1)
Find all neighbors: O(V)
Best for: Dense graphs, when you need fast edge existence checks

Representation 2: Adjacency List

Each vertex stores a list of its neighbors.

A → [B, C, D]
B → [A, C]
C → [A, B, D]
D → [A, C]

Space: O(V + E)
Edge check: O(degree)
Find all neighbors: O(1) to access, O(degree) to iterate
Best for: Sparse graphs, traversal algorithms

Graph Operations and Complexity
Operation	Adjacency Matrix	Adjacency List
Check if edge exists	O(1)	O(degree)
Get all neighbors	O(V)	O(degree)
Add edge	O(1)	O(1)
Remove edge	O(1)	O(degree)
Add vertex	O(V²) — rebuild matrix	O(1)
Space	O(V²)	O(V + E)

Fundamental graph algorithms:

Algorithm	Purpose	Complexity
BFS	Find shortest path (unweighted), level-order traversal	O(V + E)
DFS	Explore all paths, detect cycles, topological sort	O(V + E)
Dijkstra	Shortest path in weighted graph (non-negative weights)	O((V + E) log V)
Bellman-Ford	Shortest path with negative weights	O(V × E)
Topological Sort	Order vertices in DAG by dependencies	O(V + E)
Kruskal/Prim	Minimum spanning tree	O(E log V)
Floyd-Warshall	All-pairs shortest paths	O(V³)

These algorithms form the foundation of graph processing and will be explored thoroughly in dedicated chapters.

Heaps — Priority-Based Trees

A heap is a specialized tree structure designed for one purpose: efficient access to the highest (or lowest) priority element.

Formal definition:

A (binary) heap is a complete binary tree that satisfies the heap property: for a max-heap, every parent is greater than or equal to its children; for a min-heap, every parent is less than or equal to its children.

Key characteristics:

Complete binary tree: All levels are fully filled except possibly the last, which is filled left-to-right. This ensures height ≈ log n.
Heap property: The root is always the maximum (max-heap) or minimum (min-heap) element. This enables O(1) access to the extreme value.
Array representation: Due to completeness, heaps can be stored in arrays without explicit pointers:
- Parent of node at index i: ⌊(i-1)/2⌋
- Left child of node at index i: 2i + 1
- Right child of node at index i: 2i + 2

Visualizing a max-heap:

          90                  Array: [90, 70, 80, 40, 50, 60, 30]
         /  \                 Indices: 0   1   2   3   4   5   6
        70   80
       /  \  /  \
      40  50 60  30

Property: Every parent ≥ both children
90 ≥ 70, 90 ≥ 80, 70 ≥ 40, 70 ≥ 50, 80 ≥ 60, 80 ≥ 30 ✓

Heap Operations and Complexity
Operation	Description	Complexity
peek/top	Access the max (or min) element	O(1)
insert	Add a new element (bubble up)	O(log n)
extract-max/min	Remove and return the extreme element	O(log n)
heapify	Convert an array to a heap	O(n)
increase-key	Increase an element's value (bubble up)	O(log n)
delete	Remove arbitrary element	O(log n)

Heaps Power Priority Queues

Common Heap Use Cases

•Priority scheduling — Process highest-priority tasks first (OS schedulers)
•Dijkstra's algorithm — Always extend the shortest known path
•Finding k largest/smallest elements — Maintain a heap of k elements
•Median finding — Use two heaps to track running median
•Huffman coding — Always combine two lowest-frequency symbols
•Heap sort — O(n log n) in-place sorting algorithm

Tries — Prefix Trees for Strings

A trie (pronounced "try" or "tree", from retrieval) is a specialized tree structure optimized for string operations, particularly prefix-based queries.

Formal definition:

A trie is a tree where each node represents a prefix of stored strings. The path from root to any node spells out that prefix. Edges are labeled with characters, and nodes may be marked as word endings.

Key characteristics:

Prefix-based structure: Common prefixes are stored once. "cat", "car", "card" share the path c-a.
No string duplication: Unlike hash tables, the trie structure itself encodes the strings.
Alphabetical ordering: In-order traversal yields strings in sorted order.
Prefix operations in O(m): Finding all strings with a given prefix takes O(m) where m is prefix length—independent of dictionary size.

Visualizing a trie:

Storing words: "cat", "car", "card", "care", "do", "dog"

              (root)
             /      \
            c        d
            |        |
            a        o
           / \       |
          t*  r*     (2)
              |      |
              (1)    g*
             / \
            d*  e*

* marks end of word
(1) = node for "car", branches to "card" and "care"
(2) = node for "do", branches to "dog"

To find "card": traverse c → a → r → d. O(4) operations regardless of dictionary size.

Trie Operations and Complexity
Operation	Description	Complexity
Insert	Add a word to the trie	O(m) where m = word length
Search	Check if word exists	O(m)
Prefix search	Check if any word has given prefix	O(m)
Find all with prefix	Get all words starting with prefix	O(m + k) where k = matching words
Delete	Remove a word	O(m)
Autocomplete	Suggest completions for prefix	O(m + k)

Tries vs Hash Tables for Strings

Common Trie Use Cases

•Autocomplete systems — Suggest completions as user types
•Spell checkers — Find similar words for misspellings
•IP routing tables — Longest prefix matching for network routing
•Word games — Validate words, find anagrams with prefixes
•Search engines — Index terms for prefix queries
•DNA sequence matching — Find sequences with common prefixes

Comprehensive Comparison

Let's systematically compare these four non-linear structures to clarify when to use each.

Non-Linear Data Structures Comparison
Aspect	Trees (General)	Graphs	Heaps	Tries
Relationship model	One-to-many (hierarchy)	Many-to-many (network)	One-to-many (priority)	One-to-many (prefixes)
Cycles allowed	No	Yes	No	No
Primary optimization	Hierarchical access	Path finding	Priority access	Prefix operations
Common representation	Node + child pointers	Adjacency list/matrix	Array	Node + character map
Space complexity	O(n)	O(V + E)	O(n)	O(ALPHABET × m × n)
Typical use case	File systems, parsing	Networks, dependencies	Scheduling	Autocomplete, routing

Choose Trees When...

•Data is naturally hierarchical
•Each element has one parent
•You need ancestor/descendant queries
•The structure supports recursion well

Choose Heaps When...

•You need the max or min element repeatedly
•Elements have priorities
•Implementing a priority queue
•You need efficient insert + extract-extreme

Choose Graphs When...

•Entities connect to multiple others
•You need to find paths between entities
•Relationships may form cycles
•The structure is fundamentally networked

Choose Tries When...

•Working with strings/sequences
•Need prefix-based operations
•Implementing autocomplete
•Many strings share common prefixes

Structure Selection is Domain-Driven

Bridging to Linear Structures

Non-linear structures don't exist in isolation—they're built on and interface with linear structures in important ways.

Linear structures as building blocks:

Non-Linear Structure	Uses These Linear Structures
Trees	Arrays (for heap), arrays of children (n-ary)
Graphs	Array of adjacency arrays, array-based matrix
Heaps	Stored directly as arrays
Tries	Arrays or maps at each node for children

Converting between structures:

Many algorithms convert between linear and non-linear:

Tree traversals produce linear sequences. In-order traversal of a BST produces a sorted array.
Arrays can form trees. Heapify converts an array to a heap. Binary search conceptually treats a sorted array as a BST.
Graphs can be linearized. Topological sort produces a linear ordering of a DAG. BFS produces level-by-level sequences.
Tries can produce sorted outputs. Traversing a trie in alphabetical order produces a sorted list of words.

The spectrum of structure:

Simplest ─────────────────────────────────────────── Most Complex
   │                                                       │
Arrays → Linked Lists → Trees → DAGs → General Graphs
   │         │            │       │           │
   └─────────┴────────────┴───────┴───────────┘
              More relationships, more complexity

The Minimal Model Principle

Summary: The Non-Linear Structure Gallery

We've toured the four foundational non-linear data structures. Let's consolidate the essential insights:

Key Takeaways

•Trees model hierarchies with parent-child relationships. They're fundamental to computing—from file systems to parsers to databases. Balanced variants guarantee O(log n) operations.
•Graphs model networks with arbitrary connections. They're the most general structure, essential for social networks, routing, dependencies, and any many-to-many relationships.
•Heaps optimize priority access by maintaining a complete binary tree where the extreme (max or min) element is always at the root. They power priority queues and enable efficient scheduling.
•Tries optimize prefix operations by storing strings as paths from root. They enable O(m) prefix queries independent of dictionary size—perfect for autocomplete and search.
•Each structure has unique strengths. Trees for hierarchy, graphs for networks, heaps for priority, tries for prefixes. Matching structure to domain is the key skill.
•Non-linear builds on linear. Arrays and lists underlie most non-linear implementations. Traversals convert non-linear structures to linear sequences.

What's next:

With this conceptual foundation of non-linear data structures complete, you're prepared to explore each structure in depth. Upcoming chapters will cover:

Trees: Deep exploration of binary trees, BSTs, and recursive tree algorithms
Balanced Trees: AVL, Red-Black, and B-trees for guaranteed efficiency
Heaps & Priority Queues: Implementation details and advanced variants
Tries: Optimizations, compressed tries, and practical applications
Graphs: Representations, traversals, and fundamental algorithms

Each chapter builds on the conceptual understanding you've developed here.

Module Complete