Loading content...
We've established the conceptual foundations: non-linear data structures break free from sequential organization to model hierarchical and networked relationships. We understand one-to-many and many-to-many cardinality. Now it's time to meet the concrete structures that embody these concepts.
This page introduces the four foundational non-linear data structures that you'll encounter throughout computer science and software engineering:
Each structure has unique characteristics, optimal use cases, and a rich ecosystem of algorithms. This page provides the conceptual overview; dedicated chapters will explore each in depth.
By the end of this page, you will have clear mental models for trees, graphs, heaps, and tries. You'll understand what makes each structure unique, when to use each, and how they relate to the relationship patterns we've studied. This overview prepares you for deep dives into each structure in subsequent chapters.
Trees are the most fundamental non-linear data structure. They model hierarchical relationships with a single root element, parent-child connections, and no cycles.
Formal definition:
A tree is a connected, acyclic graph with a designated root node. Every node except the root has exactly one parent. Every node can have zero or more children.
Core terminology:
| Term | Definition |
|---|---|
| Root | The topmost node with no parent |
| Parent | A node's direct ancestor |
| Child | A node's direct descendant |
| Sibling | Nodes sharing the same parent |
| Leaf | A node with no children |
| Internal node | A non-leaf node (has children) |
| Depth | Distance from the root (edges on path to root) |
| Height | Longest path from node to any leaf descendant |
| Subtree | A node and all its descendants |
Visualizing a tree:
Root (depth 0)
│
┌────────┼────────┐
│ │ │
Child1 Child2 Child3 ← depth 1
│ │ │
┌─┴─┐ ┌─┴─┐ Leaf
│ │ │ │
Leaf Leaf Leaf Leaf ← depth 2 (some are leaves)
Key observations:
Trees appear in file systems (directories), HTML/XML (DOM), databases (B-trees), compilers (AST), AI (decision trees), version control (commit history), and countless other domains. Their hierarchical structure naturally models containment, classification, and decomposition—concepts fundamental to how humans organize information.
Trees support a rich set of operations. The efficiency of these operations depends on tree structure and balance.
Core tree operations:
| Operation | Description | General Tree | Balanced BST |
|---|---|---|---|
| Search | Find a node with given value | O(n) | O(log n) |
| Insert | Add a new node | O(1) if position known | O(log n) |
| Delete | Remove a node | O(1) if node known + O(k) for children handling | O(log n) |
| Find parent | Navigate to parent node | O(1) with parent pointer | O(1) |
| Find children | Get all children of a node | O(1) + O(k) to iterate k children | O(1) |
| Find depth | Distance to root | O(d) where d is depth | O(log n) for balanced |
| Traverse (all nodes) | Visit every node | O(n) | O(n) |
Tree traversal patterns:
Traversal is fundamental to tree processing. The three classic patterns for binary trees are:
Pre-order (Root → Left → Right): Process root first, then recursively traverse subtrees. Used for copying trees, prefix expression evaluation.
In-order (Left → Root → Right): Traverse left subtree, process root, traverse right. For BSTs, this yields sorted order.
Post-order (Left → Right → Root): Traverse subtrees first, then process root. Used for deletion, postfix expression evaluation.
Level-order (BFS): Visit nodes level by level, left to right. Uses a queue rather than recursion.
1
/ \
2 3
/ \
4 5
Pre-order: 1, 2, 4, 5, 3
In-order: 4, 2, 5, 1, 3
Post-order: 4, 5, 2, 3, 1
Level-order: 1, 2, 3, 4, 5
An unbalanced tree can degenerate to a linked list—height becomes O(n), and all O(log n) operations become O(n). This is why balanced trees (AVL, Red-Black, B-tree) are crucial for maintaining performance guarantees. We'll explore balancing techniques in depth in later chapters.
Graphs are the most general non-linear data structure. They can model any relationship pattern, including all the structures we've discussed as special cases.
Formal definition:
A graph G = (V, E) consists of a set V of vertices (nodes) and a set E of edges (connections between vertices). An edge connects two vertices and may be directed (one-way) or undirected (bidirectional).
Core terminology:
| Term | Definition |
|---|---|
| Vertex (Node) | A fundamental unit in the graph |
| Edge | A connection between two vertices |
| Adjacent | Two vertices connected by an edge |
| Degree | Number of edges connected to a vertex |
| Path | A sequence of vertices connected by edges |
| Cycle | A path that starts and ends at the same vertex |
| Connected | There exists a path between any two vertices |
| Component | A maximal connected subgraph |
Visualizing graphs:
Undirected Graph: Directed Graph (Digraph):
A ─── B A ──→ B
│\ │ │ ↓
│ \ │ ↓ ↓
│ \ │ D ←── C
D ─── C
A-B, A-C, A-D, B-C, D-C A→B, A→D, B→C, C→D
(5 undirected edges) (4 directed edges)
Key graph variants:
A tree is precisely a connected, acyclic, undirected graph with a designated root. Every tree can be processed using graph algorithms, but specialized tree algorithms are often more efficient because they exploit the additional structural constraints.
Graphs can be represented in multiple ways, each with different trade-offs:
Representation 1: Adjacency Matrix
A V×V matrix where entry (i,j) is 1 if there's an edge from vertex i to vertex j, 0 otherwise.
A B C D
A [ 0 1 1 1 ]
B [ 1 0 1 0 ]
C [ 1 1 0 1 ]
D [ 1 0 1 0 ]
Representation 2: Adjacency List
Each vertex stores a list of its neighbors.
A → [B, C, D]
B → [A, C]
C → [A, B, D]
D → [A, C]
| Operation | Adjacency Matrix | Adjacency List |
|---|---|---|
| Check if edge exists | O(1) | O(degree) |
| Get all neighbors | O(V) | O(degree) |
| Add edge | O(1) | O(1) |
| Remove edge | O(1) | O(degree) |
| Add vertex | O(V²) — rebuild matrix | O(1) |
| Space | O(V²) | O(V + E) |
Fundamental graph algorithms:
| Algorithm | Purpose | Complexity |
|---|---|---|
| BFS | Find shortest path (unweighted), level-order traversal | O(V + E) |
| DFS | Explore all paths, detect cycles, topological sort | O(V + E) |
| Dijkstra | Shortest path in weighted graph (non-negative weights) | O((V + E) log V) |
| Bellman-Ford | Shortest path with negative weights | O(V × E) |
| Topological Sort | Order vertices in DAG by dependencies | O(V + E) |
| Kruskal/Prim | Minimum spanning tree | O(E log V) |
| Floyd-Warshall | All-pairs shortest paths | O(V³) |
These algorithms form the foundation of graph processing and will be explored thoroughly in dedicated chapters.
A heap is a specialized tree structure designed for one purpose: efficient access to the highest (or lowest) priority element.
Formal definition:
A (binary) heap is a complete binary tree that satisfies the heap property: for a max-heap, every parent is greater than or equal to its children; for a min-heap, every parent is less than or equal to its children.
Key characteristics:
Complete binary tree: All levels are fully filled except possibly the last, which is filled left-to-right. This ensures height ≈ log n.
Heap property: The root is always the maximum (max-heap) or minimum (min-heap) element. This enables O(1) access to the extreme value.
Array representation: Due to completeness, heaps can be stored in arrays without explicit pointers:
Visualizing a max-heap:
90 Array: [90, 70, 80, 40, 50, 60, 30]
/ \ Indices: 0 1 2 3 4 5 6
70 80
/ \ / \
40 50 60 30
Property: Every parent ≥ both children
90 ≥ 70, 90 ≥ 80, 70 ≥ 40, 70 ≥ 50, 80 ≥ 60, 80 ≥ 30 ✓
| Operation | Description | Complexity |
|---|---|---|
| peek/top | Access the max (or min) element | O(1) |
| insert | Add a new element (bubble up) | O(log n) |
| extract-max/min | Remove and return the extreme element | O(log n) |
| heapify | Convert an array to a heap | O(n) |
| increase-key | Increase an element's value (bubble up) | O(log n) |
| delete | Remove arbitrary element | O(log n) |
A priority queue is an abstract data type where the highest-priority element is always accessible. Binary heaps are the most common implementation. Priority queues are crucial for scheduling, Dijkstra's algorithm, Huffman coding, and many other applications where processing order depends on priority, not arrival time.
A trie (pronounced "try" or "tree", from retrieval) is a specialized tree structure optimized for string operations, particularly prefix-based queries.
Formal definition:
A trie is a tree where each node represents a prefix of stored strings. The path from root to any node spells out that prefix. Edges are labeled with characters, and nodes may be marked as word endings.
Key characteristics:
Prefix-based structure: Common prefixes are stored once. "cat", "car", "card" share the path c-a.
No string duplication: Unlike hash tables, the trie structure itself encodes the strings.
Alphabetical ordering: In-order traversal yields strings in sorted order.
Prefix operations in O(m): Finding all strings with a given prefix takes O(m) where m is prefix length—independent of dictionary size.
Visualizing a trie:
Storing words: "cat", "car", "card", "care", "do", "dog"
(root)
/ \
c d
| |
a o
/ \ |
t* r* (2)
| |
(1) g*
/ \
d* e*
* marks end of word
(1) = node for "car", branches to "card" and "care"
(2) = node for "do", branches to "dog"
To find "card": traverse c → a → r → d. O(4) operations regardless of dictionary size.
| Operation | Description | Complexity |
|---|---|---|
| Insert | Add a word to the trie | O(m) where m = word length |
| Search | Check if word exists | O(m) |
| Prefix search | Check if any word has given prefix | O(m) |
| Find all with prefix | Get all words starting with prefix | O(m + k) where k = matching words |
| Delete | Remove a word | O(m) |
| Autocomplete | Suggest completions for prefix | O(m + k) |
Hash tables give O(1) average lookup for exact strings. Tries are slower for exact lookup (O(m)) but excel at prefix operations. If you need 'all words starting with X', tries are essential. If you only need 'does exactly this word exist', hash tables are simpler.
Let's systematically compare these four non-linear structures to clarify when to use each.
| Aspect | Trees (General) | Graphs | Heaps | Tries |
|---|---|---|---|---|
| Relationship model | One-to-many (hierarchy) | Many-to-many (network) | One-to-many (priority) | One-to-many (prefixes) |
| Cycles allowed | No | Yes | No | No |
| Primary optimization | Hierarchical access | Path finding | Priority access | Prefix operations |
| Common representation | Node + child pointers | Adjacency list/matrix | Array | Node + character map |
| Space complexity | O(n) | O(V + E) | O(n) | O(ALPHABET × m × n) |
| Typical use case | File systems, parsing | Networks, dependencies | Scheduling | Autocomplete, routing |
The 'best' structure depends entirely on your domain and operations. A social network needs graphs. A file system needs trees. A task scheduler needs heaps. A search engine needs tries (and much more). Match the structure to the problem's inherent relationships and required operations.
Non-linear structures don't exist in isolation—they're built on and interface with linear structures in important ways.
Linear structures as building blocks:
| Non-Linear Structure | Uses These Linear Structures |
|---|---|
| Trees | Arrays (for heap), arrays of children (n-ary) |
| Graphs | Array of adjacency arrays, array-based matrix |
| Heaps | Stored directly as arrays |
| Tries | Arrays or maps at each node for children |
Converting between structures:
Many algorithms convert between linear and non-linear:
Tree traversals produce linear sequences. In-order traversal of a BST produces a sorted array.
Arrays can form trees. Heapify converts an array to a heap. Binary search conceptually treats a sorted array as a BST.
Graphs can be linearized. Topological sort produces a linear ordering of a DAG. BFS produces level-by-level sequences.
Tries can produce sorted outputs. Traversing a trie in alphabetical order produces a sorted list of words.
The spectrum of structure:
Simplest ─────────────────────────────────────────── Most Complex
│ │
Arrays → Linked Lists → Trees → DAGs → General Graphs
│ │ │ │ │
└─────────┴────────────┴───────┴───────────┘
More relationships, more complexity
Each step adds expressive power at the cost of algorithmic complexity. The skill is choosing the simplest structure that captures your data's true relationships—adding structure for what you need, avoiding complexity you don't.
Always prefer the most constrained structure that faithfully models your domain. Using a graph when a tree suffices wastes algorithmic efficiency. Using a tree when a graph is needed loses essential information. The art is in the precise match.
We've toured the four foundational non-linear data structures. Let's consolidate the essential insights:
What's next:
With this conceptual foundation of non-linear data structures complete, you're prepared to explore each structure in depth. Upcoming chapters will cover:
Each chapter builds on the conceptual understanding you've developed here.
Congratulations! You've completed the Non-Linear Data Structures module. You now understand what makes a structure non-linear, the distinction between hierarchical and network relationships, cardinality patterns, and the four foundational non-linear structures. This conceptual mastery prepares you for deep dives into each structure's implementation and algorithms.