Loading learning content...
In the physical world, we organize complex information in two fundamentally different ways.
Consider a large corporation:
From one perspective, it's a hierarchy. The CEO sits at the top. Below are vice presidents, then directors, then managers, then individual contributors. Each person has exactly one boss. Information flows up and down through well-defined chains of command. This is hierarchical organization.
From another perspective, the same corporation is a network. Alice in marketing collaborates with Bob in engineering. Carol in sales has meetings with Dave in legal. Teams form and dissolve. Projects create temporary connections across the formal hierarchy. Information flows through informal channels, social relationships, and cross-functional partnerships. This is network-based organization.
Both perspectives are simultaneously true. Both are useful for different purposes. And both have direct analogues in how we structure data in computer science.
By the end of this page, you will deeply understand hierarchical and network-based relationships—not just their definitions, but their essential properties, their natural use cases, and how to recognize which paradigm fits a given problem. This understanding is fundamental to choosing between trees and graphs in any application.
Hierarchical relationships are characterized by a strict structure of containment or authority. One entity is "above" another in some meaningful sense—as a container, as an authority, as a category, or as an ancestor.
A hierarchical relationship establishes a strict parent-child connection where every child has exactly one parent, creating a tree-shaped structure that fans outward from a single root.
The defining properties of hierarchical relationships:
Real-world examples of hierarchical relationships:
| Domain | Hierarchy Example | Root | Parent-Child Relationship |
|---|---|---|---|
| Biology | Taxonomic classification | Life | Domain → Kingdom → Phylum → ... → Species |
| Geography | Administrative divisions | Country | Country → State → County → City → Neighborhood |
| Computing | File systems | Root directory | Folder → Subfolders/Files |
| Organizations | Command structure | CEO | Manager → Direct reports |
| Documents | HTML DOM | <html> element | Element → Child elements |
| Language | Sentence structure | Sentence | Phrase → Words |
Notice that each example shares the same fundamental shape: a single root, branching downward, with each node having exactly one parent.
Hierarchies are powerful precisely because of their constraints. The single-parent rule means there's always a unique path back to the root. Acyclicity means algorithms never get stuck in infinite loops. Well-defined levels mean you can reason about 'how deep' something is. These guarantees enable efficient algorithms that don't exist for general graphs.
Understanding the mathematical properties of hierarchies reveals why they're so computationally tractable.
Key mathematical properties:
Property 1: Edge Count
In any tree with n nodes, there are exactly n-1 edges. This is because each node (except the root) has exactly one incoming edge from its parent. This constraint means trees are minimally connected—removing any edge would disconnect the tree.
Property 2: Path Uniqueness
Between any two nodes in a tree, there is exactly one simple path. This uniqueness is the foundation for efficient algorithms. In general graphs, there might be exponentially many paths between two nodes.
Property 3: Height and Depth
Property 4: Subtree Independence
Every node in a tree is the root of its own subtree. This recursive structure is fundamental—it means operations on trees can often be expressed as "do something to the root, then recursively do it to each subtree."
| Property | Formula/Value | Significance |
|---|---|---|
| Edges in tree of n nodes | n - 1 | Exactly one edge per non-root node |
| Paths between any two nodes | Exactly 1 | No ambiguity in traversal |
| Maximum nodes at depth d (binary) | 2^d | Exponential growth per level |
| Height of balanced binary tree | O(log n) | Many tree operations are O(height) |
| Height of degenerate tree | O(n) | Worst case: tree becomes a list |
Why these properties matter for algorithms:
Unique paths mean deterministic searches. Finding a node doesn't require exploring alternatives.
Minimal edge count means linear space. Storing n nodes requires O(n) space for edges, not O(n²).
Subtree independence enables recursion. Almost every tree algorithm uses the pattern: process root, then recurse on children. This is only possible because subtrees are independent.
Height bounds complexity. Many operations (insert, find, delete) are O(height). In balanced trees, this is O(log n)—the exponential branching means each level eliminates a constant fraction of remaining nodes.
The O(log n) complexity of balanced tree operations comes from the branching factor. A binary tree with height h has at most 2^h nodes. Inversely, a balanced binary tree with n nodes has height at most log₂(n). This logarithmic height is what makes BSTs, heaps, and other tree structures so efficient.
Network-based relationships remove the constraints of hierarchy. They model situations where entities can connect to each other freely, without restrictions on the number or direction of connections.
A network-based relationship allows arbitrary connections between entities, where any element can be connected to any number of other elements, potentially including cycles and multiple paths.
The defining properties of network relationships:
Real-world examples of network relationships:
| Domain | Network Example | Nodes | Connections |
|---|---|---|---|
| Social | Social network | People | Friendships, follows, messages |
| Transportation | Road network | Intersections | Roads connecting intersections |
| Internet | World Wide Web | Web pages | Hyperlinks between pages |
| Computing | Dependency graph | Modules | Module A requires module B |
| Biology | Neural network | Neurons | Synaptic connections |
| Economics | Trade network | Countries | Trade agreements |
Notice that none of these examples fit neatly into a tree. People have multiple friends. Cities connect to many other cities. Web pages link to each other in complex webs.
The freedom of networks comes with computational costs. Without unique paths, finding the best path requires exploring alternatives. With cycles, algorithms must track visited nodes to avoid infinite loops. With arbitrary connectivity, worst-case edge count is O(n²). Networks are more expressive but harder to work with.
Network structures (graphs) have their own mathematical properties that contrast sharply with hierarchies.
Key mathematical properties:
Property 1: Variable Edge Count
A graph with n vertices can have anywhere from 0 to n(n-1)/2 edges (undirected) or n(n-1) edges (directed). This is a massive range:
Property 2: Multiple Paths (or None)
Between any two vertices, there may be:
Property 3: Cycles
Graphs can contain cycles—paths that return to the starting vertex. A simple cycle visits each vertex at most once. The presence of cycles fundamentally changes algorithm design.
Property 4: Connected Components
A graph may consist of multiple disjoint subgraphs. Two vertices are in the same component if there's a path between them. Trees are always fully connected (one component).
| Property | Value/Range | Contrast with Trees |
|---|---|---|
| Max edges (undirected) | n(n-1)/2 = O(n²) | Trees always have exactly n-1 |
| Paths between nodes | 0, 1, or many | Trees always have exactly 1 |
| Cycles | May exist | Never exist in trees |
| Connected components | 1 to n possible | Trees are always 1 component |
| Minimum spanning tree edges | n-1 (if connected) | Tree IS the minimal connected structure |
Graph metrics and concepts:
Why these properties matter for algorithms:
Variable edge count means variable complexity. An algorithm that's O(E) might be O(n) for sparse graphs but O(n²) for dense ones.
Multiple paths require search algorithms. Finding the shortest path needs BFS, Dijkstra, or similar algorithms—not just following pointers.
Cycles require visited tracking. Every graph traversal must mark visited nodes to avoid infinite loops.
Components require connectivity checks. You can't assume you can reach every node from every other node.
Let's systematically compare these two paradigms across every relevant dimension. This comparison is essential for choosing the right model for your data.
| Dimension | Hierarchical (Trees) | Network (Graphs) |
|---|---|---|
| Structural freedom | Low: strict parent-child rules | High: arbitrary connections allowed |
| Cycles | Impossible by definition | May be present |
| Path uniqueness | Guaranteed: exactly one path | Not guaranteed: may be many or none |
| Edge count | Always n-1 for n nodes | 0 to O(n²) edges possible |
| Root/starting point | Always exists (root) | May not have natural starting point |
| Traversal complexity | Simple: recursion works naturally | Complex: must handle cycles, components |
| Space efficiency | O(n) for edges | O(n) to O(n²) for edges |
| Common algorithms | Tree traversals, recursive descent | BFS, DFS, Dijkstra, topological sort |
| Typical problems | Containment, classification, parsing | Connectivity, shortest path, flow |
Many real systems combine both paradigms. A company has an org chart (tree) AND a collaboration network (graph). A website has a page hierarchy (tree) AND hyperlinks (graph). Don't fight this—use trees where structure is hierarchical, graphs where it's networked, and combine them when your domain demands both.
Within both hierarchical and network paradigms, relationships can be directed or undirected. This distinction fundamentally affects how you model and traverse your data.
Undirected relationships are symmetric. If A is connected to B, then B is connected to A. The connection has no inherent direction.
Examples:
Directed relationships are asymmetric. A connection from A to B doesn't imply a connection from B to A.
Examples:
| Aspect | Undirected | Directed |
|---|---|---|
| Connection symmetry | A↔B (bidirectional) | A→B (one direction) |
| Maximum edges | n(n-1)/2 | n(n-1) |
| Degree definition | Single degree per node | In-degree and out-degree separately |
| Path meaning | Can traverse either direction | Must follow edge direction |
| Cycle types | Simple cycles | Directed cycles (different concept) |
| Real-world examples | Friendships, roads (mostly) | Following, dependencies, one-ways |
The impact on algorithms:
Directedness affects almost every graph algorithm:
Traversal: In directed graphs, you can only follow edges in their specified direction. Reaching node B from node A is different from reaching A from B.
Connectivity: Undirected graphs have 'connectivity' (path exists between nodes). Directed graphs have 'weak connectivity' (path exists ignoring directions) and 'strong connectivity' (path exists following directions, in both directions).
Cycles: A directed cycle follows edge directions. A graph can have cycles when considered undirected but be acyclic when directions are respected (DAG—Directed Acyclic Graph).
Shortest paths: Dijkstra and BFS work on both, but semantics differ. The shortest path from A to B may differ from B to A in directed graphs.
Special case: Trees
Trees are typically modeled as directed structures (parent → children) for traversal purposes, though the 'direction' is often implicit. The parent-child relationship is inherently asymmetric, even if the physical links are bidirectional.
Directed Acyclic Graphs (DAGs) combine properties of trees and graphs. Like trees, they have no cycles (when following directions). Unlike trees, nodes can have multiple 'parents.' DAGs are crucial for modeling dependencies, task scheduling, version control (Git), and many other domains. They get their own detailed treatment in later chapters.
Beyond direction, relationships can carry weights—quantitative values that measure some aspect of the connection.
Unweighted relationships are binary: the connection exists or it doesn't. All connections are equivalent.
Weighted relationships carry a value: distance, cost, capacity, strength, probability, or any other metric.
Examples of weighted relationships:
| Domain | Nodes | Edge Weight Meaning |
|---|---|---|
| Road networks | Intersections | Distance in miles/km |
| Airlines | Airports | Flight time or ticket cost |
| Social networks | People | Interaction frequency or closeness |
| Computer networks | Routers | Bandwidth or latency |
| Flow networks | Stations | Maximum capacity |
| Probability graphs | States | Transition probability |
Why weights matter algorithmically:
Shortest path changes meaning. In unweighted graphs, shortest path = fewest edges (BFS). In weighted graphs, shortest path = minimum total weight (Dijkstra, Bellman-Ford).
Minimum spanning tree. Finding the cheapest way to connect all nodes becomes a minimization problem over edge weights.
Maximum flow. How much can flow through a network? Depends on edge capacities (weights).
Threshold-based filtering. You might want to consider only edges above some weight (strong connections) or below some weight (short distances).
Weights in trees:
Trees can also be weighted, though it's less common. Weighted trees arise in:
Some algorithms (like Dijkstra's) don't work correctly with negative edge weights. Negative weights create the possibility of 'negative cycles' where you can reduce total cost indefinitely. If your domain has negative weights, you need specialized algorithms like Bellman-Ford that handle them correctly.
When facing a new problem, use this systematic framework to determine whether your data relationships are hierarchical or network-based:
Step 1: Identify the entities and relationships
List what things exist (nodes) and how they connect (edges). Be explicit about both.
Step 2: Ask the single-parent question
Can each entity have multiple 'parents' or sources of the relationship? If no—hierarchy. If yes—network.
Step 3: Ask the cycle question
Can following relationships return you to the starting point? If no—potentially a tree or DAG. If yes—general graph.
Step 4: Determine direction
Is the relationship symmetric (A↔B) or directional (A→B)? This determines directed vs undirected.
Step 5: Check for weights
Are all connections equal, or do they carry quantitative values? This determines weighted vs unweighted.
1234567891011121314151617181920212223
RELATIONSHIP TYPE DECISION TREE: 1. Does each element have AT MOST ONE parent/source? ├── YES → Consider TREE structures │ └── Are there exactly N-1 connections for N elements? │ ├── YES → It's a TREE │ └── NO → It's a FOREST (multiple trees) │ └── NO → Consider GRAPH structures └── Can following edges create cycles? ├── YES → It's a GENERAL GRAPH │ └── May need cycle detection in algorithms │ └── NO → It's a DAG (Directed Acyclic Graph) └── Topological ordering exists 2. Are relationships symmetric? ├── YES → UNDIRECTED └── NO → DIRECTED 3. Do relationships have quantitative values? ├── YES → WEIGHTED └── NO → UNWEIGHTEDIf you're unsure, try drawing your data as both a tree and a graph. If the tree representation requires awkward workarounds (duplicate nodes, artificial groupings), your data is probably networked. If the graph representation is mostly tree-like with few cross-edges, a tree might suffice.
We've developed a comprehensive understanding of the two fundamental paradigms for non-linear data organization. Let's consolidate the essential insights:
What's next:
Now that we understand the two paradigms, we need to examine them more precisely through the lens of relationship cardinality: one-to-many versus many-to-many. The next page explores how these cardinality patterns manifest in data structures and how to recognize them in problem requirements.
You now have a deep understanding of hierarchical and network-based relationships—the two fundamental ways non-linear data structures organize information. This understanding prepares you to recognize relationship patterns and choose appropriate structures with confidence.