Data Structures & AlgorithmsNon-Primitive Data Structures

Non-Primitive Data Structures (Conceptual Overview)

LevelBeginner

Duration50 mins

TopicNon-Primitive Data Structures

2 / 4

Logical Grouping and Relationships Between Data

The Power of Connection

In the previous page, we established what distinguishes non-primitive data structures from primitives: composition, organization, defined operations, references, and abstraction. But why do these characteristics matter? What problems do they actually solve?

The answer lies in understanding that real-world data is never isolated. Information exists in context, grouped with related information, connected to other entities, organized in meaningful ways. A customer has orders. An order has items. Items have prices and quantities. These relationships are as important as the individual values themselves.

Primitive data types—integers, floats, characters, booleans—can store individual values. But they cannot express that this integer belongs with that string and that boolean. They cannot represent that this record connects to five other records. They cannot organize data into meaningful collections that reflect real-world structure.

Non-primitive data structures exist precisely to capture these logical groupings and relationships.

What You Will Learn

By the end of this page, you will understand how non-primitive data structures enable logical grouping of related elements, how they represent different types of relationships (one-to-one, one-to-many, many-to-many), and why this relational capability is essential for modeling real-world problems. You will see data structures not just as storage containers, but as relationship engines.

The Limitation of Isolated Values

Consider the following scenario: you're building a system to manage a library's book collection. For each book, you need to track:

Title (string)
Author (string)
ISBN (string)
Publication year (integer)
Available copies (integer)
Is digital (boolean)

Using only primitive data types, you might create six separate variables:

title = "The Pragmatic Programmer"
author = "David Thomas, Andrew Hunt"
isbn = "978-0135957059"
year = 2019
copies = 5
isDigital = true

The Problems Immediately Emerge:

Problems with Isolated Primitives

•No logical grouping — Nothing in the code expresses that these six variables belong together as a single book entity.
•Scalability nightmare — For 1000 books, you'd need 6000 separate variables: title1, title2... author1, author2...
•No relationship enforcement — Nothing prevents title5 from being accidentally paired with author17.
•No collective operations — You can't easily 'pass a book' to a function or 'store a book in a list'.
•Maintenance fragility — Adding a new field (e.g., 'genre') requires updating code everywhere that manually groups these variables.

Now imagine the full library system: books have authors, authors have multiple books, books are categorized into genres, books can be borrowed by members, members have borrowing histories, some books have waitlists of member requests...

Attempting to model these interconnected entities with isolated primitive variables is not just difficult—it's effectively impossible. The relationships between entities are the essence of the data model. Without the ability to express these relationships, you cannot meaningfully represent the problem domain.

Data Without Relationships is Incomplete

A book title without its author, ISBN, and availability status is not a 'book'—it's just a string. Real entities are defined by the bundle of their attributes and relationships. Non-primitive data structures provide the bundling mechanism.

Logical Grouping: Bundling Related Data

The most fundamental capability of non-primitive data structures is logical grouping—the ability to bundle related data elements into a single, coherent unit.

What Logical Grouping Provides:

Cohesion — Related values travel together as a unit. A 'book' containing title, author, and ISBN moves through your system as one entity.
Encapsulation — The internal composition of a group is hidden from code that doesn't need to know. A function that counts books doesn't need to know books contain titles and ISBNs.
Semantic Clarity — Code that operates on 'a book' or 'a list of books' is more readable and meaningful than code juggling title_array[i] and author_array[i].
Type Safety — The type system can verify that you're passing a book where a book is expected, catching errors at compile time.
Collection Operations — You can create collections of groups: a list of books, a set of users, a map from ISBN to book.

Forms of Logical Grouping:

Records/Structs/Classes

The most direct form of grouping bundles named fields into a single type:

Book {
    title: String
    author: String
    isbn: String
    year: Integer
    copies: Integer
    isDigital: Boolean
}

Now 'book' is a first-class entity. You can create a book, pass it to functions, store it in collections, and compare it to other books.

Tuples

A lighter-weight grouping for anonymous combinations:

(title, author, year) = ("1984", "George Orwell", 1949)

Useful when you need quick bundling without defining a named type.

Arrays/Lists of Homogeneous Elements

Grouping multiple items of the same type:

prices = [9.99, 14.99, 7.50, 24.99]

The array groups these prices into a single collection that can be manipulated as a whole.

Grouping Enables Abstraction

Once data is grouped, you can abstract over the group. A function calculateTotal(order) doesn't need to receive 20 separate parameters for each order field—it receives one order object. This abstraction dramatically simplifies interfaces and reduces coupling.

Sequential Relationships: Order Matters

Beyond simple grouping, non-primitive data structures can express that elements exist in a sequence—that there is a first element, a second element, and so on. This sequential relationship is fundamental to many real-world concepts.

Where Sequence Matters:

Playlists — Songs have an order; playing song 3 before song 1 changes the experience
Text — Characters in a string follow a specific sequence; 'stop' ≠ 'tops'
Histories — Events happened in order; causation flows forward in time
Queues — First to arrive should be first to serve
Stacks — Last added should be first removed (undo operations)
Rankings — 1st place means something different than 5th place

Linear Data Structures Express Sequence

Structures like arrays, linked lists, stacks, and queues are called 'linear' precisely because they express sequential, one-after-another relationships:

[A] → [B] → [C] → [D] → [E]

Each element has at most one predecessor and one successor. This linear topology provides:

Positional access — 'Get the 3rd element'
Iteration order — 'Visit each element from first to last'
Insertion position — 'Insert X after the 2nd element'
Ordering semantics — 'This element comes before that element'

The Sequence Relationship Is Not Inherent in Data

Here's a crucial insight: there's nothing about the values 5, 3, 8 that says 5 comes before 3. The sequence is imposed by the data structure. An array [5, 3, 8] places 5 first by virtue of its position at index 0. The same values in a different order [3, 8, 5] constitute a different sequence.

This means the data structure doesn't just store data—it adds information about relationships that the raw data doesn't possess.

Position Is Relationship

In a sequential structure, an element's position is its relationship to other elements. Element at position 3 is 'after' elements at positions 0, 1, 2 and 'before' elements at positions 4, 5, etc. The structure encodes these relationships implicitly through position.

Hierarchical Relationships: Parent-Child Structures

Many real-world relationships are not sequential but hierarchical—organized in parent-child relationships where one entity 'contains' or 'supervises' multiple sub-entities.

Where Hierarchy Matters:

File Systems — Folders contain files and subfolders, forming a tree
Organizations — CEO → VPs → Directors → Managers → Staff
HTML/XML — Document → Body → Sections → Paragraphs → Text
Taxonomy — Kingdom → Phylum → Class → Order → Family → Genus → Species
Decision Trees — Root question → branches based on answer → sub-questions
Expression Trees — Mathematical expressions have operators containing sub-expressions

Tree Structures Express Hierarchy

Tree data structures model hierarchical relationships through parent-child connections:

                    [CEO]
                   /     \
              [VP-Eng]  [VP-Sales]
              /     \        \
       [Dir-A]  [Dir-B]   [Dir-C]
         |         |          |
       [Team]   [Team]     [Team]

Key properties of hierarchical structures:

Single root — One top-level entity (CEO, root folder, document)
One parent per child — Each element (except root) has exactly one parent
Multiple children per parent — A parent can have many children
No cycles — You cannot follow parent/child links and return to where you started
Depth/Levels — Elements exist at defined levels from the root

What Hierarchy Enables:

Scoped operations — 'Delete this folder and everything in it'
Path-based access — /documents/reports/2024/Q1.pdf
Recursive processing — 'Calculate size of folder = size of files + size of subfolders'
Efficient search — Binary search trees leverage hierarchy for O(log n) lookup
Natural modeling — Data that is hierarchical should be stored hierarchically

Hierarchy Imposes Structure

A hierarchy is a strong structural constraint. Every node (except root) has exactly one parent. This constraint enables many optimizations and guarantees. When you see hierarchical data, you should think of tree structures. When you see tree structures, you should expect O(log n) operations (in balanced trees) or O(height) traversals.

Network Relationships: Many-to-Many Connections

Some relationships are neither sequential nor strictly hierarchical. They form networks where any entity can connect to any number of other entities, with no restrictions on the connection pattern.

Where Network Relationships Matter:

Social networks — Users follow users; anyone can follow anyone
Road maps — Intersections connect to multiple other intersections
The Internet — Servers link to servers; pages link to pages
Dependencies — Package A depends on B and C; B also depends on C
Collaboration — Person X works with Y and Z; Y also works with W and X
Flight routes — Airports connect to multiple other airports

Graph Structures Express Networks

Graphs are the most general relationship structure, consisting of:

Nodes (Vertices) — The entities (users, cities, pages)
Edges — The connections between entities

    [Alice] ←——→ [Bob]
       ↑          ↑
       |          |
       ↓          ↓
    [Carol] ←——→ [Dave]

Graph relationships can be:

Directed — A → B doesn't imply B → A (following on Twitter)
Undirected — Connection is mutual (friendship on Facebook)
Weighted — Connections have values (road distances, bandwidths)
Cyclic — Paths can loop back (unlike trees)
Dense — Most possible connections exist
Sparse — Few connections relative to possibilities

What Network Structures Enable:

Path finding — 'What's the shortest route from A to B?'
Connectivity queries — 'Are X and Y connected (possibly indirectly)?'
Cluster detection — 'Which nodes form tightly connected groups?'
Influence analysis — 'Which node has the most connections?'
Dependency resolution — 'In what order should I install these packages?'

The Generality of Graphs

Here's a key insight: arrays and trees are special cases of graphs.

An array is a graph where each node connects only to its successor (a path graph)
A tree is a graph with no cycles and a designated root
A linked list is a directed graph forming a chain

Graphs are the most flexible relationship structure, but this flexibility comes with cost: graph algorithms are often more complex and more expensive than array or tree algorithms. When data is sequential or hierarchical, using the more constrained structure enables better performance.

Flexibility Has a Cost

Graphs can represent any relationship, but this generality makes some operations expensive. Finding the shortest path in a graph is O(V + E) at best. In contrast, finding any element in a balanced tree is O(log n). Use the most constrained structure that fits your data.

Associative Relationships: Key-Value Mappings

A critical type of relationship in computing is the associative relationship—where one value (the key) maps to another value (the value). This is the 'lookup' relationship.

Where Associative Relationships Matter:

Dictionaries — Word → Definition
Phone books — Name → Phone number
Databases — Primary key → Record
Caches — URL → Cached page content
Configuration — Setting name → Setting value
Symbol tables — Variable name → Memory location

Maps/Dictionaries Express Association

Associative structures provide:

Key-based access — Given a key, retrieve its associated value
Uniqueness — Each key appears at most once (no duplicate keys)
Dynamic updates — Add new pairs, remove pairs, change values

{
    "Alice": "alice@email.com",
    "Bob": "bob@email.com",
    "Carol": "carol@email.com"
}

The Power of O(1) Lookup

Hash tables (the most common associative structure) provide average-case O(1) operations:

Given a key, find its value: O(1)
Insert a new key-value pair: O(1)
Delete a key-value pair: O(1)

This is dramatically faster than searching through a list (O(n)) or even a sorted array (O(log n) for search, O(n) for insert).

When Association Complements Other Structures

Associative structures often work together with other relationship types:

Graph representations use maps: node → list of neighbors
Tree nodes might be indexed by key: ID → node reference
Sequential data might need fast lookup: index → element (array!) or element → index (reverse lookup)

Maps Are Everywhere

Even when you don't explicitly use a 'map' or 'dictionary' type, associative relationships are omnipresent. An array is a map from integer indices to values. A struct is a map from field names to values. Understanding the associative paradigm helps you see these implicit mappings.

Relationship Cardinality: One-to-One, One-to-Many, Many-to-Many

Data structure selection is heavily influenced by the cardinality of relationships—how many entities on each side connect to the other.

One-to-One (1:1)

Each entity on one side relates to exactly one entity on the other side.

Examples:

A person has one social security number; each SSN belongs to one person
A country has one capital city; each capital is capital of one country
A user has one profile; each profile belongs to one user

Structure choice: Records/structs, key-value maps, or embedding directly.

One-to-Many (1:N)

One entity relates to multiple entities, but those entities relate back to only one.

Examples:

A customer has many orders; each order belongs to one customer
A folder contains many files; each file is in one folder
A parent has many children; each child has one (biological) parent

Structure choice: Trees, hierarchical structures, foreign key references, nested collections.

Many-to-Many (N:M)

Entities on both sides can relate to multiple entities on the other side.

Examples:

Students enroll in many courses; courses have many students
Authors write many books; books can have many authors
Users can have many tags; tags apply to many users

Structure choice: Graphs, junction/association tables, adjacency lists.

Relationship Cardinality and Structure Choice
Cardinality	Example	Linear Structure?	Tree Structure?	Graph Structure?
1:1	User ↔ Profile	Yes (pair)	No (overkill)	No (overkill)
1:N	Parent → Children	No (need nesting)	Yes (natural fit)	Possible but unnecessary
N:M	Students ↔ Courses	No	No (cycles possible)	Yes (only option)

Why Cardinality Matters for Structure Selection

Choosing a structure with the wrong cardinality leads to awkward designs:

Using an array for 1:1 relationships wastes the array's collection capability
Using a tree for many-to-many relationships can't represent cycles or shared children
Using a graph for simple 1:N relationships adds unnecessary complexity

Understanding the relationship cardinality of your data is a prerequisite for choosing appropriate structures.

Database Wisdom Applies

Database design has formalized relationship cardinality thinking (Entity-Relationship models). This wisdom applies equally to in-memory data structure design. Before choosing a structure, draw the relationships and identify their cardinalities.

Implicit vs Explicit Relationships

Data structures can express relationships implicitly (through position or structure) or explicitly (through references or data).

Implicit Relationships

The relationship is encoded in the structure itself, without explicit pointers:

Array: Position encodes sequence

array[0] = 'A'  // First
array[1] = 'B'  // Second (implicitly after array[0])
array[2] = 'C'  // Third (implicitly after array[1])

Binary heap: Position encodes parent-child

heap[0] is root
heap[1] and heap[2] are children of heap[0]
heap[3] and heap[4] are children of heap[1]
// Parent of heap[i] is heap[(i-1)/2]

Advantages:

Memory efficient (no pointer storage)
Cache-friendly (contiguous memory)
Simple addressing arithmetic

Disadvantages:

Hard to change relationships (requires moving data)
Fixed structure (can't have variable children counts)

Explicit Relationships

Relationships are expressed through references/pointers stored in the data:

Linked list: Pointers encode sequence

Node A:
  data: 'A'
  next: → Node B

Node B:
  data: 'B'
  next: → Node C

General tree: Pointers encode hierarchy

Node:
  data: value
  children: [→ Child1, → Child2, → Child3]

Graph: Adjacency lists or edge sets

Node A:
  neighbors: [→ B, → C, → D]

Advantages:

Flexible structure changes (insert/delete are pointer updates)
Variable relationship counts
Can represent any topology including cycles

Disadvantages:

Memory overhead for pointers
Cache-unfriendly (scattered memory access)
More complex implementation

Implicit Relationships

•Array indices
•Heap parent/child formulas
•Matrix row/column positions
•Ring buffer wraparound
•Bit positions in bitmaps

Explicit Relationships

•Linked list next/prev pointers
•Tree parent/child references
•Graph adjacency lists
•Hash table chaining
•Object references in OOP

Modern Preference

Modern systems often favor implicit relationships where possible because of cache efficiency. 'Data-oriented design' specifically advocates for flat arrays over pointer-chasing structures. But complex, dynamic relationships often require explicit representation. The best engineers know when each is appropriate.

Relationships Enable Algorithms

Here's a profound insight: algorithms operate on relationships, not just data.

Consider sorting. What does 'sort' mean? It means rearranging elements so that their sequential relationship matches their ordering relationship. The algorithm manipulates both the data values (to compare them) and the structural relationships (to reposition them).

Consider graph traversal. Breadth-first search doesn't just visit nodes—it follows edges, explores relationships. The algorithm is fundamentally about relationships.

Consider tree rotations in AVL trees. The data doesn't change at all—only the parent-child relationships are restructured to maintain balance.

The Algorithm-Structure Connection:

Algorithms Depend on Relational Structure

•Binary search requires a sequential, sorted relationship
•Heap operations require a complete binary tree relationship with heap property
•Graph algorithms (BFS, DFS, Dijkstra) require node-edge relationships
•Recursive algorithms often mirror hierarchical relationships in data
•Dynamic programming exploits relationships between subproblems

This is why data structure choice matters so much for algorithm efficiency.

If your data is structured as a hash table (associative relationship), binary search cannot apply. If your data is an array (sequential), you cannot directly perform tree traversal. The structure determines which algorithms are possible and efficient.

Choosing a data structure is choosing which algorithms will be efficient.

Structure-Algorithm Mismatch

A common mistake is choosing a structure without considering needed algorithms. If you'll need to repeatedly find the minimum element, don't use an unsorted array (O(n) each time). Use a heap or sorted structure. The relationship your structure provides must support the operations your algorithms require.

Summary: Logical Grouping and Relationships

Let's consolidate what we've learned about how non-primitive data structures handle logical grouping and relationships:

Key Takeaways

•Primitives cannot express relationships — Individual values have no inherent connection to other values.
•Logical grouping bundles related data — Records, tuples, and homogeneous collections create cohesive entities from multiple values.
•Sequential relationships — Arrays and lists express ordered, one-after-another arrangements.
•Hierarchical relationships — Trees express parent-child structures with single parents and multiple children.
•Network relationships — Graphs express arbitrary many-to-many connections without structural constraints.
•Associative relationships — Maps express key → value lookups for efficient retrieval.
•Relationship cardinality — 1:1, 1:N, and N:M relationships guide structure selection.
•Implicit vs explicit — Relationships can be encoded positionally or through references, each with tradeoffs.
•Algorithms operate on relationships — Structure choice determines algorithm applicability and efficiency.

What's Next:

We've seen that non-primitive structures enable logical grouping and relationship expression. But why do we need these abstractions? The next page explores the necessity of abstraction itself—how abstracting away complexity is essential for building and reasoning about sophisticated systems.

Page Complete

You now understand how non-primitive data structures serve as relationship engines—bundling related data, expressing sequences, hierarchies, networks, and associations. This relational perspective is crucial for selecting structures that match your data's inherent relationships.

2 / 4

Loading learning content...

Data Structures & AlgorithmsNon-Primitive Data Structures

Non-Primitive Data Structures (Conceptual Overview)

LevelBeginner

Duration50 mins

TopicNon-Primitive Data Structures

2 / 4

Logical Grouping and Relationships Between Data

The Power of Connection

Non-primitive data structures exist precisely to capture these logical groupings and relationships.

What You Will Learn

The Limitation of Isolated Values

Consider the following scenario: you're building a system to manage a library's book collection. For each book, you need to track:

Title (string)
Author (string)
ISBN (string)
Publication year (integer)
Available copies (integer)
Is digital (boolean)

Using only primitive data types, you might create six separate variables:

title = "The Pragmatic Programmer"
author = "David Thomas, Andrew Hunt"
isbn = "978-0135957059"
year = 2019
copies = 5
isDigital = true

The Problems Immediately Emerge:

Problems with Isolated Primitives

•No logical grouping — Nothing in the code expresses that these six variables belong together as a single book entity.
•Scalability nightmare — For 1000 books, you'd need 6000 separate variables: title1, title2... author1, author2...
•No relationship enforcement — Nothing prevents title5 from being accidentally paired with author17.
•No collective operations — You can't easily 'pass a book' to a function or 'store a book in a list'.
•Maintenance fragility — Adding a new field (e.g., 'genre') requires updating code everywhere that manually groups these variables.

Data Without Relationships is Incomplete

Logical Grouping: Bundling Related Data

The most fundamental capability of non-primitive data structures is logical grouping—the ability to bundle related data elements into a single, coherent unit.

What Logical Grouping Provides:

Cohesion — Related values travel together as a unit. A 'book' containing title, author, and ISBN moves through your system as one entity.
Encapsulation — The internal composition of a group is hidden from code that doesn't need to know. A function that counts books doesn't need to know books contain titles and ISBNs.
Semantic Clarity — Code that operates on 'a book' or 'a list of books' is more readable and meaningful than code juggling title_array[i] and author_array[i].
Type Safety — The type system can verify that you're passing a book where a book is expected, catching errors at compile time.
Collection Operations — You can create collections of groups: a list of books, a set of users, a map from ISBN to book.

Forms of Logical Grouping:

Records/Structs/Classes

The most direct form of grouping bundles named fields into a single type:

Book {
    title: String
    author: String
    isbn: String
    year: Integer
    copies: Integer
    isDigital: Boolean
}

Now 'book' is a first-class entity. You can create a book, pass it to functions, store it in collections, and compare it to other books.

Tuples

A lighter-weight grouping for anonymous combinations:

(title, author, year) = ("1984", "George Orwell", 1949)

Useful when you need quick bundling without defining a named type.

Arrays/Lists of Homogeneous Elements

Grouping multiple items of the same type:

prices = [9.99, 14.99, 7.50, 24.99]

The array groups these prices into a single collection that can be manipulated as a whole.

Grouping Enables Abstraction

Sequential Relationships: Order Matters

Where Sequence Matters:

Playlists — Songs have an order; playing song 3 before song 1 changes the experience
Text — Characters in a string follow a specific sequence; 'stop' ≠ 'tops'
Histories — Events happened in order; causation flows forward in time
Queues — First to arrive should be first to serve
Stacks — Last added should be first removed (undo operations)
Rankings — 1st place means something different than 5th place

Linear Data Structures Express Sequence

Structures like arrays, linked lists, stacks, and queues are called 'linear' precisely because they express sequential, one-after-another relationships:

[A] → [B] → [C] → [D] → [E]

Each element has at most one predecessor and one successor. This linear topology provides:

Positional access — 'Get the 3rd element'
Iteration order — 'Visit each element from first to last'
Insertion position — 'Insert X after the 2nd element'
Ordering semantics — 'This element comes before that element'

The Sequence Relationship Is Not Inherent in Data

This means the data structure doesn't just store data—it adds information about relationships that the raw data doesn't possess.

Position Is Relationship

Hierarchical Relationships: Parent-Child Structures

Many real-world relationships are not sequential but hierarchical—organized in parent-child relationships where one entity 'contains' or 'supervises' multiple sub-entities.

Where Hierarchy Matters:

File Systems — Folders contain files and subfolders, forming a tree
Organizations — CEO → VPs → Directors → Managers → Staff
HTML/XML — Document → Body → Sections → Paragraphs → Text
Taxonomy — Kingdom → Phylum → Class → Order → Family → Genus → Species
Decision Trees — Root question → branches based on answer → sub-questions
Expression Trees — Mathematical expressions have operators containing sub-expressions

Tree Structures Express Hierarchy

Tree data structures model hierarchical relationships through parent-child connections:

                    [CEO]
                   /     \
              [VP-Eng]  [VP-Sales]
              /     \        \
       [Dir-A]  [Dir-B]   [Dir-C]
         |         |          |
       [Team]   [Team]     [Team]

Key properties of hierarchical structures:

Single root — One top-level entity (CEO, root folder, document)
One parent per child — Each element (except root) has exactly one parent
Multiple children per parent — A parent can have many children
No cycles — You cannot follow parent/child links and return to where you started
Depth/Levels — Elements exist at defined levels from the root

What Hierarchy Enables:

Scoped operations — 'Delete this folder and everything in it'
Path-based access — /documents/reports/2024/Q1.pdf
Recursive processing — 'Calculate size of folder = size of files + size of subfolders'
Efficient search — Binary search trees leverage hierarchy for O(log n) lookup
Natural modeling — Data that is hierarchical should be stored hierarchically

Hierarchy Imposes Structure

Network Relationships: Many-to-Many Connections

Where Network Relationships Matter:

Social networks — Users follow users; anyone can follow anyone
Road maps — Intersections connect to multiple other intersections
The Internet — Servers link to servers; pages link to pages
Dependencies — Package A depends on B and C; B also depends on C
Collaboration — Person X works with Y and Z; Y also works with W and X
Flight routes — Airports connect to multiple other airports

Graph Structures Express Networks

Graphs are the most general relationship structure, consisting of:

Nodes (Vertices) — The entities (users, cities, pages)
Edges — The connections between entities

    [Alice] ←——→ [Bob]
       ↑          ↑
       |          |
       ↓          ↓
    [Carol] ←——→ [Dave]

Graph relationships can be:

Directed — A → B doesn't imply B → A (following on Twitter)
Undirected — Connection is mutual (friendship on Facebook)
Weighted — Connections have values (road distances, bandwidths)
Cyclic — Paths can loop back (unlike trees)
Dense — Most possible connections exist
Sparse — Few connections relative to possibilities

What Network Structures Enable:

Path finding — 'What's the shortest route from A to B?'
Connectivity queries — 'Are X and Y connected (possibly indirectly)?'
Cluster detection — 'Which nodes form tightly connected groups?'
Influence analysis — 'Which node has the most connections?'
Dependency resolution — 'In what order should I install these packages?'

The Generality of Graphs

Here's a key insight: arrays and trees are special cases of graphs.

An array is a graph where each node connects only to its successor (a path graph)
A tree is a graph with no cycles and a designated root
A linked list is a directed graph forming a chain

Flexibility Has a Cost

Associative Relationships: Key-Value Mappings

A critical type of relationship in computing is the associative relationship—where one value (the key) maps to another value (the value). This is the 'lookup' relationship.

Where Associative Relationships Matter:

Dictionaries — Word → Definition
Phone books — Name → Phone number
Databases — Primary key → Record
Caches — URL → Cached page content
Configuration — Setting name → Setting value
Symbol tables — Variable name → Memory location

Maps/Dictionaries Express Association

Associative structures provide:

Key-based access — Given a key, retrieve its associated value
Uniqueness — Each key appears at most once (no duplicate keys)
Dynamic updates — Add new pairs, remove pairs, change values

{
    "Alice": "alice@email.com",
    "Bob": "bob@email.com",
    "Carol": "carol@email.com"
}

The Power of O(1) Lookup

Hash tables (the most common associative structure) provide average-case O(1) operations:

Given a key, find its value: O(1)
Insert a new key-value pair: O(1)
Delete a key-value pair: O(1)

This is dramatically faster than searching through a list (O(n)) or even a sorted array (O(log n) for search, O(n) for insert).

When Association Complements Other Structures

Associative structures often work together with other relationship types:

Graph representations use maps: node → list of neighbors
Tree nodes might be indexed by key: ID → node reference
Sequential data might need fast lookup: index → element (array!) or element → index (reverse lookup)

Maps Are Everywhere

Relationship Cardinality: One-to-One, One-to-Many, Many-to-Many

Data structure selection is heavily influenced by the cardinality of relationships—how many entities on each side connect to the other.

One-to-One (1:1)

Each entity on one side relates to exactly one entity on the other side.

Examples:

A person has one social security number; each SSN belongs to one person
A country has one capital city; each capital is capital of one country
A user has one profile; each profile belongs to one user

Structure choice: Records/structs, key-value maps, or embedding directly.

One-to-Many (1:N)

One entity relates to multiple entities, but those entities relate back to only one.

Examples:

A customer has many orders; each order belongs to one customer
A folder contains many files; each file is in one folder
A parent has many children; each child has one (biological) parent

Structure choice: Trees, hierarchical structures, foreign key references, nested collections.

Many-to-Many (N:M)

Entities on both sides can relate to multiple entities on the other side.

Examples:

Students enroll in many courses; courses have many students
Authors write many books; books can have many authors
Users can have many tags; tags apply to many users

Structure choice: Graphs, junction/association tables, adjacency lists.

Relationship Cardinality and Structure Choice
Cardinality	Example	Linear Structure?	Tree Structure?	Graph Structure?
1:1	User ↔ Profile	Yes (pair)	No (overkill)	No (overkill)
1:N	Parent → Children	No (need nesting)	Yes (natural fit)	Possible but unnecessary
N:M	Students ↔ Courses	No	No (cycles possible)	Yes (only option)

Why Cardinality Matters for Structure Selection

Choosing a structure with the wrong cardinality leads to awkward designs:

Using an array for 1:1 relationships wastes the array's collection capability
Using a tree for many-to-many relationships can't represent cycles or shared children
Using a graph for simple 1:N relationships adds unnecessary complexity

Understanding the relationship cardinality of your data is a prerequisite for choosing appropriate structures.

Database Wisdom Applies

Implicit vs Explicit Relationships

Data structures can express relationships implicitly (through position or structure) or explicitly (through references or data).

Implicit Relationships

The relationship is encoded in the structure itself, without explicit pointers:

Array: Position encodes sequence

array[0] = 'A'  // First
array[1] = 'B'  // Second (implicitly after array[0])
array[2] = 'C'  // Third (implicitly after array[1])

Binary heap: Position encodes parent-child

heap[0] is root
heap[1] and heap[2] are children of heap[0]
heap[3] and heap[4] are children of heap[1]
// Parent of heap[i] is heap[(i-1)/2]

Advantages:

Memory efficient (no pointer storage)
Cache-friendly (contiguous memory)
Simple addressing arithmetic

Disadvantages:

Hard to change relationships (requires moving data)
Fixed structure (can't have variable children counts)

Explicit Relationships

Relationships are expressed through references/pointers stored in the data:

Linked list: Pointers encode sequence

Node A:
  data: 'A'
  next: → Node B

Node B:
  data: 'B'
  next: → Node C

General tree: Pointers encode hierarchy

Node:
  data: value
  children: [→ Child1, → Child2, → Child3]

Graph: Adjacency lists or edge sets

Node A:
  neighbors: [→ B, → C, → D]

Advantages:

Flexible structure changes (insert/delete are pointer updates)
Variable relationship counts
Can represent any topology including cycles

Disadvantages:

Memory overhead for pointers
Cache-unfriendly (scattered memory access)
More complex implementation

Implicit Relationships

•Array indices
•Heap parent/child formulas
•Matrix row/column positions
•Ring buffer wraparound
•Bit positions in bitmaps

Explicit Relationships

•Linked list next/prev pointers
•Tree parent/child references
•Graph adjacency lists
•Hash table chaining
•Object references in OOP

Modern Preference

Relationships Enable Algorithms

Here's a profound insight: algorithms operate on relationships, not just data.

Consider graph traversal. Breadth-first search doesn't just visit nodes—it follows edges, explores relationships. The algorithm is fundamentally about relationships.

Consider tree rotations in AVL trees. The data doesn't change at all—only the parent-child relationships are restructured to maintain balance.

The Algorithm-Structure Connection:

Algorithms Depend on Relational Structure

•Binary search requires a sequential, sorted relationship
•Heap operations require a complete binary tree relationship with heap property
•Graph algorithms (BFS, DFS, Dijkstra) require node-edge relationships
•Recursive algorithms often mirror hierarchical relationships in data
•Dynamic programming exploits relationships between subproblems

This is why data structure choice matters so much for algorithm efficiency.

Choosing a data structure is choosing which algorithms will be efficient.

Structure-Algorithm Mismatch

Summary: Logical Grouping and Relationships

Let's consolidate what we've learned about how non-primitive data structures handle logical grouping and relationships:

Key Takeaways

•Primitives cannot express relationships — Individual values have no inherent connection to other values.
•Logical grouping bundles related data — Records, tuples, and homogeneous collections create cohesive entities from multiple values.
•Sequential relationships — Arrays and lists express ordered, one-after-another arrangements.
•Hierarchical relationships — Trees express parent-child structures with single parents and multiple children.
•Network relationships — Graphs express arbitrary many-to-many connections without structural constraints.
•Associative relationships — Maps express key → value lookups for efficient retrieval.
•Relationship cardinality — 1:1, 1:N, and N:M relationships guide structure selection.
•Implicit vs explicit — Relationships can be encoded positionally or through references, each with tradeoffs.
•Algorithms operate on relationships — Structure choice determines algorithm applicability and efficiency.

What's Next:

Page Complete

2 / 4