Graph Representation Adjacency List - Learning Module

Loading content...

0/279

When Lists Are Appropriate

Making the Right Choice

We've explored the adjacency list representation in depth: its structure, O(V + E) space complexity, and O(degree) edge lookup time. Now comes the practical question every engineer faces: When should I use an adjacency list instead of an adjacency matrix?

This isn't a theoretical exercise—the choice profoundly affects memory usage, algorithm speed, and even whether your solution is feasible at all. A wrong choice might mean the difference between a solution that runs in milliseconds and one that crashes with out-of-memory errors, or between an elegant implementation and a convoluted workaround.

Let's synthesize everything we've learned into actionable decision criteria.

What You Will Learn

By the end of this page, you will be able to confidently choose between adjacency lists and matrices for any graph problem. You'll understand the key decision factors, recognize common scenarios favoring each representation, and know how to adapt when requirements change.

The Decision Framework: Key Factors

When choosing a graph representation, evaluate these factors in order of importance:

Factor 1: Graph Density

Sparse graphs (density < 10%): Adjacency list is almost always better
Dense graphs (density > 50%): Adjacency matrix becomes competitive
Complete or near-complete graphs: Matrix may be more efficient

Factor 2: Dominant Operations

What will your algorithm do most?

Traverse neighbors repeatedly: Adjacency list excels
Check edge existence frequently: Matrix or hash-set list
Modify graph structure: List is more flexible

Factor 3: Memory Constraints

What can your system handle?

V² fits in memory: Either representation works
V² doesn't fit: Adjacency list is mandatory

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
def recommend_representation(
    num_vertices: int,
    num_edges: int,
    available_memory_mb: int,
    dominant_op: str  # 'traversal', 'edge_query', 'modification'
) -> str:
    """
    Recommend graph representation based on problem characteristics.
    
    Returns: 'adjacency_list', 'hash_set_list', or 'adjacency_matrix'
    """
    max_edges = num_vertices * (num_vertices - 1) // 2
    density = num_edges / max_edges if max_edges > 0 else 0
    
    # Calculate memory requirements (rough estimates)
    list_memory_mb = (num_vertices * 24 + num_edges * 2 * 8) / 1e6
    matrix_memory_mb = (num_vertices ** 2) / 1e6  # 1 byte per cell
    
    # Memory check first - if matrix doesn't fit, must use list
    if matrix_memory_mb > available_memory_mb:
        if dominant_op == 'edge_query':
            return 'hash_set_list'  # O(1) lookup with list space
        return 'adjacency_list'
    
    # Both fit in memory - choose by density and operations
    if density > 0.5:
        # Dense graph - matrix likely better
        if dominant_op == 'traversal':
            # Even for dense graphs, list traversal is O(E) vs O(V²)
            return 'adjacency_list' if density < 0.7 else 'adjacency_matrix'
        return 'adjacency_matrix'
    
    # Sparse graph
    if dominant_op == 'edge_query':
        return 'hash_set_list'
    elif dominant_op == 'modification':
        return 'adjacency_list'  # Easy to add/remove edges
    else:
        return 'adjacency_list'  # Default for sparse + traversal
 
# Examples
print(recommend_representation(1000, 5000, 500, 'traversal'))
# → 'adjacency_list' (sparse, traversal-heavy)
 
print(recommend_representation(1000, 400000, 500, 'edge_query'))
# → 'adjacency_matrix' (dense, query-heavy)
 
print(recommend_representation(100000, 500000, 500, 'traversal'))
# → 'adjacency_list' (matrix would need 10GB!)
 
print(recommend_representation(1000, 50000, 500, 'edge_query'))
# → 'hash_set_list' (moderate density, query-heavy)

Scenarios Where Adjacency Lists Excel

Let's examine specific scenarios where adjacency lists are the clear winner:

Scenario 1: Large Sparse Graphs

The canonical use case. When V is large and E is proportional to V (not V²), adjacency lists are the only viable option.

Real-World Sparse Graph Examples
Domain	Vertices (V)	Edges (E)	Avg Degree	Matrix Memory	List Memory
Facebook social	3 billion	~200 billion	~130	~72 exabytes	~1.6 TB
US road network	24 million	~58 million	~4.8	~576 TB	~700 MB
Wikipedia links	~60 million	~500 million	~16	~3.6 PB	~6 GB
Web graph (sample)	100 million	~1 billion	~20	~10 PB	~12 GB
Protein interactions	~20,000	~200,000	~20	~400 MB	~3 MB

Scenario 2: Traversal-Heavy Algorithms

Algorithms that iterate through neighbors benefit from O(degree) iteration rather than O(V) scanning:

Traversal Algorithms Favoring Lists

•BFS/DFS: Explore graph level-by-level or depth-first — O(V + E)
•Dijkstra's shortest path: Relax edges from priority queue — O((V + E) log V)
•Bellman-Ford: Iterate through all edges — O(V × E)
•Topological sort: Process vertices in dependency order — O(V + E)
•Strongly connected components: DFS-based algorithms — O(V + E)
•Minimum spanning tree (Prim's): Similar to Dijkstra — O((V + E) log V)
•Cycle detection: DFS with back-edge detection — O(V + E)

Scenario 3: Dynamic Graphs

When edges are frequently added or removed, adjacency lists handle modifications gracefully:

Add edge (u, v): Append v to adj[u], u to adj[v] — O(1) amortized
Remove edge (u, v): Find and remove — O(degree) with lists, O(1) with hash sets

Matrices also support O(1) modification, but if the graph is already sparse, you're paying the O(V²) space penalty for nothing.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
class DynamicGraph:
    """
    Graph supporting efficient edge modifications.
    
    Uses hash sets for O(1) edge operations while
    maintaining O(V + E) space.
    """
    
    def __init__(self):
        self.adj = {}  # vertex → set of neighbors
    
    def add_vertex(self, v):
        if v not in self.adj:
            self.adj[v] = set()
    
    def add_edge(self, u, v):
        """O(1) average time."""
        self.add_vertex(u)
        self.add_vertex(v)
        self.adj[u].add(v)
        self.adj[v].add(u)
    
    def remove_edge(self, u, v):
        """O(1) average time (vs O(degree) with lists!)."""
        if u in self.adj:
            self.adj[u].discard(v)
        if v in self.adj:
            self.adj[v].discard(u)
    
    def remove_vertex(self, v):
        """O(degree(v)) time to clean up edges."""
        if v not in self.adj:
            return
        
        # Remove v from all neighbors' lists
        for neighbor in self.adj[v]:
            self.adj[neighbor].discard(v)
        
        # Remove v's entry
        del self.adj[v]
    
    def contract_edge(self, u, v):
        """
        Merge vertex v into u, removing edge (u,v).
        Common in graph algorithms like Karger's min-cut.
        """
        if u not in self.adj or v not in self.adj:
            return
        
        # Move all v's edges to u
        for neighbor in self.adj[v]:
            if neighbor != u:
                self.adj[u].add(neighbor)
                self.adj[neighbor].discard(v)
                self.adj[neighbor].add(u)
        
        # Remove v
        del self.adj[v]
        self.adj[u].discard(v)
 
# Dynamic graphs appear in:
# - Social network updates (friend/unfriend)
# - Network routing (link up/down)
# - Incremental algorithms (streaming graphs)
# - Game AI (navigable regions changing)

When Adjacency Matrices Are Better

To fully appreciate when lists are appropriate, we must understand when they're NOT. Here are scenarios where adjacency matrices excel:

Scenario 1: Dense Graphs

When most possible edges exist, the matrix wastes little space and provides O(1) lookups:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
def create_complete_bipartite(m: int, n: int):
    """
    Complete bipartite graph K_{m,n}:
    Every vertex in set A (size m) connects to every vertex in set B (size n).
    
    Edges: m × n
    Density: m × n / (m + n - 1)² → approaches 1 as m, n grow together
    
    Matrix makes sense here:
    - Space: (m + n)² vs (m + n) + 2(m × n) for list
    - Edge queries are frequent in bipartite matching algorithms
    """
    V = m + n
    matrix = [[0] * V for _ in range(V)]
    
    for i in range(m):
        for j in range(m, V):
            matrix[i][j] = 1
            matrix[j][i] = 1
    
    return matrix
 
# For K_{100,100}:
# Matrix: 40,000 cells = 40 KB
# List: 200 vertices + 20,000 edges = similar space
# But matrix gives O(1) edge lookup!

Scenario 2: Small Graphs

When V is small (say, < 1000), the O(V²) space is negligible and the constant-time operations are convenient:

Graph with 100 vertices: 10,000 cells = 10-80 KB depending on cell size
Graph with 500 vertices: 250,000 cells = 0.25-2 MB
Graph with 1000 vertices: 1,000,000 cells = 1-8 MB

For these sizes, programmer convenience often outweighs asymptotic concerns.

Scenario 3: Matrix-Based Algorithms

Some algorithms are naturally expressed in matrix form:

Algorithms Favoring Matrix Representation

•Floyd-Warshall all-pairs shortest paths: Uses matrix DP, O(V³) time
•Transitive closure: Computing reachability matrix
•Matrix exponentiation: A^k gives path counts of length k
•Spectral graph algorithms: Eigenvalue decomposition of adjacency matrix
•Graph neural networks: Matrix operations for message passing
•Small-graph isomorphism: Matrix comparison-based approaches

The Floyd-Warshall Exception

Floyd-Warshall finds all-pairs shortest paths in O(V³) time. For this algorithm, the matrix representation is natural and may even be faster than list-based alternatives due to cache-friendly sequential access patterns. If you're running Floyd-Warshall, use a matrix.

Hybrid and Specialized Representations

Sometimes neither pure lists nor pure matrices are optimal. Consider these hybrid approaches:

1. Hash Set Adjacency List (Already discussed)

Combines O(V + E) space with O(1) average edge lookup. The go-to when you need both.

2. Compressed Sparse Row (CSR)

Stores all neighbors in a single contiguous array with an index array marking boundaries. Better cache performance for static graphs:

Index array:    [0, 2, 5, 7, 9, 10] ← where each vertex's neighbors start
Neighbors array: [1,4, 0,2,3, 1,3, 1,2,4, 3] ← all neighbors contiguous

3. Adjacency List + Matrix for Specific Subgraph

For graphs with a small dense core and large sparse periphery, use a matrix for the core and lists for the rest.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
class HybridPowerLawGraph:
    """
    Specialized for power-law graphs (social networks, web graphs).
    
    Observation: Few 'hub' vertices have very high degree.
    Strategy: Use different representations for hubs vs regular vertices.
    
    Hubs: Store neighbors as sorted array for binary search O(log d)
    Regular: Store neighbors as unsorted list (small d ≈ O(1))
    """
    
    def __init__(self, num_vertices: int, hub_threshold: int = 1000):
        self.V = num_vertices
        self.hub_threshold = hub_threshold
        self.regular = [[] for _ in range(num_vertices)]  # Unsorted
        self.hubs = {}  # vertex → sorted list of neighbors
        self.is_hub = [False] * num_vertices
    
    def _promote_to_hub(self, v: int):
        """Convert regular vertex to hub representation."""
        if not self.is_hub[v]:
            self.hubs[v] = sorted(self.regular[v])
            self.regular[v] = None  # Free memory
            self.is_hub[v] = True
    
    def add_edge(self, u: int, v: int):
        """Add edge, promoting to hub if threshold exceeded."""
        for x, y in [(u, v), (v, u)]:
            if self.is_hub[x]:
                # Insert into sorted array
                import bisect
                bisect.insort(self.hubs[x], y)
            else:
                self.regular[x].append(y)
                if len(self.regular[x]) > self.hub_threshold:
                    self._promote_to_hub(x)
    
    def has_edge(self, u: int, v: int) -> bool:
        """O(log d) for hubs, O(d) for regular."""
        if self.is_hub[u]:
            import bisect
            neighbors = self.hubs[u]
            pos = bisect.bisect_left(neighbors, v)
            return pos < len(neighbors) and neighbors[pos] == v
        else:
            return v in self.regular[u]
    
    def neighbors(self, v: int):
        """Get neighbors of v."""
        if self.is_hub[v]:
            return self.hubs[v]
        return self.regular[v]
 
# This hybrid approach gives:
# - O(log d) lookup for high-degree hubs (instead of O(d))
# - O(d) lookup for regular vertices (where d is small)
# - Same asymptotic space as regular adjacency list

When to Use Hybrids

Hybrid representations add complexity. Use them only when: (1) Profiling shows clear bottlenecks in standard representations, (2) The graph structure is known to be pathological (e.g., extreme power-law), or (3) You're building production infrastructure where micro-optimizations matter.

Practical Decision Guide

Here's a flowchart-style decision process you can follow for any graph problem:

Step 1: Check Memory Feasibility

Calculate V². If V² × cell_size exceeds available memory, use adjacency list. End of discussion.

Step 2: Assess Graph Density

If E < V²/10 (< 10% dense): Prefer adjacency list
If E > V²/2 (> 50% dense): Consider adjacency matrix
Between 10-50%: Consider based on operations

Step 3: Identify Dominant Operations

Mostly neighbor iteration: Use adjacency list
Many edge existence queries: Use hash set list or matrix
Dynamic modifications: Use adjacency list (hash set variant)
Matrix algorithms (Floyd-Warshall, spectral): Use adjacency matrix

Quick Reference Decision Table
Graph Size	Density	Primary Operation	Recommendation
Small (V < 500)	Any	Any	Either works; prefer lists
Medium (V < 5000)	< 10%	Traversal	Adjacency List
Medium (V < 5000)	< 10%	Edge queries	Hash Set List
Medium (V < 5000)	50%	Edge queries	Adjacency Matrix
Large (V > 10000)	< 1%	Any	Adjacency List (only option)
Large (V > 10000)	High	N/A	Matrix won't fit; use list
Any	Any	Floyd-Warshall	Adjacency Matrix
Any	Any	BFS/DFS/Dijkstra	Adjacency List

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
def choose_graph_representation(
    V: int, 
    E: int, 
    available_memory_bytes: int,
    operations: list[str]  # e.g., ['bfs', 'edge_query', 'add_edge']
) -> dict:
    """
    Comprehensive graph representation recommendation.
    
    Returns a dict with:
    - 'recommendation': str
    - 'reasoning': list of factors that influenced the decision
    - 'alternatives': other viable options
    """
    result = {
        'recommendation': None,
        'reasoning': [],
        'alternatives': []
    }
    
    # Calculate metrics
    max_edges = V * (V - 1) // 2
    density = E / max_edges if max_edges > 0 else 0
    matrix_size = V * V  # cells (at least 1 byte each)
    list_size = V * 8 + E * 2 * 8  # rough estimate in bytes
    
    # Step 1: Memory check
    if matrix_size > available_memory_bytes:
        result['recommendation'] = 'adjacency_list'
        result['reasoning'].append(
            f'Matrix would require {matrix_size/1e9:.1f}GB, exceeds available memory'
        )
        
        # Refine list type based on operations
        if 'edge_query' in operations:
            result['recommendation'] = 'hash_set_list'
            result['reasoning'].append('Edge queries benefit from O(1) hash set lookup')
        
        return result
    
    result['reasoning'].append(f'Both representations fit in memory')
    
    # Step 2: Density analysis
    if density < 0.10:
        result['recommendation'] = 'adjacency_list'
        result['reasoning'].append(f'Graph is sparse ({density:.1%} density)')
    elif density > 0.50:
        result['recommendation'] = 'adjacency_matrix'
        result['reasoning'].append(f'Graph is dense ({density:.1%} density)')
    else:
        result['reasoning'].append(
            f'Medium density ({density:.1%}), considering operations'
        )
    
    # Step 3: Operation-based refinement
    op_scores = {'list': 0, 'matrix': 0, 'hash_set': 0}
    
    for op in operations:
        if op in ['bfs', 'dfs', 'dijkstra', 'bellman_ford', 'topological_sort']:
            op_scores['list'] += 2
        elif op in ['edge_query', 'has_edge']:
            op_scores['matrix'] += 1
            op_scores['hash_set'] += 2
        elif op in ['add_edge', 'remove_edge', 'dynamic']:
            op_scores['list'] += 1
            op_scores['hash_set'] += 1
        elif op in ['floyd_warshall', 'matrix_multiply', 'spectral']:
            op_scores['matrix'] += 3
    
    # Final decision
    best = max(op_scores, key=op_scores.get)
    if best == 'hash_set':
        result['recommendation'] = 'hash_set_list'
    elif best == 'matrix':
        result['recommendation'] = 'adjacency_matrix'
    else:
        result['recommendation'] = 'adjacency_list'
    
    result['reasoning'].append(f'Operations favor {best} (score: {op_scores[best]})')
    
    # Document alternatives
    for rep, score in sorted(op_scores.items(), key=lambda x: -x[1]):
        if rep != best:
            result['alternatives'].append(f'{rep} (score: {score})')
    
    return result
 
# Example usage
rec = choose_graph_representation(
    V=10000, 
    E=50000, 
    available_memory_bytes=100_000_000,
    operations=['bfs', 'dfs', 'edge_query']
)
print(f"Recommendation: {rec['recommendation']}")
for reason in rec['reasoning']:
    print(f"  - {reason}")

Adjacency Lists in Competitive Programming

In competitive programming, constraints often dictate the representation choice. Let's see how to read contest constraints:

Typical Constraint Patterns:

Constraint	Interpretation	Representation
V ≤ 1000, E ≤ 10⁵	Sparse	Adjacency List
V ≤ 500, E ≤ V²	Possibly dense	Either works
V ≤ 2000, all-pairs query	Floyd-Warshall	Matrix
V ≤ 10⁵, BFS/DFS	Standard traversal	Adjacency List
V ≤ 10⁶, Dijkstra	Large sparse	Adjacency List

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
#include <bits/stdc++.h>
using namespace std;
 
// Standard adjacency list (unweighted)
vector<int> adj[100005];  // Static array of vectors
 
// Weighted adjacency list
vector<pair<int, int>> wAdj[100005];  // (neighbor, weight)
 
// Reading graph from input
void readGraph(int n, int m, bool weighted = false, bool directed = false) {
    for (int i = 0; i < m; i++) {
        int u, v;
        cin >> u >> v;
        
        if (weighted) {
            int w;
            cin >> w;
            wAdj[u].push_back({v, w});
            if (!directed) wAdj[v].push_back({u, w});
        } else {
            adj[u].push_back(v);
            if (!directed) adj[v].push_back(u);
        }
    }
}
 
// BFS with adjacency list
vector<int> bfs(int start, int n) {
    vector<int> dist(n + 1, -1);
    queue<int> q;
    q.push(start);
    dist[start] = 0;
    
    while (!q.empty()) {
        int u = q.front();
        q.pop();
        
        for (int v : adj[u]) {  // O(degree) iteration
            if (dist[v] == -1) {
                dist[v] = dist[u] + 1;
                q.push(v);
            }
        }
    }
    return dist;
}
 
// Dijkstra with adjacency list
vector<long long> dijkstra(int start, int n) {
    vector<long long> dist(n + 1, LLONG_MAX);
    priority_queue<pair<long long, int>, 
                   vector<pair<long long, int>>,
                   greater<pair<long long, int>>> pq;
    
    dist[start] = 0;
    pq.push({0, start});
    
    while (!pq.empty()) {
        auto [d, u] = pq.top();
        pq.pop();
        
        if (d > dist[u]) continue;
        
        for (auto [v, w] : wAdj[u]) {  // O(degree) iteration
            if (dist[u] + w < dist[v]) {
                dist[v] = dist[u] + w;
                pq.push({dist[v], v});
            }
        }
    }
    return dist;
}

Contest Time Tip

In contests, default to adjacency lists unless: (1) The problem explicitly requires O(1) edge lookup (rare), (2) V ≤ 500 and you're doing all-pairs operations, or (3) The problem involves matrix operations. Lists work for 95%+ of graph problems.

Summary: The Adjacency List Advantage

We've completed our deep dive into the adjacency list representation. Let's consolidate everything into a final set of principles:

Key Takeaways

•Default to adjacency lists — For the vast majority of graph problems, especially with large or sparse graphs, adjacency lists are the right choice.
•Use hash sets for edge queries — When frequent has_edge(u, v) checks are needed, hash set-based lists give O(1) lookup with O(V + E) space.
•Check memory constraints first — If V² doesn't fit in memory, adjacency matrix isn't an option regardless of other factors.
•Match representation to algorithm — Traversal algorithms love lists; matrix algorithms need matrices. Let the algorithm guide your choice.
•Consider hybrid approaches for extremes — Power-law graphs, mixed density regions, or specific bottlenecks may warrant specialized structures.
•In competitive programming, lists are standard — Default to vector<int> adj[N] in C++; it handles most problems efficiently.
•Real-world systems universally use lists — Every social network, search engine, and navigation system uses adjacency list variants for their graphs.

Module Complete:

You've now mastered the adjacency list representation:

Page 1: Structure and implementation — storing neighbors for each vertex
Page 2: Space complexity — O(V + E) and why it matters
Page 3: Time complexity — O(degree) lookup and its implications
Page 4: When to use lists — decision criteria and practical guidelines

The next module compares both representations head-to-head, providing a comprehensive analysis of their trade-offs for different scenarios.

Module Complete

You now have complete mastery of the adjacency list representation. You understand its structure, space and time characteristics, and can confidently choose between lists and matrices for any graph problem. This knowledge is foundational for all graph algorithms you'll encounter.