Data Structures & AlgorithmsGraph Representation Trade-offs

Graph Representation Trade-offs

LevelIntermediate

Duration60 mins

TopicGraph Representation Trade-offs

1 / 4

Space Comparison — Matrix vs List

The Memory Footprint of Graph Representations

When you decide to represent a graph in your program, you're not just choosing a data structure—you're committing to a specific memory allocation pattern that will determine how your program scales. The difference between adjacency matrices and adjacency lists isn't merely theoretical; it can mean the difference between a program that runs smoothly on a laptop and one that crashes with an out-of-memory error.

The core question: How much memory does each representation actually consume, and when does one become dramatically more efficient than the other?

This page provides a rigorous, mathematical analysis of space consumption for both representations, arming you with the precise knowledge needed to make informed engineering decisions.

What You Will Learn

By the end of this page, you will understand exactly how to calculate memory usage for both adjacency matrices and adjacency lists. You'll master the mathematical formulas, understand the constants behind asymptotic notation, and develop intuition for when each representation becomes prohibitively expensive or surprisingly efficient.

Adjacency Matrix Space Analysis

An adjacency matrix represents a graph G = (V, E) as a two-dimensional array M where M[i][j] indicates the presence (and possibly weight) of an edge from vertex i to vertex j. Let's dissect its space requirements with surgical precision.

The fundamental space requirement:

For a graph with |V| vertices, the adjacency matrix requires exactly |V|² cells. This is an inescapable mathematical fact—every pair of vertices needs a cell to record whether an edge exists between them.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// For an unweighted graph, each cell stores a boolean (or 0/1)
// For a weighted graph, each cell stores a number (weight) or null/infinity
 
interface AdjacencyMatrix {
    // |V| × |V| matrix
    matrix: number[][];  // or boolean[][] for unweighted
    vertexCount: number;
}
 
// Example: Graph with 5 vertices
const graph: AdjacencyMatrix = {
    vertexCount: 5,
    matrix: [
        //  0  1  2  3  4    (destination vertices)
        [0, 1, 1, 0, 0],  // edges from vertex 0
        [1, 0, 0, 1, 1],  // edges from vertex 1
        [1, 0, 0, 1, 0],  // edges from vertex 2
        [0, 1, 1, 0, 1],  // edges from vertex 3
        [0, 1, 0, 1, 0],  // edges from vertex 4
    ]
};

Precise memory calculation:

Let's compute the exact memory consumption, not just the asymptotic complexity:

For an unweighted graph (boolean entries):

Each cell stores 1 bit (ideally) or 1 byte (in practice)
Total bits: |V|² bits
Total bytes (naive): |V|² bytes
With bit packing: |V|² / 8 bytes ≈ 0.125 × |V|² bytes

For a weighted graph (numeric entries):

Each cell stores a number (4 bytes for int32, 8 bytes for float64)
Total bytes: |V|² × sizeof(number)
For 32-bit integers: 4 × |V|² bytes
For 64-bit floats: 8 × |V|² bytes

Adjacency Matrix Memory Consumption by Graph Size
Vertices (\|V\|)	Cells (\|V\|²)	Unweighted (bytes)	Weighted int32 (bytes)	Weighted float64 (bytes)
100	10,000	10 KB	40 KB	80 KB
1,000	1,000,000	1 MB	4 MB	8 MB
10,000	100,000,000	100 MB	400 MB	800 MB
100,000	10,000,000,000	10 GB	40 GB	80 GB
1,000,000	1,000,000,000,000	1 TB	4 TB	8 TB

The Quadratic Explosion

Notice how rapidly memory consumption explodes. A graph with just 100,000 vertices requires 10 GB for an unweighted matrix—and this memory is allocated regardless of how many edges exist. A sparse graph with 100,000 vertices and only 200,000 edges would still consume 10 GB, even though the actual edge information could fit in kilobytes.

The independence from edge count:

This is the defining characteristic of adjacency matrix space complexity:

Space = O(|V|²)

The number of edges |E| does not appear in this formula. Whether your graph has 0 edges or |V|² edges (complete graph), the matrix consumes the same amount of memory. This independence is both the matrix's greatest strength (predictable allocation) and its most significant weakness (wasteful for sparse graphs).

Adjacency List Space Analysis

An adjacency list represents a graph G = (V, E) as an array of lists, where the i-th list contains all vertices adjacent to vertex i. This representation is edge-sensitive—its memory consumption directly reflects the graph's structure.

The fundamental space requirement:

For a graph with |V| vertices and |E| edges:

The array of lists requires |V| list headers
Each edge contributes to one or two lists (directed vs undirected)
Total list elements: |E| for directed graphs, 2|E| for undirected graphs

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// For an unweighted graph
interface AdjacencyListUnweighted {
    // Array of |V| lists, each containing neighbor indices
    adjacency: number[][];  // adjacency[v] = list of neighbors of v
    vertexCount: number;
}
 
// For a weighted graph
interface WeightedEdge {
    to: number;
    weight: number;
}
 
interface AdjacencyListWeighted {
    adjacency: WeightedEdge[][];
    vertexCount: number;
}
 
// Example: Same 5-vertex graph as before
const graphList: AdjacencyListUnweighted = {
    vertexCount: 5,
    adjacency: [
        [1, 2],     // vertex 0 connects to 1, 2
        [0, 3, 4],  // vertex 1 connects to 0, 3, 4
        [0, 3],     // vertex 2 connects to 0, 3
        [1, 2, 4],  // vertex 3 connects to 1, 2, 4
        [1, 3],     // vertex 4 connects to 1, 3
    ]
};
 
// Total storage: 5 arrays + 12 integers (each edge appears twice)

Precise memory calculation:

The memory analysis for adjacency lists is more nuanced because it involves both fixed and variable components:

For an undirected unweighted graph:

Fixed overhead (array of pointers):

|V| array/list headers
Each header: 8-24 bytes depending on language (pointer + length + capacity)
Estimated: 16 × |V| bytes (typical for dynamic arrays)

Variable storage (edge data):

2|E| vertex indices (each edge in both endpoint lists)
Each index: 4 bytes (int32) or 8 bytes (int64)
Estimated: 4 × 2|E| = 8|E| bytes for int32 indices

Total: 16|V| + 8|E| bytes (approximate)

For a directed unweighted graph:

Fixed overhead: Same as undirected

16 × |V| bytes

Variable storage:

|E| vertex indices (each edge in one list only)
4 × |E| = 4|E| bytes for int32 indices

Total: 16|V| + 4|E| bytes (approximate)

For weighted graphs:

Each edge entry now includes both destination and weight:

Destination: 4 bytes
Weight: 4-8 bytes
Per edge entry: 8-12 bytes

Total: 16|V| + 8|E| to 16|V| + 24|E| bytes

Adjacency List Memory for Various Graph Configurations
Vertices	Edges	Edge Density	Unweighted (bytes)	Weighted (bytes)
1,000	2,000	0.2% (very sparse)	17.6 KB	49.6 KB
1,000	10,000	1% (sparse)	96 KB	256 KB
1,000	100,000	10%	816 KB	2.4 MB
1,000	499,500	50% (dense)	4 MB	12 MB
10,000	20,000	0.02% (sparse)	320 KB	640 KB
10,000	5,000,000	5%	40 MB	120 MB
100,000	200,000	0.002%	3.2 MB	6.4 MB
100,000	10,000,000	0.1%	81.6 MB	241.6 MB

The Edge-Proportional Advantage

The crucial insight is that adjacency list space scales with actual edges, not potential edges. A graph with 100,000 vertices and 200,000 edges uses only 3.2 MB with lists—versus 10 GB with a matrix. That's a 3,000× difference in memory for the same graph!

Mathematical Comparison Framework

Now let's establish a rigorous mathematical framework for comparing these representations. Understanding when one beats the other requires expressing the break-even point precisely.

Asymptotic space complexity:

Representation	Space Complexity
Adjacency Matrix	O(
Adjacency List	O(

But asymptotic notation hides the constants. Let's derive the exact break-even point.

Deriving the break-even point:

Let's find when the matrix and list consume equal memory.

Matrix space: S_matrix = c₁ × |V|² List space: S_list = c₂ × |V| + c₃ × |E|

For typical implementations:

c₁ ≈ 1 byte (unweighted) or 4-8 bytes (weighted)
c₂ ≈ 16 bytes (array header overhead)
c₃ ≈ 8-16 bytes per edge entry

Setting S_matrix = S_list:

c₁ × |V|² = c₂ × |V| + c₃ × |E|

Solving for |E|:

|E| = (c₁ × |V|² - c₂ × |V|) / c₃ |E| ≈ (c₁/c₃) × |V|² - (c₂/c₃) × |V|

For large |V|, the dominant term is: |E| ≈ (c₁/c₃) × |V|²

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
/**
 * Calculate the break-even edge count where matrix and list
 * consume approximately equal memory.
 * 
 * @param vertexCount Number of vertices in the graph
 * @param matrixBytesPerCell Bytes per matrix cell (1 for bool, 4-8 for numbers)
 * @param listOverheadPerVertex Bytes overhead per vertex list header
 * @param listBytesPerEdge Bytes per edge entry in the list
 * @returns The edge count where both representations use similar memory
 */
function calculateBreakEvenEdges(
    vertexCount: number,
    matrixBytesPerCell: number = 1,
    listOverheadPerVertex: number = 16,
    listBytesPerEdge: number = 8
): number {
    const matrixSpace = matrixBytesPerCell * vertexCount * vertexCount;
    // Solve: matrixSpace = listOverheadPerVertex * V + listBytesPerEdge * E
    const breakEvenEdges = (matrixSpace - listOverheadPerVertex * vertexCount) 
                           / listBytesPerEdge;
    return Math.max(0, Math.floor(breakEvenEdges));
}
 
// Examples
console.log("Break-even edges for various vertex counts:");
console.log(`  100 vertices: ${calculateBreakEvenEdges(100)} edges`);
console.log(`  1,000 vertices: ${calculateBreakEvenEdges(1000)} edges`);
console.log(`  10,000 vertices: ${calculateBreakEvenEdges(10000)} edges`);
 
// Output:
// Break-even edges for various vertex counts:
//   100 vertices: 1,050 edges (vs max 4,950 for undirected)
//   1,000 vertices: 123,000 edges (vs max 499,500)
//   10,000 vertices: 12,480,000 edges (vs max 49,995,000)

The Density Threshold

The break-even typically occurs around 10-25% edge density for unweighted graphs. For weighted graphs (where matrix cells are larger), the break-even shifts to even lower densities, making adjacency lists even more favorable for most real-world graphs.

Edge Density: The Key Decision Factor

Edge density is the ratio of actual edges to maximum possible edges. It's the single most important factor in choosing between representations.

Definition:

For an undirected graph without self-loops:

Maximum edges: |V|(|V|-1)/2
Density = |E| / (|V|(|V|-1)/2) = 2|E| / (|V|(|V|-1))

For a directed graph:

Maximum edges: |V|(|V|-1) (or |V|² with self-loops)
Density = |E| / (|V|(|V|-1))

Density ranges from 0 (no edges) to 1 (complete graph).

Edge Density Categories and Representation Recommendations
Density Range	Classification	Typical Examples	Recommended Representation
0% - 1%	Very Sparse	Social networks, web graphs, road networks	Adjacency List (strongly)
1% - 10%	Sparse	Collaboration networks, citation graphs	Adjacency List
10% - 25%	Moderate	Some communication networks, smaller graphs	Context-dependent
25% - 50%	Semi-Dense	Certain biological networks, correlation matrices	Consider both
50% - 100%	Dense	Complete or near-complete graphs, cliques	Adjacency Matrix often better

Real-world graph densities:

Most real-world graphs are remarkably sparse. Understanding typical densities helps guide default choices:

Social Networks:

Facebook friendship graph: ~0.00003% density (avg 150 friends / 2.9B users)
Twitter following: ~0.0001% density
LinkedIn connections: ~0.0001% density

Infrastructure Networks:

Internet router topology: ~0.001% density
US road network: ~0.00001% density
Power grid: ~0.001% density

Information Networks:

Wikipedia link graph: ~0.0002% density
Web hyperlink graph: ~0.00001% density
Citation networks: ~0.01% density

Biological Networks:

Protein interaction: ~0.1-1% density
Neural networks (brain): ~1-10% density

The Practical Takeaway

The vast majority of real-world graphs have densities well below 1%. This means adjacency lists are the appropriate default for nearly all practical applications. Adjacency matrices are the exception, not the rule—reserved for dense graphs, small graphs, or algorithms that specifically benefit from O(1) edge lookup.

Memory Layout and Cache Efficiency

Raw byte counts don't tell the complete story. How data is arranged in memory affects actual performance through cache effects. Modern CPUs rely heavily on cache hierarchies, and memory access patterns dramatically impact real-world speed.

Adjacency Matrix Memory Layout:

Matrices are stored as contiguous 2D arrays (typically row-major in C-like languages):

Memory Address: [0,0][0,1][0,2]...[0,V-1][1,0][1,1]...[V-1,V-1]
                |----  Row 0  ----|----  Row 1  ----| ... |

Cache implications:

Accessing a single row: Excellent spatial locality
Accessing a single column: Poor locality (stride = |V| × element_size)
Iterating all edges: Mixed—good for row-wise, poor for column-wise

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// CACHE-FRIENDLY: Row-major access (sequential memory)
function iterateRowMajor(matrix: number[][], V: number): void {
    for (let i = 0; i < V; i++) {
        for (let j = 0; j < V; j++) {
            // Accesses memory sequentially
            // Each cache line (64 bytes) serves 16 consecutive int32 values
            process(matrix[i][j]);
        }
    }
}
// Cache behavior: ~V²/16 cache misses (excellent)
 
// CACHE-HOSTILE: Column-major access (strided memory)  
function iterateColumnMajor(matrix: number[][], V: number): void {
    for (let j = 0; j < V; j++) {
        for (let i = 0; i < V; i++) {
            // Jumps V elements between accesses
            // If V > cache_line_size/element_size, every access is a miss
            process(matrix[i][j]);
        }
    }
}
// Cache behavior: ~V² cache misses if V > 16 (terrible)

Adjacency List Memory Layout:

Adjacency lists have more fragmented memory:

Main Array:    [ptr_0][ptr_1][ptr_2]...[ptr_V-1]
                 ↓      ↓      ↓         ↓
Neighbor Lists: [2,5] [0,3,7,9] [1]  ... [0,2,4]

Cache implications:

Accessing one vertex's neighbors: Good locality (contiguous list)
Iterating all edges: Pointer chasing between lists (cache misses)
Random vertex access: Poor locality (indirect access)

Matrix Cache Advantages

•Predictable memory access patterns
•Row-wise iteration is cache-optimal
•No pointer chasing or indirection
•Prefetching works effectively
•SIMD operations can process multiple cells

List Cache Advantages

•Only relevant edges in cache
•Neighbor iteration is compact
•Working set is proportional to edges
•Less memory bandwidth for sparse graphs
•May fit in cache when matrix wouldn't

The Working Set Insight

For sparse graphs, the adjacency list's working set (active data in cache) is proportional to edges being processed. The matrix must potentially load entire rows even if most entries are zeros. When the graph is sparse enough that the adjacency list fits in L3 cache but the matrix doesn't, the list can be dramatically faster despite pointer indirection.

Practical Memory Optimization Techniques

Both representations admit various optimizations that can significantly reduce memory footprint. Understanding these techniques is essential for production systems.

Matrix Optimizations:

Bit Packing for Unweighted Graphs

•Store each edge as a single bit instead of a byte
•Reduces memory by 8× for unweighted graphs
•Pack 64 edges per 64-bit word
•Trade-off: Bit manipulation overhead for access
•Implementation: Use bitwise operations for get/set

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
class BitPackedMatrix {
    private bits: Uint32Array;
    private vertices: number;
    
    constructor(vertexCount: number) {
        this.vertices = vertexCount;
        // Each Uint32 holds 32 edges
        const totalBits = vertexCount * vertexCount;
        const uint32Count = Math.ceil(totalBits / 32);
        this.bits = new Uint32Array(uint32Count);
    }
    
    private getIndex(from: number, to: number): { word: number; bit: number } {
        const flatIndex = from * this.vertices + to;
        return {
            word: Math.floor(flatIndex / 32),
            bit: flatIndex % 32
        };
    }
    
    hasEdge(from: number, to: number): boolean {
        const { word, bit } = this.getIndex(from, to);
        return (this.bits[word] & (1 << bit)) !== 0;
    }
    
    setEdge(from: number, to: number, exists: boolean): void {
        const { word, bit } = this.getIndex(from, to);
        if (exists) {
            this.bits[word] |= (1 << bit);
        } else {
            this.bits[word] &= ~(1 << bit);
        }
    }
    
    memoryUsage(): number {
        return this.bits.byteLength;
    }
}
 
// For 10,000 vertices:
// Regular matrix: 100,000,000 bytes = 95.4 MB
// Bit-packed: 12,500,000 bytes = 11.9 MB (8× reduction!)

Symmetric Matrix Optimization:

For undirected graphs, the matrix is symmetric (M[i][j] = M[j][i]). We can store only the upper or lower triangle:

Full matrix: |V|² cells
Triangular storage: |V|(|V|+1)/2 cells
Savings: Nearly 50% reduction

The trade-off is slightly more complex indexing, but for large graphs, the memory savings are substantial.

Adjacency List Optimizations:

1. Compressed Sparse Row (CSR) Format:

Instead of an array of dynamic arrays, use two static arrays:

edges[]: All neighbors concatenated
offsets[]: Starting position of each vertex's neighbors

This eliminates per-list overhead and improves cache performance.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
class CSRGraph {
    // All neighbors stored consecutively
    private edges: Uint32Array;
    // offsets[i] = start index of vertex i's neighbors in edges[]
    // offsets[V] = total number of edge entries (sentinel)
    private offsets: Uint32Array;
    
    constructor(vertexCount: number, edgeList: [number, number][]) {
        // Count outgoing edges per vertex
        const outDegree = new Uint32Array(vertexCount);
        for (const [from, _to] of edgeList) {
            outDegree[from]++;
        }
        
        // Compute prefix sums for offsets
        this.offsets = new Uint32Array(vertexCount + 1);
        for (let i = 0; i < vertexCount; i++) {
            this.offsets[i + 1] = this.offsets[i] + outDegree[i];
        }
        
        // Fill edges array
        this.edges = new Uint32Array(this.offsets[vertexCount]);
        const currentPos = new Uint32Array(vertexCount);
        for (const [from, to] of edgeList) {
            const pos = this.offsets[from] + currentPos[from];
            this.edges[pos] = to;
            currentPos[from]++;
        }
    }
    
    *neighbors(vertex: number): Generator<number> {
        const start = this.offsets[vertex];
        const end = this.offsets[vertex + 1];
        for (let i = start; i < end; i++) {
            yield this.edges[i];
        }
    }
    
    degree(vertex: number): number {
        return this.offsets[vertex + 1] - this.offsets[vertex];
    }
    
    memoryUsage(): number {
        return this.edges.byteLength + this.offsets.byteLength;
    }
}
 
// Memory comparison for 1M vertices, 10M edges:
// Array of arrays: ~16MB (vertex overhead) + ~40MB (edges) = ~56MB
// CSR format: ~4MB (offsets) + ~40MB (edges) = ~44MB
// Plus: Better cache locality, no allocation fragmentation

Production Systems

High-performance graph libraries like SNAP, NetworkX (for certain backends), and GraphBLAS use CSR or similar compressed formats. When processing billion-edge graphs, these optimizations can reduce memory usage by 30-50% compared to naive implementations while simultaneously improving cache performance.

Summary: Space Comparison Key Insights

We've conducted a thorough analysis of memory requirements for both graph representations. Let's consolidate the essential takeaways:

Key Takeaways

•Adjacency Matrix: O(|V|²) space — Memory is independent of edge count, making it wasteful for sparse graphs but predictable and compact for dense graphs.
•Adjacency List: O(|V| + |E|) space — Memory scales with actual graph structure, making it dramatically more efficient for typical real-world graphs.
•Break-even density is typically 10-25% — Below this, lists win; above this, matrices become competitive. Most real graphs are far below this threshold.
•Real-world graphs are overwhelmingly sparse — Social networks, infrastructure networks, and information networks typically have densities well under 1%.
•Cache effects matter — Matrix row-wise access is cache-friendly; list access has better working set behavior for sparse graphs.
•Optimization techniques exist for both — Bit packing and triangular storage for matrices; CSR format for lists. These can provide significant constant-factor improvements.

What's next:

Space is only half the story. The next page examines time complexity for common operations—edge lookup, neighbor iteration, and edge insertion/deletion. You'll see how the space-time trade-off plays out in practice, and why the 'best' representation depends not just on the graph, but on which operations your algorithm performs most frequently.

Page Complete

You now have a rigorous understanding of the memory requirements for adjacency matrices and adjacency lists. You can calculate exact memory usage, identify break-even points, and understand why adjacency lists are the default choice for most real-world graphs. Next, we'll analyze the time complexity trade-offs to complete the picture.

1 / 4

Loading learning content...

Data Structures & AlgorithmsGraph Representation Trade-offs

Graph Representation Trade-offs

LevelIntermediate

Duration60 mins

TopicGraph Representation Trade-offs

1 / 4

Space Comparison — Matrix vs List

The Memory Footprint of Graph Representations

The core question: How much memory does each representation actually consume, and when does one become dramatically more efficient than the other?

This page provides a rigorous, mathematical analysis of space consumption for both representations, arming you with the precise knowledge needed to make informed engineering decisions.

What You Will Learn

Adjacency Matrix Space Analysis

The fundamental space requirement:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
// For an unweighted graph, each cell stores a boolean (or 0/1)
// For a weighted graph, each cell stores a number (weight) or null/infinity
 
interface AdjacencyMatrix {
    // |V| × |V| matrix
    matrix: number[][];  // or boolean[][] for unweighted
    vertexCount: number;
}
 
// Example: Graph with 5 vertices
const graph: AdjacencyMatrix = {
    vertexCount: 5,
    matrix: [
        //  0  1  2  3  4    (destination vertices)
        [0, 1, 1, 0, 0],  // edges from vertex 0
        [1, 0, 0, 1, 1],  // edges from vertex 1
        [1, 0, 0, 1, 0],  // edges from vertex 2
        [0, 1, 1, 0, 1],  // edges from vertex 3
        [0, 1, 0, 1, 0],  // edges from vertex 4
    ]
};

Precise memory calculation:

Let's compute the exact memory consumption, not just the asymptotic complexity:

For an unweighted graph (boolean entries):

Each cell stores 1 bit (ideally) or 1 byte (in practice)
Total bits: |V|² bits
Total bytes (naive): |V|² bytes
With bit packing: |V|² / 8 bytes ≈ 0.125 × |V|² bytes

For a weighted graph (numeric entries):

Each cell stores a number (4 bytes for int32, 8 bytes for float64)
Total bytes: |V|² × sizeof(number)
For 32-bit integers: 4 × |V|² bytes
For 64-bit floats: 8 × |V|² bytes

Adjacency Matrix Memory Consumption by Graph Size
Vertices (\|V\|)	Cells (\|V\|²)	Unweighted (bytes)	Weighted int32 (bytes)	Weighted float64 (bytes)
100	10,000	10 KB	40 KB	80 KB
1,000	1,000,000	1 MB	4 MB	8 MB
10,000	100,000,000	100 MB	400 MB	800 MB
100,000	10,000,000,000	10 GB	40 GB	80 GB
1,000,000	1,000,000,000,000	1 TB	4 TB	8 TB

The Quadratic Explosion

The independence from edge count:

This is the defining characteristic of adjacency matrix space complexity:

Space = O(|V|²)

Adjacency List Space Analysis

The fundamental space requirement:

For a graph with |V| vertices and |E| edges:

The array of lists requires |V| list headers
Each edge contributes to one or two lists (directed vs undirected)
Total list elements: |E| for directed graphs, 2|E| for undirected graphs

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
// For an unweighted graph
interface AdjacencyListUnweighted {
    // Array of |V| lists, each containing neighbor indices
    adjacency: number[][];  // adjacency[v] = list of neighbors of v
    vertexCount: number;
}
 
// For a weighted graph
interface WeightedEdge {
    to: number;
    weight: number;
}
 
interface AdjacencyListWeighted {
    adjacency: WeightedEdge[][];
    vertexCount: number;
}
 
// Example: Same 5-vertex graph as before
const graphList: AdjacencyListUnweighted = {
    vertexCount: 5,
    adjacency: [
        [1, 2],     // vertex 0 connects to 1, 2
        [0, 3, 4],  // vertex 1 connects to 0, 3, 4
        [0, 3],     // vertex 2 connects to 0, 3
        [1, 2, 4],  // vertex 3 connects to 1, 2, 4
        [1, 3],     // vertex 4 connects to 1, 3
    ]
};
 
// Total storage: 5 arrays + 12 integers (each edge appears twice)

Precise memory calculation:

The memory analysis for adjacency lists is more nuanced because it involves both fixed and variable components:

For an undirected unweighted graph:

Fixed overhead (array of pointers):

|V| array/list headers
Each header: 8-24 bytes depending on language (pointer + length + capacity)
Estimated: 16 × |V| bytes (typical for dynamic arrays)

Variable storage (edge data):

2|E| vertex indices (each edge in both endpoint lists)
Each index: 4 bytes (int32) or 8 bytes (int64)
Estimated: 4 × 2|E| = 8|E| bytes for int32 indices

Total: 16|V| + 8|E| bytes (approximate)

For a directed unweighted graph:

Fixed overhead: Same as undirected

16 × |V| bytes

Variable storage:

|E| vertex indices (each edge in one list only)
4 × |E| = 4|E| bytes for int32 indices

Total: 16|V| + 4|E| bytes (approximate)

For weighted graphs:

Each edge entry now includes both destination and weight:

Destination: 4 bytes
Weight: 4-8 bytes
Per edge entry: 8-12 bytes

Total: 16|V| + 8|E| to 16|V| + 24|E| bytes

Adjacency List Memory for Various Graph Configurations
Vertices	Edges	Edge Density	Unweighted (bytes)	Weighted (bytes)
1,000	2,000	0.2% (very sparse)	17.6 KB	49.6 KB
1,000	10,000	1% (sparse)	96 KB	256 KB
1,000	100,000	10%	816 KB	2.4 MB
1,000	499,500	50% (dense)	4 MB	12 MB
10,000	20,000	0.02% (sparse)	320 KB	640 KB
10,000	5,000,000	5%	40 MB	120 MB
100,000	200,000	0.002%	3.2 MB	6.4 MB
100,000	10,000,000	0.1%	81.6 MB	241.6 MB

The Edge-Proportional Advantage

Mathematical Comparison Framework

Now let's establish a rigorous mathematical framework for comparing these representations. Understanding when one beats the other requires expressing the break-even point precisely.

Asymptotic space complexity:

Representation	Space Complexity
Adjacency Matrix	O(
Adjacency List	O(

But asymptotic notation hides the constants. Let's derive the exact break-even point.

Deriving the break-even point:

Let's find when the matrix and list consume equal memory.

Matrix space: S_matrix = c₁ × |V|² List space: S_list = c₂ × |V| + c₃ × |E|

For typical implementations:

c₁ ≈ 1 byte (unweighted) or 4-8 bytes (weighted)
c₂ ≈ 16 bytes (array header overhead)
c₃ ≈ 8-16 bytes per edge entry

Setting S_matrix = S_list:

c₁ × |V|² = c₂ × |V| + c₃ × |E|

Solving for |E|:

|E| = (c₁ × |V|² - c₂ × |V|) / c₃ |E| ≈ (c₁/c₃) × |V|² - (c₂/c₃) × |V|

For large |V|, the dominant term is: |E| ≈ (c₁/c₃) × |V|²

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
/**
 * Calculate the break-even edge count where matrix and list
 * consume approximately equal memory.
 * 
 * @param vertexCount Number of vertices in the graph
 * @param matrixBytesPerCell Bytes per matrix cell (1 for bool, 4-8 for numbers)
 * @param listOverheadPerVertex Bytes overhead per vertex list header
 * @param listBytesPerEdge Bytes per edge entry in the list
 * @returns The edge count where both representations use similar memory
 */
function calculateBreakEvenEdges(
    vertexCount: number,
    matrixBytesPerCell: number = 1,
    listOverheadPerVertex: number = 16,
    listBytesPerEdge: number = 8
): number {
    const matrixSpace = matrixBytesPerCell * vertexCount * vertexCount;
    // Solve: matrixSpace = listOverheadPerVertex * V + listBytesPerEdge * E
    const breakEvenEdges = (matrixSpace - listOverheadPerVertex * vertexCount) 
                           / listBytesPerEdge;
    return Math.max(0, Math.floor(breakEvenEdges));
}
 
// Examples
console.log("Break-even edges for various vertex counts:");
console.log(`  100 vertices: ${calculateBreakEvenEdges(100)} edges`);
console.log(`  1,000 vertices: ${calculateBreakEvenEdges(1000)} edges`);
console.log(`  10,000 vertices: ${calculateBreakEvenEdges(10000)} edges`);
 
// Output:
// Break-even edges for various vertex counts:
//   100 vertices: 1,050 edges (vs max 4,950 for undirected)
//   1,000 vertices: 123,000 edges (vs max 499,500)
//   10,000 vertices: 12,480,000 edges (vs max 49,995,000)

The Density Threshold

Edge Density: The Key Decision Factor

Edge density is the ratio of actual edges to maximum possible edges. It's the single most important factor in choosing between representations.

Definition:

For an undirected graph without self-loops:

Maximum edges: |V|(|V|-1)/2
Density = |E| / (|V|(|V|-1)/2) = 2|E| / (|V|(|V|-1))

For a directed graph:

Maximum edges: |V|(|V|-1) (or |V|² with self-loops)
Density = |E| / (|V|(|V|-1))

Density ranges from 0 (no edges) to 1 (complete graph).

Edge Density Categories and Representation Recommendations
Density Range	Classification	Typical Examples	Recommended Representation
0% - 1%	Very Sparse	Social networks, web graphs, road networks	Adjacency List (strongly)
1% - 10%	Sparse	Collaboration networks, citation graphs	Adjacency List
10% - 25%	Moderate	Some communication networks, smaller graphs	Context-dependent
25% - 50%	Semi-Dense	Certain biological networks, correlation matrices	Consider both
50% - 100%	Dense	Complete or near-complete graphs, cliques	Adjacency Matrix often better

Real-world graph densities:

Most real-world graphs are remarkably sparse. Understanding typical densities helps guide default choices:

Social Networks:

Facebook friendship graph: ~0.00003% density (avg 150 friends / 2.9B users)
Twitter following: ~0.0001% density
LinkedIn connections: ~0.0001% density

Infrastructure Networks:

Internet router topology: ~0.001% density
US road network: ~0.00001% density
Power grid: ~0.001% density

Information Networks:

Wikipedia link graph: ~0.0002% density
Web hyperlink graph: ~0.00001% density
Citation networks: ~0.01% density

Biological Networks:

Protein interaction: ~0.1-1% density
Neural networks (brain): ~1-10% density

The Practical Takeaway

Memory Layout and Cache Efficiency

Adjacency Matrix Memory Layout:

Matrices are stored as contiguous 2D arrays (typically row-major in C-like languages):

Memory Address: [0,0][0,1][0,2]...[0,V-1][1,0][1,1]...[V-1,V-1]
                |----  Row 0  ----|----  Row 1  ----| ... |

Cache implications:

Accessing a single row: Excellent spatial locality
Accessing a single column: Poor locality (stride = |V| × element_size)
Iterating all edges: Mixed—good for row-wise, poor for column-wise

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// CACHE-FRIENDLY: Row-major access (sequential memory)
function iterateRowMajor(matrix: number[][], V: number): void {
    for (let i = 0; i < V; i++) {
        for (let j = 0; j < V; j++) {
            // Accesses memory sequentially
            // Each cache line (64 bytes) serves 16 consecutive int32 values
            process(matrix[i][j]);
        }
    }
}
// Cache behavior: ~V²/16 cache misses (excellent)
 
// CACHE-HOSTILE: Column-major access (strided memory)  
function iterateColumnMajor(matrix: number[][], V: number): void {
    for (let j = 0; j < V; j++) {
        for (let i = 0; i < V; i++) {
            // Jumps V elements between accesses
            // If V > cache_line_size/element_size, every access is a miss
            process(matrix[i][j]);
        }
    }
}
// Cache behavior: ~V² cache misses if V > 16 (terrible)

Adjacency List Memory Layout:

Adjacency lists have more fragmented memory:

Main Array:    [ptr_0][ptr_1][ptr_2]...[ptr_V-1]
                 ↓      ↓      ↓         ↓
Neighbor Lists: [2,5] [0,3,7,9] [1]  ... [0,2,4]

Cache implications:

Accessing one vertex's neighbors: Good locality (contiguous list)
Iterating all edges: Pointer chasing between lists (cache misses)
Random vertex access: Poor locality (indirect access)

Matrix Cache Advantages

•Predictable memory access patterns
•Row-wise iteration is cache-optimal
•No pointer chasing or indirection
•Prefetching works effectively
•SIMD operations can process multiple cells

List Cache Advantages

•Only relevant edges in cache
•Neighbor iteration is compact
•Working set is proportional to edges
•Less memory bandwidth for sparse graphs
•May fit in cache when matrix wouldn't

The Working Set Insight

Practical Memory Optimization Techniques

Both representations admit various optimizations that can significantly reduce memory footprint. Understanding these techniques is essential for production systems.

Matrix Optimizations:

Bit Packing for Unweighted Graphs

•Store each edge as a single bit instead of a byte
•Reduces memory by 8× for unweighted graphs
•Pack 64 edges per 64-bit word
•Trade-off: Bit manipulation overhead for access
•Implementation: Use bitwise operations for get/set

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
class BitPackedMatrix {
    private bits: Uint32Array;
    private vertices: number;
    
    constructor(vertexCount: number) {
        this.vertices = vertexCount;
        // Each Uint32 holds 32 edges
        const totalBits = vertexCount * vertexCount;
        const uint32Count = Math.ceil(totalBits / 32);
        this.bits = new Uint32Array(uint32Count);
    }
    
    private getIndex(from: number, to: number): { word: number; bit: number } {
        const flatIndex = from * this.vertices + to;
        return {
            word: Math.floor(flatIndex / 32),
            bit: flatIndex % 32
        };
    }
    
    hasEdge(from: number, to: number): boolean {
        const { word, bit } = this.getIndex(from, to);
        return (this.bits[word] & (1 << bit)) !== 0;
    }
    
    setEdge(from: number, to: number, exists: boolean): void {
        const { word, bit } = this.getIndex(from, to);
        if (exists) {
            this.bits[word] |= (1 << bit);
        } else {
            this.bits[word] &= ~(1 << bit);
        }
    }
    
    memoryUsage(): number {
        return this.bits.byteLength;
    }
}
 
// For 10,000 vertices:
// Regular matrix: 100,000,000 bytes = 95.4 MB
// Bit-packed: 12,500,000 bytes = 11.9 MB (8× reduction!)

Symmetric Matrix Optimization:

For undirected graphs, the matrix is symmetric (M[i][j] = M[j][i]). We can store only the upper or lower triangle:

Full matrix: |V|² cells
Triangular storage: |V|(|V|+1)/2 cells
Savings: Nearly 50% reduction

The trade-off is slightly more complex indexing, but for large graphs, the memory savings are substantial.

Adjacency List Optimizations:

1. Compressed Sparse Row (CSR) Format:

Instead of an array of dynamic arrays, use two static arrays:

edges[]: All neighbors concatenated
offsets[]: Starting position of each vertex's neighbors

This eliminates per-list overhead and improves cache performance.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
class CSRGraph {
    // All neighbors stored consecutively
    private edges: Uint32Array;
    // offsets[i] = start index of vertex i's neighbors in edges[]
    // offsets[V] = total number of edge entries (sentinel)
    private offsets: Uint32Array;
    
    constructor(vertexCount: number, edgeList: [number, number][]) {
        // Count outgoing edges per vertex
        const outDegree = new Uint32Array(vertexCount);
        for (const [from, _to] of edgeList) {
            outDegree[from]++;
        }
        
        // Compute prefix sums for offsets
        this.offsets = new Uint32Array(vertexCount + 1);
        for (let i = 0; i < vertexCount; i++) {
            this.offsets[i + 1] = this.offsets[i] + outDegree[i];
        }
        
        // Fill edges array
        this.edges = new Uint32Array(this.offsets[vertexCount]);
        const currentPos = new Uint32Array(vertexCount);
        for (const [from, to] of edgeList) {
            const pos = this.offsets[from] + currentPos[from];
            this.edges[pos] = to;
            currentPos[from]++;
        }
    }
    
    *neighbors(vertex: number): Generator<number> {
        const start = this.offsets[vertex];
        const end = this.offsets[vertex + 1];
        for (let i = start; i < end; i++) {
            yield this.edges[i];
        }
    }
    
    degree(vertex: number): number {
        return this.offsets[vertex + 1] - this.offsets[vertex];
    }
    
    memoryUsage(): number {
        return this.edges.byteLength + this.offsets.byteLength;
    }
}
 
// Memory comparison for 1M vertices, 10M edges:
// Array of arrays: ~16MB (vertex overhead) + ~40MB (edges) = ~56MB
// CSR format: ~4MB (offsets) + ~40MB (edges) = ~44MB
// Plus: Better cache locality, no allocation fragmentation

Production Systems

Summary: Space Comparison Key Insights

We've conducted a thorough analysis of memory requirements for both graph representations. Let's consolidate the essential takeaways:

Key Takeaways

•Adjacency Matrix: O(|V|²) space — Memory is independent of edge count, making it wasteful for sparse graphs but predictable and compact for dense graphs.
•Adjacency List: O(|V| + |E|) space — Memory scales with actual graph structure, making it dramatically more efficient for typical real-world graphs.
•Break-even density is typically 10-25% — Below this, lists win; above this, matrices become competitive. Most real graphs are far below this threshold.
•Real-world graphs are overwhelmingly sparse — Social networks, infrastructure networks, and information networks typically have densities well under 1%.
•Cache effects matter — Matrix row-wise access is cache-friendly; list access has better working set behavior for sparse graphs.
•Optimization techniques exist for both — Bit packing and triangular storage for matrices; CSR format for lists. These can provide significant constant-factor improvements.

What's next:

Page Complete

1 / 4