Data Structures & AlgorithmsGraph Representation — Adjacency List

Graph Representation — Adjacency List

LevelIntermediate

Duration60 mins

TopicGraph Representation — Adjacency List

3 / 4

Edge Lookup — O(degree) Time Complexity

The Cost of Checking Connections

One of the most fundamental operations on any graph is answering the question: "Is there an edge between vertex u and vertex v?" This operation—called edge lookup or edge query—has dramatically different performance characteristics depending on your graph representation.

With an adjacency matrix, the answer is immediate: check matrix[u][v] in O(1) time. With a standard adjacency list, the answer requires searching through vertex u's neighbor list—taking O(degree(u)) time.

This seemingly simple difference has profound implications for algorithm design, representation choice, and real-world system performance. Let's explore why.

What You Will Learn

By the end of this page, you will understand why edge lookup takes O(degree) time in adjacency lists, how this affects algorithm performance, when this limitation matters (and when it doesn't), and techniques to achieve O(1) lookup when needed.

Understanding Vertex Degree

Before analyzing lookup complexity, let's establish a precise understanding of degree—the central concept in adjacency list performance.

Degree Definitions:

Degree of vertex v (undirected graph): The number of edges incident to v, i.e., the size of v's neighbor list.
Out-degree of vertex v (directed graph): The number of edges leaving v.
In-degree of vertex v (directed graph): The number of edges entering v.
Average degree: Total edges × 2 / Total vertices = 2|E| / |V| for undirected graphs.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
class GraphWithDegreeAnalysis:
    """Adjacency list with degree analysis utilities."""
    
    def __init__(self, num_vertices: int):
        self.V = num_vertices
        self.adj = [[] for _ in range(num_vertices)]
    
    def add_edge(self, u: int, v: int):
        """Add undirected edge."""
        self.adj[u].append(v)
        self.adj[v].append(u)
    
    def degree(self, v: int) -> int:
        """Time: O(1) - just return list length."""
        return len(self.adj[v])
    
    def average_degree(self) -> float:
        """Time: O(V) - iterate through all lists."""
        total_degree = sum(len(neighbors) for neighbors in self.adj)
        return total_degree / self.V
    
    def max_degree(self) -> int:
        """Time: O(V) - find maximum list length."""
        return max(len(neighbors) for neighbors in self.adj)
    
    def degree_distribution(self) -> dict[int, int]:
        """Time: O(V) - count vertices by degree."""
        dist = {}
        for v in range(self.V):
            d = self.degree(v)
            dist[d] = dist.get(d, 0) + 1
        return dist
 
# Example: Social Network Analysis
g = GraphWithDegreeAnalysis(1000)
# Add edges... (simulating connections)
 
# In real social networks:
# - Most users have low degree (few connections)
# - Some "hub" users have very high degree (celebrities, influencers)
# - This is called a "power-law" or "scale-free" distribution
 
# Implications for edge lookup:
# - Checking if regular users are friends: O(small constant) ≈ O(1)
# - Checking if you're friends with a celebrity: O(millions) = slow!

Degree in Real-World Graphs:

Graph Type	Typical Avg Degree	Max Degree	Distribution
Road network	2-4	~20	Very uniform
Social network	100-500	10M+	Power-law
Web graph	10-50	1M+	Power-law
Citation graph	20-50	10K+	Power-law
Protein interaction	3-10	1000+	Power-law
Random graph	~log(n)	~2log(n)	Concentrated

Power-Law Implications

Power-law degree distributions mean most vertices have O(average) degree, but some 'hubs' have O(V) degree. Edge lookups touching these hubs can be unexpectedly slow with standard adjacency lists.

Why Edge Lookup Takes O(degree) Time

In a standard adjacency list (array of arrays/lists), checking if edge (u, v) exists requires searching through u's neighbor list. Let's trace through this process:

The Search Process:

To check has_edge(u, v):
1. Access adj[u]           → O(1)
2. Search for v in adj[u]  → O(degree(u))
3. Return found/not found

The bottleneck is step 2: we must potentially examine every neighbor of u before concluding v is not present.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
class AdjacencyListGraph:
    def __init__(self, num_vertices: int):
        self.adj = [[] for _ in range(num_vertices)]
    
    def add_edge(self, u: int, v: int):
        self.adj[u].append(v)
        self.adj[v].append(u)
    
    def has_edge(self, u: int, v: int) -> bool:
        """
        Check if edge (u, v) exists.
        
        Time Complexity: O(degree(u))
        
        Best case: O(1) if v is the first neighbor
        Worst case: O(degree(u)) if v is last or not present
        Average case: O(degree(u) / 2) ≈ O(degree(u))
        """
        # Linear search through u's neighbor list
        for neighbor in self.adj[u]:  # Loop runs degree(u) times in worst case
            if neighbor == v:
                return True
        return False
    
    def has_edge_optimized(self, u: int, v: int) -> bool:
        """
        Slight optimization: search the smaller list.
        
        Time: O(min(degree(u), degree(v)))
        
        Helpful when one vertex is a hub and the other isn't.
        """
        # Choose the vertex with fewer neighbors
        if len(self.adj[u]) > len(self.adj[v]):
            u, v = v, u
        
        return v in self.adj[u]  # Python 'in' is O(len) for lists
 
# Performance demonstration
import time
 
g = AdjacencyListGraph(10000)
 
# Create a hub: vertex 0 connected to all others
for i in range(1, 10000):
    g.add_edge(0, i)
 
# Add some edges between regular vertices
for i in range(1, 100):
    g.add_edge(i, i + 1)
 
# Lookup involving hub: SLOW
start = time.perf_counter()
for _ in range(1000):
    g.has_edge(0, 9999)  # Check hub's last neighbor
hub_time = time.perf_counter() - start
 
# Lookup between regular vertices: FAST
start = time.perf_counter()
for _ in range(1000):
    g.has_edge(50, 51)  # Check adjacent regular vertices
regular_time = time.perf_counter() - start
 
print(f"Hub lookup time: {hub_time:.4f}s")      # Significant
print(f"Regular lookup time: {regular_time:.4f}s")  # Negligible
print(f"Ratio: {hub_time/regular_time:.1f}x slower")  # Could be 1000x+!

Why This Matters:

The O(degree) lookup isn't always a problem—for graphs with bounded or low average degree, it's effectively O(1). But it becomes critical in:

Dense regions of sparse graphs: Some vertices may have very high degree
Algorithms that check many edges: Triangle counting, clique finding, certain shortest-path variants
Real-time systems: Where consistent response time matters more than average case
Hub-heavy graphs: Social networks where celebrities have millions of connections

Adjacency List vs Matrix: The Lookup Tradeoff

This is perhaps the most fundamental tradeoff between the two main graph representations:

Operation	Adjacency List	Adjacency Matrix
Edge lookup has_edge(u,v)	O(degree)	O(1)
Iterate neighbors	O(degree)	O(V)
Add edge	O(1) amortized	O(1)
Remove edge	O(degree)	O(1)
Space	O(V + E)	O(V²)

The matrix trades space for constant-time lookup. When is each appropriate?

When O(degree) Lookup is Acceptable

•Low/bounded degree graphs: Road networks, trees, planar graphs
•Traversal-heavy algorithms: BFS, DFS, shortest paths
•Memory-constrained systems: When O(V²) is impossible
•Sparse graphs: When E << V²
•Graph modifications needed: Frequent edge additions/deletions

When O(1) Lookup Matters

•Dense graphs: Complete graphs, high-density networks
•Repeated edge queries: Checking many edge pairs
•Triangle/clique algorithms: O(V³) algorithms × O(degree) = slower
•Small graphs: V² fits comfortably in memory
•Real-time requirements: Guaranteed constant-time response

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
# Example: Counting triangles in a graph
# A triangle is 3 vertices all connected to each other: (u, v, w)
 
def count_triangles_list(adj: list[list[int]], V: int) -> int:
    """
    Count triangles using adjacency list.
    
    For each edge (u, v) where u < v:
        For each neighbor w of u where w > v:
            Check if (v, w) is an edge  ← This is O(degree(v))!
    
    Time: O(E × D) where D = max degree
    For power-law graphs, this can approach O(E × V) in worst case.
    """
    count = 0
    for u in range(V):
        for v in adj[u]:
            if v > u:
                for w in adj[u]:
                    if w > v:
                        if w in adj[v]:  # O(degree(v)) check!
                            count += 1
    return count
 
def count_triangles_matrix(matrix: list[list[int]], V: int) -> int:
    """
    Count triangles using adjacency matrix.
    
    For each triple (u, v, w) where u < v < w:
        Check if all three edges exist in O(1) each
    
    Time: O(V³) or O(V × E) with optimization
    Each edge check is O(1).
    """
    count = 0
    for u in range(V):
        for v in range(u + 1, V):
            if matrix[u][v]:  # O(1)
                for w in range(v + 1, V):
                    if matrix[u][w] and matrix[v][w]:  # O(1) each
                        count += 1
    return count
 
# The O(degree) lookup in adjacency lists makes triangle counting
# potentially much slower for high-degree vertices.

Achieving O(1) Lookup with Hash Sets

What if you need both the space efficiency of adjacency lists AND constant-time edge lookup? The solution is to use hash sets for neighbor storage instead of lists/arrays.

The Hybrid Approach:

Replace each vertex's neighbor list with a hash set. This gives:

O(1) average edge lookup: Hash set membership is O(1) average
O(V + E) space: Same asymptotic space as lists, with some hash table overhead
O(degree) neighbor iteration: Still efficient for traversals

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
class HashSetGraph:
    """
    Adjacency list using hash sets for O(1) edge lookup.
    
    Best of both worlds:
    - O(V + E) space like regular adjacency list
    - O(1) average edge lookup like adjacency matrix
    - O(degree) iteration like regular adjacency list
    
    Trade-offs:
    - Higher constant factor in space (hash table overhead)
    - Iteration order is not deterministic
    - Slightly higher constant in edge insertion
    """
    
    def __init__(self, num_vertices: int):
        self.V = num_vertices
        self.adj = [set() for _ in range(num_vertices)]
        self.edge_count = 0
    
    def add_edge(self, u: int, v: int):
        """O(1) average time to add edge."""
        if v not in self.adj[u]:  # Avoid duplicate counting
            self.adj[u].add(v)
            self.adj[v].add(u)
            self.edge_count += 1
    
    def remove_edge(self, u: int, v: int):
        """O(1) average time to remove edge (vs O(degree) for lists!)."""
        if v in self.adj[u]:
            self.adj[u].discard(v)
            self.adj[v].discard(u)
            self.edge_count -= 1
    
    def has_edge(self, u: int, v: int) -> bool:
        """O(1) average time instead of O(degree)!"""
        return v in self.adj[u]
    
    def neighbors(self, v: int) -> set:
        """O(1) to get reference, O(degree) to fully iterate."""
        return self.adj[v]
    
    def degree(self, v: int) -> int:
        """O(1) time."""
        return len(self.adj[v])
 
# Performance comparison
import time
import random
 
def compare_lookup_performance(num_vertices: int, edges_per_vertex: int):
    """Compare list-based vs hash set-based edge lookup."""
    
    # Build both graphs with same edges
    list_adj = [[] for _ in range(num_vertices)]
    set_adj = [set() for _ in range(num_vertices)]
    
    for u in range(num_vertices):
        for _ in range(edges_per_vertex):
            v = random.randint(0, num_vertices - 1)
            if v != u:
                list_adj[u].append(v)
                set_adj[u].add(v)
    
    # Generate random edge queries
    queries = [(random.randint(0, num_vertices-1), 
                random.randint(0, num_vertices-1)) 
               for _ in range(10000)]
    
    # Time list-based lookup
    start = time.perf_counter()
    for u, v in queries:
        _ = v in list_adj[u]
    list_time = time.perf_counter() - start
    
    # Time set-based lookup
    start = time.perf_counter()
    for u, v in queries:
        _ = v in set_adj[u]
    set_time = time.perf_counter() - start
    
    print(f"Vertices: {num_vertices}, Edges/vertex: {edges_per_vertex}")
    print(f"  List lookup: {list_time*1000:.2f}ms")
    print(f"  Set lookup:  {set_time*1000:.2f}ms")
    print(f"  Speedup: {list_time/set_time:.1f}x")
 
# As edges_per_vertex grows, speedup increases dramatically
compare_lookup_performance(1000, 10)    # Small speedup
compare_lookup_performance(1000, 100)   # Medium speedup  
compare_lookup_performance(1000, 1000)  # Large speedup

When to Use Hash Set Adjacency Lists

Use hash sets when: (1) You need both edge existence queries AND traversals, (2) Memory overhead is acceptable (~2-3x more than arrays), (3) You need O(1) edge deletion, or (4) Working with high-degree vertices. Stick with arrays when: Traversal is the only operation, memory is tight, or you need deterministic iteration order.

Algorithmic Implications of O(degree) Lookup

The O(degree) lookup complexity shapes how we design and analyze graph algorithms. Let's examine several important cases:

Case 1: Graph Traversals (BFS/DFS)

These algorithms iterate through neighbors, not check specific edges. They're unaffected by lookup complexity:

BFS/DFS: For each vertex v, iterate through all neighbors
         Time: O(V + E) with adjacency list
         Each edge is visited at most twice (once from each endpoint)

The O(degree) per-vertex iteration is exactly what we want.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
from collections import deque
 
def bfs(adj: list[list[int]], start: int) -> list[int]:
    """
    BFS visits neighbors, doesn't query specific edges.
    
    Time: O(V + E) - perfect match for adjacency list
    
    Each vertex is enqueued once: O(V)
    Each edge is traversed once (or twice for undirected): O(E)
    """
    visited = [False] * len(adj)
    order = []
    queue = deque([start])
    visited[start] = True
    
    while queue:
        v = queue.popleft()
        order.append(v)
        
        # Iterate through neighbors - O(degree(v)) total for this vertex
        for neighbor in adj[v]:  # No edge lookup needed!
            if not visited[neighbor]:
                visited[neighbor] = True
                queue.append(neighbor)
    
    return order

Case 2: Shortest Path Algorithms

Dijkstra's and Bellman-Ford iterate through neighbors for relaxation—again matching adjacency list strengths:

Dijkstra: For each extracted vertex, relax all neighbors
          Time: O((V + E) log V) with binary heap
          Edge iteration is the dominant pattern

Case 3: Edge-Centric Algorithms

Some algorithms explicitly query edge existence. Here O(degree) matters:

Triangle enumeration: Check if (v, w) exists for each pair of u's neighbors
Graph matching: Check if a specific pairing is valid
Cycle detection with specific edges: Checking back-edges

For these, consider hash sets or careful algorithm design to minimize lookups.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
def count_triangles_optimized(adj: list[set[int]], V: int) -> int:
    """
    Optimized triangle counting with hash set adjacency list.
    
    Key insight: Order vertices by degree, iterate in that order.
    Only check 'forward' edges to avoid counting each triangle 3×.
    
    Time: O(E × sqrt(E)) for many real-world graphs
    With lists: O(E × D_max) where D_max can be O(V)
    With sets: Each edge check is O(1)
    """
    # Get degrees for all vertices
    degrees = [(len(adj[v]), v) for v in range(V)]
    degrees.sort()
    rank = {v: i for i, (_, v) in enumerate(degrees)}
    
    count = 0
    for u in range(V):
        # Only consider neighbors with higher rank
        for v in adj[u]:
            if rank[v] > rank[u]:
                for w in adj[u]:
                    if rank[w] > rank[v]:
                        # O(1) check with hash set!
                        if w in adj[v]:
                            count += 1
    
    return count
 
# Without O(1) lookup (using lists):
# Each 'w in adj[v]' would be O(degree(v))
# Total: O(sum over all edges of degree) = O(E × D_avg)
# Can be O(E × V) for hub-heavy graphs
 
# With O(1) lookup (using sets):
# Each check is O(1)
# Total: O(sum of min degrees) = O(E × sqrt(E)) typically

Algorithm Design Principle

When designing graph algorithms with adjacency lists, prefer iteration over neighbors (O(degree) total) rather than explicit edge queries (O(degree) each). Restructure algorithms to minimize edge existence checks, or use hash sets when checks are unavoidable.

Worst Case vs Average Case Analysis

When we say edge lookup is O(degree), we're stating a worst case bound. The actual performance can vary significantly:

Best Case: O(1)

The target vertex is the first neighbor checked. With unsorted lists, this occurs with probability 1/degree.

Average Case: O(degree/2) = O(degree)

On average, we examine half the neighbors before finding the target (if present) or all neighbors (if absent). Assuming uniform access patterns, searches for present edges average degree/2 comparisons, while searches for absent edges always take degree comparisons.

Worst Case: O(degree)

Target is last in the list, or not present at all.

Edge Lookup Performance Distribution
Scenario	Comparisons	When This Happens
Best case	1	Target is first neighbor
Average (edge exists)	degree/2	Target uniformly distributed
Average (edge absent)	degree	Must check all neighbors
Worst case	degree	Target last or absent

Amortized Analysis for Specific Patterns:

Some access patterns can improve average-case behavior:

Move-to-front heuristic: After finding a neighbor, move it to the front of the list. Frequently accessed edges become O(1).
Sorted neighbor lists: Enable binary search for O(log degree) lookup. Trade-off: O(degree) insertion to maintain sortedness.
Probabilistic skip lists: O(log degree) expected lookup with O(degree) space overhead.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
import bisect
 
class SortedAdjacencyList:
    """
    Maintain sorted neighbor lists for O(log degree) lookup.
    
    Trade-offs:
    - Lookup: O(log degree) instead of O(degree)
    - Insert: O(degree) instead of O(1) amortized
    - Space: Same as regular list
    
    Best for: Static graphs or infrequent modifications.
    """
    
    def __init__(self, num_vertices: int):
        self.adj = [[] for _ in range(num_vertices)]
    
    def add_edge(self, u: int, v: int):
        """O(degree) insertion to maintain sorted order."""
        # Binary search to find insertion point
        pos_u = bisect.bisect_left(self.adj[u], v)
        if pos_u == len(self.adj[u]) or self.adj[u][pos_u] != v:
            self.adj[u].insert(pos_u, v)  # O(degree) shift
        
        pos_v = bisect.bisect_left(self.adj[v], u)
        if pos_v == len(self.adj[v]) or self.adj[v][pos_v] != u:
            self.adj[v].insert(pos_v, u)
    
    def has_edge(self, u: int, v: int) -> bool:
        """O(log degree) lookup using binary search."""
        adj_u = self.adj[u]
        pos = bisect.bisect_left(adj_u, v)
        return pos < len(adj_u) and adj_u[pos] == v
    
    def neighbors(self, v: int) -> list[int]:
        """Neighbors returned in sorted order."""
        return self.adj[v]
 
# Comparison:
# Regular list lookup: O(degree) = O(1000) for hub vertex
# Sorted list lookup: O(log degree) = O(10) for same hub
# Hash set lookup: O(1) average
# Trade-off: Sorted lists maintain order, hash sets don't

Summary: Living with O(degree) Lookup

The O(degree) edge lookup time in adjacency lists is both a limitation and an acceptable trade-off depending on your use case. Let's consolidate the key insights:

Key Takeaways

•O(degree) means linear in neighbor count — For vertex v, checking if edge (u, v) exists requires potentially examining all of v's neighbors.
•Traversal algorithms are unaffected — BFS, DFS, and shortest-path algorithms iterate through neighbors, perfectly matching adjacency list structure.
•Edge-query algorithms suffer — Triangle counting, matching, and other edge-centric algorithms pay O(degree) per query.
•Hash sets provide O(1) lookup — Replace arrays with hash sets to get average O(1) edge queries while maintaining O(V + E) space.
•Sorted lists offer O(log degree) — Binary search on sorted neighbor arrays provides a middle ground, but O(degree) insertion.
•Degree distribution matters — Power-law graphs have hub vertices where O(degree) can mean O(millions). Design algorithms accordingly.
•Choose representation by dominant operation — If traversal dominates, use lists. If edge queries dominate, use sets or matrices.

What's Next:

Now that we understand the time and space characteristics of adjacency lists, the final page synthesizes everything: when are adjacency lists appropriate, and how do they compare holistically to adjacency matrices?

Page Complete

You now understand that O(degree) edge lookup is the key time complexity trade-off of adjacency lists. This limitation is acceptable for traversal-heavy algorithms and can be mitigated with hash sets when edge queries are frequent.

3 / 4

Loading learning content...

Data Structures & AlgorithmsGraph Representation — Adjacency List

Graph Representation — Adjacency List

LevelIntermediate

Duration60 mins

TopicGraph Representation — Adjacency List

3 / 4

Edge Lookup — O(degree) Time Complexity

The Cost of Checking Connections

This seemingly simple difference has profound implications for algorithm design, representation choice, and real-world system performance. Let's explore why.

What You Will Learn

Understanding Vertex Degree

Before analyzing lookup complexity, let's establish a precise understanding of degree—the central concept in adjacency list performance.

Degree Definitions:

Degree of vertex v (undirected graph): The number of edges incident to v, i.e., the size of v's neighbor list.
Out-degree of vertex v (directed graph): The number of edges leaving v.
In-degree of vertex v (directed graph): The number of edges entering v.
Average degree: Total edges × 2 / Total vertices = 2|E| / |V| for undirected graphs.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
class GraphWithDegreeAnalysis:
    """Adjacency list with degree analysis utilities."""
    
    def __init__(self, num_vertices: int):
        self.V = num_vertices
        self.adj = [[] for _ in range(num_vertices)]
    
    def add_edge(self, u: int, v: int):
        """Add undirected edge."""
        self.adj[u].append(v)
        self.adj[v].append(u)
    
    def degree(self, v: int) -> int:
        """Time: O(1) - just return list length."""
        return len(self.adj[v])
    
    def average_degree(self) -> float:
        """Time: O(V) - iterate through all lists."""
        total_degree = sum(len(neighbors) for neighbors in self.adj)
        return total_degree / self.V
    
    def max_degree(self) -> int:
        """Time: O(V) - find maximum list length."""
        return max(len(neighbors) for neighbors in self.adj)
    
    def degree_distribution(self) -> dict[int, int]:
        """Time: O(V) - count vertices by degree."""
        dist = {}
        for v in range(self.V):
            d = self.degree(v)
            dist[d] = dist.get(d, 0) + 1
        return dist
 
# Example: Social Network Analysis
g = GraphWithDegreeAnalysis(1000)
# Add edges... (simulating connections)
 
# In real social networks:
# - Most users have low degree (few connections)
# - Some "hub" users have very high degree (celebrities, influencers)
# - This is called a "power-law" or "scale-free" distribution
 
# Implications for edge lookup:
# - Checking if regular users are friends: O(small constant) ≈ O(1)
# - Checking if you're friends with a celebrity: O(millions) = slow!

Degree in Real-World Graphs:

Graph Type	Typical Avg Degree	Max Degree	Distribution
Road network	2-4	~20	Very uniform
Social network	100-500	10M+	Power-law
Web graph	10-50	1M+	Power-law
Citation graph	20-50	10K+	Power-law
Protein interaction	3-10	1000+	Power-law
Random graph	~log(n)	~2log(n)	Concentrated

Power-Law Implications

Power-law degree distributions mean most vertices have O(average) degree, but some 'hubs' have O(V) degree. Edge lookups touching these hubs can be unexpectedly slow with standard adjacency lists.

Why Edge Lookup Takes O(degree) Time

In a standard adjacency list (array of arrays/lists), checking if edge (u, v) exists requires searching through u's neighbor list. Let's trace through this process:

The Search Process:

To check has_edge(u, v):
1. Access adj[u]           → O(1)
2. Search for v in adj[u]  → O(degree(u))
3. Return found/not found

The bottleneck is step 2: we must potentially examine every neighbor of u before concluding v is not present.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
class AdjacencyListGraph:
    def __init__(self, num_vertices: int):
        self.adj = [[] for _ in range(num_vertices)]
    
    def add_edge(self, u: int, v: int):
        self.adj[u].append(v)
        self.adj[v].append(u)
    
    def has_edge(self, u: int, v: int) -> bool:
        """
        Check if edge (u, v) exists.
        
        Time Complexity: O(degree(u))
        
        Best case: O(1) if v is the first neighbor
        Worst case: O(degree(u)) if v is last or not present
        Average case: O(degree(u) / 2) ≈ O(degree(u))
        """
        # Linear search through u's neighbor list
        for neighbor in self.adj[u]:  # Loop runs degree(u) times in worst case
            if neighbor == v:
                return True
        return False
    
    def has_edge_optimized(self, u: int, v: int) -> bool:
        """
        Slight optimization: search the smaller list.
        
        Time: O(min(degree(u), degree(v)))
        
        Helpful when one vertex is a hub and the other isn't.
        """
        # Choose the vertex with fewer neighbors
        if len(self.adj[u]) > len(self.adj[v]):
            u, v = v, u
        
        return v in self.adj[u]  # Python 'in' is O(len) for lists
 
# Performance demonstration
import time
 
g = AdjacencyListGraph(10000)
 
# Create a hub: vertex 0 connected to all others
for i in range(1, 10000):
    g.add_edge(0, i)
 
# Add some edges between regular vertices
for i in range(1, 100):
    g.add_edge(i, i + 1)
 
# Lookup involving hub: SLOW
start = time.perf_counter()
for _ in range(1000):
    g.has_edge(0, 9999)  # Check hub's last neighbor
hub_time = time.perf_counter() - start
 
# Lookup between regular vertices: FAST
start = time.perf_counter()
for _ in range(1000):
    g.has_edge(50, 51)  # Check adjacent regular vertices
regular_time = time.perf_counter() - start
 
print(f"Hub lookup time: {hub_time:.4f}s")      # Significant
print(f"Regular lookup time: {regular_time:.4f}s")  # Negligible
print(f"Ratio: {hub_time/regular_time:.1f}x slower")  # Could be 1000x+!

Why This Matters:

The O(degree) lookup isn't always a problem—for graphs with bounded or low average degree, it's effectively O(1). But it becomes critical in:

Dense regions of sparse graphs: Some vertices may have very high degree
Algorithms that check many edges: Triangle counting, clique finding, certain shortest-path variants
Real-time systems: Where consistent response time matters more than average case
Hub-heavy graphs: Social networks where celebrities have millions of connections

Adjacency List vs Matrix: The Lookup Tradeoff

This is perhaps the most fundamental tradeoff between the two main graph representations:

Operation	Adjacency List	Adjacency Matrix
Edge lookup has_edge(u,v)	O(degree)	O(1)
Iterate neighbors	O(degree)	O(V)
Add edge	O(1) amortized	O(1)
Remove edge	O(degree)	O(1)
Space	O(V + E)	O(V²)

The matrix trades space for constant-time lookup. When is each appropriate?

When O(degree) Lookup is Acceptable

•Low/bounded degree graphs: Road networks, trees, planar graphs
•Traversal-heavy algorithms: BFS, DFS, shortest paths
•Memory-constrained systems: When O(V²) is impossible
•Sparse graphs: When E << V²
•Graph modifications needed: Frequent edge additions/deletions

When O(1) Lookup Matters

•Dense graphs: Complete graphs, high-density networks
•Repeated edge queries: Checking many edge pairs
•Triangle/clique algorithms: O(V³) algorithms × O(degree) = slower
•Small graphs: V² fits comfortably in memory
•Real-time requirements: Guaranteed constant-time response

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
# Example: Counting triangles in a graph
# A triangle is 3 vertices all connected to each other: (u, v, w)
 
def count_triangles_list(adj: list[list[int]], V: int) -> int:
    """
    Count triangles using adjacency list.
    
    For each edge (u, v) where u < v:
        For each neighbor w of u where w > v:
            Check if (v, w) is an edge  ← This is O(degree(v))!
    
    Time: O(E × D) where D = max degree
    For power-law graphs, this can approach O(E × V) in worst case.
    """
    count = 0
    for u in range(V):
        for v in adj[u]:
            if v > u:
                for w in adj[u]:
                    if w > v:
                        if w in adj[v]:  # O(degree(v)) check!
                            count += 1
    return count
 
def count_triangles_matrix(matrix: list[list[int]], V: int) -> int:
    """
    Count triangles using adjacency matrix.
    
    For each triple (u, v, w) where u < v < w:
        Check if all three edges exist in O(1) each
    
    Time: O(V³) or O(V × E) with optimization
    Each edge check is O(1).
    """
    count = 0
    for u in range(V):
        for v in range(u + 1, V):
            if matrix[u][v]:  # O(1)
                for w in range(v + 1, V):
                    if matrix[u][w] and matrix[v][w]:  # O(1) each
                        count += 1
    return count
 
# The O(degree) lookup in adjacency lists makes triangle counting
# potentially much slower for high-degree vertices.

Achieving O(1) Lookup with Hash Sets

What if you need both the space efficiency of adjacency lists AND constant-time edge lookup? The solution is to use hash sets for neighbor storage instead of lists/arrays.

The Hybrid Approach:

Replace each vertex's neighbor list with a hash set. This gives:

O(1) average edge lookup: Hash set membership is O(1) average
O(V + E) space: Same asymptotic space as lists, with some hash table overhead
O(degree) neighbor iteration: Still efficient for traversals

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
class HashSetGraph:
    """
    Adjacency list using hash sets for O(1) edge lookup.
    
    Best of both worlds:
    - O(V + E) space like regular adjacency list
    - O(1) average edge lookup like adjacency matrix
    - O(degree) iteration like regular adjacency list
    
    Trade-offs:
    - Higher constant factor in space (hash table overhead)
    - Iteration order is not deterministic
    - Slightly higher constant in edge insertion
    """
    
    def __init__(self, num_vertices: int):
        self.V = num_vertices
        self.adj = [set() for _ in range(num_vertices)]
        self.edge_count = 0
    
    def add_edge(self, u: int, v: int):
        """O(1) average time to add edge."""
        if v not in self.adj[u]:  # Avoid duplicate counting
            self.adj[u].add(v)
            self.adj[v].add(u)
            self.edge_count += 1
    
    def remove_edge(self, u: int, v: int):
        """O(1) average time to remove edge (vs O(degree) for lists!)."""
        if v in self.adj[u]:
            self.adj[u].discard(v)
            self.adj[v].discard(u)
            self.edge_count -= 1
    
    def has_edge(self, u: int, v: int) -> bool:
        """O(1) average time instead of O(degree)!"""
        return v in self.adj[u]
    
    def neighbors(self, v: int) -> set:
        """O(1) to get reference, O(degree) to fully iterate."""
        return self.adj[v]
    
    def degree(self, v: int) -> int:
        """O(1) time."""
        return len(self.adj[v])
 
# Performance comparison
import time
import random
 
def compare_lookup_performance(num_vertices: int, edges_per_vertex: int):
    """Compare list-based vs hash set-based edge lookup."""
    
    # Build both graphs with same edges
    list_adj = [[] for _ in range(num_vertices)]
    set_adj = [set() for _ in range(num_vertices)]
    
    for u in range(num_vertices):
        for _ in range(edges_per_vertex):
            v = random.randint(0, num_vertices - 1)
            if v != u:
                list_adj[u].append(v)
                set_adj[u].add(v)
    
    # Generate random edge queries
    queries = [(random.randint(0, num_vertices-1), 
                random.randint(0, num_vertices-1)) 
               for _ in range(10000)]
    
    # Time list-based lookup
    start = time.perf_counter()
    for u, v in queries:
        _ = v in list_adj[u]
    list_time = time.perf_counter() - start
    
    # Time set-based lookup
    start = time.perf_counter()
    for u, v in queries:
        _ = v in set_adj[u]
    set_time = time.perf_counter() - start
    
    print(f"Vertices: {num_vertices}, Edges/vertex: {edges_per_vertex}")
    print(f"  List lookup: {list_time*1000:.2f}ms")
    print(f"  Set lookup:  {set_time*1000:.2f}ms")
    print(f"  Speedup: {list_time/set_time:.1f}x")
 
# As edges_per_vertex grows, speedup increases dramatically
compare_lookup_performance(1000, 10)    # Small speedup
compare_lookup_performance(1000, 100)   # Medium speedup  
compare_lookup_performance(1000, 1000)  # Large speedup

When to Use Hash Set Adjacency Lists

Algorithmic Implications of O(degree) Lookup

The O(degree) lookup complexity shapes how we design and analyze graph algorithms. Let's examine several important cases:

Case 1: Graph Traversals (BFS/DFS)

These algorithms iterate through neighbors, not check specific edges. They're unaffected by lookup complexity:

BFS/DFS: For each vertex v, iterate through all neighbors
         Time: O(V + E) with adjacency list
         Each edge is visited at most twice (once from each endpoint)

The O(degree) per-vertex iteration is exactly what we want.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
from collections import deque
 
def bfs(adj: list[list[int]], start: int) -> list[int]:
    """
    BFS visits neighbors, doesn't query specific edges.
    
    Time: O(V + E) - perfect match for adjacency list
    
    Each vertex is enqueued once: O(V)
    Each edge is traversed once (or twice for undirected): O(E)
    """
    visited = [False] * len(adj)
    order = []
    queue = deque([start])
    visited[start] = True
    
    while queue:
        v = queue.popleft()
        order.append(v)
        
        # Iterate through neighbors - O(degree(v)) total for this vertex
        for neighbor in adj[v]:  # No edge lookup needed!
            if not visited[neighbor]:
                visited[neighbor] = True
                queue.append(neighbor)
    
    return order

Case 2: Shortest Path Algorithms

Dijkstra's and Bellman-Ford iterate through neighbors for relaxation—again matching adjacency list strengths:

Dijkstra: For each extracted vertex, relax all neighbors
          Time: O((V + E) log V) with binary heap
          Edge iteration is the dominant pattern

Case 3: Edge-Centric Algorithms

Some algorithms explicitly query edge existence. Here O(degree) matters:

Triangle enumeration: Check if (v, w) exists for each pair of u's neighbors
Graph matching: Check if a specific pairing is valid
Cycle detection with specific edges: Checking back-edges

For these, consider hash sets or careful algorithm design to minimize lookups.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
def count_triangles_optimized(adj: list[set[int]], V: int) -> int:
    """
    Optimized triangle counting with hash set adjacency list.
    
    Key insight: Order vertices by degree, iterate in that order.
    Only check 'forward' edges to avoid counting each triangle 3×.
    
    Time: O(E × sqrt(E)) for many real-world graphs
    With lists: O(E × D_max) where D_max can be O(V)
    With sets: Each edge check is O(1)
    """
    # Get degrees for all vertices
    degrees = [(len(adj[v]), v) for v in range(V)]
    degrees.sort()
    rank = {v: i for i, (_, v) in enumerate(degrees)}
    
    count = 0
    for u in range(V):
        # Only consider neighbors with higher rank
        for v in adj[u]:
            if rank[v] > rank[u]:
                for w in adj[u]:
                    if rank[w] > rank[v]:
                        # O(1) check with hash set!
                        if w in adj[v]:
                            count += 1
    
    return count
 
# Without O(1) lookup (using lists):
# Each 'w in adj[v]' would be O(degree(v))
# Total: O(sum over all edges of degree) = O(E × D_avg)
# Can be O(E × V) for hub-heavy graphs
 
# With O(1) lookup (using sets):
# Each check is O(1)
# Total: O(sum of min degrees) = O(E × sqrt(E)) typically

Algorithm Design Principle

Worst Case vs Average Case Analysis

When we say edge lookup is O(degree), we're stating a worst case bound. The actual performance can vary significantly:

Best Case: O(1)

The target vertex is the first neighbor checked. With unsorted lists, this occurs with probability 1/degree.

Average Case: O(degree/2) = O(degree)

Worst Case: O(degree)

Target is last in the list, or not present at all.

Edge Lookup Performance Distribution
Scenario	Comparisons	When This Happens
Best case	1	Target is first neighbor
Average (edge exists)	degree/2	Target uniformly distributed
Average (edge absent)	degree	Must check all neighbors
Worst case	degree	Target last or absent

Amortized Analysis for Specific Patterns:

Some access patterns can improve average-case behavior:

Move-to-front heuristic: After finding a neighbor, move it to the front of the list. Frequently accessed edges become O(1).
Sorted neighbor lists: Enable binary search for O(log degree) lookup. Trade-off: O(degree) insertion to maintain sortedness.
Probabilistic skip lists: O(log degree) expected lookup with O(degree) space overhead.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
import bisect
 
class SortedAdjacencyList:
    """
    Maintain sorted neighbor lists for O(log degree) lookup.
    
    Trade-offs:
    - Lookup: O(log degree) instead of O(degree)
    - Insert: O(degree) instead of O(1) amortized
    - Space: Same as regular list
    
    Best for: Static graphs or infrequent modifications.
    """
    
    def __init__(self, num_vertices: int):
        self.adj = [[] for _ in range(num_vertices)]
    
    def add_edge(self, u: int, v: int):
        """O(degree) insertion to maintain sorted order."""
        # Binary search to find insertion point
        pos_u = bisect.bisect_left(self.adj[u], v)
        if pos_u == len(self.adj[u]) or self.adj[u][pos_u] != v:
            self.adj[u].insert(pos_u, v)  # O(degree) shift
        
        pos_v = bisect.bisect_left(self.adj[v], u)
        if pos_v == len(self.adj[v]) or self.adj[v][pos_v] != u:
            self.adj[v].insert(pos_v, u)
    
    def has_edge(self, u: int, v: int) -> bool:
        """O(log degree) lookup using binary search."""
        adj_u = self.adj[u]
        pos = bisect.bisect_left(adj_u, v)
        return pos < len(adj_u) and adj_u[pos] == v
    
    def neighbors(self, v: int) -> list[int]:
        """Neighbors returned in sorted order."""
        return self.adj[v]
 
# Comparison:
# Regular list lookup: O(degree) = O(1000) for hub vertex
# Sorted list lookup: O(log degree) = O(10) for same hub
# Hash set lookup: O(1) average
# Trade-off: Sorted lists maintain order, hash sets don't

Summary: Living with O(degree) Lookup

The O(degree) edge lookup time in adjacency lists is both a limitation and an acceptable trade-off depending on your use case. Let's consolidate the key insights:

Key Takeaways

•O(degree) means linear in neighbor count — For vertex v, checking if edge (u, v) exists requires potentially examining all of v's neighbors.
•Traversal algorithms are unaffected — BFS, DFS, and shortest-path algorithms iterate through neighbors, perfectly matching adjacency list structure.
•Edge-query algorithms suffer — Triangle counting, matching, and other edge-centric algorithms pay O(degree) per query.
•Hash sets provide O(1) lookup — Replace arrays with hash sets to get average O(1) edge queries while maintaining O(V + E) space.
•Sorted lists offer O(log degree) — Binary search on sorted neighbor arrays provides a middle ground, but O(degree) insertion.
•Degree distribution matters — Power-law graphs have hub vertices where O(degree) can mean O(millions). Design algorithms accordingly.
•Choose representation by dominant operation — If traversal dominates, use lists. If edge queries dominate, use sets or matrices.

What's Next:

Page Complete

3 / 4