Loading learning content...
One of the most fundamental operations on any graph is answering the question: "Is there an edge between vertex u and vertex v?" This operation—called edge lookup or edge query—has dramatically different performance characteristics depending on your graph representation.
With an adjacency matrix, the answer is immediate: check matrix[u][v] in O(1) time. With a standard adjacency list, the answer requires searching through vertex u's neighbor list—taking O(degree(u)) time.
This seemingly simple difference has profound implications for algorithm design, representation choice, and real-world system performance. Let's explore why.
By the end of this page, you will understand why edge lookup takes O(degree) time in adjacency lists, how this affects algorithm performance, when this limitation matters (and when it doesn't), and techniques to achieve O(1) lookup when needed.
Before analyzing lookup complexity, let's establish a precise understanding of degree—the central concept in adjacency list performance.
Degree Definitions:
Degree of vertex v (undirected graph): The number of edges incident to v, i.e., the size of v's neighbor list.
Out-degree of vertex v (directed graph): The number of edges leaving v.
In-degree of vertex v (directed graph): The number of edges entering v.
Average degree: Total edges × 2 / Total vertices = 2|E| / |V| for undirected graphs.
123456789101112131415161718192021222324252627282930313233343536373839404142434445
class GraphWithDegreeAnalysis: """Adjacency list with degree analysis utilities.""" def __init__(self, num_vertices: int): self.V = num_vertices self.adj = [[] for _ in range(num_vertices)] def add_edge(self, u: int, v: int): """Add undirected edge.""" self.adj[u].append(v) self.adj[v].append(u) def degree(self, v: int) -> int: """Time: O(1) - just return list length.""" return len(self.adj[v]) def average_degree(self) -> float: """Time: O(V) - iterate through all lists.""" total_degree = sum(len(neighbors) for neighbors in self.adj) return total_degree / self.V def max_degree(self) -> int: """Time: O(V) - find maximum list length.""" return max(len(neighbors) for neighbors in self.adj) def degree_distribution(self) -> dict[int, int]: """Time: O(V) - count vertices by degree.""" dist = {} for v in range(self.V): d = self.degree(v) dist[d] = dist.get(d, 0) + 1 return dist # Example: Social Network Analysisg = GraphWithDegreeAnalysis(1000)# Add edges... (simulating connections) # In real social networks:# - Most users have low degree (few connections)# - Some "hub" users have very high degree (celebrities, influencers)# - This is called a "power-law" or "scale-free" distribution # Implications for edge lookup:# - Checking if regular users are friends: O(small constant) ≈ O(1)# - Checking if you're friends with a celebrity: O(millions) = slow!Degree in Real-World Graphs:
| Graph Type | Typical Avg Degree | Max Degree | Distribution |
|---|---|---|---|
| Road network | 2-4 | ~20 | Very uniform |
| Social network | 100-500 | 10M+ | Power-law |
| Web graph | 10-50 | 1M+ | Power-law |
| Citation graph | 20-50 | 10K+ | Power-law |
| Protein interaction | 3-10 | 1000+ | Power-law |
| Random graph | ~log(n) | ~2log(n) | Concentrated |
Power-law degree distributions mean most vertices have O(average) degree, but some 'hubs' have O(V) degree. Edge lookups touching these hubs can be unexpectedly slow with standard adjacency lists.
In a standard adjacency list (array of arrays/lists), checking if edge (u, v) exists requires searching through u's neighbor list. Let's trace through this process:
The Search Process:
To check has_edge(u, v):
1. Access adj[u] → O(1)
2. Search for v in adj[u] → O(degree(u))
3. Return found/not found
The bottleneck is step 2: we must potentially examine every neighbor of u before concluding v is not present.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566
class AdjacencyListGraph: def __init__(self, num_vertices: int): self.adj = [[] for _ in range(num_vertices)] def add_edge(self, u: int, v: int): self.adj[u].append(v) self.adj[v].append(u) def has_edge(self, u: int, v: int) -> bool: """ Check if edge (u, v) exists. Time Complexity: O(degree(u)) Best case: O(1) if v is the first neighbor Worst case: O(degree(u)) if v is last or not present Average case: O(degree(u) / 2) ≈ O(degree(u)) """ # Linear search through u's neighbor list for neighbor in self.adj[u]: # Loop runs degree(u) times in worst case if neighbor == v: return True return False def has_edge_optimized(self, u: int, v: int) -> bool: """ Slight optimization: search the smaller list. Time: O(min(degree(u), degree(v))) Helpful when one vertex is a hub and the other isn't. """ # Choose the vertex with fewer neighbors if len(self.adj[u]) > len(self.adj[v]): u, v = v, u return v in self.adj[u] # Python 'in' is O(len) for lists # Performance demonstrationimport time g = AdjacencyListGraph(10000) # Create a hub: vertex 0 connected to all othersfor i in range(1, 10000): g.add_edge(0, i) # Add some edges between regular verticesfor i in range(1, 100): g.add_edge(i, i + 1) # Lookup involving hub: SLOWstart = time.perf_counter()for _ in range(1000): g.has_edge(0, 9999) # Check hub's last neighborhub_time = time.perf_counter() - start # Lookup between regular vertices: FASTstart = time.perf_counter()for _ in range(1000): g.has_edge(50, 51) # Check adjacent regular verticesregular_time = time.perf_counter() - start print(f"Hub lookup time: {hub_time:.4f}s") # Significantprint(f"Regular lookup time: {regular_time:.4f}s") # Negligibleprint(f"Ratio: {hub_time/regular_time:.1f}x slower") # Could be 1000x+!Why This Matters:
The O(degree) lookup isn't always a problem—for graphs with bounded or low average degree, it's effectively O(1). But it becomes critical in:
This is perhaps the most fundamental tradeoff between the two main graph representations:
| Operation | Adjacency List | Adjacency Matrix |
|---|---|---|
| Edge lookup has_edge(u,v) | O(degree) | O(1) |
| Iterate neighbors | O(degree) | O(V) |
| Add edge | O(1) amortized | O(1) |
| Remove edge | O(degree) | O(1) |
| Space | O(V + E) | O(V²) |
The matrix trades space for constant-time lookup. When is each appropriate?
123456789101112131415161718192021222324252627282930313233343536373839404142434445
# Example: Counting triangles in a graph# A triangle is 3 vertices all connected to each other: (u, v, w) def count_triangles_list(adj: list[list[int]], V: int) -> int: """ Count triangles using adjacency list. For each edge (u, v) where u < v: For each neighbor w of u where w > v: Check if (v, w) is an edge ← This is O(degree(v))! Time: O(E × D) where D = max degree For power-law graphs, this can approach O(E × V) in worst case. """ count = 0 for u in range(V): for v in adj[u]: if v > u: for w in adj[u]: if w > v: if w in adj[v]: # O(degree(v)) check! count += 1 return count def count_triangles_matrix(matrix: list[list[int]], V: int) -> int: """ Count triangles using adjacency matrix. For each triple (u, v, w) where u < v < w: Check if all three edges exist in O(1) each Time: O(V³) or O(V × E) with optimization Each edge check is O(1). """ count = 0 for u in range(V): for v in range(u + 1, V): if matrix[u][v]: # O(1) for w in range(v + 1, V): if matrix[u][w] and matrix[v][w]: # O(1) each count += 1 return count # The O(degree) lookup in adjacency lists makes triangle counting# potentially much slower for high-degree vertices.What if you need both the space efficiency of adjacency lists AND constant-time edge lookup? The solution is to use hash sets for neighbor storage instead of lists/arrays.
The Hybrid Approach:
Replace each vertex's neighbor list with a hash set. This gives:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990
class HashSetGraph: """ Adjacency list using hash sets for O(1) edge lookup. Best of both worlds: - O(V + E) space like regular adjacency list - O(1) average edge lookup like adjacency matrix - O(degree) iteration like regular adjacency list Trade-offs: - Higher constant factor in space (hash table overhead) - Iteration order is not deterministic - Slightly higher constant in edge insertion """ def __init__(self, num_vertices: int): self.V = num_vertices self.adj = [set() for _ in range(num_vertices)] self.edge_count = 0 def add_edge(self, u: int, v: int): """O(1) average time to add edge.""" if v not in self.adj[u]: # Avoid duplicate counting self.adj[u].add(v) self.adj[v].add(u) self.edge_count += 1 def remove_edge(self, u: int, v: int): """O(1) average time to remove edge (vs O(degree) for lists!).""" if v in self.adj[u]: self.adj[u].discard(v) self.adj[v].discard(u) self.edge_count -= 1 def has_edge(self, u: int, v: int) -> bool: """O(1) average time instead of O(degree)!""" return v in self.adj[u] def neighbors(self, v: int) -> set: """O(1) to get reference, O(degree) to fully iterate.""" return self.adj[v] def degree(self, v: int) -> int: """O(1) time.""" return len(self.adj[v]) # Performance comparisonimport timeimport random def compare_lookup_performance(num_vertices: int, edges_per_vertex: int): """Compare list-based vs hash set-based edge lookup.""" # Build both graphs with same edges list_adj = [[] for _ in range(num_vertices)] set_adj = [set() for _ in range(num_vertices)] for u in range(num_vertices): for _ in range(edges_per_vertex): v = random.randint(0, num_vertices - 1) if v != u: list_adj[u].append(v) set_adj[u].add(v) # Generate random edge queries queries = [(random.randint(0, num_vertices-1), random.randint(0, num_vertices-1)) for _ in range(10000)] # Time list-based lookup start = time.perf_counter() for u, v in queries: _ = v in list_adj[u] list_time = time.perf_counter() - start # Time set-based lookup start = time.perf_counter() for u, v in queries: _ = v in set_adj[u] set_time = time.perf_counter() - start print(f"Vertices: {num_vertices}, Edges/vertex: {edges_per_vertex}") print(f" List lookup: {list_time*1000:.2f}ms") print(f" Set lookup: {set_time*1000:.2f}ms") print(f" Speedup: {list_time/set_time:.1f}x") # As edges_per_vertex grows, speedup increases dramaticallycompare_lookup_performance(1000, 10) # Small speedupcompare_lookup_performance(1000, 100) # Medium speedup compare_lookup_performance(1000, 1000) # Large speedupUse hash sets when: (1) You need both edge existence queries AND traversals, (2) Memory overhead is acceptable (~2-3x more than arrays), (3) You need O(1) edge deletion, or (4) Working with high-degree vertices. Stick with arrays when: Traversal is the only operation, memory is tight, or you need deterministic iteration order.
The O(degree) lookup complexity shapes how we design and analyze graph algorithms. Let's examine several important cases:
Case 1: Graph Traversals (BFS/DFS)
These algorithms iterate through neighbors, not check specific edges. They're unaffected by lookup complexity:
BFS/DFS: For each vertex v, iterate through all neighbors
Time: O(V + E) with adjacency list
Each edge is visited at most twice (once from each endpoint)
The O(degree) per-vertex iteration is exactly what we want.
123456789101112131415161718192021222324252627
from collections import deque def bfs(adj: list[list[int]], start: int) -> list[int]: """ BFS visits neighbors, doesn't query specific edges. Time: O(V + E) - perfect match for adjacency list Each vertex is enqueued once: O(V) Each edge is traversed once (or twice for undirected): O(E) """ visited = [False] * len(adj) order = [] queue = deque([start]) visited[start] = True while queue: v = queue.popleft() order.append(v) # Iterate through neighbors - O(degree(v)) total for this vertex for neighbor in adj[v]: # No edge lookup needed! if not visited[neighbor]: visited[neighbor] = True queue.append(neighbor) return orderCase 2: Shortest Path Algorithms
Dijkstra's and Bellman-Ford iterate through neighbors for relaxation—again matching adjacency list strengths:
Dijkstra: For each extracted vertex, relax all neighbors
Time: O((V + E) log V) with binary heap
Edge iteration is the dominant pattern
Case 3: Edge-Centric Algorithms
Some algorithms explicitly query edge existence. Here O(degree) matters:
For these, consider hash sets or careful algorithm design to minimize lookups.
12345678910111213141516171819202122232425262728293031323334353637
def count_triangles_optimized(adj: list[set[int]], V: int) -> int: """ Optimized triangle counting with hash set adjacency list. Key insight: Order vertices by degree, iterate in that order. Only check 'forward' edges to avoid counting each triangle 3×. Time: O(E × sqrt(E)) for many real-world graphs With lists: O(E × D_max) where D_max can be O(V) With sets: Each edge check is O(1) """ # Get degrees for all vertices degrees = [(len(adj[v]), v) for v in range(V)] degrees.sort() rank = {v: i for i, (_, v) in enumerate(degrees)} count = 0 for u in range(V): # Only consider neighbors with higher rank for v in adj[u]: if rank[v] > rank[u]: for w in adj[u]: if rank[w] > rank[v]: # O(1) check with hash set! if w in adj[v]: count += 1 return count # Without O(1) lookup (using lists):# Each 'w in adj[v]' would be O(degree(v))# Total: O(sum over all edges of degree) = O(E × D_avg)# Can be O(E × V) for hub-heavy graphs # With O(1) lookup (using sets):# Each check is O(1)# Total: O(sum of min degrees) = O(E × sqrt(E)) typicallyWhen designing graph algorithms with adjacency lists, prefer iteration over neighbors (O(degree) total) rather than explicit edge queries (O(degree) each). Restructure algorithms to minimize edge existence checks, or use hash sets when checks are unavoidable.
When we say edge lookup is O(degree), we're stating a worst case bound. The actual performance can vary significantly:
Best Case: O(1)
The target vertex is the first neighbor checked. With unsorted lists, this occurs with probability 1/degree.
Average Case: O(degree/2) = O(degree)
On average, we examine half the neighbors before finding the target (if present) or all neighbors (if absent). Assuming uniform access patterns, searches for present edges average degree/2 comparisons, while searches for absent edges always take degree comparisons.
Worst Case: O(degree)
Target is last in the list, or not present at all.
| Scenario | Comparisons | When This Happens |
|---|---|---|
| Best case | 1 | Target is first neighbor |
| Average (edge exists) | degree/2 | Target uniformly distributed |
| Average (edge absent) | degree | Must check all neighbors |
| Worst case | degree | Target last or absent |
Amortized Analysis for Specific Patterns:
Some access patterns can improve average-case behavior:
Move-to-front heuristic: After finding a neighbor, move it to the front of the list. Frequently accessed edges become O(1).
Sorted neighbor lists: Enable binary search for O(log degree) lookup. Trade-off: O(degree) insertion to maintain sortedness.
Probabilistic skip lists: O(log degree) expected lookup with O(degree) space overhead.
12345678910111213141516171819202122232425262728293031323334353637383940414243
import bisect class SortedAdjacencyList: """ Maintain sorted neighbor lists for O(log degree) lookup. Trade-offs: - Lookup: O(log degree) instead of O(degree) - Insert: O(degree) instead of O(1) amortized - Space: Same as regular list Best for: Static graphs or infrequent modifications. """ def __init__(self, num_vertices: int): self.adj = [[] for _ in range(num_vertices)] def add_edge(self, u: int, v: int): """O(degree) insertion to maintain sorted order.""" # Binary search to find insertion point pos_u = bisect.bisect_left(self.adj[u], v) if pos_u == len(self.adj[u]) or self.adj[u][pos_u] != v: self.adj[u].insert(pos_u, v) # O(degree) shift pos_v = bisect.bisect_left(self.adj[v], u) if pos_v == len(self.adj[v]) or self.adj[v][pos_v] != u: self.adj[v].insert(pos_v, u) def has_edge(self, u: int, v: int) -> bool: """O(log degree) lookup using binary search.""" adj_u = self.adj[u] pos = bisect.bisect_left(adj_u, v) return pos < len(adj_u) and adj_u[pos] == v def neighbors(self, v: int) -> list[int]: """Neighbors returned in sorted order.""" return self.adj[v] # Comparison:# Regular list lookup: O(degree) = O(1000) for hub vertex# Sorted list lookup: O(log degree) = O(10) for same hub# Hash set lookup: O(1) average# Trade-off: Sorted lists maintain order, hash sets don'tThe O(degree) edge lookup time in adjacency lists is both a limitation and an acceptable trade-off depending on your use case. Let's consolidate the key insights:
What's Next:
Now that we understand the time and space characteristics of adjacency lists, the final page synthesizes everything: when are adjacency lists appropriate, and how do they compare holistically to adjacency matrices?
You now understand that O(degree) edge lookup is the key time complexity trade-off of adjacency lists. This limitation is acceptable for traversal-heavy algorithms and can be mitigated with hash sets when edge queries are frequent.