System Design (LLD)Iterator Pattern

Iterator Pattern: Uniform Collection Traversal

LevelIntermediate

Duration60 mins

TopicIterator Pattern

1 / 4

Problem: Traversing Collections Uniformly

The Universal Challenge of Collection Traversal

Every software system manipulates collections. Users in a database. Products in a catalog. Messages in a queue. Files in a directory. At the heart of nearly every application lies the fundamental need to traverse collections—to visit each element, process it, and move to the next.

Yet collections come in wildly different forms. Arrays store elements contiguously in memory. Linked lists chain elements through pointers. Trees organize elements hierarchically. Graphs connect elements through arbitrary relationships. Hash tables scatter elements across buckets.

The problem: How do you write code that can traverse any collection without knowing (or caring about) its internal structure?

What You Will Learn

By the end of this page, you will understand why exposing collection internals for traversal creates fragile, tightly-coupled code. You'll see how different traversal requirements (forward, backward, filtered, transformed) compound the problem. Most importantly, you'll understand why we need a pattern that separates 'how to traverse' from 'what to traverse.'

The Naive Approach: Exposing Collection Internals

When developers first encounter the need to traverse a collection, the most intuitive approach is to expose the collection's internal structure directly. If you have an array, you expose the array. If you have a linked list, you expose the head node. Let the client code figure out how to traverse it.

This seems perfectly reasonable for simple cases. Consider a notification system that needs to send alerts to all users:

naive_traversal.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# Naive Approach: Exposing Internal Structure
 
class UserRepository:
    """Repository that stores users in an internal array"""
    
    def __init__(self):
        self._users: list = []  # Internal storage - an array
    
    def add_user(self, user: User) -> None:
        self._users.append(user)
    
    def get_users(self) -> list:
        """Exposes internal array directly"""
        return self._users  # ⚠️ Exposes implementation detail
 
 
class NotificationService:
    """Service that sends notifications to all users"""
    
    def __init__(self, user_repository: UserRepository):
        self.user_repository = user_repository
    
    def notify_all(self, message: str) -> None:
        # Client code must know how to traverse an array
        users = self.user_repository.get_users()
        
        for i in range(len(users)):  # Array-specific traversal
            users[i].send_notification(message)
 
 
# Usage
repository = UserRepository()
repository.add_user(User("alice@example.com"))
repository.add_user(User("bob@example.com"))
 
notification_service = NotificationService(repository)
notification_service.notify_all("System maintenance tonight!")

This works. But it creates a hidden dependency: NotificationService now knows that UserRepository stores users in a list. This knowledge is embedded in the for i in range(len(users)) loop.

What happens when requirements change?

The Ripple Effect of Internal Exposure

When you expose a collection's internal structure, every piece of client code becomes coupled to that structure. Change the structure, and you must change every client. In large systems, this creates cascading changes that can affect dozens or hundreds of files.

The Change That Breaks Everything

Suppose your system grows. You now have millions of users, and array operations become performance bottlenecks. The team decides to switch to a linked list for O(1) insertions and deletions. Or perhaps you adopt a tree structure for sorted access. Or a hash table for O(1) lookups by ID.

Suddenly, every client that traverses users must change:

breaking_change.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
# After switching from array to linked list...
 
class UserNode:
    """Node in a linked list of users"""
    def __init__(self, user: User):
        self.user = user
        self.next: Optional['UserNode'] = None
 
 
class UserRepository:
    """Now stores users in a linked list"""
    
    def __init__(self):
        self._head: Optional[UserNode] = None  # Changed internal structure!
        self._tail: Optional[UserNode] = None
    
    def add_user(self, user: User) -> None:
        new_node = UserNode(user)
        if self._tail:
            self._tail.next = new_node
        else:
            self._head = new_node
        self._tail = new_node
    
    def get_head(self) -> Optional[UserNode]:
        """New API - exposes linked list structure"""
        return self._head
 
 
class NotificationService:
    """Service that sends notifications to all users"""
    
    def __init__(self, user_repository: UserRepository):
        self.user_repository = user_repository
    
    def notify_all(self, message: str) -> None:
        # ❌ Old array traversal code is BROKEN
        # users = self.user_repository.get_users()
        # for i in range(len(users)):
        #     users[i].send_notification(message)
        
        # ✅ Must rewrite for linked list traversal
        current = self.user_repository.get_head()
        while current is not None:  # Linked list traversal
            current.user.send_notification(message)
            current = current.next

The damage is extensive:

NotificationService must be rewritten
Any analytics code that counted users must be rewritten
Any export code that serialized users must be rewritten
Any UI code that displayed user lists must be rewritten
Any test code that iterated over users must be rewritten

Every single place in your codebase that traversed the user collection is now broken. And this is for one collection in one repository.

Violation of Encapsulation

This scenario violates a fundamental principle of object-oriented design: encapsulation. The internal representation of an object should be hidden from clients. By exposing whether users are stored in an array, linked list, tree, or any other structure, we've leaked implementation details that clients should never know.

The Problem Multiplies: Diverse Collection Types

Real systems don't have just one type of collection. They have many—often with different structures optimized for different access patterns. Consider an e-commerce platform:

Collection Diversity in Real Systems
Collection	Optimal Structure	Access Pattern	Traversal Method
Products	Hash Map (by ID)	O(1) lookup	Iterate over values()
Categories	Tree (hierarchical)	Parent-child navigation	DFS/BFS traversal
Order Queue	Linked List / Deque	FIFO processing	Pop from head
Price Tiers	Sorted Array	Range queries	Binary search bounds
User Sessions	LRU Cache	Recent access	Custom eviction order
Product Tags	Set	Membership test	Unordered iteration

Now imagine writing a reporting module that needs to generate summaries across all these collections. Without abstraction, the reporting code becomes a maze of collection-specific traversal logic:

report_generator_mess.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
# Without Iterator Pattern: Collection-Specific Chaos
 
class ReportGenerator:
    """Generates reports across multiple collection types"""
    
    def __init__(
        self,
        product_repo: ProductRepository,     # Uses HashMap
        category_repo: CategoryRepository,   # Uses Tree
        order_queue: OrderQueue,             # Uses LinkedList
        price_tiers: PriceTierRepository,    # Uses SortedArray
    ):
        self.product_repo = product_repo
        self.category_repo = category_repo
        self.order_queue = order_queue
        self.price_tiers = price_tiers
    
    def generate_daily_summary(self) -> Report:
        report = Report()
        
        # Traverse products (HashMap) - must know it's a dict
        products = self.product_repo.get_product_map()
        for product_id, product in products.items():
            report.add_product_metric(product)
        
        # Traverse categories (Tree) - must know it's a tree
        root = self.category_repo.get_root_category()
        self._traverse_category_tree(root, report)
        
        # Traverse orders (LinkedList) - must know it's a linked list
        current_order = self.order_queue.get_head()
        while current_order:
            report.add_order_metric(current_order.data)
            current_order = current_order.next
        
        # Traverse price tiers (SortedArray) - must know it's an array
        tiers = self.price_tiers.get_tiers_array()
        for i in range(len(tiers)):
            report.add_tier_metric(tiers[i])
        
        return report
    
    def _traverse_category_tree(self, node: CategoryNode, report: Report):
        """Recursive tree traversal - tree-specific logic"""
        if node is None:
            return
        
        report.add_category_metric(node.category)
        
        for child in node.children:
            self._traverse_category_tree(child, report)

Cognitive Overload

The report generator must understand four completely different traversal mechanisms. Developers working on this code must mentally context-switch between array indices, dictionary iteration, linked list pointers, and recursive tree traversal—all in one method. This cognitive load increases bugs and slows development.

Traversal Isn't Just 'Visit Every Element'

The problem deepens when you consider that traversal itself has many variations. It's not just about visiting every element—it's about how you visit them:

Traversal Variations

•Forward traversal — Visit elements from beginning to end (the most common case)
•Reverse traversal — Visit elements from end to beginning (for undo operations, reverse chronological display)
•Filtered traversal — Visit only elements matching certain criteria (active users, pending orders)
•Transformed traversal — Transform elements during iteration (project specific fields, compute derived values)
•Paginated traversal — Visit elements in chunks (for memory efficiency, API pagination)
•Random access traversal — Jump to specific positions (for binary search, sampling)
•Concurrent traversal — Multiple traversals happening simultaneously (multi-threaded processing)
•Snapshot traversal — Traverse a consistent view even as the collection changes (MVCC)

Without proper abstraction, each traversal variation requires collection-specific implementations. For a system with 5 collection types and 8 traversal variations, you potentially need 40 different traversal implementations—each tightly coupled to both the collection structure and the traversal type.

traversal_explosion.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# The Combinatorial Explosion of Traversal Implementations
 
# For Array-based collection:
def traverse_array_forward(arr): ...
def traverse_array_reverse(arr): ...
def traverse_array_filtered(arr, predicate): ...
def traverse_array_paginated(arr, page_size): ...
 
# For LinkedList-based collection:
def traverse_linked_list_forward(head): ...
def traverse_linked_list_reverse(head): ...  # Requires O(n) per step or doubly-linked
def traverse_linked_list_filtered(head, predicate): ...
def traverse_linked_list_paginated(head, page_size): ...
 
# For Tree-based collection:
def traverse_tree_preorder(root): ...
def traverse_tree_inorder(root): ...
def traverse_tree_postorder(root): ...
def traverse_tree_level_order(root): ...
def traverse_tree_filtered(root, predicate): ...
def traverse_tree_paginated(root, page_size): ...
 
# For Graph-based collection:
def traverse_graph_dfs(start_node): ...
def traverse_graph_bfs(start_node): ...
def traverse_graph_topological(graph): ...
def traverse_graph_filtered(start_node, predicate): ...
 
# And so on for every combination...
# This is UNMAINTAINABLE!

Maintenance Nightmare

Every time you add a new collection type, you must implement all traversal variations. Every time you add a new traversal variation, you must implement it for all collection types. The maintenance burden grows multiplicatively, not additively.

The Polymorphism Problem

One of the most powerful features of object-oriented programming is polymorphism—the ability to treat different types uniformly through a common interface. But without proper iteration abstraction, collections break polymorphism.

Consider a function that should work with any collection of items:

broken_polymorphism.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
# Broken Polymorphism: Cannot Write Generic Code
 
def calculate_total_value(collection) -> Decimal:
    """
    Should calculate total value of any collection of items.
    But HOW do we traverse the collection?
    """
    total = Decimal("0")
    
    # We must check the collection type and use specific traversal!
    if isinstance(collection, list):
        for i in range(len(collection)):
            total += collection[i].get_value()
    
    elif isinstance(collection, dict):
        for key, item in collection.items():
            total += item.get_value()
    
    elif isinstance(collection, LinkedList):
        current = collection.head
        while current:
            total += current.data.get_value()
            current = current.next
    
    elif isinstance(collection, BinaryTree):
        # Need recursive traversal
        def traverse(node):
            nonlocal total
            if node:
                total += node.data.get_value()
                traverse(node.left)
                traverse(node.right)
        traverse(collection.root)
    
    else:
        raise TypeError(f"Unknown collection type: {type(collection)}")
    
    return total
 
 
# The function "works" but violates every principle:
# ❌ Open/Closed Principle: Must modify function for new collection types
# ❌ Single Responsibility: One function handles multiple traversal algorithms
# ❌ Liskov Substitution: Cannot substitute collections freely
# ❌ Dependency Inversion: Depends on concrete collection implementations

The isinstance checks are a code smell. They indicate that our abstraction is incomplete. We're forced to peek inside the collection to figure out how to traverse it, which defeats the purpose of having different collection types in the first place.

What we really want:

desired_polymorphism.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# What We Want: True Polymorphic Iteration
 
def calculate_total_value(collection: Iterable[Item]) -> Decimal:
                                    """
    Calculate total value of ANY collection of items.
    We don't care HOW the collection stores items.
    We only care that we can iterate over them.
    """
    total = Decimal("0")
    
    # Single, uniform traversal mechanism
    for item in collection:
        total += item.get_value()
    
    return total
 
 
# This function works with:
# - Arrays/Lists
# - Linked Lists
# - Trees (any traversal order)
# - Hash Maps (over values)
# - Sets
# - Queues
# - Database result sets
# - File line iterators
# - Network stream chunks
# - ANY collection that provides iteration!

The Goal of Iterator Pattern

The Iterator Pattern gives us exactly this: a uniform interface for traversing any collection, without exposing how the collection is structured internally. Client code simply asks for elements one at a time, and the iterator handles all the details of navigation.

Real-World Systems That Suffer Without Iterators

The iteration problem isn't theoretical—it causes real pain in production systems. Here are patterns we see repeatedly:

Production System Pain Points

•ORM Result Sets — Database queries return results in various forms (arrays, lazy cursors, paginated streams). Without iterator abstraction, business logic becomes coupled to specific ORM return types.
•API Pagination — External APIs return paginated results. Client code must handle 'has more pages' logic, next-page tokens, and rate limiting—mixing traversal concerns with business logic.
•File System Operations — Directory contents can be listed eagerly (into memory) or lazily (streaming). Without abstraction, code that processes files becomes tied to specific listing strategies.
•Message Queue Consumption — Messages are consumed from queues in batches or streams. Consumer code that's tightly coupled to queue implementation can't be reused for different queue systems.
•Search Results — Search engines return results with scores, highlights, and pagination. Without iteration abstraction, display logic becomes tangled with search-specific data structures.
•Configuration Systems — Configuration can come from files, environment variables, databases, or remote services. Code that iterates over settings shouldn't care about the source.

In each case, the fundamental issue is the same: business logic becomes polluted with traversal mechanics. The code that should focus on 'what to do with each element' is instead preoccupied with 'how to get the next element.'

Without Iterator Pattern

•Business logic mixed with traversal logic
•Coupled to specific collection types
•Changes propagate across codebase
•Can't test traversal independently
•Can't easily change collection implementations
•Duplicate traversal code everywhere

With Iterator Pattern

•Clean separation of concerns
•Works with any iterable collection
•Changes isolated to iterator implementations
•Iterators can be tested in isolation
•Swap implementations without touching clients
•Single, reusable iteration interface

Defining the Core Problem

Let's crystallize the problem the Iterator Pattern solves. We need to satisfy multiple requirements simultaneously:

Requirements for Collection Traversal

•Encapsulation — The internal structure of a collection (array, linked list, tree, graph) must remain hidden from clients
•Uniformity — Clients should traverse all collections using the same interface, regardless of the collection's internal structure
•Multiple Traversals — It should be possible to have multiple simultaneous traversals of the same collection (different positions, different directions)
•Traversal Variations — The same collection should support different traversal orders (forward, reverse, filtered, etc.) without modifying the collection class
•Polymorphism — Generic algorithms should work with any collection type without type-specific code
•Separation of Concerns — Traversal logic should be separate from both collection logic and client business logic

The formal problem statement:

Provide a way to access the elements of an aggregate object sequentially without exposing its underlying representation.

This is exactly what the Gang of Four identified in their seminal work on design patterns. The Iterator Pattern provides a solution that satisfies all six requirements above while maintaining clean object-oriented design.

Historical Context

The Iterator Pattern is one of the original 23 patterns documented by the Gang of Four in 1994. It has become so fundamental that most modern programming languages build iteration directly into their syntax (Python's 'for...in', Java's enhanced for loop, JavaScript's 'for...of'). Understanding the pattern helps you leverage these features more effectively.

Towards a Solution: The Iterator Abstraction

Before we dive into the full pattern implementation in the next page, let's preview the conceptual solution.

The key insight is to extract the traversal mechanism into its own object. Instead of the collection exposing its internals, the collection provides an 'iterator' object that knows how to traverse it. Clients interact only with the iterator, never with the collection's internal structure.

conceptual_solution.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# Conceptual Preview of the Iterator Solution
 
# The Iterator abstracts traversal
class Iterator:
    def has_next(self) -> bool:
        """Returns True if more elements exist"""
        ...
    
    def next(self) -> Any:
        """Returns the next element and advances position"""
        ...
 
 
# Collections provide iterators instead of exposing internals
class Collection:
    def create_iterator(self) -> Iterator:
        """Factory method: creates an iterator for this collection"""
        ...
 
 
# Client code uses only the Iterator interface
def process_items(collection: Collection):
    iterator = collection.create_iterator()
    
    while iterator.has_next():
        item = iterator.next()
        # Process item - no knowledge of collection structure!
        process(item)

Why this works:

Encapsulation preserved: The collection never exposes its internal structure
Uniform interface: Every collection provides the same create_iterator() method
Polymorphism enabled: Client code works with any collection that provides an iterator
Separation achieved: Traversal logic lives in the iterator, not in clients
Multiple traversals: Each call to create_iterator() returns a fresh iterator with independent state

The Power of Indirection

The Iterator Pattern introduces a level of indirection between clients and collections. This indirection is the source of all its benefits. Clients depend on the abstract Iterator interface, not on concrete collection implementations. This is the Dependency Inversion Principle in action.

Summary: The Problem Landscape

We've explored the fundamental challenges of collection traversal that motivate the Iterator Pattern:

Key Takeaways

•Exposing internals creates coupling — When collections expose their internal structure, every client becomes dependent on that structure
•Changes propagate explosively — Switching from array to linked list (or any other structure) breaks all client code
•Multiple collection types multiply the problem — Real systems have many collection types, each requiring different traversal code
•Traversal variations compound complexity — Forward, reverse, filtered, paginated—each variation needs implementation per collection type
•Polymorphism breaks down — Without abstraction, generic algorithms devolve into type-checking messes
•Business logic gets polluted — Code that should focus on processing mixes with navigation mechanics

What's next:

In the next page, we'll implement the Iterator Pattern in full detail. You'll see the complete structure—Iterator interface, Concrete Iterators, Aggregate interfaces, and Concrete Aggregates. We'll build iterators for multiple collection types and demonstrate how client code becomes simple, uniform, and completely decoupled from collection internals.

Page Complete

You now understand the problem of uniform collection traversal—why exposing collection internals creates fragile, coupled systems, and why we need an abstraction layer for iteration. Next, we'll build the solution: the Iterator Pattern's complete architecture.

1 / 4

Loading learning content...

System Design (LLD)Iterator Pattern

Iterator Pattern: Uniform Collection Traversal

LevelIntermediate

Duration60 mins

TopicIterator Pattern

1 / 4

Problem: Traversing Collections Uniformly

The Universal Challenge of Collection Traversal

The problem: How do you write code that can traverse any collection without knowing (or caring about) its internal structure?

What You Will Learn

The Naive Approach: Exposing Collection Internals

This seems perfectly reasonable for simple cases. Consider a notification system that needs to send alerts to all users:

naive_traversal.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# Naive Approach: Exposing Internal Structure
 
class UserRepository:
    """Repository that stores users in an internal array"""
    
    def __init__(self):
        self._users: list = []  # Internal storage - an array
    
    def add_user(self, user: User) -> None:
        self._users.append(user)
    
    def get_users(self) -> list:
        """Exposes internal array directly"""
        return self._users  # ⚠️ Exposes implementation detail
 
 
class NotificationService:
    """Service that sends notifications to all users"""
    
    def __init__(self, user_repository: UserRepository):
        self.user_repository = user_repository
    
    def notify_all(self, message: str) -> None:
        # Client code must know how to traverse an array
        users = self.user_repository.get_users()
        
        for i in range(len(users)):  # Array-specific traversal
            users[i].send_notification(message)
 
 
# Usage
repository = UserRepository()
repository.add_user(User("alice@example.com"))
repository.add_user(User("bob@example.com"))
 
notification_service = NotificationService(repository)
notification_service.notify_all("System maintenance tonight!")

This works. But it creates a hidden dependency: NotificationService now knows that UserRepository stores users in a list. This knowledge is embedded in the for i in range(len(users)) loop.

What happens when requirements change?

The Ripple Effect of Internal Exposure

The Change That Breaks Everything

Suddenly, every client that traverses users must change:

breaking_change.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
# After switching from array to linked list...
 
class UserNode:
    """Node in a linked list of users"""
    def __init__(self, user: User):
        self.user = user
        self.next: Optional['UserNode'] = None
 
 
class UserRepository:
    """Now stores users in a linked list"""
    
    def __init__(self):
        self._head: Optional[UserNode] = None  # Changed internal structure!
        self._tail: Optional[UserNode] = None
    
    def add_user(self, user: User) -> None:
        new_node = UserNode(user)
        if self._tail:
            self._tail.next = new_node
        else:
            self._head = new_node
        self._tail = new_node
    
    def get_head(self) -> Optional[UserNode]:
        """New API - exposes linked list structure"""
        return self._head
 
 
class NotificationService:
    """Service that sends notifications to all users"""
    
    def __init__(self, user_repository: UserRepository):
        self.user_repository = user_repository
    
    def notify_all(self, message: str) -> None:
        # ❌ Old array traversal code is BROKEN
        # users = self.user_repository.get_users()
        # for i in range(len(users)):
        #     users[i].send_notification(message)
        
        # ✅ Must rewrite for linked list traversal
        current = self.user_repository.get_head()
        while current is not None:  # Linked list traversal
            current.user.send_notification(message)
            current = current.next

The damage is extensive:

NotificationService must be rewritten
Any analytics code that counted users must be rewritten
Any export code that serialized users must be rewritten
Any UI code that displayed user lists must be rewritten
Any test code that iterated over users must be rewritten

Every single place in your codebase that traversed the user collection is now broken. And this is for one collection in one repository.

Violation of Encapsulation

The Problem Multiplies: Diverse Collection Types

Real systems don't have just one type of collection. They have many—often with different structures optimized for different access patterns. Consider an e-commerce platform:

Collection Diversity in Real Systems
Collection	Optimal Structure	Access Pattern	Traversal Method
Products	Hash Map (by ID)	O(1) lookup	Iterate over values()
Categories	Tree (hierarchical)	Parent-child navigation	DFS/BFS traversal
Order Queue	Linked List / Deque	FIFO processing	Pop from head
Price Tiers	Sorted Array	Range queries	Binary search bounds
User Sessions	LRU Cache	Recent access	Custom eviction order
Product Tags	Set	Membership test	Unordered iteration

Now imagine writing a reporting module that needs to generate summaries across all these collections. Without abstraction, the reporting code becomes a maze of collection-specific traversal logic:

report_generator_mess.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
# Without Iterator Pattern: Collection-Specific Chaos
 
class ReportGenerator:
    """Generates reports across multiple collection types"""
    
    def __init__(
        self,
        product_repo: ProductRepository,     # Uses HashMap
        category_repo: CategoryRepository,   # Uses Tree
        order_queue: OrderQueue,             # Uses LinkedList
        price_tiers: PriceTierRepository,    # Uses SortedArray
    ):
        self.product_repo = product_repo
        self.category_repo = category_repo
        self.order_queue = order_queue
        self.price_tiers = price_tiers
    
    def generate_daily_summary(self) -> Report:
        report = Report()
        
        # Traverse products (HashMap) - must know it's a dict
        products = self.product_repo.get_product_map()
        for product_id, product in products.items():
            report.add_product_metric(product)
        
        # Traverse categories (Tree) - must know it's a tree
        root = self.category_repo.get_root_category()
        self._traverse_category_tree(root, report)
        
        # Traverse orders (LinkedList) - must know it's a linked list
        current_order = self.order_queue.get_head()
        while current_order:
            report.add_order_metric(current_order.data)
            current_order = current_order.next
        
        # Traverse price tiers (SortedArray) - must know it's an array
        tiers = self.price_tiers.get_tiers_array()
        for i in range(len(tiers)):
            report.add_tier_metric(tiers[i])
        
        return report
    
    def _traverse_category_tree(self, node: CategoryNode, report: Report):
        """Recursive tree traversal - tree-specific logic"""
        if node is None:
            return
        
        report.add_category_metric(node.category)
        
        for child in node.children:
            self._traverse_category_tree(child, report)

Cognitive Overload

Traversal Isn't Just 'Visit Every Element'

The problem deepens when you consider that traversal itself has many variations. It's not just about visiting every element—it's about how you visit them:

Traversal Variations

•Forward traversal — Visit elements from beginning to end (the most common case)
•Reverse traversal — Visit elements from end to beginning (for undo operations, reverse chronological display)
•Filtered traversal — Visit only elements matching certain criteria (active users, pending orders)
•Transformed traversal — Transform elements during iteration (project specific fields, compute derived values)
•Paginated traversal — Visit elements in chunks (for memory efficiency, API pagination)
•Random access traversal — Jump to specific positions (for binary search, sampling)
•Concurrent traversal — Multiple traversals happening simultaneously (multi-threaded processing)
•Snapshot traversal — Traverse a consistent view even as the collection changes (MVCC)

traversal_explosion.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# The Combinatorial Explosion of Traversal Implementations
 
# For Array-based collection:
def traverse_array_forward(arr): ...
def traverse_array_reverse(arr): ...
def traverse_array_filtered(arr, predicate): ...
def traverse_array_paginated(arr, page_size): ...
 
# For LinkedList-based collection:
def traverse_linked_list_forward(head): ...
def traverse_linked_list_reverse(head): ...  # Requires O(n) per step or doubly-linked
def traverse_linked_list_filtered(head, predicate): ...
def traverse_linked_list_paginated(head, page_size): ...
 
# For Tree-based collection:
def traverse_tree_preorder(root): ...
def traverse_tree_inorder(root): ...
def traverse_tree_postorder(root): ...
def traverse_tree_level_order(root): ...
def traverse_tree_filtered(root, predicate): ...
def traverse_tree_paginated(root, page_size): ...
 
# For Graph-based collection:
def traverse_graph_dfs(start_node): ...
def traverse_graph_bfs(start_node): ...
def traverse_graph_topological(graph): ...
def traverse_graph_filtered(start_node, predicate): ...
 
# And so on for every combination...
# This is UNMAINTAINABLE!

Maintenance Nightmare

The Polymorphism Problem

Consider a function that should work with any collection of items:

broken_polymorphism.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
# Broken Polymorphism: Cannot Write Generic Code
 
def calculate_total_value(collection) -> Decimal:
    """
    Should calculate total value of any collection of items.
    But HOW do we traverse the collection?
    """
    total = Decimal("0")
    
    # We must check the collection type and use specific traversal!
    if isinstance(collection, list):
        for i in range(len(collection)):
            total += collection[i].get_value()
    
    elif isinstance(collection, dict):
        for key, item in collection.items():
            total += item.get_value()
    
    elif isinstance(collection, LinkedList):
        current = collection.head
        while current:
            total += current.data.get_value()
            current = current.next
    
    elif isinstance(collection, BinaryTree):
        # Need recursive traversal
        def traverse(node):
            nonlocal total
            if node:
                total += node.data.get_value()
                traverse(node.left)
                traverse(node.right)
        traverse(collection.root)
    
    else:
        raise TypeError(f"Unknown collection type: {type(collection)}")
    
    return total
 
 
# The function "works" but violates every principle:
# ❌ Open/Closed Principle: Must modify function for new collection types
# ❌ Single Responsibility: One function handles multiple traversal algorithms
# ❌ Liskov Substitution: Cannot substitute collections freely
# ❌ Dependency Inversion: Depends on concrete collection implementations

What we really want:

desired_polymorphism.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# What We Want: True Polymorphic Iteration
 
def calculate_total_value(collection: Iterable[Item]) -> Decimal:
                                    """
    Calculate total value of ANY collection of items.
    We don't care HOW the collection stores items.
    We only care that we can iterate over them.
    """
    total = Decimal("0")
    
    # Single, uniform traversal mechanism
    for item in collection:
        total += item.get_value()
    
    return total
 
 
# This function works with:
# - Arrays/Lists
# - Linked Lists
# - Trees (any traversal order)
# - Hash Maps (over values)
# - Sets
# - Queues
# - Database result sets
# - File line iterators
# - Network stream chunks
# - ANY collection that provides iteration!

The Goal of Iterator Pattern

Real-World Systems That Suffer Without Iterators

The iteration problem isn't theoretical—it causes real pain in production systems. Here are patterns we see repeatedly:

Production System Pain Points

•ORM Result Sets — Database queries return results in various forms (arrays, lazy cursors, paginated streams). Without iterator abstraction, business logic becomes coupled to specific ORM return types.
•API Pagination — External APIs return paginated results. Client code must handle 'has more pages' logic, next-page tokens, and rate limiting—mixing traversal concerns with business logic.
•File System Operations — Directory contents can be listed eagerly (into memory) or lazily (streaming). Without abstraction, code that processes files becomes tied to specific listing strategies.
•Message Queue Consumption — Messages are consumed from queues in batches or streams. Consumer code that's tightly coupled to queue implementation can't be reused for different queue systems.
•Search Results — Search engines return results with scores, highlights, and pagination. Without iteration abstraction, display logic becomes tangled with search-specific data structures.
•Configuration Systems — Configuration can come from files, environment variables, databases, or remote services. Code that iterates over settings shouldn't care about the source.

Without Iterator Pattern

•Business logic mixed with traversal logic
•Coupled to specific collection types
•Changes propagate across codebase
•Can't test traversal independently
•Can't easily change collection implementations
•Duplicate traversal code everywhere

With Iterator Pattern

•Clean separation of concerns
•Works with any iterable collection
•Changes isolated to iterator implementations
•Iterators can be tested in isolation
•Swap implementations without touching clients
•Single, reusable iteration interface

Defining the Core Problem

Let's crystallize the problem the Iterator Pattern solves. We need to satisfy multiple requirements simultaneously:

Requirements for Collection Traversal

•Encapsulation — The internal structure of a collection (array, linked list, tree, graph) must remain hidden from clients
•Uniformity — Clients should traverse all collections using the same interface, regardless of the collection's internal structure
•Multiple Traversals — It should be possible to have multiple simultaneous traversals of the same collection (different positions, different directions)
•Traversal Variations — The same collection should support different traversal orders (forward, reverse, filtered, etc.) without modifying the collection class
•Polymorphism — Generic algorithms should work with any collection type without type-specific code
•Separation of Concerns — Traversal logic should be separate from both collection logic and client business logic

The formal problem statement:

Provide a way to access the elements of an aggregate object sequentially without exposing its underlying representation.

Historical Context

Towards a Solution: The Iterator Abstraction

Before we dive into the full pattern implementation in the next page, let's preview the conceptual solution.

conceptual_solution.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# Conceptual Preview of the Iterator Solution
 
# The Iterator abstracts traversal
class Iterator:
    def has_next(self) -> bool:
        """Returns True if more elements exist"""
        ...
    
    def next(self) -> Any:
        """Returns the next element and advances position"""
        ...
 
 
# Collections provide iterators instead of exposing internals
class Collection:
    def create_iterator(self) -> Iterator:
        """Factory method: creates an iterator for this collection"""
        ...
 
 
# Client code uses only the Iterator interface
def process_items(collection: Collection):
    iterator = collection.create_iterator()
    
    while iterator.has_next():
        item = iterator.next()
        # Process item - no knowledge of collection structure!
        process(item)

Why this works:

Encapsulation preserved: The collection never exposes its internal structure
Uniform interface: Every collection provides the same create_iterator() method
Polymorphism enabled: Client code works with any collection that provides an iterator
Separation achieved: Traversal logic lives in the iterator, not in clients
Multiple traversals: Each call to create_iterator() returns a fresh iterator with independent state

The Power of Indirection

Summary: The Problem Landscape

We've explored the fundamental challenges of collection traversal that motivate the Iterator Pattern:

Key Takeaways

•Exposing internals creates coupling — When collections expose their internal structure, every client becomes dependent on that structure
•Changes propagate explosively — Switching from array to linked list (or any other structure) breaks all client code
•Multiple collection types multiply the problem — Real systems have many collection types, each requiring different traversal code
•Traversal variations compound complexity — Forward, reverse, filtered, paginated—each variation needs implementation per collection type
•Polymorphism breaks down — Without abstraction, generic algorithms devolve into type-checking messes
•Business logic gets polluted — Code that should focus on processing mixes with navigation mechanics

What's next:

Page Complete

1 / 4