Iterator Pattern - Learning Module

Loading content...

0/246

Use Cases and Examples

The Iterator Pattern in Production

The Iterator Pattern is everywhere in production systems. Every time you query a database, read a file line by line, consume messages from a queue, or page through API results, you're using iterators.

This final page brings the pattern to life with concrete examples from real-world systems. You'll see how the abstract concepts translate into practical solutions for common engineering challenges.

What You Will Learn

By the end of this page, you will understand how to apply the Iterator Pattern to database cursors, file system operations, network streams, API pagination, and custom composite data structures. You'll also learn advanced patterns including bidirectional iteration, cursor-based pagination, and iterator decoration.

Use Case 1: Database Cursors

Database cursors are the canonical example of the Iterator Pattern. A query might return millions of rows, but you don't want to load all of them into memory at once. Instead, you get a cursor—an iterator that fetches rows on demand.

database_cursor.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
from typing import Iterator, Optional, Dict, Any, TypeVar
from dataclasses import dataclass
import sqlite3
 
T = TypeVar('T')
 
class DatabaseCursor(Iterator[Dict[str, Any]]):
    """
    Iterator over database query results.
    
    Implements the Iterator Pattern to provide:
    - Memory-efficient streaming of large result sets
    - Uniform interface regardless of query complexity
    - Lazy fetching (rows fetched only when needed)
    """
    
    def __init__(
        self, 
        connection: sqlite3.Connection,
        query: str,
        params: tuple = (),
        batch_size: int = 1000
    ):
        self._connection = connection
        self._cursor = connection.cursor()
        self._query = query
        self._params = params
        self._batch_size = batch_size
        
        # State for iteration
        self._buffer: list = []
        self._buffer_index = 0
        self._exhausted = False
        self._row_count = 0
        
        # Execute query (but don't fetch yet)
        self._cursor.execute(query, params)
        self._columns = [desc[0] for desc in self._cursor.description]
    
    def __iter__(self):
        return self
    
    def __next__(self) -> Dict[str, Any]:
        """
        Return next row as dictionary.
        
        Fetches rows in batches for efficiency,
        but presents them one at a time to client.
        """
        if self._buffer_index >= len(self._buffer):
            # Need to fetch more rows
            if self._exhausted:
                raise StopIteration
            
            self._buffer = self._cursor.fetchmany(self._batch_size)
            self._buffer_index = 0
            
            if not self._buffer:
                self._exhausted = True
                raise StopIteration
        
        row = self._buffer[self._buffer_index]
        self._buffer_index += 1
        self._row_count += 1
        
        # Return as dictionary for easier access
        return dict(zip(self._columns, row))
    
    def close(self) -> None:
        """Release database resources."""
        self._cursor.close()
    
    @property
    def rows_processed(self) -> int:
        """Number of rows iterated so far."""
        return self._row_count
    
    def __enter__(self):
        return self
    
    def __exit__(self, exc_type, exc_val, exc_tb):
        self.close()
 
 
class Database:
    """
    Database wrapper that returns cursors (iterators) for queries.
    """
    
    def __init__(self, path: str):
        self._connection = sqlite3.connect(path)
    
    def query(
        self, 
        sql: str, 
        params: tuple = ()
    ) -> DatabaseCursor:
        """
        Execute query and return cursor iterator.
        
        Usage:
            for row in db.query("SELECT * FROM users WHERE active = ?", (True,)):
                process(row)
        """
        return DatabaseCursor(self._connection, sql, params)
    
    def query_one(
        self, 
        sql: str, 
        params: tuple = ()
    ) -> Optional[Dict[str, Any]]:
        """Execute query and return first row or None."""
        with self.query(sql, params) as cursor:
            try:
                return next(cursor)
            except StopIteration:
                return None
 
 
# Practical usage example
def process_large_result_set():
    """
    Process millions of rows without loading all into memory.
    
    The cursor fetches batches internally but presents
    a simple iteration interface to the client.
    """
    db = Database("app.db")
    
    # This query might match 10 million rows
    # But we only have a few thousand in memory at once
    query = """
        SELECT orders.*, customers.name 
        FROM orders 
        JOIN customers ON orders.customer_id = customers.id
        WHERE orders.date > ?
    """
    
    total_revenue = 0
    order_count = 0
    
    with db.query(query, ("2024-01-01",)) as cursor:
        for row in cursor:
            total_revenue += row['amount']
            order_count += 1
            
            # Can process each row individually
            if row['amount'] > 10000:
                flag_large_order(row)
            
            # Progress logging every 10000 rows
            if order_count % 10000 == 0:
                print(f"Processed {order_count} orders...")
    
    print(f"Total: {order_count} orders, ${total_revenue:,.2f} revenue")

Batch Fetching

The cursor internally fetches rows in batches (e.g., 1000 at a time) but presents them individually. This balances memory usage against round-trip overhead. The client doesn't need to know about batching—it just iterates.

Use Case 2: File System Traversal

File systems are naturally hierarchical—directories contain files and subdirectories. The Iterator Pattern lets us traverse this hierarchy uniformly, hiding the complexity of recursive directory walking.

filesystem_iterator.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
from pathlib import Path
from typing import Iterator, Optional, Callable
from dataclasses import dataclass
from enum import Enum, auto
import os
 
class TraversalOrder(Enum):
    """How to traverse the directory tree."""
    BREADTH_FIRST = auto()  # Level by level
    DEPTH_FIRST = auto()    # Deep then back up
 
@dataclass
class FileInfo:
    """Information about a file or directory."""
    path: Path
    is_dir: bool
    size: int
    name: str
    extension: str
    
    @classmethod
    def from_path(cls, path: Path) -> 'FileInfo':
        stat = path.stat()
        return cls(
            path=path,
            is_dir=path.is_dir(),
            size=stat.st_size if not path.is_dir() else 0,
            name=path.name,
            extension=path.suffix.lower()
        )
 
 
class FileSystemIterator(Iterator[FileInfo]):
    """
    Iterator over file system entries.
    
    Traverses directories recursively, yielding files and directories
    according to specified order and filters.
    """
    
    def __init__(
        self,
        root: Path,
        order: TraversalOrder = TraversalOrder.DEPTH_FIRST,
        include_dirs: bool = True,
        file_filter: Optional[Callable[[FileInfo], bool]] = None,
        max_depth: Optional[int] = None
    ):
        self._root = root
        self._order = order
        self._include_dirs = include_dirs
        self._file_filter = file_filter or (lambda f: True)
        self._max_depth = max_depth
        
        # Initialize traversal state
        self._queue: list[tuple[Path, int]] = [(root, 0)]
        self._current_depth = 0
    
    def __iter__(self):
        return self
    
    def __next__(self) -> FileInfo:
        """
        Return next file/directory in traversal.
        """
        while self._queue:
            if self._order == TraversalOrder.DEPTH_FIRST:
                current_path, depth = self._queue.pop()  # LIFO
            else:
                current_path, depth = self._queue.pop(0)  # FIFO
            
            self._current_depth = depth
            
            try:
                file_info = FileInfo.from_path(current_path)
            except (PermissionError, OSError):
                # Skip inaccessible entries
                continue
            
            # Add children to queue if directory
            if file_info.is_dir:
                if self._max_depth is None or depth < self._max_depth:
                    try:
                        children = list(current_path.iterdir())
                        # Add children to appropriate end of queue
                        for child in children:
                            self._queue.append((child, depth + 1))
                    except (PermissionError, OSError):
                        pass
            
            # Apply filters and skip criteria
            if file_info.is_dir and not self._include_dirs:
                continue
            
            if not self._file_filter(file_info):
                continue
            
            return file_info
        
        raise StopIteration
 
 
class FileSystem:
    """
    File system navigator that provides iterator access.
    """
    
    @staticmethod
    def walk(
        root: Path,
        **kwargs
    ) -> FileSystemIterator:
        """
        Walk directory tree, yielding file info.
        
        This is the Aggregate's create_iterator() method.
        """
        return FileSystemIterator(root, **kwargs)
    
    @staticmethod
    def find_files(
        root: Path,
        extensions: list[str],
        max_depth: Optional[int] = None
    ) -> Iterator[FileInfo]:
        """
        Find files with specific extensions.
        
        Convenience method that configures the iterator.
        """
        return FileSystemIterator(
            root,
            include_dirs=False,
            file_filter=lambda f: f.extension in extensions,
            max_depth=max_depth
        )
    
    @staticmethod
    def find_large_files(
        root: Path,
        min_size_bytes: int
    ) -> Iterator[FileInfo]:
        """
        Find files larger than specified size.
        """
        return FileSystemIterator(
            root,
            include_dirs=False,
            file_filter=lambda f: f.size >= min_size_bytes
        )
 
 
# Practical usage
def cleanup_project():
    """
    Clean up a project directory using file system iterator.
    """
    project_root = Path("/home/user/project")
    
    # Find and remove build artifacts
    removable_patterns = {'.pyc', '.pyo', '.class', '.o', '.obj'}
    
    for file_info in FileSystem.find_files(project_root, list(removable_patterns)):
        print(f"Removing: {file_info.path}")
        file_info.path.unlink()
    
    # Find large log files
    log_extensions = ['.log', '.txt']
    one_megabyte = 1024 * 1024
    
    for file_info in FileSystem.walk(
        project_root,
        file_filter=lambda f: (
            f.extension in log_extensions 
            and f.size > one_megabyte
        )
    ):
        print(f"Large log: {file_info.path} ({file_info.size / one_megabyte:.1f} MB)")
    
    # Calculate total size of Python files
    total_python_size = sum(
        f.size for f in FileSystem.find_files(project_root, ['.py'])
    )
    print(f"Total Python code: {total_python_size / 1024:.1f} KB")

DFS vs BFS

Depth-first traversal uses less memory (stack depth = tree depth) and finds deeply nested files first. Breadth-first uses more memory (queue size = level width) but finds shallow files first. Choose based on your use case.

Use Case 3: Network Stream Iteration

Network data arrives as streams—you don't know how much is coming or when it will end. The Iterator Pattern provides a natural interface for consuming streamed data chunk by chunk.

network_stream_iterator.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
from typing import Iterator, TypeVar, Generic
from dataclasses import dataclass
import json
import socket
from abc import ABC, abstractmethod
 
T = TypeVar('T')
 
class StreamEndedException(Exception):
    """Raised when stream has no more data."""
    pass
 
 
class NetworkStreamIterator(Iterator[bytes], ABC):
    """
    Abstract iterator for network streams.
    
    Provides buffered reading from network connections
    with a simple iteration interface.
    """
    
    @abstractmethod
    def _read_chunk(self) -> bytes:
        """Read next chunk from network. Returns empty bytes on EOF."""
        pass
    
    @abstractmethod
    def close(self) -> None:
        """Close the underlying connection."""
        pass
 
 
class TCPStreamIterator(NetworkStreamIterator):
    """
    Iterator over TCP socket data.
    
    Yields chunks of bytes as they arrive.
    """
    
    def __init__(
        self, 
        host: str, 
        port: int, 
        chunk_size: int = 4096
    ):
        self._socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
        self._socket.connect((host, port))
        self._chunk_size = chunk_size
        self._exhausted = False
    
    def __iter__(self):
        return self
    
    def __next__(self) -> bytes:
        if self._exhausted:
            raise StopIteration
        
        chunk = self._read_chunk()
        if not chunk:
            self._exhausted = True
            raise StopIteration
        
        return chunk
    
    def _read_chunk(self) -> bytes:
        return self._socket.recv(self._chunk_size)
    
    def close(self) -> None:
        self._socket.close()
 
 
class JSONLinesIterator(Iterator[dict]):
    """
    Iterator over newline-delimited JSON stream.
    
    Many APIs stream data as JSON Lines (one JSON object per line).
    This iterator parses each line as it arrives.
    """
    
    def __init__(self, byte_iterator: Iterator[bytes]):
        self._byte_iterator = byte_iterator
        self._buffer = ""
        self._closed = False
    
    def __iter__(self):
        return self
    
    def __next__(self) -> dict:
        while True:
            # Check if we have a complete line in buffer
            newline_pos = self._buffer.find('
')
            if newline_pos != -1:
                line = self._buffer[:newline_pos]
                self._buffer = self._buffer[newline_pos + 1:]
                if line.strip():
                    return json.loads(line)
                continue
            
            # Need more data
            try:
                chunk = next(self._byte_iterator)
                self._buffer += chunk.decode('utf-8')
            except StopIteration:
                # End of stream - return remaining data if any
                if self._buffer.strip():
                    remaining = self._buffer
                    self._buffer = ""
                    return json.loads(remaining)
                raise
 
 
class ServerSentEventIterator(Iterator[dict]):
    """
    Iterator for Server-Sent Events (SSE).
    
    SSE is a common pattern for streaming updates:
    - Live sports scores
    - Stock prices
    - Chat messages
    - System monitoring
    """
    
    def __init__(self, http_response_iterator: Iterator[bytes]):
        self._response = http_response_iterator
        self._buffer = ""
        self._current_event = {}
    
    def __iter__(self):
        return self
    
    def __next__(self) -> dict:
        """
        Parse and return next SSE event.
        
        SSE format:
            event: eventType
            data: {"json": "payload"}
            
            (blank line separates events)
        """
        while True:
            try:
                chunk = next(self._response)
                self._buffer += chunk.decode('utf-8')
            except StopIteration:
                raise
            
            # Look for complete events (double newline)
            while '
 
' in self._buffer:
                event_str, self._buffer = self._buffer.split('
 
', 1)
                event = self._parse_event(event_str)
                if event:
                    return event
    
    def _parse_event(self, event_str: str) -> dict:
        event = {'type': 'message', 'data': None}
        
        for line in event_str.split('
'):
            if line.startswith('event:'):
                event['type'] = line[6:].strip()
            elif line.startswith('data:'):
                data_str = line[5:].strip()
                try:
                    event['data'] = json.loads(data_str)
                except json.JSONDecodeError:
                    event['data'] = data_str
        
        return event if event['data'] is not None else None
 
 
# Real-world usage: Live stock price stream
def monitor_stock_prices():
    """
    Consume live stock prices via SSE.
    
    The iterator abstracts away:
    - Network buffering
    - SSE protocol parsing
    - JSON decoding
    
    Client code just iterates over price updates.
    """
    # Hypothetical SSE stream
    stream = ServerSentEventIterator(
        connect_to_sse_endpoint("wss://market.example.com/prices")
    )
    
    for event in stream:
        if event['type'] == 'price_update':
            price = event['data']
            symbol = price['symbol']
            current = price['price']
            change = price['change_percent']
            
            print(f"{symbol}: ${current:.2f} ({change:+.1f}%)")
            
            # Can still break early
            if current > 1000:
                alert_high_price(symbol, current)
                break

Layered Iterators

Notice how iterators can wrap other iterators: JSONLinesIterator wraps a byte stream iterator. Each layer handles one concern (networking, framing, parsing). This is the Decorator Pattern applied to iterators.

Use Case 4: API Pagination

REST APIs typically paginate large result sets. The Iterator Pattern can hide pagination complexity, letting clients iterate as if they had all results in memory.

paginated_api_iterator.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
from typing import Iterator, TypeVar, Generic, Optional, Any
from dataclasses import dataclass
import requests
 
T = TypeVar('T')
 
@dataclass
class PageInfo:
    """Pagination metadata."""
    page_number: int
    page_size: int
    total_items: int
    total_pages: int
    has_next: bool
    next_cursor: Optional[str] = None
 
 
class PaginatedAPIIterator(Iterator[T], Generic[T]):
    """
    Iterator over paginated API results.
    
    Automatically fetches subsequent pages as iteration progresses.
    Client code sees a seamless stream of items.
    
    Supports both:
    - Offset-based pagination (page=1, page=2)
    - Cursor-based pagination (cursor=abc123)
    """
    
    def __init__(
        self,
        base_url: str,
        headers: dict = None,
        page_size: int = 100,
        cursor_param: str = "cursor",
        results_key: str = "items",
        next_cursor_key: str = "next_cursor"
    ):
        self._base_url = base_url
        self._headers = headers or {}
        self._page_size = page_size
        self._cursor_param = cursor_param
        self._results_key = results_key
        self._next_cursor_key = next_cursor_key
        
        # Iteration state
        self._current_page: list = []
        self._page_index = 0
        self._next_cursor: Optional[str] = None
        self._exhausted = False
        self._first_request = True
    
    def __iter__(self):
        return self
    
    def __next__(self) -> T:
        # If current page exhausted, fetch next
        if self._page_index >= len(self._current_page):
            if self._exhausted:
                raise StopIteration
            
            self._fetch_next_page()
            
            if not self._current_page:
                self._exhausted = True
                raise StopIteration
        
        item = self._current_page[self._page_index]
        self._page_index += 1
        return item
    
    def _fetch_next_page(self) -> None:
        """
        Fetch next page from API.
        """
        params = {"limit": self._page_size}
        
        if self._next_cursor:
            params[self._cursor_param] = self._next_cursor
        elif not self._first_request:
            # No cursor and not first request = done
            self._current_page = []
            return
        
        self._first_request = False
        
        response = requests.get(
            self._base_url,
            params=params,
            headers=self._headers
        )
        response.raise_for_status()
        
        data = response.json()
        self._current_page = data.get(self._results_key, [])
        self._page_index = 0
        self._next_cursor = data.get(self._next_cursor_key)
        
        if not self._next_cursor and not self._current_page:
            self._exhausted = True
 
 
class GitHubRepoIterator(Iterator[dict]):
    """
    Iterator over GitHub repositories.
    
    GitHub uses Link headers for pagination.
    This iterator handles the header parsing internally.
    """
    
    def __init__(self, username: str, token: str = None):
        self._username = username
        self._headers = {"Accept": "application/vnd.github.v3+json"}
        if token:
            self._headers["Authorization"] = f"token {token}"
        
        self._next_url = f"https://api.github.com/users/{username}/repos"
        self._current_page: list = []
        self._page_index = 0
    
    def __iter__(self):
        return self
    
    def __next__(self) -> dict:
        if self._page_index >= len(self._current_page):
            if not self._next_url:
                raise StopIteration
            
            self._fetch_next_page()
            
            if not self._current_page:
                raise StopIteration
        
        repo = self._current_page[self._page_index]
        self._page_index += 1
        return repo
    
    def _fetch_next_page(self) -> None:
        response = requests.get(self._next_url, headers=self._headers)
        response.raise_for_status()
        
        self._current_page = response.json()
        self._page_index = 0
        
        # Parse Link header for next page
        self._next_url = self._parse_next_link(
            response.headers.get('Link', '')
        )
    
    def _parse_next_link(self, link_header: str) -> Optional[str]:
        for part in link_header.split(','):
            if 'rel="next"' in part:
                url = part.split(';')[0].strip()
                return url.strip('<>')
        return None
 
 
# Clean client code
def analyze_repositories():
    """
    Analyze all repositories for a GitHub user.
    
    The iterator handles pagination transparently.
    We just iterate as if all repos were in memory.
    """
    total_stars = 0
    repo_count = 0
    languages = {}
    
    # Iterate through ALL repos (auto-paginated)
    for repo in GitHubRepoIterator("microsoft"):
        repo_count += 1
        total_stars += repo['stargazers_count']
        
        lang = repo.get('language')
        if lang:
            languages[lang] = languages.get(lang, 0) + 1
        
        # Can still break early
        if repo_count >= 100:
            print("Analyzed first 100 repos")
            break
    
    print(f"Repositories: {repo_count}")
    print(f"Total stars: {total_stars:,}")
    print(f"Top languages: {sorted(languages.items(), key=lambda x: -x[1])[:5]}")

Transparent Pagination

Client code simply iterates. It doesn't know about pages, cursors, or API rate limits. The iterator encapsulates all pagination logic, providing a clean abstraction over a complex reality.

Use Case 5: Composite Structure Iteration

The Composite Pattern (trees of objects) pairs naturally with the Iterator Pattern. Complex hierarchies—org charts, file systems, UI component trees—can all be traversed uniformly.

composite_iterator.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
from typing import Iterator, List, Optional
from dataclasses import dataclass, field
from abc import ABC, abstractmethod
 
class OrganizationComponent(ABC):
    """
    Component in organizational hierarchy.
    
    Both individuals and teams implement this interface.
    """
    
    @property
    @abstractmethod
    def name(self) -> str:
        pass
    
    @property
    @abstractmethod
    def salary(self) -> float:
        pass
    
    @abstractmethod
    def create_iterator(self) -> Iterator['OrganizationComponent']:
        """Return iterator over this component and its children."""
        pass
 
 
@dataclass
class Employee(OrganizationComponent):
    """
    Leaf in organizational hierarchy.
    """
    _name: str
    _salary: float
    role: str
    
    @property
    def name(self) -> str:
        return self._name
    
    @property
    def salary(self) -> float:
        return self._salary
    
    def create_iterator(self) -> Iterator['OrganizationComponent']:
        """Employee has no children, just yields self."""
        yield self
 
 
@dataclass
class Team(OrganizationComponent):
    """
    Composite in organizational hierarchy.
    
    Contains both employees and sub-teams.
    """
    _name: str
    members: List[OrganizationComponent] = field(default_factory=list)
    
    @property
    def name(self) -> str:
        return self._name
    
    @property
    def salary(self) -> float:
        """Total salary of all members."""
        return sum(m.salary for m in self._iterate_all())
    
    def add(self, component: OrganizationComponent) -> None:
        self.members.append(component)
    
    def _iterate_all(self) -> Iterator[OrganizationComponent]:
        """Internal iterator over all leaf members."""
        for member in self.members:
            yield from member.create_iterator()
    
    def create_iterator(self) -> Iterator['OrganizationComponent']:
        """
        Depth-first iteration over entire subtree.
        
        Yields this team, then recursively yields all members.
        """
        yield self
        for member in self.members:
            yield from member.create_iterator()
    
    def create_employees_only_iterator(self) -> Iterator[Employee]:
        """Iterate over employees only (skip teams)."""
        for component in self.create_iterator():
            if isinstance(component, Employee):
                yield component
    
    def create_direct_reports_iterator(self) -> Iterator['OrganizationComponent']:
        """Iterate over direct members only (shallow)."""
        return iter(self.members)
 
 
class DepthAnnotatedIterator(Iterator[tuple[int, OrganizationComponent]]):
    """
    Iterator that yields (depth, component) tuples.
    
    Useful for indented display or level-based processing.
    """
    
    def __init__(self, root: OrganizationComponent):
        self._stack: List[tuple[int, OrganizationComponent]] = [(0, root)]
    
    def __iter__(self):
        return self
    
    def __next__(self) -> tuple[int, OrganizationComponent]:
        if not self._stack:
            raise StopIteration
        
        depth, component = self._stack.pop()
        
        if isinstance(component, Team):
            # Add children in reverse order (so first child pops first)
            for member in reversed(component.members):
                self._stack.append((depth + 1, member))
        
        return depth, component
 
 
# Build an organization
def build_org() -> Team:
    ceo = Team("Executive")
    ceo.add(Employee("Alice", 500000, "CEO"))
    
    engineering = Team("Engineering")
    engineering.add(Employee("Bob", 200000, "VP Engineering"))
    
    platform = Team("Platform Team")
    platform.add(Employee("Charlie", 150000, "Staff Engineer"))
    platform.add(Employee("Diana", 140000, "Senior Engineer"))
    platform.add(Employee("Eve", 120000, "Engineer"))
    
    product = Team("Product Team")
    product.add(Employee("Frank", 145000, "Staff Engineer"))
    product.add(Employee("Grace", 130000, "Senior Engineer"))
    
    engineering.add(platform)
    engineering.add(product)
    
    sales = Team("Sales")
    sales.add(Employee("Henry", 180000, "VP Sales"))
    sales.add(Employee("Ivy", 100000, "Account Executive"))
    
    ceo.add(engineering)
    ceo.add(sales)
    
    return ceo
 
 
# Use iterators for various analyses
def demonstrate_composite_iteration():
    org = build_org()
    
    # Print organization chart with indentation
    print("Organization Chart:")
    for depth, component in DepthAnnotatedIterator(org):
        indent = "  " * depth
        if isinstance(component, Team):
            print(f"{indent}📁 {component.name}")
        else:
            print(f"{indent}👤 {component.name} ({component.role})")
    
    # Calculate total salary
    total = sum(e.salary for e in org.create_employees_only_iterator())
    print(f"
Total Salary: ${total:,.2f}")
    
    # Find high earners
    print("
Employees earning > $150k:")
    for emp in org.create_employees_only_iterator():
        if emp.salary > 150000:
            print(f"  {emp.name}: ${emp.salary:,.2f}")
    
    # Count employees per team
    print("
Team Sizes:")
    for component in org.create_iterator():
        if isinstance(component, Team):
            emp_count = sum(1 for _ in component.create_employees_only_iterator())
            print(f"  {component.name}: {emp_count} employees")

Multiple Iteration Strategies

The same composite structure supports multiple iteration strategies: depth-first with annotations, employees only, direct reports only. Each iterator implementation provides a different view of the same underlying data.

Advanced Iterator Patterns

Beyond basic iteration, there are several advanced patterns that solve specific challenges:

advanced_iterators.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
from typing import Iterator, TypeVar, Generic, Optional, Callable
from dataclasses import dataclass
 
T = TypeVar('T')
 
# Pattern 1: Bidirectional Iterator
class BidirectionalIterator(Iterator[T], Generic[T]):
    """
    Iterator that can move both forward and backward.
    
    Useful for:
    - Text editors (cursor movement)
    - Media players (previous/next track)
    - Undo/redo in any sequence
    """
    
    def __init__(self, items: list[T]):
        self._items = items
        self._index = -1  # Before first element
    
    def has_next(self) -> bool:
        return self._index < len(self._items) - 1
    
    def has_previous(self) -> bool:
        return self._index > 0
    
    def __next__(self) -> T:
        if not self.has_next():
            raise StopIteration
        self._index += 1
        return self._items[self._index]
    
    def previous(self) -> T:
        if not self.has_previous():
            raise StopIteration
        self._index -= 1
        return self._items[self._index]
    
    def next_index(self) -> int:
        return self._index + 1
    
    def previous_index(self) -> int:
        return self._index - 1
    
    def current(self) -> Optional[T]:
        if 0 <= self._index < len(self._items):
            return self._items[self._index]
        return None
 
 
# Pattern 2: Filtering Iterator (Decorator)
class FilteringIterator(Iterator[T], Generic[T]):
    """
    Wraps another iterator, yielding only matching elements.
    
    This is the Decorator Pattern applied to iterators.
    """
    
    def __init__(
        self, 
        source: Iterator[T], 
        predicate: Callable[[T], bool]
    ):
        self._source = source
        self._predicate = predicate
    
    def __iter__(self):
        return self
    
    def __next__(self) -> T:
        while True:
            item = next(self._source)  # May raise StopIteration
            if self._predicate(item):
                return item
 
 
# Pattern 3: Transforming Iterator (Decorator)
class MappingIterator(Iterator[T], Generic[T]):
    """
    Wraps another iterator, transforming elements.
    """
    
    def __init__(
        self, 
        source: Iterator, 
        transform: Callable[..., T]
    ):
        self._source = source
        self._transform = transform
    
    def __iter__(self):
        return self
    
    def __next__(self) -> T:
        return self._transform(next(self._source))
 
 
# Pattern 4: Buffered Iterator (lookahead)
class LookaheadIterator(Iterator[T], Generic[T]):
    """
    Iterator with lookahead capability.
    
    Useful for parsers that need to peek at upcoming tokens.
    """
    
    def __init__(self, source: Iterator[T], lookahead_count: int = 1):
        self._source = source
        self._buffer: list[T] = []
        self._lookahead_count = lookahead_count
        self._exhausted = False
        
        # Pre-fill buffer
        self._fill_buffer()
    
    def _fill_buffer(self) -> None:
        while len(self._buffer) < self._lookahead_count and not self._exhausted:
            try:
                self._buffer.append(next(self._source))
            except StopIteration:
                self._exhausted = True
    
    def peek(self, offset: int = 0) -> Optional[T]:
        """Look ahead without consuming."""
        if offset < len(self._buffer):
            return self._buffer[offset]
        return None
    
    def __iter__(self):
        return self
    
    def __next__(self) -> T:
        if not self._buffer:
            raise StopIteration
        
        item = self._buffer.pop(0)
        self._fill_buffer()
        return item
 
 
# Pattern 5: Null/Empty Iterator
class EmptyIterator(Iterator[T], Generic[T]):
    """
    Iterator that yields nothing.
    
    Useful for:
    - Avoiding None checks in iterator-returning methods
    - Default return value
    - Testing
    """
    
    def __iter__(self):
        return self
    
    def __next__(self) -> T:
        raise StopIteration
 
 
# Pattern 6: Chaining Iterator
class ChainedIterator(Iterator[T], Generic[T]):
    """
    Chains multiple iterators sequentially.
    
    Exhausts first iterator, then second, then third, etc.
    """
    
    def __init__(self, *iterators: Iterator[T]):
        self._iterators = list(iterators)
        self._current_index = 0
    
    def __iter__(self):
        return self
    
    def __next__(self) -> T:
        while self._current_index < len(self._iterators):
            try:
                return next(self._iterators[self._current_index])
            except StopIteration:
                self._current_index += 1
        
        raise StopIteration
 
 
# Using advanced patterns together
def demonstrate_advanced_patterns():
    numbers = list(range(20))
    
    # Chain decorator iterators
    iterator = ChainedIterator(
        MappingIterator(
            FilteringIterator(
                iter(numbers),
                lambda x: x % 2 == 0  # Only evens
            ),
            lambda x: x ** 2  # Square them
        ),
        iter([1000, 2000, 3000])  # Then these
    )
    
    print("Squared evens, then extras:")
    for item in iterator:
        print(f"  {item}")
    
    # Lookahead for parsing
    tokens = LookaheadIterator(iter(["if", "(", "x", ">", "0", ")", "{"]))
    
    while True:
        try:
            current = next(tokens)
            next_token = tokens.peek()
            print(f"Current: {current}, Next: {next_token}")
        except StopIteration:
            break

Iterator Anti-Patterns to Avoid

While the Iterator Pattern is powerful, there are common mistakes that can cause bugs or performance issues:

Common Anti-Patterns

•Modifying collection during iteration — Adding or removing elements while iterating can cause skipped elements, duplicates, or exceptions. Use a copy or collect modifications for later.
•Assuming single-use iterators are reusable — Many iterators (especially generators) can only be traversed once. If you need multiple passes, either reset or create a new iterator.
•Ignoring resource cleanup — Database cursors, file handles, and network connections need explicit cleanup. Use try/finally or context managers.
•Mixing iteration paradigms — Don't use external iteration (for loop) inside an internal iterator callback (forEach). It creates confusion and bugs.
•Inefficient has_next implementations — has_next() should be O(1). If it requires peeking ahead, cache the peeked value.
•State leakage between iterations — Reset all state in reset() methods. Leftover state causes subtle bugs on re-iteration.

anti_patterns.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
# Anti-Pattern Examples
 
# ❌ WRONG: Modifying during iteration
items = [1, 2, 3, 4, 5]
for item in items:
    if item % 2 == 0:
        items.remove(item)  # Skips the next element!
# Result: [1, 3, 5] seems right but try [1, 2, 4, 3, 5] - broken!
 
# ✅ CORRECT: Collect modifications
items = [1, 2, 3, 4, 5]
items = [item for item in items if item % 2 != 0]
 
 
# ❌ WRONG: Assuming generator reusability
def get_numbers():
    yield 1
    yield 2
    yield 3
 
gen = get_numbers()
print(list(gen))  # [1, 2, 3]
print(list(gen))  # [] - generator exhausted!
 
# ✅ CORRECT: Create new generator each time
print(list(get_numbers()))  # [1, 2, 3]
print(list(get_numbers()))  # [1, 2, 3]
 
 
# ❌ WRONG: No cleanup
def process_files(paths):
    for path in paths:
        f = open(path)  # Never closed!
        for line in f:
            process(line)
 
# ✅ CORRECT: Use context manager
def process_files(paths):
    for path in paths:
        with open(path) as f:
            for line in f:
                process(line)

ConcurrentModificationException

Many languages throw explicit errors when you modify a collection during iteration (Java's ConcurrentModificationException, Python's RuntimeError for dict changes during iteration). This is intentional—the behavior is undefined and the iterator can't compensate.

Summary: Iterator Pattern Mastery

Throughout this module, we've explored the Iterator Pattern comprehensively—from the problem it solves to real-world applications. Let's consolidate our learning:

Key Takeaways

•The Problem: Exposing collection internals for traversal creates tight coupling and breaks encapsulation
•The Solution: Extract traversal into a separate Iterator object with uniform interface (has_next, next)
•Four Participants: Iterator (interface), Concrete Iterator (implementation), Aggregate (collection interface), Concrete Aggregate (collection)
•Internal vs External: External iterators give client control; internal iterators give collection control
•Real-World Uses: Database cursors, file system traversal, network streams, API pagination, composite structures
•Advanced Patterns: Bidirectional, filtering, mapping, lookahead, null object, chained iterators
•Built into Languages: Most modern languages have iterator protocols built into syntax (for loops)

Iterator Pattern Quick Reference
When to Use	Benefits	Watch Out For
Uniform collection traversal	Encapsulation preserved	Modification during iteration
Hide collection structure	Polymorphic algorithms	Resource cleanup
Multiple simultaneous traversals	Single Responsibility	One-shot iterators
Lazy/streaming data	Open/Closed extensible	State leakage on reset
Paginated or remote data	Memory efficiency	Inefficient has_next()

Where to go from here:

The Iterator Pattern is a gateway to functional programming concepts. Explore:

Java Streams — Lazy internal iteration with parallel support
Python Generators — Elegant iterator implementation
Reactive Extensions (RxJS, RxJava) — Asynchronous iteration over event streams
Itertools library — Advanced iterator combinators

The Iterator Pattern is fundamental. Mastering it improves your ability to design APIs, reason about data flow, and write code that scales.

Module Complete

Congratulations! You've mastered the Iterator Pattern—from understanding the problem of uniform collection traversal, through the solution architecture, to real-world applications and advanced patterns. You can now design and implement iterators for any collection type, choose appropriately between internal and external iteration, and recognize iterator opportunities in production systems.