Iterator Pattern - Learning Module

Loading content...

0/246

Internal vs External Iterators

Two Philosophies of Iteration

The Iterator Pattern we've explored so far follows a specific model: the client controls iteration. The client asks 'is there more?', the client requests the next element, the client decides when to stop. This is called an external iterator—the iteration mechanism is external to the collection, controlled by the client.

But there's another approach. What if the collection controlled iteration? What if the client simply said 'here's what to do with each element' and the collection handled the traversal? This is an internal iterator—the iteration mechanism is internal to the collection.

This distinction is fundamental. It affects API design, control flow, error handling, and even which programming paradigm (imperative vs functional) feels more natural. Let's explore both approaches in depth.

What You Will Learn

By the end of this page, you will understand the difference between internal and external iterators, when to choose each approach, how control flow differs between them, and how modern functional programming patterns (map, filter, reduce) relate to internal iteration.

External Iterators: Client-Controlled Traversal

An external iterator (also called an active iterator or cursor) is what we've been building throughout this module. The client code explicitly requests elements one at a time and decides when to advance.

Key characteristics:

Client controls the iteration loop
Client decides when to call next()
Client can pause, resume, or abandon iteration at any point
Client manages iteration state (implicitly, through the iterator object)

external_iterator_example.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
from typing import Iterator, List
 
# External Iterator: Client controls everything
 
def process_with_external_iterator(items: List[str]) -> None:
    """
    Using an external iterator.
    
    The CLIENT (this function) controls:
    - When to get the iterator
    - When to advance (call next())
    - When to check for more elements
    - When to stop iterating
    - What to do between iterations
    """
    iterator = iter(items)  # Get the iterator
    
    while True:
        try:
            # CLIENT requests next element
            item = next(iterator)
            
            # CLIENT decides what to do
            print(f"Processing: {item}")
            
            # CLIENT can do anything between iterations
            if item == "STOP":
                print("Found stop signal, exiting early")
                break  # CLIENT controls when to stop
            
            # CLIENT can pause for external reasons
            if should_pause():
                save_progress(iterator)
                wait_for_resume()
            
            # CLIENT can interleave with other iterators
            if needs_lookup(item):
                # Process completely different collection
                lookup_result = process_lookup_collection()
                apply_result(item, lookup_result)
                
        except StopIteration:
            break
    
    print("Iteration complete")
 
 
# More typical Python style
def external_with_for_loop(items: List[str]) -> None:
    """
    Python's for loop is still an external iterator.
    
    The loop body (client code) runs between each iterator step.
    Client can still break, continue, or return.
    """
    for item in items:
        if item.startswith("_"):
            continue  # Skip this one
        
        if item == "STOP":
            break  # Exit early
        
        result = complex_processing(item)
        
        if not result.success:
            return  # Exit the entire function
        
        yield result  # Can even be part of a generator!

External Iterator Advantages

•Maximum control — Client can break, continue, pause, or interleave with other operations at any point
•Stateful iteration — Easy to track position, compare elements, look-ahead or look-behind
•Multiple simultaneous iterations — Can have several iterators over same or different collections, advancing independently
•Early termination — Can stop as soon as condition is met without processing remaining elements
•Complex control flow — Supports nested loops, conditional advancement, backtracking

The Iterator is a Cursor

Think of an external iterator as a cursor or bookmark. It marks a position in the collection. The client moves the cursor forward (or backward, for bidirectional iterators). The collection itself is unaware of where the cursor points—it just provides the mechanism for movement.

Internal Iterators: Collection-Controlled Traversal

An internal iterator (also called a passive iterator or callback-based iterator) inverts the control. Instead of the client pulling elements, the collection pushes elements to a callback function provided by the client.

Key characteristics:

Collection controls the iteration loop
Client provides a function to apply to each element
Collection decides when and how to call the function
Client's function is invoked for each element automatically

internal_iterator_example.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
from typing import List, Callable, TypeVar
 
T = TypeVar('T')
R = TypeVar('R')
 
class InternalIteratorCollection:
    """
    Collection with internal iteration support.
    
    The COLLECTION controls traversal.
    Client just provides the operation to perform.
    """
    
    def __init__(self):
        self._items: List[T] = []
    
    def add(self, item: T) -> None:
        self._items.append(item)
    
    def for_each(self, action: Callable[[T], None]) -> None:
        """
        Internal iterator: collection controls traversal.
        
        Client provides a callback function.
        Collection applies it to every element.
        
        The client has NO control over:
        - When iteration starts
        - The order of traversal
        - When to pause or resume
        - Early termination (without exceptions)
        """
        for item in self._items:
            action(item)  # Collection calls client's function
    
    def map(self, transform: Callable[[T], R]) -> 'InternalIteratorCollection[R]':
        """
        Transform each element, return new collection.
        
        Another form of internal iteration: client provides
        the transformation, collection handles traversal.
        """
        result = InternalIteratorCollection[R]()
        for item in self._items:
            result.add(transform(item))
        return result
    
    def filter(self, predicate: Callable[[T], bool]) -> 'InternalIteratorCollection[T]':
        """
        Keep only elements matching predicate.
        
        Collection iterates and decides what to include.
        """
        result = InternalIteratorCollection[T]()
        for item in self._items:
            if predicate(item):
                result.add(item)
        return result
    
    def reduce(self, initial: R, reducer: Callable[[R, T], R]) -> R:
        """
        Combine all elements into single value.
        
        Classic internal iteration pattern.
        """
        accumulator = initial
        for item in self._items:
            accumulator = reducer(accumulator, item)
        return accumulator
 
 
# Using internal iterators
def demo_internal_iteration():
    numbers = InternalIteratorCollection[int]()
    for n in [1, 2, 3, 4, 5]:
        numbers.add(n)
    
    # Internal iteration: just provide the action
    print("All numbers:")
    numbers.for_each(lambda x: print(f"  {x}"))
    
    # Chained internal iterations (functional style)
    result = (numbers
        .map(lambda x: x * 2)        # Double each
        .filter(lambda x: x > 4)     # Keep if > 4
        .reduce(0, lambda a, b: a + b))  # Sum them
    
    print(f"Result: {result}")  # Output: 18 (6 + 8 + 10 = 24... wait)
    # Actually: 2,4,6,8,10 -> filter > 4 -> 6,8,10 -> sum = 24
 
 
# Python's built-in internal iterators
def python_internal_iterators():
    numbers = [1, 2, 3, 4, 5]
    
    # map() - internal iteration with transformation
    doubled = list(map(lambda x: x * 2, numbers))
    
    # filter() - internal iteration with selection
    evens = list(filter(lambda x: x % 2 == 0, numbers))
    
    # reduce() (from functools) - internal iteration with accumulation
    from functools import reduce
    total = reduce(lambda acc, x: acc + x, numbers, 0)
    
    # all() and any() - internal iteration with short-circuit
    all_positive = all(x > 0 for x in numbers)
    any_negative = any(x < 0 for x in numbers)

Internal Iterator Advantages

•Simpler client code — No loop boilerplate; just express what to do with each element
•Chainable operations — Natural fit for functional pipelines (map → filter → reduce)
•Collection optimization — Collection can optimize traversal (parallel execution, short-circuit, lazy evaluation)
•Encapsulation — Traversal logic stays entirely within the collection
•Declarative style — Client expresses intent, collection handles mechanism

Control Flow: The Fundamental Difference

The most significant difference between internal and external iterators is who controls the loop. This has profound implications for what operations are easy or difficult.

external_control.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# External Iterator Control Flow
 
# Client is in a loop they control
iterator = collection.iterator()
 
while iterator.has_next():
    item = iterator.next()
    
    # Client can:
    
    # 1. Break early
    if found_what_i_need(item):
        break
    
    # 2. Skip items
    if not relevant(item):
        continue
    
    # 3. Track state across iterations
    count += 1
    if item > max_so_far:
        max_so_far = item
    
    # 4. Interleave with other iterators
    for related in lookup(item):
        process(item, related)
    
    # 5. Return from the function
    if error_condition(item):
        return None

internal_control.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# Internal Iterator Control Flow
 
# Collection controls the loop
# Client provides callback
 
collection.for_each(
    lambda item: process(item)
)
 
# Client CANNOT easily:
 
# 1. Break early
#    (need exception or flag)
 
# 2. Skip items
#    (need filter() before)
 
# 3. Track state across calls
#    (need closure or class)
 
# 4. Interleave iterators
#    (callbacks don't nest well)
 
# 5. Return from outer function
#    (return only exits lambda)
 
# But CAN chain operations:
collection
    .filter(relevant)
    .map(transform)
    .for_each(process)

The trade-off is clear:

External iterators give the client full control but require more explicit loop management
Internal iterators simplify common cases but limit control flow options

control_flow_challenges.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
# Challenges with Internal Iterators
 
# Challenge 1: Early termination
# External - easy
for item in items:
    if is_target(item):
        result = item
        break
 
# Internal - awkward
result = None
def find_target(item):
    global result
    if is_target(item) and result is None:
        result = item
items.for_each(find_target)
# Still processes ALL items even after finding!
 
# Better: use takewhile or first()
from itertools import takewhile
result = next((item for item in items if is_target(item)), None)
 
 
# Challenge 2: Stateful operations
# External - easy
running_total = 0
for item in items:
    running_total += item.value
    item.running_total = running_total
 
# Internal - needs closure
running_total = [0]  # Mutable container for closure
def add_running_total(item):
    running_total[0] += item.value
    item.running_total = running_total[0]
items.for_each(add_running_total)
 
 
# Challenge 3: Nested iteration
# External - natural
for order in orders:
    for item in order.items:
        if item.needs_restock:
            restock(item)
 
# Internal - non-obvious
def process_order(order):
    order.items.filter(lambda i: i.needs_restock)\
               .for_each(restock)
orders.for_each(process_order)
 
# Or with flatMap
orders.flat_map(lambda o: o.items)\
      .filter(lambda i: i.needs_restock)\
      .for_each(restock)

Inversion of Control

Internal iteration is an example of Inversion of Control (IoC). The collection 'calls back' into client code. This is powerful for frameworks but can make debugging harder—stack traces show library code calling your code, not the other way around.

Decision Framework: Choosing the Right Approach

Neither approach is universally better. The right choice depends on your specific needs:

External vs Internal Iterator Decision Matrix
Requirement	External Iterator	Internal Iterator
Process every element	✓ Works well	✓ Ideal - simpler code
Early termination	✓ Break statement	⚠ Awkward, needs workaround
Track state across elements	✓ Local variables	⚠ Needs closures or classes
Nested iteration	✓ Nested loops	⚠ Nested callbacks complex
Chain transformations	⚠ Intermediate collections	✓ Fluent API chains
Parallel execution	⚠ Client must parallelize	✓ Collection can parallelize
Lazy evaluation	✓ Generators work well	✓ Can be lazy too
Compare across collections	✓ Multiple cursors	⚠ Complex with callbacks
Functional programming style	⚠ Imperative by nature	✓ Natural fit
Debugging	✓ Linear control flow	⚠ Callback stack traces

Choose External When:

•You need fine-grained control over iteration (stop, skip, pause)
•You're comparing elements across multiple collections simultaneously
•You need to track complex state that spans multiple elements
•Control flow is complex (nested loops with conditional breaks)
•You're in a language or codebase that strongly prefers imperative style

Choose Internal When:

•You're applying a simple operation to every element
•You want to chain transformations (map → filter → reduce)
•The collection might optimize traversal (parallel, lazy, batched)
•You prefer declarative, functional programming style
•You're building APIs where simplicity matters more than control

Internal Iteration and Functional Programming

Internal iterators are the foundation of functional programming's approach to collections. The classic functional trio—map, filter, and reduce—are all internal iteration patterns.

functional_trio.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
from typing import TypeVar, List, Callable, Optional
from functools import reduce
 
T = TypeVar('T')
R = TypeVar('R')
 
class FunctionalCollection(list):
    """
    Collection with functional (internal iteration) methods.
    
    These methods are internal iterators: the collection
    controls traversal, client provides behavior.
    """
    
    def map(self, transform: Callable[[T], R]) -> 'FunctionalCollection[R]':
        """
        MAP: Transform each element.
        
        Mathematical: f: A → B applied to each element
        
        Collection iterates and applies transform.
        Returns new collection with transformed elements.
        
        Map preserves structure: same number of elements,
        possibly different types.
        """
        return FunctionalCollection(transform(x) for x in self)
    
    def filter(self, predicate: Callable[[T], bool]) -> 'FunctionalCollection[T]':
        """
        FILTER: Keep elements matching predicate.
        
        Mathematical: {x ∈ A | P(x)}
        
        Collection iterates and applies predicate.
        Returns subset of original elements.
        
        Filter preserves types: same element type,
        possibly fewer elements.
        """
        return FunctionalCollection(x for x in self if predicate(x))
    
    def reduce(
        self, 
        reducer: Callable[[R, T], R], 
        initial: R
    ) -> R:
        """
        REDUCE: Combine all elements into single value.
        
        Also called: fold, aggregate, accumulate
        
        Mathematical: repeated application of binary operation
        
        Collection iterates and accumulates.
        Returns single value (any type).
        """
        return reduce(reducer, self, initial)
    
    # Additional functional methods
    def flat_map(
        self, 
        transform: Callable[[T], 'FunctionalCollection[R]']
    ) -> 'FunctionalCollection[R]':
        """
        FLAT_MAP: Map then flatten.
        
        Each element transforms to a collection.
        Results are concatenated into single collection.
        
        Also called: bind, chain, selectMany
        """
        result = FunctionalCollection()
        for item in self:
            result.extend(transform(item))
        return result
    
    def find(self, predicate: Callable[[T], bool]) -> Optional[T]:
        """
        FIND: Return first matching element.
        
        Like filter, but returns single element or None.
        Short-circuits: stops at first match.
        """
        for item in self:
            if predicate(item):
                return item
        return None
    
    def all(self, predicate: Callable[[T], bool]) -> bool:
        """
        ALL: Check if all elements match predicate.
        
        Short-circuits: returns False on first non-match.
        """
        return all(predicate(x) for x in self)
    
    def any(self, predicate: Callable[[T], bool]) -> bool:
        """
        ANY: Check if any element matches predicate.
        
        Short-circuits: returns True on first match.
        """
        return any(predicate(x) for x in self)
    
    def partition(
        self, 
        predicate: Callable[[T], bool]
    ) -> tuple['FunctionalCollection[T]', 'FunctionalCollection[T]']:
        """
        PARTITION: Split into matching and non-matching.
        
        Single pass that returns two collections.
        """
        matching = FunctionalCollection()
        non_matching = FunctionalCollection()
        
        for item in self:
            if predicate(item):
                matching.append(item)
            else:
                non_matching.append(item)
        
        return matching, non_matching
 
 
# Power of chaining internal iterators
def demo_functional_pipeline():
    """
    Functional pipelines read like a description of the transformation.
    
    Each step is an internal iteration. The collection handles
    how to traverse; the client just describes the transformation.
    """
    
    orders = FunctionalCollection([
        Order(id=1, items=[Item("Book", 29.99), Item("Pen", 2.99)]),
        Order(id=2, items=[Item("Laptop", 999.00)]),
        Order(id=3, items=[Item("Coffee", 4.99), Item("Mug", 12.99)]),
    ])
    
    # Complex query expressed as a pipeline
    result = (orders
        .filter(lambda o: o.total > 20)           # Orders over $20
        .flat_map(lambda o: o.items)               # All items from those orders
        .filter(lambda i: i.price > 10)            # Items over $10
        .map(lambda i: i.name.upper())             # Get uppercase names
        .reduce(lambda acc, n: f"{acc}, {n}", "")  # Join into string
    )
    
    print(f"Expensive items: {result}")

Declarative Power

Notice how the pipeline reads almost like English: 'Filter orders over $20, get their items, filter items over $10, get uppercase names, join into string.' This declarative style is why internal iterators dominate in functional programming.

Lazy Evaluation: Best of Both Worlds

A common misconception is that internal iterators are always eager—that they process all elements immediately. Modern implementations often use lazy evaluation, which delays processing until results are actually needed.

Lazy internal iteration provides the declarative style of internal iteration with the efficiency benefits of external iteration (specifically, the ability to stop early).

lazy_iteration.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
from typing import TypeVar, Iterator, Callable, Optional, Generator
 
T = TypeVar('T')
R = TypeVar('R')
 
class LazyCollection:
    """
    Collection with lazy internal iteration.
    
    Operations return new LazyCollection without computing results.
    Computation happens only when terminal operation is called.
    """
    
    def __init__(self, source: Iterator[T]):
        self._source = source
        self._operations: list = []  # Queued operations
    
    def map(self, transform: Callable[[T], R]) -> 'LazyCollection[R]':
        """
        Queue a map operation (doesn't execute yet).
        """
        def apply_map(iterator: Iterator[T]) -> Generator[R, None, None]:
            for item in iterator:
                yield transform(item)
        
        new_collection = LazyCollection(self._materialize())
        new_collection._operations.append(('map', apply_map))
        return new_collection
    
    def filter(self, predicate: Callable[[T], bool]) -> 'LazyCollection[T]':
        """
        Queue a filter operation (doesn't execute yet).
        """
        def apply_filter(iterator: Iterator[T]) -> Generator[T, None, None]:
            for item in iterator:
                if predicate(item):
                    yield item
        
        new_collection = LazyCollection(self._materialize())
        new_collection._operations.append(('filter', apply_filter))
        return new_collection
    
    def _materialize(self) -> Generator[T, None, None]:
        """
        Create generator that applies all queued operations lazily.
        """
        current: Iterator = self._source
        for op_name, op_func in self._operations:
            current = op_func(current)
        yield from current
    
    # Terminal operations - these force evaluation
    
    def first(self) -> Optional[T]:
        """
        Get first element (terminal - forces evaluation).
        
        Only computes elements until first is found.
        """
        for item in self._materialize():
            return item
        return None
    
    def take(self, n: int) -> list[T]:
        """
        Get first n elements (terminal).
        
        Only computes n elements, not entire collection.
        """
        result = []
        for item in self._materialize():
            result.append(item)
            if len(result) >= n:
                break
        return result
    
    def to_list(self) -> list[T]:
        """
        Materialize entire collection (terminal).
        """
        return list(self._materialize())
    
    def for_each(self, action: Callable[[T], None]) -> None:
        """
        Apply action to each element (terminal).
        """
        for item in self._materialize():
            action(item)
    
    def count(self) -> int:
        """
        Count elements (terminal).
        """
        return sum(1 for _ in self._materialize())
 
 
# Demonstration of laziness
def demo_lazy_evaluation():
    """
    Lazy evaluation means work is only done when absolutely necessary.
    """
    
    def expensive_transform(x):
        print(f"  Transforming {x}...")
        return x * 2
    
    def check_predicate(x):
        print(f"  Checking {x}...")
        return x > 10
    
    source = iter(range(1, 1000000))  # Million numbers
    
    # Build the pipeline - NO WORK DONE YET
    print("Building pipeline...")
    lazy = (LazyCollection(source)
        .map(expensive_transform)
        .filter(check_predicate))
    print("Pipeline built (no computation yet)")
    
    # Get first matching element - minimal work
    print("
Getting first result:")
    first = lazy.first()
    print(f"First matching: {first}")
    # Only processes elements until first > 10 is found!
    
    # Compare to eager: would process ALL million elements
    # before returning first result
 
 
# Java Streams are a perfect example
java_example = """
// Java Stream API: Lazy internal iteration
 
List<String> result = orders.stream()    // Lazy source
    .filter(o -> o.getTotal() > 100)     // Lazy filter
    .map(Order::getCustomerName)         // Lazy map
    .sorted()                             // Lazy sort
    .limit(10)                            // Lazy limit
    .collect(Collectors.toList());       // Terminal - forces evaluation
 
// Only the minimum work is done to produce 10 results
// If source has millions of orders, we don't process all of them
"""

Short-Circuit with Lazy Internal Iteration

Lazy internal iteration solves the early termination problem! Operations like first(), take(n), or anyMatch() only process elements until they have their answer. This gives you declarative syntax with practical efficiency.

Parallelization: The Hidden Power

One of the most powerful advantages of internal iteration is that the collection controls traversal. This means the collection can choose to traverse in parallel without any changes to client code.

parallel_iteration.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
from typing import TypeVar, List, Callable
import multiprocessing
 
T = TypeVar('T')
R = TypeVar('R')
 
class ParallelCollection(list):
    """
    Collection that can execute internal iterations in parallel.
    
    Client code doesn't change - the collection decides how to traverse.
    """
    
    def __init__(self, items=None, parallel=False, workers=None):
        super().__init__(items or [])
        self.parallel = parallel
        self.workers = workers or multiprocessing.cpu_count()
    
    def sequential(self) -> 'ParallelCollection[T]':
        """Return sequential version of this collection."""
        return ParallelCollection(self, parallel=False)
    
    def parallel_stream(self) -> 'ParallelCollection[T]':
        """Return parallel version of this collection."""
        return ParallelCollection(self, parallel=True, workers=self.workers)
    
    def map(self, transform: Callable[[T], R]) -> 'ParallelCollection[R]':
        """
        Map with optional parallelization.
        
        Same API - client doesn't know if it's parallel!
        """
        if self.parallel:
            # Parallel execution
            with ThreadPoolExecutor(max_workers=self.workers) as executor:
                results = list(executor.map(transform, self))
        else:
            # Sequential execution
            results = [transform(x) for x in self]
        
        return ParallelCollection(results, parallel=self.parallel)
    
    def filter(self, predicate: Callable[[T], bool]) -> 'ParallelCollection[T]':
        """
        Filter with optional parallelization.
        """
        if self.parallel:
            with ThreadPoolExecutor(max_workers=self.workers) as executor:
                # Check predicates in parallel
                mask = list(executor.map(predicate, self))
                results = [item for item, keep in zip(self, mask) if keep]
        else:
            results = [x for x in self if predicate(x)]
        
        return ParallelCollection(results, parallel=self.parallel)
    
    def reduce(
        self, 
        reducer: Callable[[R, T], R], 
        initial: R
    ) -> R:
        """
        Reduce with optional parallelization.
        
        Parallel reduce is more complex - needs associative operation
        and uses divide-and-conquer.
        """
        if not self.parallel or len(self) < 1000:
            # Sequential for small collections or non-parallel
            result = initial
            for item in self:
                result = reducer(result, item)
            return result
        
        # Parallel: divide, reduce chunks, combine
        chunk_size = len(self) // self.workers
        chunks = [
            self[i:i + chunk_size] 
            for i in range(0, len(self), chunk_size)
        ]
        
        def reduce_chunk(chunk):
            result = initial
            for item in chunk:
                result = reducer(result, item)
            return result
        
        with ThreadPoolExecutor(max_workers=self.workers) as executor:
            partial_results = list(executor.map(reduce_chunk, chunks))
        
        # Combine partial results
        final = initial
        for partial in partial_results:
            final = reducer(final, partial)
        return final
 
 
# The power: same client code, parallel execution
def demo_parallel():
    items = ParallelCollection(range(1000000))
    
    # Sequential processing
    seq_result = (items
        .sequential()
        .map(lambda x: x ** 2)
        .filter(lambda x: x % 7 == 0)
        .reduce(lambda a, b: a + b, 0))
    
    # Parallel processing - SAME CODE, just .parallel_stream()
    par_result = (items
        .parallel_stream()  # Only change!
        .map(lambda x: x ** 2)
        .filter(lambda x: x % 7 == 0)
        .reduce(lambda a, b: a + b, 0))
    
    assert seq_result == par_result  # Same result!
 
 
# Java example - this is exactly how Java Streams work
java_parallel = """
// Java: Sequential to parallel is ONE method call
 
// Sequential
int sum = orders.stream()
    .filter(o -> o.getTotal() > 100)
    .mapToInt(Order::getItemCount)
    .sum();
 
// Parallel - SAME code, just add .parallel()
int sum = orders.stream()
    .parallel()  // <-- That's it!
    .filter(o -> o.getTotal() > 100)
    .mapToInt(Order::getItemCount)
    .sum();
 
// The collection handles thread pools, work distribution,
// and result aggregation. Client code is unchanged.
"""

Why This Works

With external iteration, the client controls the loop, so the client would need to explicitly manage parallelization. With internal iteration, the collection controls traversal, so it can parallelize transparently. This is a fundamental advantage of internal iterators.

Language Perspectives

Different languages emphasize different iteration styles. Understanding these preferences helps you write idiomatic code in each language.

Iteration Preferences by Language
Language	Primary Style	Internal Iteration	Notes
Python	External (for loops)	map/filter/comprehensions	Comprehensions preferred over map/filter
Java (pre-8)	External (for loops)	Limited	Enhanced for loop was revolutionary
Java 8+	Both	Streams API	Streams enable functional style
JavaScript	Both	Array methods	forEach/map/filter/reduce are idiomatic
Ruby	Internal	Blocks everywhere	each, map, select are idiomatic
Scala	Internal	Collections API	Functional style is default
Haskell	Internal	Functor/Monad	Everything is internal iteration
C++	External	STL algorithms	Ranges in C++20 add internal style
Rust	Internal	Iterator trait	Iterators are lazy and chainable
Go	External	for range loops	No generics until 1.18 limited internal

language_examples.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// TypeScript/JavaScript: Both styles are common
 
const numbers = [1, 2, 3, 4, 5];
 
// External iteration (imperative)
for (const num of numbers) {
    console.log(num * 2);
}
 
// Internal iteration (functional)
numbers
    .filter(n => n % 2 === 0)
    .map(n => n * 2)
    .forEach(n => console.log(n));
 
// Modern JavaScript strongly favors internal iteration
// for data transformations, external for control-heavy logic

Follow Language Idioms

When in doubt, follow the conventions of your language. Use Ruby blocks in Ruby, Streams in Java 8+, and choose based on context in polyglot languages like Python and JavaScript.

Summary: Choosing Your Iteration Strategy

We've explored the fundamental distinction between internal and external iteration:

Key Takeaways

•External iterators give clients control — they pull elements one at a time and manage the loop
•Internal iterators give collections control — clients provide callbacks, collections handle traversal
•Control flow differs fundamentally — external supports break/continue/return; internal requires workarounds
•Functional patterns use internal iteration — map, filter, reduce are internal iterators
•Lazy evaluation bridges the gap — lazy internal iterators support early termination
•Parallel execution is easier with internal — the collection can parallelize without client changes
•Language idioms vary — some languages prefer one style; use what's idiomatic

What's next:

In the final page, we'll explore real-world use cases and examples of the Iterator Pattern. You'll see how iterators are used in database cursors, file system traversal, network streams, and more. We'll also cover advanced patterns like bidirectional iterators, cursor-based pagination, and iterator decoration.

Page Complete

You now understand the fundamental distinction between internal and external iterators, when to choose each, and how functional programming patterns relate to internal iteration. Next, we'll see the Iterator Pattern applied to real-world systems.

Internal vs External Iterators

Two Philosophies of Iteration

What You Will Learn

External Iterators: Client-Controlled Traversal

Key characteristics:

Client controls the iteration loop
Client decides when to call next()
Client can pause, resume, or abandon iteration at any point
Client manages iteration state (implicitly, through the iterator object)

external_iterator_example.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
from typing import Iterator, List
 
# External Iterator: Client controls everything
 
def process_with_external_iterator(items: List[str]) -> None:
    """
    Using an external iterator.
    
    The CLIENT (this function) controls:
    - When to get the iterator
    - When to advance (call next())
    - When to check for more elements
    - When to stop iterating
    - What to do between iterations
    """
    iterator = iter(items)  # Get the iterator
    
    while True:
        try:
            # CLIENT requests next element
            item = next(iterator)
            
            # CLIENT decides what to do
            print(f"Processing: {item}")
            
            # CLIENT can do anything between iterations
            if item == "STOP":
                print("Found stop signal, exiting early")
                break  # CLIENT controls when to stop
            
            # CLIENT can pause for external reasons
            if should_pause():
                save_progress(iterator)
                wait_for_resume()
            
            # CLIENT can interleave with other iterators
            if needs_lookup(item):
                # Process completely different collection
                lookup_result = process_lookup_collection()
                apply_result(item, lookup_result)
                
        except StopIteration:
            break
    
    print("Iteration complete")
 
 
# More typical Python style
def external_with_for_loop(items: List[str]) -> None:
    """
    Python's for loop is still an external iterator.
    
    The loop body (client code) runs between each iterator step.
    Client can still break, continue, or return.
    """
    for item in items:
        if item.startswith("_"):
            continue  # Skip this one
        
        if item == "STOP":
            break  # Exit early
        
        result = complex_processing(item)
        
        if not result.success:
            return  # Exit the entire function
        
        yield result  # Can even be part of a generator!

External Iterator Advantages

•Maximum control — Client can break, continue, pause, or interleave with other operations at any point
•Stateful iteration — Easy to track position, compare elements, look-ahead or look-behind
•Multiple simultaneous iterations — Can have several iterators over same or different collections, advancing independently
•Early termination — Can stop as soon as condition is met without processing remaining elements
•Complex control flow — Supports nested loops, conditional advancement, backtracking

The Iterator is a Cursor

Internal Iterators: Collection-Controlled Traversal

Key characteristics:

Collection controls the iteration loop
Client provides a function to apply to each element
Collection decides when and how to call the function
Client's function is invoked for each element automatically

internal_iterator_example.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
from typing import List, Callable, TypeVar
 
T = TypeVar('T')
R = TypeVar('R')
 
class InternalIteratorCollection:
    """
    Collection with internal iteration support.
    
    The COLLECTION controls traversal.
    Client just provides the operation to perform.
    """
    
    def __init__(self):
        self._items: List[T] = []
    
    def add(self, item: T) -> None:
        self._items.append(item)
    
    def for_each(self, action: Callable[[T], None]) -> None:
        """
        Internal iterator: collection controls traversal.
        
        Client provides a callback function.
        Collection applies it to every element.
        
        The client has NO control over:
        - When iteration starts
        - The order of traversal
        - When to pause or resume
        - Early termination (without exceptions)
        """
        for item in self._items:
            action(item)  # Collection calls client's function
    
    def map(self, transform: Callable[[T], R]) -> 'InternalIteratorCollection[R]':
        """
        Transform each element, return new collection.
        
        Another form of internal iteration: client provides
        the transformation, collection handles traversal.
        """
        result = InternalIteratorCollection[R]()
        for item in self._items:
            result.add(transform(item))
        return result
    
    def filter(self, predicate: Callable[[T], bool]) -> 'InternalIteratorCollection[T]':
        """
        Keep only elements matching predicate.
        
        Collection iterates and decides what to include.
        """
        result = InternalIteratorCollection[T]()
        for item in self._items:
            if predicate(item):
                result.add(item)
        return result
    
    def reduce(self, initial: R, reducer: Callable[[R, T], R]) -> R:
        """
        Combine all elements into single value.
        
        Classic internal iteration pattern.
        """
        accumulator = initial
        for item in self._items:
            accumulator = reducer(accumulator, item)
        return accumulator
 
 
# Using internal iterators
def demo_internal_iteration():
    numbers = InternalIteratorCollection[int]()
    for n in [1, 2, 3, 4, 5]:
        numbers.add(n)
    
    # Internal iteration: just provide the action
    print("All numbers:")
    numbers.for_each(lambda x: print(f"  {x}"))
    
    # Chained internal iterations (functional style)
    result = (numbers
        .map(lambda x: x * 2)        # Double each
        .filter(lambda x: x > 4)     # Keep if > 4
        .reduce(0, lambda a, b: a + b))  # Sum them
    
    print(f"Result: {result}")  # Output: 18 (6 + 8 + 10 = 24... wait)
    # Actually: 2,4,6,8,10 -> filter > 4 -> 6,8,10 -> sum = 24
 
 
# Python's built-in internal iterators
def python_internal_iterators():
    numbers = [1, 2, 3, 4, 5]
    
    # map() - internal iteration with transformation
    doubled = list(map(lambda x: x * 2, numbers))
    
    # filter() - internal iteration with selection
    evens = list(filter(lambda x: x % 2 == 0, numbers))
    
    # reduce() (from functools) - internal iteration with accumulation
    from functools import reduce
    total = reduce(lambda acc, x: acc + x, numbers, 0)
    
    # all() and any() - internal iteration with short-circuit
    all_positive = all(x > 0 for x in numbers)
    any_negative = any(x < 0 for x in numbers)

Internal Iterator Advantages

•Simpler client code — No loop boilerplate; just express what to do with each element
•Chainable operations — Natural fit for functional pipelines (map → filter → reduce)
•Collection optimization — Collection can optimize traversal (parallel execution, short-circuit, lazy evaluation)
•Encapsulation — Traversal logic stays entirely within the collection
•Declarative style — Client expresses intent, collection handles mechanism

Control Flow: The Fundamental Difference

The most significant difference between internal and external iterators is who controls the loop. This has profound implications for what operations are easy or difficult.

external_control.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
# External Iterator Control Flow
 
# Client is in a loop they control
iterator = collection.iterator()
 
while iterator.has_next():
    item = iterator.next()
    
    # Client can:
    
    # 1. Break early
    if found_what_i_need(item):
        break
    
    # 2. Skip items
    if not relevant(item):
        continue
    
    # 3. Track state across iterations
    count += 1
    if item > max_so_far:
        max_so_far = item
    
    # 4. Interleave with other iterators
    for related in lookup(item):
        process(item, related)
    
    # 5. Return from the function
    if error_condition(item):
        return None

internal_control.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# Internal Iterator Control Flow
 
# Collection controls the loop
# Client provides callback
 
collection.for_each(
    lambda item: process(item)
)
 
# Client CANNOT easily:
 
# 1. Break early
#    (need exception or flag)
 
# 2. Skip items
#    (need filter() before)
 
# 3. Track state across calls
#    (need closure or class)
 
# 4. Interleave iterators
#    (callbacks don't nest well)
 
# 5. Return from outer function
#    (return only exits lambda)
 
# But CAN chain operations:
collection
    .filter(relevant)
    .map(transform)
    .for_each(process)

The trade-off is clear:

External iterators give the client full control but require more explicit loop management
Internal iterators simplify common cases but limit control flow options

control_flow_challenges.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
# Challenges with Internal Iterators
 
# Challenge 1: Early termination
# External - easy
for item in items:
    if is_target(item):
        result = item
        break
 
# Internal - awkward
result = None
def find_target(item):
    global result
    if is_target(item) and result is None:
        result = item
items.for_each(find_target)
# Still processes ALL items even after finding!
 
# Better: use takewhile or first()
from itertools import takewhile
result = next((item for item in items if is_target(item)), None)
 
 
# Challenge 2: Stateful operations
# External - easy
running_total = 0
for item in items:
    running_total += item.value
    item.running_total = running_total
 
# Internal - needs closure
running_total = [0]  # Mutable container for closure
def add_running_total(item):
    running_total[0] += item.value
    item.running_total = running_total[0]
items.for_each(add_running_total)
 
 
# Challenge 3: Nested iteration
# External - natural
for order in orders:
    for item in order.items:
        if item.needs_restock:
            restock(item)
 
# Internal - non-obvious
def process_order(order):
    order.items.filter(lambda i: i.needs_restock)\
               .for_each(restock)
orders.for_each(process_order)
 
# Or with flatMap
orders.flat_map(lambda o: o.items)\
      .filter(lambda i: i.needs_restock)\
      .for_each(restock)

Inversion of Control

Decision Framework: Choosing the Right Approach

Neither approach is universally better. The right choice depends on your specific needs:

External vs Internal Iterator Decision Matrix
Requirement	External Iterator	Internal Iterator
Process every element	✓ Works well	✓ Ideal - simpler code
Early termination	✓ Break statement	⚠ Awkward, needs workaround
Track state across elements	✓ Local variables	⚠ Needs closures or classes
Nested iteration	✓ Nested loops	⚠ Nested callbacks complex
Chain transformations	⚠ Intermediate collections	✓ Fluent API chains
Parallel execution	⚠ Client must parallelize	✓ Collection can parallelize
Lazy evaluation	✓ Generators work well	✓ Can be lazy too
Compare across collections	✓ Multiple cursors	⚠ Complex with callbacks
Functional programming style	⚠ Imperative by nature	✓ Natural fit
Debugging	✓ Linear control flow	⚠ Callback stack traces

Choose External When:

•You need fine-grained control over iteration (stop, skip, pause)
•You're comparing elements across multiple collections simultaneously
•You need to track complex state that spans multiple elements
•Control flow is complex (nested loops with conditional breaks)
•You're in a language or codebase that strongly prefers imperative style

Choose Internal When:

•You're applying a simple operation to every element
•You want to chain transformations (map → filter → reduce)
•The collection might optimize traversal (parallel, lazy, batched)
•You prefer declarative, functional programming style
•You're building APIs where simplicity matters more than control

Internal Iteration and Functional Programming

Internal iterators are the foundation of functional programming's approach to collections. The classic functional trio—map, filter, and reduce—are all internal iteration patterns.

functional_trio.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
from typing import TypeVar, List, Callable, Optional
from functools import reduce
 
T = TypeVar('T')
R = TypeVar('R')
 
class FunctionalCollection(list):
    """
    Collection with functional (internal iteration) methods.
    
    These methods are internal iterators: the collection
    controls traversal, client provides behavior.
    """
    
    def map(self, transform: Callable[[T], R]) -> 'FunctionalCollection[R]':
        """
        MAP: Transform each element.
        
        Mathematical: f: A → B applied to each element
        
        Collection iterates and applies transform.
        Returns new collection with transformed elements.
        
        Map preserves structure: same number of elements,
        possibly different types.
        """
        return FunctionalCollection(transform(x) for x in self)
    
    def filter(self, predicate: Callable[[T], bool]) -> 'FunctionalCollection[T]':
        """
        FILTER: Keep elements matching predicate.
        
        Mathematical: {x ∈ A | P(x)}
        
        Collection iterates and applies predicate.
        Returns subset of original elements.
        
        Filter preserves types: same element type,
        possibly fewer elements.
        """
        return FunctionalCollection(x for x in self if predicate(x))
    
    def reduce(
        self, 
        reducer: Callable[[R, T], R], 
        initial: R
    ) -> R:
        """
        REDUCE: Combine all elements into single value.
        
        Also called: fold, aggregate, accumulate
        
        Mathematical: repeated application of binary operation
        
        Collection iterates and accumulates.
        Returns single value (any type).
        """
        return reduce(reducer, self, initial)
    
    # Additional functional methods
    def flat_map(
        self, 
        transform: Callable[[T], 'FunctionalCollection[R]']
    ) -> 'FunctionalCollection[R]':
        """
        FLAT_MAP: Map then flatten.
        
        Each element transforms to a collection.
        Results are concatenated into single collection.
        
        Also called: bind, chain, selectMany
        """
        result = FunctionalCollection()
        for item in self:
            result.extend(transform(item))
        return result
    
    def find(self, predicate: Callable[[T], bool]) -> Optional[T]:
        """
        FIND: Return first matching element.
        
        Like filter, but returns single element or None.
        Short-circuits: stops at first match.
        """
        for item in self:
            if predicate(item):
                return item
        return None
    
    def all(self, predicate: Callable[[T], bool]) -> bool:
        """
        ALL: Check if all elements match predicate.
        
        Short-circuits: returns False on first non-match.
        """
        return all(predicate(x) for x in self)
    
    def any(self, predicate: Callable[[T], bool]) -> bool:
        """
        ANY: Check if any element matches predicate.
        
        Short-circuits: returns True on first match.
        """
        return any(predicate(x) for x in self)
    
    def partition(
        self, 
        predicate: Callable[[T], bool]
    ) -> tuple['FunctionalCollection[T]', 'FunctionalCollection[T]']:
        """
        PARTITION: Split into matching and non-matching.
        
        Single pass that returns two collections.
        """
        matching = FunctionalCollection()
        non_matching = FunctionalCollection()
        
        for item in self:
            if predicate(item):
                matching.append(item)
            else:
                non_matching.append(item)
        
        return matching, non_matching
 
 
# Power of chaining internal iterators
def demo_functional_pipeline():
    """
    Functional pipelines read like a description of the transformation.
    
    Each step is an internal iteration. The collection handles
    how to traverse; the client just describes the transformation.
    """
    
    orders = FunctionalCollection([
        Order(id=1, items=[Item("Book", 29.99), Item("Pen", 2.99)]),
        Order(id=2, items=[Item("Laptop", 999.00)]),
        Order(id=3, items=[Item("Coffee", 4.99), Item("Mug", 12.99)]),
    ])
    
    # Complex query expressed as a pipeline
    result = (orders
        .filter(lambda o: o.total > 20)           # Orders over $20
        .flat_map(lambda o: o.items)               # All items from those orders
        .filter(lambda i: i.price > 10)            # Items over $10
        .map(lambda i: i.name.upper())             # Get uppercase names
        .reduce(lambda acc, n: f"{acc}, {n}", "")  # Join into string
    )
    
    print(f"Expensive items: {result}")

Declarative Power

Lazy Evaluation: Best of Both Worlds

Lazy internal iteration provides the declarative style of internal iteration with the efficiency benefits of external iteration (specifically, the ability to stop early).

lazy_iteration.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
from typing import TypeVar, Iterator, Callable, Optional, Generator
 
T = TypeVar('T')
R = TypeVar('R')
 
class LazyCollection:
    """
    Collection with lazy internal iteration.
    
    Operations return new LazyCollection without computing results.
    Computation happens only when terminal operation is called.
    """
    
    def __init__(self, source: Iterator[T]):
        self._source = source
        self._operations: list = []  # Queued operations
    
    def map(self, transform: Callable[[T], R]) -> 'LazyCollection[R]':
        """
        Queue a map operation (doesn't execute yet).
        """
        def apply_map(iterator: Iterator[T]) -> Generator[R, None, None]:
            for item in iterator:
                yield transform(item)
        
        new_collection = LazyCollection(self._materialize())
        new_collection._operations.append(('map', apply_map))
        return new_collection
    
    def filter(self, predicate: Callable[[T], bool]) -> 'LazyCollection[T]':
        """
        Queue a filter operation (doesn't execute yet).
        """
        def apply_filter(iterator: Iterator[T]) -> Generator[T, None, None]:
            for item in iterator:
                if predicate(item):
                    yield item
        
        new_collection = LazyCollection(self._materialize())
        new_collection._operations.append(('filter', apply_filter))
        return new_collection
    
    def _materialize(self) -> Generator[T, None, None]:
        """
        Create generator that applies all queued operations lazily.
        """
        current: Iterator = self._source
        for op_name, op_func in self._operations:
            current = op_func(current)
        yield from current
    
    # Terminal operations - these force evaluation
    
    def first(self) -> Optional[T]:
        """
        Get first element (terminal - forces evaluation).
        
        Only computes elements until first is found.
        """
        for item in self._materialize():
            return item
        return None
    
    def take(self, n: int) -> list[T]:
        """
        Get first n elements (terminal).
        
        Only computes n elements, not entire collection.
        """
        result = []
        for item in self._materialize():
            result.append(item)
            if len(result) >= n:
                break
        return result
    
    def to_list(self) -> list[T]:
        """
        Materialize entire collection (terminal).
        """
        return list(self._materialize())
    
    def for_each(self, action: Callable[[T], None]) -> None:
        """
        Apply action to each element (terminal).
        """
        for item in self._materialize():
            action(item)
    
    def count(self) -> int:
        """
        Count elements (terminal).
        """
        return sum(1 for _ in self._materialize())
 
 
# Demonstration of laziness
def demo_lazy_evaluation():
    """
    Lazy evaluation means work is only done when absolutely necessary.
    """
    
    def expensive_transform(x):
        print(f"  Transforming {x}...")
        return x * 2
    
    def check_predicate(x):
        print(f"  Checking {x}...")
        return x > 10
    
    source = iter(range(1, 1000000))  # Million numbers
    
    # Build the pipeline - NO WORK DONE YET
    print("Building pipeline...")
    lazy = (LazyCollection(source)
        .map(expensive_transform)
        .filter(check_predicate))
    print("Pipeline built (no computation yet)")
    
    # Get first matching element - minimal work
    print("
Getting first result:")
    first = lazy.first()
    print(f"First matching: {first}")
    # Only processes elements until first > 10 is found!
    
    # Compare to eager: would process ALL million elements
    # before returning first result
 
 
# Java Streams are a perfect example
java_example = """
// Java Stream API: Lazy internal iteration
 
List<String> result = orders.stream()    // Lazy source
    .filter(o -> o.getTotal() > 100)     // Lazy filter
    .map(Order::getCustomerName)         // Lazy map
    .sorted()                             // Lazy sort
    .limit(10)                            // Lazy limit
    .collect(Collectors.toList());       // Terminal - forces evaluation
 
// Only the minimum work is done to produce 10 results
// If source has millions of orders, we don't process all of them
"""

Short-Circuit with Lazy Internal Iteration

Parallelization: The Hidden Power

parallel_iteration.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
from typing import TypeVar, List, Callable
import multiprocessing
 
T = TypeVar('T')
R = TypeVar('R')
 
class ParallelCollection(list):
    """
    Collection that can execute internal iterations in parallel.
    
    Client code doesn't change - the collection decides how to traverse.
    """
    
    def __init__(self, items=None, parallel=False, workers=None):
        super().__init__(items or [])
        self.parallel = parallel
        self.workers = workers or multiprocessing.cpu_count()
    
    def sequential(self) -> 'ParallelCollection[T]':
        """Return sequential version of this collection."""
        return ParallelCollection(self, parallel=False)
    
    def parallel_stream(self) -> 'ParallelCollection[T]':
        """Return parallel version of this collection."""
        return ParallelCollection(self, parallel=True, workers=self.workers)
    
    def map(self, transform: Callable[[T], R]) -> 'ParallelCollection[R]':
        """
        Map with optional parallelization.
        
        Same API - client doesn't know if it's parallel!
        """
        if self.parallel:
            # Parallel execution
            with ThreadPoolExecutor(max_workers=self.workers) as executor:
                results = list(executor.map(transform, self))
        else:
            # Sequential execution
            results = [transform(x) for x in self]
        
        return ParallelCollection(results, parallel=self.parallel)
    
    def filter(self, predicate: Callable[[T], bool]) -> 'ParallelCollection[T]':
        """
        Filter with optional parallelization.
        """
        if self.parallel:
            with ThreadPoolExecutor(max_workers=self.workers) as executor:
                # Check predicates in parallel
                mask = list(executor.map(predicate, self))
                results = [item for item, keep in zip(self, mask) if keep]
        else:
            results = [x for x in self if predicate(x)]
        
        return ParallelCollection(results, parallel=self.parallel)
    
    def reduce(
        self, 
        reducer: Callable[[R, T], R], 
        initial: R
    ) -> R:
        """
        Reduce with optional parallelization.
        
        Parallel reduce is more complex - needs associative operation
        and uses divide-and-conquer.
        """
        if not self.parallel or len(self) < 1000:
            # Sequential for small collections or non-parallel
            result = initial
            for item in self:
                result = reducer(result, item)
            return result
        
        # Parallel: divide, reduce chunks, combine
        chunk_size = len(self) // self.workers
        chunks = [
            self[i:i + chunk_size] 
            for i in range(0, len(self), chunk_size)
        ]
        
        def reduce_chunk(chunk):
            result = initial
            for item in chunk:
                result = reducer(result, item)
            return result
        
        with ThreadPoolExecutor(max_workers=self.workers) as executor:
            partial_results = list(executor.map(reduce_chunk, chunks))
        
        # Combine partial results
        final = initial
        for partial in partial_results:
            final = reducer(final, partial)
        return final
 
 
# The power: same client code, parallel execution
def demo_parallel():
    items = ParallelCollection(range(1000000))
    
    # Sequential processing
    seq_result = (items
        .sequential()
        .map(lambda x: x ** 2)
        .filter(lambda x: x % 7 == 0)
        .reduce(lambda a, b: a + b, 0))
    
    # Parallel processing - SAME CODE, just .parallel_stream()
    par_result = (items
        .parallel_stream()  # Only change!
        .map(lambda x: x ** 2)
        .filter(lambda x: x % 7 == 0)
        .reduce(lambda a, b: a + b, 0))
    
    assert seq_result == par_result  # Same result!
 
 
# Java example - this is exactly how Java Streams work
java_parallel = """
// Java: Sequential to parallel is ONE method call
 
// Sequential
int sum = orders.stream()
    .filter(o -> o.getTotal() > 100)
    .mapToInt(Order::getItemCount)
    .sum();
 
// Parallel - SAME code, just add .parallel()
int sum = orders.stream()
    .parallel()  // <-- That's it!
    .filter(o -> o.getTotal() > 100)
    .mapToInt(Order::getItemCount)
    .sum();
 
// The collection handles thread pools, work distribution,
// and result aggregation. Client code is unchanged.
"""

Why This Works

Language Perspectives

Different languages emphasize different iteration styles. Understanding these preferences helps you write idiomatic code in each language.

Iteration Preferences by Language
Language	Primary Style	Internal Iteration	Notes
Python	External (for loops)	map/filter/comprehensions	Comprehensions preferred over map/filter
Java (pre-8)	External (for loops)	Limited	Enhanced for loop was revolutionary
Java 8+	Both	Streams API	Streams enable functional style
JavaScript	Both	Array methods	forEach/map/filter/reduce are idiomatic
Ruby	Internal	Blocks everywhere	each, map, select are idiomatic
Scala	Internal	Collections API	Functional style is default
Haskell	Internal	Functor/Monad	Everything is internal iteration
C++	External	STL algorithms	Ranges in C++20 add internal style
Rust	Internal	Iterator trait	Iterators are lazy and chainable
Go	External	for range loops	No generics until 1.18 limited internal

language_examples.ts
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// TypeScript/JavaScript: Both styles are common
 
const numbers = [1, 2, 3, 4, 5];
 
// External iteration (imperative)
for (const num of numbers) {
    console.log(num * 2);
}
 
// Internal iteration (functional)
numbers
    .filter(n => n % 2 === 0)
    .map(n => n * 2)
    .forEach(n => console.log(n));
 
// Modern JavaScript strongly favors internal iteration
// for data transformations, external for control-heavy logic

Follow Language Idioms

When in doubt, follow the conventions of your language. Use Ruby blocks in Ruby, Streams in Java 8+, and choose based on context in polyglot languages like Python and JavaScript.

Summary: Choosing Your Iteration Strategy

We've explored the fundamental distinction between internal and external iteration:

Key Takeaways

•External iterators give clients control — they pull elements one at a time and manage the loop
•Internal iterators give collections control — clients provide callbacks, collections handle traversal
•Control flow differs fundamentally — external supports break/continue/return; internal requires workarounds
•Functional patterns use internal iteration — map, filter, reduce are internal iterators
•Lazy evaluation bridges the gap — lazy internal iterators support early termination
•Parallel execution is easier with internal — the collection can parallelize without client changes
•Language idioms vary — some languages prefer one style; use what's idiomatic

What's next:

Page Complete