Loading content...
The Iterator Pattern we've explored so far follows a specific model: the client controls iteration. The client asks 'is there more?', the client requests the next element, the client decides when to stop. This is called an external iterator—the iteration mechanism is external to the collection, controlled by the client.
But there's another approach. What if the collection controlled iteration? What if the client simply said 'here's what to do with each element' and the collection handled the traversal? This is an internal iterator—the iteration mechanism is internal to the collection.
This distinction is fundamental. It affects API design, control flow, error handling, and even which programming paradigm (imperative vs functional) feels more natural. Let's explore both approaches in depth.
By the end of this page, you will understand the difference between internal and external iterators, when to choose each approach, how control flow differs between them, and how modern functional programming patterns (map, filter, reduce) relate to internal iteration.
An external iterator (also called an active iterator or cursor) is what we've been building throughout this module. The client code explicitly requests elements one at a time and decides when to advance.
Key characteristics:
next()1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768
from typing import Iterator, List # External Iterator: Client controls everything def process_with_external_iterator(items: List[str]) -> None: """ Using an external iterator. The CLIENT (this function) controls: - When to get the iterator - When to advance (call next()) - When to check for more elements - When to stop iterating - What to do between iterations """ iterator = iter(items) # Get the iterator while True: try: # CLIENT requests next element item = next(iterator) # CLIENT decides what to do print(f"Processing: {item}") # CLIENT can do anything between iterations if item == "STOP": print("Found stop signal, exiting early") break # CLIENT controls when to stop # CLIENT can pause for external reasons if should_pause(): save_progress(iterator) wait_for_resume() # CLIENT can interleave with other iterators if needs_lookup(item): # Process completely different collection lookup_result = process_lookup_collection() apply_result(item, lookup_result) except StopIteration: break print("Iteration complete") # More typical Python styledef external_with_for_loop(items: List[str]) -> None: """ Python's for loop is still an external iterator. The loop body (client code) runs between each iterator step. Client can still break, continue, or return. """ for item in items: if item.startswith("_"): continue # Skip this one if item == "STOP": break # Exit early result = complex_processing(item) if not result.success: return # Exit the entire function yield result # Can even be part of a generator!Think of an external iterator as a cursor or bookmark. It marks a position in the collection. The client moves the cursor forward (or backward, for bidirectional iterators). The collection itself is unaware of where the cursor points—it just provides the mechanism for movement.
An internal iterator (also called a passive iterator or callback-based iterator) inverts the control. Instead of the client pulling elements, the collection pushes elements to a callback function provided by the client.
Key characteristics:
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108
from typing import List, Callable, TypeVar T = TypeVar('T')R = TypeVar('R') class InternalIteratorCollection: """ Collection with internal iteration support. The COLLECTION controls traversal. Client just provides the operation to perform. """ def __init__(self): self._items: List[T] = [] def add(self, item: T) -> None: self._items.append(item) def for_each(self, action: Callable[[T], None]) -> None: """ Internal iterator: collection controls traversal. Client provides a callback function. Collection applies it to every element. The client has NO control over: - When iteration starts - The order of traversal - When to pause or resume - Early termination (without exceptions) """ for item in self._items: action(item) # Collection calls client's function def map(self, transform: Callable[[T], R]) -> 'InternalIteratorCollection[R]': """ Transform each element, return new collection. Another form of internal iteration: client provides the transformation, collection handles traversal. """ result = InternalIteratorCollection[R]() for item in self._items: result.add(transform(item)) return result def filter(self, predicate: Callable[[T], bool]) -> 'InternalIteratorCollection[T]': """ Keep only elements matching predicate. Collection iterates and decides what to include. """ result = InternalIteratorCollection[T]() for item in self._items: if predicate(item): result.add(item) return result def reduce(self, initial: R, reducer: Callable[[R, T], R]) -> R: """ Combine all elements into single value. Classic internal iteration pattern. """ accumulator = initial for item in self._items: accumulator = reducer(accumulator, item) return accumulator # Using internal iteratorsdef demo_internal_iteration(): numbers = InternalIteratorCollection[int]() for n in [1, 2, 3, 4, 5]: numbers.add(n) # Internal iteration: just provide the action print("All numbers:") numbers.for_each(lambda x: print(f" {x}")) # Chained internal iterations (functional style) result = (numbers .map(lambda x: x * 2) # Double each .filter(lambda x: x > 4) # Keep if > 4 .reduce(0, lambda a, b: a + b)) # Sum them print(f"Result: {result}") # Output: 18 (6 + 8 + 10 = 24... wait) # Actually: 2,4,6,8,10 -> filter > 4 -> 6,8,10 -> sum = 24 # Python's built-in internal iteratorsdef python_internal_iterators(): numbers = [1, 2, 3, 4, 5] # map() - internal iteration with transformation doubled = list(map(lambda x: x * 2, numbers)) # filter() - internal iteration with selection evens = list(filter(lambda x: x % 2 == 0, numbers)) # reduce() (from functools) - internal iteration with accumulation from functools import reduce total = reduce(lambda acc, x: acc + x, numbers, 0) # all() and any() - internal iteration with short-circuit all_positive = all(x > 0 for x in numbers) any_negative = any(x < 0 for x in numbers)The most significant difference between internal and external iterators is who controls the loop. This has profound implications for what operations are easy or difficult.
123456789101112131415161718192021222324252627282930
# External Iterator Control Flow # Client is in a loop they controliterator = collection.iterator() while iterator.has_next(): item = iterator.next() # Client can: # 1. Break early if found_what_i_need(item): break # 2. Skip items if not relevant(item): continue # 3. Track state across iterations count += 1 if item > max_so_far: max_so_far = item # 4. Interleave with other iterators for related in lookup(item): process(item, related) # 5. Return from the function if error_condition(item): return None12345678910111213141516171819202122232425262728293031
# Internal Iterator Control Flow # Collection controls the loop# Client provides callback collection.for_each( lambda item: process(item)) # Client CANNOT easily: # 1. Break early# (need exception or flag) # 2. Skip items# (need filter() before) # 3. Track state across calls# (need closure or class) # 4. Interleave iterators# (callbacks don't nest well) # 5. Return from outer function# (return only exits lambda) # But CAN chain operations:collection .filter(relevant) .map(transform) .for_each(process)The trade-off is clear:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455
# Challenges with Internal Iterators # Challenge 1: Early termination# External - easyfor item in items: if is_target(item): result = item break # Internal - awkwardresult = Nonedef find_target(item): global result if is_target(item) and result is None: result = itemitems.for_each(find_target)# Still processes ALL items even after finding! # Better: use takewhile or first()from itertools import takewhileresult = next((item for item in items if is_target(item)), None) # Challenge 2: Stateful operations# External - easyrunning_total = 0for item in items: running_total += item.value item.running_total = running_total # Internal - needs closurerunning_total = [0] # Mutable container for closuredef add_running_total(item): running_total[0] += item.value item.running_total = running_total[0]items.for_each(add_running_total) # Challenge 3: Nested iteration# External - naturalfor order in orders: for item in order.items: if item.needs_restock: restock(item) # Internal - non-obviousdef process_order(order): order.items.filter(lambda i: i.needs_restock)\ .for_each(restock)orders.for_each(process_order) # Or with flatMaporders.flat_map(lambda o: o.items)\ .filter(lambda i: i.needs_restock)\ .for_each(restock)Internal iteration is an example of Inversion of Control (IoC). The collection 'calls back' into client code. This is powerful for frameworks but can make debugging harder—stack traces show library code calling your code, not the other way around.
Neither approach is universally better. The right choice depends on your specific needs:
| Requirement | External Iterator | Internal Iterator |
|---|---|---|
| Process every element | ✓ Works well | ✓ Ideal - simpler code |
| Early termination | ✓ Break statement | ⚠ Awkward, needs workaround |
| Track state across elements | ✓ Local variables | ⚠ Needs closures or classes |
| Nested iteration | ✓ Nested loops | ⚠ Nested callbacks complex |
| Chain transformations | ⚠ Intermediate collections | ✓ Fluent API chains |
| Parallel execution | ⚠ Client must parallelize | ✓ Collection can parallelize |
| Lazy evaluation | ✓ Generators work well | ✓ Can be lazy too |
| Compare across collections | ✓ Multiple cursors | ⚠ Complex with callbacks |
| Functional programming style | ⚠ Imperative by nature | ✓ Natural fit |
| Debugging | ✓ Linear control flow | ⚠ Callback stack traces |
Internal iterators are the foundation of functional programming's approach to collections. The classic functional trio—map, filter, and reduce—are all internal iteration patterns.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151
from typing import TypeVar, List, Callable, Optionalfrom functools import reduce T = TypeVar('T')R = TypeVar('R') class FunctionalCollection(list): """ Collection with functional (internal iteration) methods. These methods are internal iterators: the collection controls traversal, client provides behavior. """ def map(self, transform: Callable[[T], R]) -> 'FunctionalCollection[R]': """ MAP: Transform each element. Mathematical: f: A → B applied to each element Collection iterates and applies transform. Returns new collection with transformed elements. Map preserves structure: same number of elements, possibly different types. """ return FunctionalCollection(transform(x) for x in self) def filter(self, predicate: Callable[[T], bool]) -> 'FunctionalCollection[T]': """ FILTER: Keep elements matching predicate. Mathematical: {x ∈ A | P(x)} Collection iterates and applies predicate. Returns subset of original elements. Filter preserves types: same element type, possibly fewer elements. """ return FunctionalCollection(x for x in self if predicate(x)) def reduce( self, reducer: Callable[[R, T], R], initial: R ) -> R: """ REDUCE: Combine all elements into single value. Also called: fold, aggregate, accumulate Mathematical: repeated application of binary operation Collection iterates and accumulates. Returns single value (any type). """ return reduce(reducer, self, initial) # Additional functional methods def flat_map( self, transform: Callable[[T], 'FunctionalCollection[R]'] ) -> 'FunctionalCollection[R]': """ FLAT_MAP: Map then flatten. Each element transforms to a collection. Results are concatenated into single collection. Also called: bind, chain, selectMany """ result = FunctionalCollection() for item in self: result.extend(transform(item)) return result def find(self, predicate: Callable[[T], bool]) -> Optional[T]: """ FIND: Return first matching element. Like filter, but returns single element or None. Short-circuits: stops at first match. """ for item in self: if predicate(item): return item return None def all(self, predicate: Callable[[T], bool]) -> bool: """ ALL: Check if all elements match predicate. Short-circuits: returns False on first non-match. """ return all(predicate(x) for x in self) def any(self, predicate: Callable[[T], bool]) -> bool: """ ANY: Check if any element matches predicate. Short-circuits: returns True on first match. """ return any(predicate(x) for x in self) def partition( self, predicate: Callable[[T], bool] ) -> tuple['FunctionalCollection[T]', 'FunctionalCollection[T]']: """ PARTITION: Split into matching and non-matching. Single pass that returns two collections. """ matching = FunctionalCollection() non_matching = FunctionalCollection() for item in self: if predicate(item): matching.append(item) else: non_matching.append(item) return matching, non_matching # Power of chaining internal iteratorsdef demo_functional_pipeline(): """ Functional pipelines read like a description of the transformation. Each step is an internal iteration. The collection handles how to traverse; the client just describes the transformation. """ orders = FunctionalCollection([ Order(id=1, items=[Item("Book", 29.99), Item("Pen", 2.99)]), Order(id=2, items=[Item("Laptop", 999.00)]), Order(id=3, items=[Item("Coffee", 4.99), Item("Mug", 12.99)]), ]) # Complex query expressed as a pipeline result = (orders .filter(lambda o: o.total > 20) # Orders over $20 .flat_map(lambda o: o.items) # All items from those orders .filter(lambda i: i.price > 10) # Items over $10 .map(lambda i: i.name.upper()) # Get uppercase names .reduce(lambda acc, n: f"{acc}, {n}", "") # Join into string ) print(f"Expensive items: {result}")Notice how the pipeline reads almost like English: 'Filter orders over $20, get their items, filter items over $10, get uppercase names, join into string.' This declarative style is why internal iterators dominate in functional programming.
A common misconception is that internal iterators are always eager—that they process all elements immediately. Modern implementations often use lazy evaluation, which delays processing until results are actually needed.
Lazy internal iteration provides the declarative style of internal iteration with the efficiency benefits of external iteration (specifically, the ability to stop early).
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144
from typing import TypeVar, Iterator, Callable, Optional, Generator T = TypeVar('T')R = TypeVar('R') class LazyCollection: """ Collection with lazy internal iteration. Operations return new LazyCollection without computing results. Computation happens only when terminal operation is called. """ def __init__(self, source: Iterator[T]): self._source = source self._operations: list = [] # Queued operations def map(self, transform: Callable[[T], R]) -> 'LazyCollection[R]': """ Queue a map operation (doesn't execute yet). """ def apply_map(iterator: Iterator[T]) -> Generator[R, None, None]: for item in iterator: yield transform(item) new_collection = LazyCollection(self._materialize()) new_collection._operations.append(('map', apply_map)) return new_collection def filter(self, predicate: Callable[[T], bool]) -> 'LazyCollection[T]': """ Queue a filter operation (doesn't execute yet). """ def apply_filter(iterator: Iterator[T]) -> Generator[T, None, None]: for item in iterator: if predicate(item): yield item new_collection = LazyCollection(self._materialize()) new_collection._operations.append(('filter', apply_filter)) return new_collection def _materialize(self) -> Generator[T, None, None]: """ Create generator that applies all queued operations lazily. """ current: Iterator = self._source for op_name, op_func in self._operations: current = op_func(current) yield from current # Terminal operations - these force evaluation def first(self) -> Optional[T]: """ Get first element (terminal - forces evaluation). Only computes elements until first is found. """ for item in self._materialize(): return item return None def take(self, n: int) -> list[T]: """ Get first n elements (terminal). Only computes n elements, not entire collection. """ result = [] for item in self._materialize(): result.append(item) if len(result) >= n: break return result def to_list(self) -> list[T]: """ Materialize entire collection (terminal). """ return list(self._materialize()) def for_each(self, action: Callable[[T], None]) -> None: """ Apply action to each element (terminal). """ for item in self._materialize(): action(item) def count(self) -> int: """ Count elements (terminal). """ return sum(1 for _ in self._materialize()) # Demonstration of lazinessdef demo_lazy_evaluation(): """ Lazy evaluation means work is only done when absolutely necessary. """ def expensive_transform(x): print(f" Transforming {x}...") return x * 2 def check_predicate(x): print(f" Checking {x}...") return x > 10 source = iter(range(1, 1000000)) # Million numbers # Build the pipeline - NO WORK DONE YET print("Building pipeline...") lazy = (LazyCollection(source) .map(expensive_transform) .filter(check_predicate)) print("Pipeline built (no computation yet)") # Get first matching element - minimal work print("Getting first result:") first = lazy.first() print(f"First matching: {first}") # Only processes elements until first > 10 is found! # Compare to eager: would process ALL million elements # before returning first result # Java Streams are a perfect examplejava_example = """// Java Stream API: Lazy internal iteration List<String> result = orders.stream() // Lazy source .filter(o -> o.getTotal() > 100) // Lazy filter .map(Order::getCustomerName) // Lazy map .sorted() // Lazy sort .limit(10) // Lazy limit .collect(Collectors.toList()); // Terminal - forces evaluation // Only the minimum work is done to produce 10 results// If source has millions of orders, we don't process all of them"""Lazy internal iteration solves the early termination problem! Operations like first(), take(n), or anyMatch() only process elements until they have their answer. This gives you declarative syntax with practical efficiency.
One of the most powerful advantages of internal iteration is that the collection controls traversal. This means the collection can choose to traverse in parallel without any changes to client code.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutorfrom typing import TypeVar, List, Callableimport multiprocessing T = TypeVar('T')R = TypeVar('R') class ParallelCollection(list): """ Collection that can execute internal iterations in parallel. Client code doesn't change - the collection decides how to traverse. """ def __init__(self, items=None, parallel=False, workers=None): super().__init__(items or []) self.parallel = parallel self.workers = workers or multiprocessing.cpu_count() def sequential(self) -> 'ParallelCollection[T]': """Return sequential version of this collection.""" return ParallelCollection(self, parallel=False) def parallel_stream(self) -> 'ParallelCollection[T]': """Return parallel version of this collection.""" return ParallelCollection(self, parallel=True, workers=self.workers) def map(self, transform: Callable[[T], R]) -> 'ParallelCollection[R]': """ Map with optional parallelization. Same API - client doesn't know if it's parallel! """ if self.parallel: # Parallel execution with ThreadPoolExecutor(max_workers=self.workers) as executor: results = list(executor.map(transform, self)) else: # Sequential execution results = [transform(x) for x in self] return ParallelCollection(results, parallel=self.parallel) def filter(self, predicate: Callable[[T], bool]) -> 'ParallelCollection[T]': """ Filter with optional parallelization. """ if self.parallel: with ThreadPoolExecutor(max_workers=self.workers) as executor: # Check predicates in parallel mask = list(executor.map(predicate, self)) results = [item for item, keep in zip(self, mask) if keep] else: results = [x for x in self if predicate(x)] return ParallelCollection(results, parallel=self.parallel) def reduce( self, reducer: Callable[[R, T], R], initial: R ) -> R: """ Reduce with optional parallelization. Parallel reduce is more complex - needs associative operation and uses divide-and-conquer. """ if not self.parallel or len(self) < 1000: # Sequential for small collections or non-parallel result = initial for item in self: result = reducer(result, item) return result # Parallel: divide, reduce chunks, combine chunk_size = len(self) // self.workers chunks = [ self[i:i + chunk_size] for i in range(0, len(self), chunk_size) ] def reduce_chunk(chunk): result = initial for item in chunk: result = reducer(result, item) return result with ThreadPoolExecutor(max_workers=self.workers) as executor: partial_results = list(executor.map(reduce_chunk, chunks)) # Combine partial results final = initial for partial in partial_results: final = reducer(final, partial) return final # The power: same client code, parallel executiondef demo_parallel(): items = ParallelCollection(range(1000000)) # Sequential processing seq_result = (items .sequential() .map(lambda x: x ** 2) .filter(lambda x: x % 7 == 0) .reduce(lambda a, b: a + b, 0)) # Parallel processing - SAME CODE, just .parallel_stream() par_result = (items .parallel_stream() # Only change! .map(lambda x: x ** 2) .filter(lambda x: x % 7 == 0) .reduce(lambda a, b: a + b, 0)) assert seq_result == par_result # Same result! # Java example - this is exactly how Java Streams workjava_parallel = """// Java: Sequential to parallel is ONE method call // Sequentialint sum = orders.stream() .filter(o -> o.getTotal() > 100) .mapToInt(Order::getItemCount) .sum(); // Parallel - SAME code, just add .parallel()int sum = orders.stream() .parallel() // <-- That's it! .filter(o -> o.getTotal() > 100) .mapToInt(Order::getItemCount) .sum(); // The collection handles thread pools, work distribution,// and result aggregation. Client code is unchanged."""With external iteration, the client controls the loop, so the client would need to explicitly manage parallelization. With internal iteration, the collection controls traversal, so it can parallelize transparently. This is a fundamental advantage of internal iterators.
Different languages emphasize different iteration styles. Understanding these preferences helps you write idiomatic code in each language.
| Language | Primary Style | Internal Iteration | Notes |
|---|---|---|---|
| Python | External (for loops) | map/filter/comprehensions | Comprehensions preferred over map/filter |
| Java (pre-8) | External (for loops) | Limited | Enhanced for loop was revolutionary |
| Java 8+ | Both | Streams API | Streams enable functional style |
| JavaScript | Both | Array methods | forEach/map/filter/reduce are idiomatic |
| Ruby | Internal | Blocks everywhere | each, map, select are idiomatic |
| Scala | Internal | Collections API | Functional style is default |
| Haskell | Internal | Functor/Monad | Everything is internal iteration |
| C++ | External | STL algorithms | Ranges in C++20 add internal style |
| Rust | Internal | Iterator trait | Iterators are lazy and chainable |
| Go | External | for range loops | No generics until 1.18 limited internal |
1234567891011121314151617
// TypeScript/JavaScript: Both styles are common const numbers = [1, 2, 3, 4, 5]; // External iteration (imperative)for (const num of numbers) { console.log(num * 2);} // Internal iteration (functional)numbers .filter(n => n % 2 === 0) .map(n => n * 2) .forEach(n => console.log(n)); // Modern JavaScript strongly favors internal iteration// for data transformations, external for control-heavy logicWhen in doubt, follow the conventions of your language. Use Ruby blocks in Ruby, Streams in Java 8+, and choose based on context in polyglot languages like Python and JavaScript.
We've explored the fundamental distinction between internal and external iteration:
What's next:
In the final page, we'll explore real-world use cases and examples of the Iterator Pattern. You'll see how iterators are used in database cursors, file system traversal, network streams, and more. We'll also cover advanced patterns like bidirectional iterators, cursor-based pagination, and iterator decoration.
You now understand the fundamental distinction between internal and external iterators, when to choose each, and how functional programming patterns relate to internal iteration. Next, we'll see the Iterator Pattern applied to real-world systems.