Link State Routing - Learning Module

Loading content...

0/228

SPF Calculation

From Topology to Routes: The SPF Pipeline

The Shortest Path First (SPF) calculation is where theory meets practice in link state routing. After LSAs are flooded and the Link State Database is synchronized, each router independently executes Dijkstra's algorithm to compute the shortest path tree rooted at itself. This tree directly translates into the routing table entries that determine how packets are forwarded.

The term "SPF" is used interchangeably with "Dijkstra's algorithm" in routing contexts, though SPF specifically refers to the complete process: triggering, scheduling, graph construction, Dijkstra execution, and routing table population. Understanding this end-to-end pipeline is essential for optimizing convergence time and troubleshooting routing behavior.

What You Will Learn

By the end of this page, you will understand the complete SPF calculation pipeline from trigger to routing table, the SPF scheduling mechanism and throttling, how the shortest path tree is built and used, routing table population from SPF results, incremental SPF optimizations, and Equal-Cost Multi-Path (ECMP) computation.

The Complete SPF Calculation Pipeline

SPF calculation is not a simple function call—it's a carefully orchestrated process with multiple stages, each designed to balance computational efficiency against convergence speed.

Converting Mermaid diagram...

SPF Pipeline Stages
Stage	Input	Process	Output	Typical Time
Trigger Detection	LSA update	Compare with LSDB, detect change	SPF trigger event	<1 ms
Scheduling	Trigger event	Apply delays, check throttling	Scheduled SPF run	0-10 seconds
Graph Construction	LSDB contents	Parse LSAs, build adjacency list	Topology graph	5-50 ms
Dijkstra Execution	Topology graph	Run SPF algorithm	Shortest path tree	10-500 ms
Route Calculation	SPF tree	Extract next-hops, costs	RIB updates	5-50 ms
FIB Update	RIB changes	Program forwarding hardware	Active routes	10-100 ms

Total Convergence Impact:

The SPF calculation itself (Dijkstra's algorithm) is typically fast—measured in tens of milliseconds even for large networks. However, the scheduling delays and FIB programming often dominate total convergence time:

Small network (100 routers): SPF algorithm < 10 ms
Medium network (1,000 routers): SPF algorithm ≈ 50-100 ms
Large network (10,000 routers): SPF algorithm ≈ 200-500 ms

But with default SPF scheduling (5-second holdtime), actual convergence after a link failure is often 5-10 seconds regardless of network size.

SPF Triggering and Scheduling Mechanisms

Not every LSDB change requires a full SPF calculation. OSPF distinguishes between changes that affect the topology (require SPF) and changes that don't (can use faster partial recalculation).

Events That Trigger Full SPF:

New Router-LSA installed
Router-LSA modified (link added, removed, or cost changed)
Network-LSA added, removed, or modified
Changes affecting the SPF tree structure

Events That Don't Require Full SPF:

Type 3/4/5/7 LSA changes (only affect routes, not topology)
LSA age refresh without content change
Sequence number increment without topology change

OSPF SPF Scheduling Parameters
Parameter	Default	Tunable Range	Purpose
spf-delay (initial)	0 ms	0-600,000 ms	Time before first SPF after trigger
spf-holdtime	5000 ms	0-600,000 ms	Minimum interval between SPF runs
spf-max-wait	10,000 ms	0-600,000 ms	Maximum delay before SPF must run
lsa-arrival	1000 ms	0-600,000 ms	Min interval to accept same LSA
lsa-group-pacing	240 seconds	10-1,800 s	Interval for LSA group refresh

Exponential Backoff Scheduling:

Modern OSPF implementations use exponential backoff to balance fast initial convergence against CPU protection during network instability:

First trigger: SPF runs after spf-delay (often 0)
Second trigger within holdtime: SPF delays until spf-holdtime expires
Subsequent triggers: Double the wait time each time, up to spf-max-wait
Quiet period: After no triggers for spf-max-wait, reset to initial behavior

Example Timeline:

Time 0.0s: Link fails, first trigger
Time 0.0s: SPF runs immediately (spf-delay = 0)
Time 0.5s: Another link flaps, trigger
Time 5.0s: SPF runs (held for spf-holdtime = 5s)
Time 5.5s: More flapping, trigger
Time 15.0s: SPF runs (held for 10s = 2 × holdtime)
Time 15.5s: Flapping continues, trigger
Time 35.0s: SPF runs (held for 20s, capped at spf-max-wait)
... eventually stabilizes ...

Fast Convergence vs. Stability Trade-off

Reducing SPF timers (e.g., spf-delay 0, spf-holdtime 50ms) enables sub-second convergence but increases CPU usage during instability. A single flapping link can cause hundreds of SPF runs. Always tune timers based on your specific requirements and test thoroughly.

spf_scheduler.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
import time
import threading
from dataclasses import dataclass
from typing import Callable, Optional
from enum import Enum
 
class SPFState(Enum):
    """SPF scheduler state machine"""
    QUIET = "quiet"           # No recent triggers
    WAITING = "waiting"       # Trigger received, waiting to run
    RUNNING = "running"       # SPF currently executing
    HOLDDOWN = "holddown"     # Waiting for holdtime after run
 
@dataclass
class SPFTimers:
    """Configurable SPF timing parameters"""
    initial_delay_ms: int = 0        # spf-delay
    holdtime_ms: int = 5000          # spf-holdtime  
    max_wait_ms: int = 10000         # spf-max-wait
    
class SPFScheduler:
    """
    Implements OSPF SPF scheduling with exponential backoff.
    
    This scheduler ensures:
    1. Fast initial convergence (immediate first SPF)
    2. Protection against rapid re-computation (holdtime)
    3. Bounded backoff (max-wait)
    4. Guaranteed eventual convergence (pending triggers run)
    """
    
    def __init__(self, 
                 spf_callback: Callable[[], None],
                 timers: Optional[SPFTimers] = None):
        self.spf_callback = spf_callback
        self.timers = timers or SPFTimers()
        
        self.state = SPFState.QUIET
        self.last_spf_time: Optional[float] = None
        self.pending_trigger = False
        self.current_holdtime_ms = self.timers.holdtime_ms
        
        self._lock = threading.Lock()
        self._timer: Optional[threading.Timer] = None
        
        # Statistics
        self.stats = {
            "triggers_received": 0,
            "spf_runs": 0,
            "triggers_coalesced": 0,  # Triggers that didn't cause new SPF
        }
    
    def trigger_spf(self, reason: str = "unknown") -> None:
        """
        Signal that SPF calculation is needed.
        
        The actual SPF run is scheduled according to timing parameters,
        not executed immediately.
        """
        with self._lock:
            self.stats["triggers_received"] += 1
            print(f"[SPF] Trigger: {reason}")
            
            if self.state == SPFState.QUIET:
                # First trigger after quiet period - schedule immediately
                self._schedule_spf(self.timers.initial_delay_ms)
                self.current_holdtime_ms = self.timers.holdtime_ms
                
            elif self.state == SPFState.WAITING:
                # Already scheduled - this trigger will be handled
                self.pending_trigger = True
                self.stats["triggers_coalesced"] += 1
                print(f"[SPF] Coalescing trigger (already scheduled)")
                
            elif self.state == SPFState.HOLDDOWN:
                # In holddown - mark trigger pending for when holddown expires
                self.pending_trigger = True
                self.stats["triggers_coalesced"] += 1
                print(f"[SPF] Trigger pending (in holddown)")
                
            elif self.state == SPFState.RUNNING:
                # Currently running - will need to re-run
                self.pending_trigger = True
                self.stats["triggers_coalesced"] += 1
    
    def _schedule_spf(self, delay_ms: int) -> None:
        """Schedule SPF to run after specified delay"""
        self.state = SPFState.WAITING
        delay_seconds = delay_ms / 1000.0
        
        if self._timer:
            self._timer.cancel()
        
        print(f"[SPF] Scheduling SPF in {delay_ms}ms")
        self._timer = threading.Timer(delay_seconds, self._run_spf)
        self._timer.start()
    
    def _run_spf(self) -> None:
        """Execute SPF calculation"""
        with self._lock:
            self.state = SPFState.RUNNING
            self.pending_trigger = False
        
        start_time = time.time()
        print(f"[SPF] Starting SPF calculation...")
        
        try:
            # Execute the actual SPF
            self.spf_callback()
        except Exception as e:
            print(f"[SPF] Error during SPF: {e}")
        
        elapsed_ms = (time.time() - start_time) * 1000
        
        with self._lock:
            self.stats["spf_runs"] += 1
            self.last_spf_time = time.time()
            
            print(f"[SPF] SPF completed in {elapsed_ms:.1f}ms")
            
            # Transition to holddown
            self.state = SPFState.HOLDDOWN
            self._schedule_holddown()
    
    def _schedule_holddown(self) -> None:
        """Start holddown timer"""
        delay_seconds = self.current_holdtime_ms / 1000.0
        
        print(f"[SPF] Entering holddown ({self.current_holdtime_ms}ms)")
        
        self._timer = threading.Timer(delay_seconds, self._holddown_expired)
        self._timer.start()
        
        # Increase holdtime for next run (exponential backoff)
        self.current_holdtime_ms = min(
            self.current_holdtime_ms * 2,
            self.timers.max_wait_ms
        )
    
    def _holddown_expired(self) -> None:
        """Handle holddown timer expiration"""
        with self._lock:
            if self.pending_trigger:
                # Triggers arrived during holddown - run SPF again
                print(f"[SPF] Holddown expired, pending trigger exists")
                self._schedule_spf(0)  # Run immediately
            else:
                # No triggers - return to quiet state
                print(f"[SPF] Holddown expired, no pending triggers")
                self.state = SPFState.QUIET
                self.current_holdtime_ms = self.timers.holdtime_ms  # Reset
    
    def get_stats(self) -> dict:
        """Return scheduling statistics"""
        with self._lock:
            return dict(self.stats)
 
# Example usage
def mock_spf_calculation():
    """Simulated SPF calculation"""
    time.sleep(0.05)  # Simulate 50ms computation
 
scheduler = SPFScheduler(
    spf_callback=mock_spf_calculation,
    timers=SPFTimers(
        initial_delay_ms=0,
        holdtime_ms=1000,  # 1 second for demo
        max_wait_ms=5000
    )
)
 
# Simulate rapid topology changes
scheduler.trigger_spf("link_failure")
time.sleep(0.1)
scheduler.trigger_spf("neighbor_lost")
time.sleep(0.1)
scheduler.trigger_spf("lsa_update")
 
# Wait for processing
time.sleep(3)
print(f"\n[SPF] Statistics: {scheduler.get_stats()}")

Building the Shortest Path Tree

The output of Dijkstra's algorithm is the Shortest Path Tree (SPT), a spanning tree rooted at the computing router where every path from the root to any node represents the shortest path to that node. This tree is the fundamental structure from which routing table entries are derived.

Converting Mermaid diagram...

Key SPT Properties:

Rooted at Computing Router: The SPT is unique to each router; R1's tree differs from R2's tree.
Includes All Reachable Nodes: Every router and network in the LSDB appears in the tree (if reachable).
Path Optimality: The path from root to any node is guaranteed to be optimal (shortest cost).
Predecessor-Based Structure: Each node (except root) has exactly one parent, representing the next hop back toward the root.

Building the SPT:

During Dijkstra's algorithm execution, the predecessor relationship is recorded:

When updating d[v] through u:
    predecessor[v] = u

After completion, following predecessors from any node back to the root traces the shortest path. For routing, we need the reverse—the first hop away from the root toward each destination.

SPT Analysis for Router R1
Destination	Total Cost	Path	Next-Hop from R1	Outgoing Interface
R1	0	R1
R2	10	R1 → R2	R2	eth0
R3	15	R1 → R2 → R3	R2	eth0
R4	25	R1 → R2 → R3 → R4	R2	eth0
R5	30	R1 → R2 → R3 → R4 → R5	R2	eth0
10.1.0.0/24	11	R1 → R2 → [stub]	R2	eth0

Next-Hop Derivation

To find the next-hop for destination D from router S:

Start at D in the SPT
Follow predecessors back toward S
Stop when the predecessor is S itself
The current node is the next-hop

In the example, for R1 to reach R5: R5 ← R4 ← R3 ← R2 ← R1. The next-hop is R2 (predecessor of R3 whose predecessor is R1).

Populating the Routing Table from SPT

The Shortest Path Tree is an intermediate result—the ultimate goal is populating the routing table (RIB - Routing Information Base) with entries that determine how packets are forwarded. OSPF processes the SPT to generate routes for all reachable destinations.

Route Types from SPF:

OSPF generates different route types based on the LSA source:

Route Type	Source	Priority	Example
Intra-Area (O)	Type 1, 2 LSAs	Highest	Routes within the area
Inter-Area (O IA)	Type 3 LSAs	Medium	Routes from other areas via ABR
External Type 1 (E1)	Type 5 LSA	Lower	External routes, metric added to OSPF cost
External Type 2 (E2)	Type 5 LSA	Lowest	External routes, fixed external metric
NSSA Type 1 (N1)	Type 7 LSA	Variable	NSSA external, metric added
NSSA Type 2 (N2)	Type 7 LSA	Variable	NSSA external, fixed metric

Intra-area routes are always preferred over inter-area, which are preferred over external.

route_calculation.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
from dataclasses import dataclass
from typing import Dict, List, Optional, Tuple
from enum import IntEnum
 
class OSPFRouteType(IntEnum):
    """OSPF route types in preference order"""
    INTRA_AREA = 1      # O     - Within area (Type 1, 2 LSAs)
    INTER_AREA = 2      # O IA  - From other areas (Type 3 LSAs)
    EXTERNAL_TYPE1 = 3  # O E1  - External with additive metric
    EXTERNAL_TYPE2 = 4  # O E2  - External with fixed metric
    NSSA_TYPE1 = 5      # O N1  - NSSA external additive
    NSSA_TYPE2 = 6      # O N2  - NSSA external fixed
 
@dataclass
class OSPFRoute:
    """Represents an OSPF route in the routing table"""
    destination: str        # Network prefix (CIDR notation)
    next_hops: List[str]    # Next-hop IP addresses (multiple for ECMP)
    interfaces: List[str]   # Outgoing interfaces
    cost: int               # Total cost to destination
    route_type: OSPFRouteType
    area_id: Optional[str]  # Source area for inter-area routes
    external_metric: Optional[int]  # For external routes
    tag: Optional[int]      # Route tag for policy
    
    def __lt__(self, other: 'OSPFRoute') -> bool:
        """Route comparison for preference (lower is better)"""
        # First: route type preference
        if self.route_type != other.route_type:
            return self.route_type < other.route_type
        
        # Second: cost preference
        if self.route_type in (OSPFRouteType.EXTERNAL_TYPE2, 
                                OSPFRouteType.NSSA_TYPE2):
            # E2/N2: compare external metric, then OSPF cost to ASBR
            if self.external_metric != other.external_metric:
                return self.external_metric < other.external_metric
        
        return self.cost < other.cost
 
class RoutingTableBuilder:
    """
    Builds OSPF routing table from SPF results.
    
    Takes the Shortest Path Tree and LSDB as input,
    processes all reachable destinations, and generates
    routing table entries with proper preference.
    """
    
    def __init__(self, router_id: str, area_id: str):
        self.router_id = router_id
        self.area_id = area_id
        self.routes: Dict[str, OSPFRoute] = {}
        
    def add_intra_area_routes(self,
                               spt: Dict[str, Tuple[int, str, str]],
                               stub_networks: Dict[str, Tuple[str, int, str]]
                               ) -> None:
        """
        Add intra-area routes from SPT.
        
        Args:
            spt: {destination: (cost, next_hop, interface)}
            stub_networks: {network: (attached_router, cost, mask)}
        """
        # Routes to routers (transit to reach stub networks)
        for dest_router, (cost, next_hop, interface) in spt.items():
            if dest_router == self.router_id:
                continue
            
            # The router's loopback/RID is reachable
            route = OSPFRoute(
                destination=f"{dest_router}/32",
                next_hops=[next_hop],
                interfaces=[interface],
                cost=cost,
                route_type=OSPFRouteType.INTRA_AREA,
                area_id=self.area_id,
                external_metric=None,
                tag=None
            )
            self._add_or_update_route(route)
        
        # Routes to stub networks advertised by routers
        for network, (attached_router, stub_cost, mask) in stub_networks.items():
            if attached_router not in spt:
                continue  # Router not reachable
            
            router_cost, next_hop, interface = spt[attached_router]
            total_cost = router_cost + stub_cost
            
            route = OSPFRoute(
                destination=f"{network}/{self._mask_to_prefix(mask)}",
                next_hops=[next_hop] if attached_router != self.router_id else [],
                interfaces=[interface] if attached_router != self.router_id else ["local"],
                cost=total_cost,
                route_type=OSPFRouteType.INTRA_AREA,
                area_id=self.area_id,
                external_metric=None,
                tag=None
            )
            self._add_or_update_route(route)
    
    def add_inter_area_routes(self,
                               summary_lsas: List[dict],
                               spt: Dict[str, Tuple[int, str, str]]
                               ) -> None:
        """
        Add inter-area routes from Type 3 Summary LSAs.
        
        For each Summary LSA:
        1. Find the advertising ABR in SPT
        2. Add ABR cost + advertised metric
        3. Use ABR as next-hop
        """
        for lsa in summary_lsas:
            abr_id = lsa['advertising_router']
            network = lsa['link_state_id']
            mask = lsa['netmask']
            metric = lsa['metric']
            
            if abr_id not in spt:
                continue  # ABR not reachable
            
            abr_cost, next_hop, interface = spt[abr_id]
            total_cost = abr_cost + metric
            
            route = OSPFRoute(
                destination=f"{network}/{self._mask_to_prefix(mask)}",
                next_hops=[next_hop],
                interfaces=[interface],
                cost=total_cost,
                route_type=OSPFRouteType.INTER_AREA,
                area_id=lsa.get('source_area'),
                external_metric=None,
                tag=None
            )
            self._add_or_update_route(route)
    
    def add_external_routes(self,
                            external_lsas: List[dict],
                            spt: Dict[str, Tuple[int, str, str]],
                            asbr_paths: Dict[str, Tuple[int, str, str]]
                            ) -> None:
        """
        Add external routes from Type 5 AS-External LSAs.
        
        External Type 1: total cost = cost to ASBR + external metric
        External Type 2: total cost = external metric only (ASBR cost as tiebreaker)
        """
        for lsa in external_lsas:
            asbr_id = lsa['advertising_router']
            network = lsa['link_state_id']
            mask = lsa['netmask']
            external_metric = lsa['metric']
            metric_type = lsa.get('metric_type', 2)  # 1 or 2
            forwarding_address = lsa.get('forwarding_address')
            tag = lsa.get('tag')
            
            # Find path to ASBR (may be in different area)
            if asbr_id in spt:
                asbr_cost, next_hop, interface = spt[asbr_id]
            elif asbr_id in asbr_paths:
                asbr_cost, next_hop, interface = asbr_paths[asbr_id]
            else:
                continue  # ASBR not reachable
            
            # Calculate cost based on metric type
            if metric_type == 1:
                cost = asbr_cost + external_metric
                route_type = OSPFRouteType.EXTERNAL_TYPE1
            else:
                cost = asbr_cost  # For tiebreaking
                route_type = OSPFRouteType.EXTERNAL_TYPE2
            
            # Handle forwarding address if set
            if forwarding_address and forwarding_address != "0.0.0.0":
                # Route through forwarding address instead of ASBR
                # (requires forwarding address to be reachable intra-area)
                pass  # Simplified: use ASBR path
            
            route = OSPFRoute(
                destination=f"{network}/{self._mask_to_prefix(mask)}",
                next_hops=[next_hop],
                interfaces=[interface],
                cost=cost,
                route_type=route_type,
                area_id=None,
                external_metric=external_metric,
                tag=tag
            )
            self._add_or_update_route(route)
    
    def _add_or_update_route(self, new_route: OSPFRoute) -> None:
        """Add route, handling preference and ECMP"""
        existing = self.routes.get(new_route.destination)
        
        if existing is None:
            self.routes[new_route.destination] = new_route
            
        elif new_route < existing:
            # New route is better - replace
            self.routes[new_route.destination] = new_route
            
        elif not (existing < new_route):
            # Equal cost - add for ECMP
            if new_route.next_hops[0] not in existing.next_hops:
                existing.next_hops.extend(new_route.next_hops)
                existing.interfaces.extend(new_route.interfaces)
    
    def _mask_to_prefix(self, mask: str) -> int:
        """Convert subnet mask to prefix length"""
        octets = [int(x) for x in mask.split('.')]
        binary = ''.join(format(o, '08b') for o in octets)
        return binary.count('1')
    
    def get_routing_table(self) -> List[OSPFRoute]:
        """Return sorted routing table"""
        return sorted(self.routes.values(), 
                     key=lambda r: (r.route_type, r.cost))
 
# Example output format
print("""
OSPF Routing Table for Router 1.1.1.1:
 
Codes: O - OSPF intra-area, IA - OSPF inter-area
       E1 - OSPF external type 1, E2 - OSPF external type 2
       N1 - OSPF NSSA type 1, N2 - OSPF NSSA type 2
 
Destination        Next-Hop       Interface  Cost  Type
--------------     -----------    ---------  ----  ----
10.0.0.0/8         10.1.1.2       eth0       11    O
10.1.0.0/24        10.1.1.2       eth0       10    O
10.2.0.0/24        10.1.1.2       eth0       15    O
192.168.0.0/16     10.1.1.2       eth0       110   O IA
172.16.0.0/12      10.1.1.2       eth0       150   O E1
0.0.0.0/0          10.1.1.2       eth0       1     O E2
""")

Equal-Cost Multi-Path (ECMP)

When multiple paths to a destination have identical costs, Equal-Cost Multi-Path (ECMP) allows routers to load-balance traffic across all optimal paths. This is a powerful feature of link state routing that distance vector protocols traditionally couldn't support.

Detecting Equal-Cost Paths:

During Dijkstra's algorithm, when we find a path to node v with the same cost as the current best:

if new_distance == distances[neighbor]:
    # Equal cost path found!
    predecessors[neighbor].append(current_node)  # Track multiple
elif new_distance < distances[neighbor]:
    predecessors[neighbor] = [current_node]  # Replace with new best

ECMP in Practice:

ECMP Configuration and Behavior
Aspect	Description	Typical Value
Maximum Paths	Maximum ECMP paths per destination	4-16 (platform dependent)
Load Balancing	How traffic is distributed	Per-flow hash (5-tuple)
Hash Algorithm	Determines flow-to-path mapping	CRC, XOR, or proprietary
Unequal Bandwidth	Handling paths with different capacity	May need traffic engineering

Converting Mermaid diagram...

Per-Flow vs. Per-Packet Load Balancing

Modern routers use per-flow (not per-packet) load balancing to preserve packet ordering within TCP sessions. A hash of the 5-tuple (src IP, dst IP, protocol, src port, dst port) determines which ECMP path each flow uses. This ensures all packets of a single connection traverse the same path.

Benefits of ECMP:

Increased Bandwidth: Aggregate capacity of multiple paths
Resilience: Failure of one path doesn't require reconvergence—traffic shifts automatically
Better Utilization: Prevents single-path bottlenecks
Faster Failover: Hardware can detect failure and redirect within milliseconds

Limitations:

Unequal Path Capacity: If paths have different bandwidths, equal load distribution wastes capacity
Polarization: Poor hash functions can cause uneven distribution
Troubleshooting Complexity: Packets may take different paths, complicating trace analysis

Incremental SPF and Optimizations

Full SPF recalculation examines the entire topology graph—wasteful when only a small portion has changed. Incremental SPF (iSPF) optimizes for partial topology changes by recomputing only affected portions of the tree.

SPF Optimization Techniques
Technique	When Applied	Savings	Complexity
Full SPF	Always valid	None (baseline)	O((V+E) log V)
Partial Route Calc (PRC)	Leaf/stub network changes	Skip Dijkstra, update routes only	O(changed routes)
Incremental SPF (iSPF)	Limited topology changes	Recompute affected subtree only	O(affected portion × log V)
Incremental RIB Update	SPT unchanged, costs changed	Skip tree building	O(affected routes)

Partial Route Calculation (PRC):

When changes affect only "leaf" information (stub networks, external routes) without changing the SPT structure, OSPF can skip Dijkstra entirely:

Type 3/4/5/7 LSA changes → PRC only
Stub network addition/removal from Router-LSA → PRC if router is in SPT
Metric change for stub network → PRC

Incremental SPF (iSPF):

For topology changes that affect the SPT but in a localized way:

Identify the "affected area" of the tree
Prune and regrow only that portion
Leave unaffected subtrees intact

Example:

If link R3-R4 fails in a 1000-router network where only 50 routers have routes affected:

Full SPF: Examine all 1000 routers and their links
iSPF: Examine only the 50-router subtree rooted at R3

Potential speedup: 20x for this scenario.

When iSPF Helps Most

Incremental SPF provides the largest benefits when:

• Network is large (1000+ routers) • Changes are localized (single link/node failures) • SPT structure is mostly stable

For small networks or core topology changes, full SPF is often equally fast due to optimization overhead.

SPF Performance Monitoring

Monitoring SPF performance is essential for maintaining network health and meeting convergence SLAs. Excessive SPF runs or long computation times indicate problems requiring attention.

spf_monitoring_commands

! View SPF statistics
Router# show ip ospf statistics
 
OSPF Router with ID (1.1.1.1) (Process ID 1)
 
SPF Statistics:
  SPF algorithm executed 47 times
  Last SPF ran 00:05:23 ago
  Last SPF duration: 15 msec
  
SPF calculation timers:
  Current holdtime: 5000 msec
  Last delay: 0 msec
 
SPF Log (last 10 runs):
=================================================
Time            Duration   Reason
=================================================
Jan 16 15:30:00    15ms   Router LSA received
Jan 16 15:29:55    12ms   Link up (GigabitEthernet0/1)
Jan 16 15:25:00    18ms   Neighbor 2.2.2.2 Full
Jan 16 15:24:55    10ms   Router LSA aged out
Jan 16 14:55:00    14ms   Periodic refresh
...
 
! View OSPF process CPU usage
Router# show processes cpu | include OSPF
  45    12.0%   5.0%  OSPF-1 Hello
  46     3.2%   1.5%  OSPF-1 Router
  
! View SPF scheduling
Router# show ip ospf spf-schedule
 
OSPF Router with ID 1.1.1.1 (Process ID 1)
 
SPF schedule delay: 0 msec
SPF Hold time: 5000 msec
SPF Max wait: 10000 msec
 
Current SPF phase: QUIET
Time since last SPF: 325 seconds
SPF runs in last minute: 0
SPF runs in last hour: 3

SPF Health Indicators
Metric	Healthy Range	Warning Signs	Action
SPF Duration	< 100ms	500ms consistently	Check LSDB size, CPU
SPF Runs/Hour	< 10	100	Investigate topology instability
Holdtime Escalation	Returns to base	Stuck at max-wait	Link flapping, fix physical
CPU Usage	< 30% during SPF	80%	Tune timers, add areas

Summary: SPF Calculation

We have explored the complete SPF calculation pipeline—from trigger detection through Dijkstra execution to routing table population. This understanding is essential for optimizing convergence time and troubleshooting routing behavior.

Key Takeaways

•SPF is a complete pipeline — Triggering, scheduling, graph construction, Dijkstra execution, and route installation work together.
•Scheduling protects stability — Exponential backoff prevents CPU exhaustion during network instability while allowing fast initial convergence.
•The SPT is the core output — Dijkstra produces the Shortest Path Tree from which all routes are derived.
•Route types have preferences — Intra-area routes are preferred over inter-area, which are preferred over external.
•ECMP provides resilience and bandwidth — Equal-cost paths enable load balancing and fast failover without reconvergence.
•Incremental SPF optimizes for partial changes — PRC and iSPF skip unnecessary computation when topology changes are localized.
•Monitoring enables proactive management — SPF statistics reveal network health and convergence performance.

What's Next:

The final page examines Convergence—the measure of how quickly the network stabilizes after topology changes. We'll explore convergence components, optimization techniques, and the trade-offs between fast convergence and network stability.

Page Complete

You now understand the SPF calculation pipeline in depth—from the mathematical algorithm through practical implementation details. This knowledge enables you to tune OSPF for optimal convergence, troubleshoot slow convergence issues, and design networks that meet stringent recovery time objectives.

SPF Calculation

From Topology to Routes: The SPF Pipeline

What You Will Learn

The Complete SPF Calculation Pipeline

SPF calculation is not a simple function call—it's a carefully orchestrated process with multiple stages, each designed to balance computational efficiency against convergence speed.

Converting Mermaid diagram...

SPF Pipeline Stages
Stage	Input	Process	Output	Typical Time
Trigger Detection	LSA update	Compare with LSDB, detect change	SPF trigger event	<1 ms
Scheduling	Trigger event	Apply delays, check throttling	Scheduled SPF run	0-10 seconds
Graph Construction	LSDB contents	Parse LSAs, build adjacency list	Topology graph	5-50 ms
Dijkstra Execution	Topology graph	Run SPF algorithm	Shortest path tree	10-500 ms
Route Calculation	SPF tree	Extract next-hops, costs	RIB updates	5-50 ms
FIB Update	RIB changes	Program forwarding hardware	Active routes	10-100 ms

Total Convergence Impact:

Small network (100 routers): SPF algorithm < 10 ms
Medium network (1,000 routers): SPF algorithm ≈ 50-100 ms
Large network (10,000 routers): SPF algorithm ≈ 200-500 ms

But with default SPF scheduling (5-second holdtime), actual convergence after a link failure is often 5-10 seconds regardless of network size.

SPF Triggering and Scheduling Mechanisms

Not every LSDB change requires a full SPF calculation. OSPF distinguishes between changes that affect the topology (require SPF) and changes that don't (can use faster partial recalculation).

Events That Trigger Full SPF:

New Router-LSA installed
Router-LSA modified (link added, removed, or cost changed)
Network-LSA added, removed, or modified
Changes affecting the SPF tree structure

Events That Don't Require Full SPF:

Type 3/4/5/7 LSA changes (only affect routes, not topology)
LSA age refresh without content change
Sequence number increment without topology change

OSPF SPF Scheduling Parameters
Parameter	Default	Tunable Range	Purpose
spf-delay (initial)	0 ms	0-600,000 ms	Time before first SPF after trigger
spf-holdtime	5000 ms	0-600,000 ms	Minimum interval between SPF runs
spf-max-wait	10,000 ms	0-600,000 ms	Maximum delay before SPF must run
lsa-arrival	1000 ms	0-600,000 ms	Min interval to accept same LSA
lsa-group-pacing	240 seconds	10-1,800 s	Interval for LSA group refresh

Exponential Backoff Scheduling:

Modern OSPF implementations use exponential backoff to balance fast initial convergence against CPU protection during network instability:

First trigger: SPF runs after spf-delay (often 0)
Second trigger within holdtime: SPF delays until spf-holdtime expires
Subsequent triggers: Double the wait time each time, up to spf-max-wait
Quiet period: After no triggers for spf-max-wait, reset to initial behavior

Example Timeline:

Time 0.0s: Link fails, first trigger
Time 0.0s: SPF runs immediately (spf-delay = 0)
Time 0.5s: Another link flaps, trigger
Time 5.0s: SPF runs (held for spf-holdtime = 5s)
Time 5.5s: More flapping, trigger
Time 15.0s: SPF runs (held for 10s = 2 × holdtime)
Time 15.5s: Flapping continues, trigger
Time 35.0s: SPF runs (held for 20s, capped at spf-max-wait)
... eventually stabilizes ...

Fast Convergence vs. Stability Trade-off

spf_scheduler.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
import time
import threading
from dataclasses import dataclass
from typing import Callable, Optional
from enum import Enum
 
class SPFState(Enum):
    """SPF scheduler state machine"""
    QUIET = "quiet"           # No recent triggers
    WAITING = "waiting"       # Trigger received, waiting to run
    RUNNING = "running"       # SPF currently executing
    HOLDDOWN = "holddown"     # Waiting for holdtime after run
 
@dataclass
class SPFTimers:
    """Configurable SPF timing parameters"""
    initial_delay_ms: int = 0        # spf-delay
    holdtime_ms: int = 5000          # spf-holdtime  
    max_wait_ms: int = 10000         # spf-max-wait
    
class SPFScheduler:
    """
    Implements OSPF SPF scheduling with exponential backoff.
    
    This scheduler ensures:
    1. Fast initial convergence (immediate first SPF)
    2. Protection against rapid re-computation (holdtime)
    3. Bounded backoff (max-wait)
    4. Guaranteed eventual convergence (pending triggers run)
    """
    
    def __init__(self, 
                 spf_callback: Callable[[], None],
                 timers: Optional[SPFTimers] = None):
        self.spf_callback = spf_callback
        self.timers = timers or SPFTimers()
        
        self.state = SPFState.QUIET
        self.last_spf_time: Optional[float] = None
        self.pending_trigger = False
        self.current_holdtime_ms = self.timers.holdtime_ms
        
        self._lock = threading.Lock()
        self._timer: Optional[threading.Timer] = None
        
        # Statistics
        self.stats = {
            "triggers_received": 0,
            "spf_runs": 0,
            "triggers_coalesced": 0,  # Triggers that didn't cause new SPF
        }
    
    def trigger_spf(self, reason: str = "unknown") -> None:
        """
        Signal that SPF calculation is needed.
        
        The actual SPF run is scheduled according to timing parameters,
        not executed immediately.
        """
        with self._lock:
            self.stats["triggers_received"] += 1
            print(f"[SPF] Trigger: {reason}")
            
            if self.state == SPFState.QUIET:
                # First trigger after quiet period - schedule immediately
                self._schedule_spf(self.timers.initial_delay_ms)
                self.current_holdtime_ms = self.timers.holdtime_ms
                
            elif self.state == SPFState.WAITING:
                # Already scheduled - this trigger will be handled
                self.pending_trigger = True
                self.stats["triggers_coalesced"] += 1
                print(f"[SPF] Coalescing trigger (already scheduled)")
                
            elif self.state == SPFState.HOLDDOWN:
                # In holddown - mark trigger pending for when holddown expires
                self.pending_trigger = True
                self.stats["triggers_coalesced"] += 1
                print(f"[SPF] Trigger pending (in holddown)")
                
            elif self.state == SPFState.RUNNING:
                # Currently running - will need to re-run
                self.pending_trigger = True
                self.stats["triggers_coalesced"] += 1
    
    def _schedule_spf(self, delay_ms: int) -> None:
        """Schedule SPF to run after specified delay"""
        self.state = SPFState.WAITING
        delay_seconds = delay_ms / 1000.0
        
        if self._timer:
            self._timer.cancel()
        
        print(f"[SPF] Scheduling SPF in {delay_ms}ms")
        self._timer = threading.Timer(delay_seconds, self._run_spf)
        self._timer.start()
    
    def _run_spf(self) -> None:
        """Execute SPF calculation"""
        with self._lock:
            self.state = SPFState.RUNNING
            self.pending_trigger = False
        
        start_time = time.time()
        print(f"[SPF] Starting SPF calculation...")
        
        try:
            # Execute the actual SPF
            self.spf_callback()
        except Exception as e:
            print(f"[SPF] Error during SPF: {e}")
        
        elapsed_ms = (time.time() - start_time) * 1000
        
        with self._lock:
            self.stats["spf_runs"] += 1
            self.last_spf_time = time.time()
            
            print(f"[SPF] SPF completed in {elapsed_ms:.1f}ms")
            
            # Transition to holddown
            self.state = SPFState.HOLDDOWN
            self._schedule_holddown()
    
    def _schedule_holddown(self) -> None:
        """Start holddown timer"""
        delay_seconds = self.current_holdtime_ms / 1000.0
        
        print(f"[SPF] Entering holddown ({self.current_holdtime_ms}ms)")
        
        self._timer = threading.Timer(delay_seconds, self._holddown_expired)
        self._timer.start()
        
        # Increase holdtime for next run (exponential backoff)
        self.current_holdtime_ms = min(
            self.current_holdtime_ms * 2,
            self.timers.max_wait_ms
        )
    
    def _holddown_expired(self) -> None:
        """Handle holddown timer expiration"""
        with self._lock:
            if self.pending_trigger:
                # Triggers arrived during holddown - run SPF again
                print(f"[SPF] Holddown expired, pending trigger exists")
                self._schedule_spf(0)  # Run immediately
            else:
                # No triggers - return to quiet state
                print(f"[SPF] Holddown expired, no pending triggers")
                self.state = SPFState.QUIET
                self.current_holdtime_ms = self.timers.holdtime_ms  # Reset
    
    def get_stats(self) -> dict:
        """Return scheduling statistics"""
        with self._lock:
            return dict(self.stats)
 
# Example usage
def mock_spf_calculation():
    """Simulated SPF calculation"""
    time.sleep(0.05)  # Simulate 50ms computation
 
scheduler = SPFScheduler(
    spf_callback=mock_spf_calculation,
    timers=SPFTimers(
        initial_delay_ms=0,
        holdtime_ms=1000,  # 1 second for demo
        max_wait_ms=5000
    )
)
 
# Simulate rapid topology changes
scheduler.trigger_spf("link_failure")
time.sleep(0.1)
scheduler.trigger_spf("neighbor_lost")
time.sleep(0.1)
scheduler.trigger_spf("lsa_update")
 
# Wait for processing
time.sleep(3)
print(f"\n[SPF] Statistics: {scheduler.get_stats()}")

Building the Shortest Path Tree

Converting Mermaid diagram...

Key SPT Properties:

Rooted at Computing Router: The SPT is unique to each router; R1's tree differs from R2's tree.
Includes All Reachable Nodes: Every router and network in the LSDB appears in the tree (if reachable).
Path Optimality: The path from root to any node is guaranteed to be optimal (shortest cost).
Predecessor-Based Structure: Each node (except root) has exactly one parent, representing the next hop back toward the root.

Building the SPT:

During Dijkstra's algorithm execution, the predecessor relationship is recorded:

When updating d[v] through u:
    predecessor[v] = u

After completion, following predecessors from any node back to the root traces the shortest path. For routing, we need the reverse—the first hop away from the root toward each destination.

SPT Analysis for Router R1
Destination	Total Cost	Path	Next-Hop from R1	Outgoing Interface
R1	0	R1
R2	10	R1 → R2	R2	eth0
R3	15	R1 → R2 → R3	R2	eth0
R4	25	R1 → R2 → R3 → R4	R2	eth0
R5	30	R1 → R2 → R3 → R4 → R5	R2	eth0
10.1.0.0/24	11	R1 → R2 → [stub]	R2	eth0

Next-Hop Derivation

To find the next-hop for destination D from router S:

Start at D in the SPT
Follow predecessors back toward S
Stop when the predecessor is S itself
The current node is the next-hop

In the example, for R1 to reach R5: R5 ← R4 ← R3 ← R2 ← R1. The next-hop is R2 (predecessor of R3 whose predecessor is R1).

Populating the Routing Table from SPT

Route Types from SPF:

OSPF generates different route types based on the LSA source:

Route Type	Source	Priority	Example
Intra-Area (O)	Type 1, 2 LSAs	Highest	Routes within the area
Inter-Area (O IA)	Type 3 LSAs	Medium	Routes from other areas via ABR
External Type 1 (E1)	Type 5 LSA	Lower	External routes, metric added to OSPF cost
External Type 2 (E2)	Type 5 LSA	Lowest	External routes, fixed external metric
NSSA Type 1 (N1)	Type 7 LSA	Variable	NSSA external, metric added
NSSA Type 2 (N2)	Type 7 LSA	Variable	NSSA external, fixed metric

Intra-area routes are always preferred over inter-area, which are preferred over external.

route_calculation.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
from dataclasses import dataclass
from typing import Dict, List, Optional, Tuple
from enum import IntEnum
 
class OSPFRouteType(IntEnum):
    """OSPF route types in preference order"""
    INTRA_AREA = 1      # O     - Within area (Type 1, 2 LSAs)
    INTER_AREA = 2      # O IA  - From other areas (Type 3 LSAs)
    EXTERNAL_TYPE1 = 3  # O E1  - External with additive metric
    EXTERNAL_TYPE2 = 4  # O E2  - External with fixed metric
    NSSA_TYPE1 = 5      # O N1  - NSSA external additive
    NSSA_TYPE2 = 6      # O N2  - NSSA external fixed
 
@dataclass
class OSPFRoute:
    """Represents an OSPF route in the routing table"""
    destination: str        # Network prefix (CIDR notation)
    next_hops: List[str]    # Next-hop IP addresses (multiple for ECMP)
    interfaces: List[str]   # Outgoing interfaces
    cost: int               # Total cost to destination
    route_type: OSPFRouteType
    area_id: Optional[str]  # Source area for inter-area routes
    external_metric: Optional[int]  # For external routes
    tag: Optional[int]      # Route tag for policy
    
    def __lt__(self, other: 'OSPFRoute') -> bool:
        """Route comparison for preference (lower is better)"""
        # First: route type preference
        if self.route_type != other.route_type:
            return self.route_type < other.route_type
        
        # Second: cost preference
        if self.route_type in (OSPFRouteType.EXTERNAL_TYPE2, 
                                OSPFRouteType.NSSA_TYPE2):
            # E2/N2: compare external metric, then OSPF cost to ASBR
            if self.external_metric != other.external_metric:
                return self.external_metric < other.external_metric
        
        return self.cost < other.cost
 
class RoutingTableBuilder:
    """
    Builds OSPF routing table from SPF results.
    
    Takes the Shortest Path Tree and LSDB as input,
    processes all reachable destinations, and generates
    routing table entries with proper preference.
    """
    
    def __init__(self, router_id: str, area_id: str):
        self.router_id = router_id
        self.area_id = area_id
        self.routes: Dict[str, OSPFRoute] = {}
        
    def add_intra_area_routes(self,
                               spt: Dict[str, Tuple[int, str, str]],
                               stub_networks: Dict[str, Tuple[str, int, str]]
                               ) -> None:
        """
        Add intra-area routes from SPT.
        
        Args:
            spt: {destination: (cost, next_hop, interface)}
            stub_networks: {network: (attached_router, cost, mask)}
        """
        # Routes to routers (transit to reach stub networks)
        for dest_router, (cost, next_hop, interface) in spt.items():
            if dest_router == self.router_id:
                continue
            
            # The router's loopback/RID is reachable
            route = OSPFRoute(
                destination=f"{dest_router}/32",
                next_hops=[next_hop],
                interfaces=[interface],
                cost=cost,
                route_type=OSPFRouteType.INTRA_AREA,
                area_id=self.area_id,
                external_metric=None,
                tag=None
            )
            self._add_or_update_route(route)
        
        # Routes to stub networks advertised by routers
        for network, (attached_router, stub_cost, mask) in stub_networks.items():
            if attached_router not in spt:
                continue  # Router not reachable
            
            router_cost, next_hop, interface = spt[attached_router]
            total_cost = router_cost + stub_cost
            
            route = OSPFRoute(
                destination=f"{network}/{self._mask_to_prefix(mask)}",
                next_hops=[next_hop] if attached_router != self.router_id else [],
                interfaces=[interface] if attached_router != self.router_id else ["local"],
                cost=total_cost,
                route_type=OSPFRouteType.INTRA_AREA,
                area_id=self.area_id,
                external_metric=None,
                tag=None
            )
            self._add_or_update_route(route)
    
    def add_inter_area_routes(self,
                               summary_lsas: List[dict],
                               spt: Dict[str, Tuple[int, str, str]]
                               ) -> None:
        """
        Add inter-area routes from Type 3 Summary LSAs.
        
        For each Summary LSA:
        1. Find the advertising ABR in SPT
        2. Add ABR cost + advertised metric
        3. Use ABR as next-hop
        """
        for lsa in summary_lsas:
            abr_id = lsa['advertising_router']
            network = lsa['link_state_id']
            mask = lsa['netmask']
            metric = lsa['metric']
            
            if abr_id not in spt:
                continue  # ABR not reachable
            
            abr_cost, next_hop, interface = spt[abr_id]
            total_cost = abr_cost + metric
            
            route = OSPFRoute(
                destination=f"{network}/{self._mask_to_prefix(mask)}",
                next_hops=[next_hop],
                interfaces=[interface],
                cost=total_cost,
                route_type=OSPFRouteType.INTER_AREA,
                area_id=lsa.get('source_area'),
                external_metric=None,
                tag=None
            )
            self._add_or_update_route(route)
    
    def add_external_routes(self,
                            external_lsas: List[dict],
                            spt: Dict[str, Tuple[int, str, str]],
                            asbr_paths: Dict[str, Tuple[int, str, str]]
                            ) -> None:
        """
        Add external routes from Type 5 AS-External LSAs.
        
        External Type 1: total cost = cost to ASBR + external metric
        External Type 2: total cost = external metric only (ASBR cost as tiebreaker)
        """
        for lsa in external_lsas:
            asbr_id = lsa['advertising_router']
            network = lsa['link_state_id']
            mask = lsa['netmask']
            external_metric = lsa['metric']
            metric_type = lsa.get('metric_type', 2)  # 1 or 2
            forwarding_address = lsa.get('forwarding_address')
            tag = lsa.get('tag')
            
            # Find path to ASBR (may be in different area)
            if asbr_id in spt:
                asbr_cost, next_hop, interface = spt[asbr_id]
            elif asbr_id in asbr_paths:
                asbr_cost, next_hop, interface = asbr_paths[asbr_id]
            else:
                continue  # ASBR not reachable
            
            # Calculate cost based on metric type
            if metric_type == 1:
                cost = asbr_cost + external_metric
                route_type = OSPFRouteType.EXTERNAL_TYPE1
            else:
                cost = asbr_cost  # For tiebreaking
                route_type = OSPFRouteType.EXTERNAL_TYPE2
            
            # Handle forwarding address if set
            if forwarding_address and forwarding_address != "0.0.0.0":
                # Route through forwarding address instead of ASBR
                # (requires forwarding address to be reachable intra-area)
                pass  # Simplified: use ASBR path
            
            route = OSPFRoute(
                destination=f"{network}/{self._mask_to_prefix(mask)}",
                next_hops=[next_hop],
                interfaces=[interface],
                cost=cost,
                route_type=route_type,
                area_id=None,
                external_metric=external_metric,
                tag=tag
            )
            self._add_or_update_route(route)
    
    def _add_or_update_route(self, new_route: OSPFRoute) -> None:
        """Add route, handling preference and ECMP"""
        existing = self.routes.get(new_route.destination)
        
        if existing is None:
            self.routes[new_route.destination] = new_route
            
        elif new_route < existing:
            # New route is better - replace
            self.routes[new_route.destination] = new_route
            
        elif not (existing < new_route):
            # Equal cost - add for ECMP
            if new_route.next_hops[0] not in existing.next_hops:
                existing.next_hops.extend(new_route.next_hops)
                existing.interfaces.extend(new_route.interfaces)
    
    def _mask_to_prefix(self, mask: str) -> int:
        """Convert subnet mask to prefix length"""
        octets = [int(x) for x in mask.split('.')]
        binary = ''.join(format(o, '08b') for o in octets)
        return binary.count('1')
    
    def get_routing_table(self) -> List[OSPFRoute]:
        """Return sorted routing table"""
        return sorted(self.routes.values(), 
                     key=lambda r: (r.route_type, r.cost))
 
# Example output format
print("""
OSPF Routing Table for Router 1.1.1.1:
 
Codes: O - OSPF intra-area, IA - OSPF inter-area
       E1 - OSPF external type 1, E2 - OSPF external type 2
       N1 - OSPF NSSA type 1, N2 - OSPF NSSA type 2
 
Destination        Next-Hop       Interface  Cost  Type
--------------     -----------    ---------  ----  ----
10.0.0.0/8         10.1.1.2       eth0       11    O
10.1.0.0/24        10.1.1.2       eth0       10    O
10.2.0.0/24        10.1.1.2       eth0       15    O
192.168.0.0/16     10.1.1.2       eth0       110   O IA
172.16.0.0/12      10.1.1.2       eth0       150   O E1
0.0.0.0/0          10.1.1.2       eth0       1     O E2
""")

Equal-Cost Multi-Path (ECMP)

Detecting Equal-Cost Paths:

During Dijkstra's algorithm, when we find a path to node v with the same cost as the current best:

if new_distance == distances[neighbor]:
    # Equal cost path found!
    predecessors[neighbor].append(current_node)  # Track multiple
elif new_distance < distances[neighbor]:
    predecessors[neighbor] = [current_node]  # Replace with new best

ECMP in Practice:

ECMP Configuration and Behavior
Aspect	Description	Typical Value
Maximum Paths	Maximum ECMP paths per destination	4-16 (platform dependent)
Load Balancing	How traffic is distributed	Per-flow hash (5-tuple)
Hash Algorithm	Determines flow-to-path mapping	CRC, XOR, or proprietary
Unequal Bandwidth	Handling paths with different capacity	May need traffic engineering

Converting Mermaid diagram...

Per-Flow vs. Per-Packet Load Balancing

Benefits of ECMP:

Increased Bandwidth: Aggregate capacity of multiple paths
Resilience: Failure of one path doesn't require reconvergence—traffic shifts automatically
Better Utilization: Prevents single-path bottlenecks
Faster Failover: Hardware can detect failure and redirect within milliseconds

Limitations:

Unequal Path Capacity: If paths have different bandwidths, equal load distribution wastes capacity
Polarization: Poor hash functions can cause uneven distribution
Troubleshooting Complexity: Packets may take different paths, complicating trace analysis

Incremental SPF and Optimizations

SPF Optimization Techniques
Technique	When Applied	Savings	Complexity
Full SPF	Always valid	None (baseline)	O((V+E) log V)
Partial Route Calc (PRC)	Leaf/stub network changes	Skip Dijkstra, update routes only	O(changed routes)
Incremental SPF (iSPF)	Limited topology changes	Recompute affected subtree only	O(affected portion × log V)
Incremental RIB Update	SPT unchanged, costs changed	Skip tree building	O(affected routes)

Partial Route Calculation (PRC):

When changes affect only "leaf" information (stub networks, external routes) without changing the SPT structure, OSPF can skip Dijkstra entirely:

Type 3/4/5/7 LSA changes → PRC only
Stub network addition/removal from Router-LSA → PRC if router is in SPT
Metric change for stub network → PRC

Incremental SPF (iSPF):

For topology changes that affect the SPT but in a localized way:

Identify the "affected area" of the tree
Prune and regrow only that portion
Leave unaffected subtrees intact

Example:

If link R3-R4 fails in a 1000-router network where only 50 routers have routes affected:

Full SPF: Examine all 1000 routers and their links
iSPF: Examine only the 50-router subtree rooted at R3

Potential speedup: 20x for this scenario.

When iSPF Helps Most

Incremental SPF provides the largest benefits when:

• Network is large (1000+ routers) • Changes are localized (single link/node failures) • SPT structure is mostly stable

For small networks or core topology changes, full SPF is often equally fast due to optimization overhead.

SPF Performance Monitoring

Monitoring SPF performance is essential for maintaining network health and meeting convergence SLAs. Excessive SPF runs or long computation times indicate problems requiring attention.

spf_monitoring_commands

! View SPF statistics
Router# show ip ospf statistics
 
OSPF Router with ID (1.1.1.1) (Process ID 1)
 
SPF Statistics:
  SPF algorithm executed 47 times
  Last SPF ran 00:05:23 ago
  Last SPF duration: 15 msec
  
SPF calculation timers:
  Current holdtime: 5000 msec
  Last delay: 0 msec
 
SPF Log (last 10 runs):
=================================================
Time            Duration   Reason
=================================================
Jan 16 15:30:00    15ms   Router LSA received
Jan 16 15:29:55    12ms   Link up (GigabitEthernet0/1)
Jan 16 15:25:00    18ms   Neighbor 2.2.2.2 Full
Jan 16 15:24:55    10ms   Router LSA aged out
Jan 16 14:55:00    14ms   Periodic refresh
...
 
! View OSPF process CPU usage
Router# show processes cpu | include OSPF
  45    12.0%   5.0%  OSPF-1 Hello
  46     3.2%   1.5%  OSPF-1 Router
  
! View SPF scheduling
Router# show ip ospf spf-schedule
 
OSPF Router with ID 1.1.1.1 (Process ID 1)
 
SPF schedule delay: 0 msec
SPF Hold time: 5000 msec
SPF Max wait: 10000 msec
 
Current SPF phase: QUIET
Time since last SPF: 325 seconds
SPF runs in last minute: 0
SPF runs in last hour: 3

SPF Health Indicators
Metric	Healthy Range	Warning Signs	Action
SPF Duration	< 100ms	500ms consistently	Check LSDB size, CPU
SPF Runs/Hour	< 10	100	Investigate topology instability
Holdtime Escalation	Returns to base	Stuck at max-wait	Link flapping, fix physical
CPU Usage	< 30% during SPF	80%	Tune timers, add areas

Summary: SPF Calculation

Key Takeaways

•SPF is a complete pipeline — Triggering, scheduling, graph construction, Dijkstra execution, and route installation work together.
•Scheduling protects stability — Exponential backoff prevents CPU exhaustion during network instability while allowing fast initial convergence.
•The SPT is the core output — Dijkstra produces the Shortest Path Tree from which all routes are derived.
•Route types have preferences — Intra-area routes are preferred over inter-area, which are preferred over external.
•ECMP provides resilience and bandwidth — Equal-cost paths enable load balancing and fast failover without reconvergence.
•Incremental SPF optimizes for partial changes — PRC and iSPF skip unnecessary computation when topology changes are localized.
•Monitoring enables proactive management — SPF statistics reveal network health and convergence performance.

What's Next:

Page Complete