Loading content...
We've explored SDN's principles, programmability, and benefits. Now we synthesize everything into a complete architectural view—the blueprint for how SDN systems are designed, deployed, and operated.
SDN architecture isn't a single specification but a set of architectural patterns and components that can be combined in various ways. Understanding these patterns is essential for designing SDN solutions, evaluating products, or operating SDN deployments.
This page provides the comprehensive architectural reference: the components and their responsibilities, the interfaces connecting them, deployment models, scalability considerations, and design patterns for production systems.
By the end of this page, you will understand: the complete SDN component architecture; all interface types and their protocols; controller architecture internals; deployment models and high availability patterns; and design considerations for production SDN systems.
SDN architecture organizes network components into distinct layers with well-defined interfaces between them. This layering enables independent evolution, interoperability, and clear separation of concerns.
Application Plane The topmost layer contains network applications that consume SDN controller services to implement network functionality:
Control Plane The middle layer—the SDN controller—provides:
Data Plane (Infrastructure Layer) The bottom layer contains the physical and virtual network devices:
Southbound Interface (SBI)
Northbound Interface (NBI)
East-West Interface (EWI)
Management Interface
The SDN controller is the most complex component in the architecture. Understanding its internal structure is essential for designing applications, troubleshooting issues, and making informed architectural decisions.
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520
"""SDN Controller Architecture: Internal Component Model This module illustrates the internal architecture of an SDN controller,showing how components interact to provide controller functionality.""" from abc import ABC, abstractmethodfrom dataclasses import dataclass, fieldfrom typing import Dict, List, Optional, Callable, Setfrom enum import Enumimport asynciofrom collections import defaultdict # ==================================================# SOUTHBOUND PLUGIN FRAMEWORK# ================================================== class SouthboundProtocol(ABC): """ Abstract base for southbound protocol plugins. The plugin framework allows supporting multiple protocols through a common abstraction. """ @abstractmethod async def connect_device(self, device_id: str, address: str): """Establish connection to a device.""" pass @abstractmethod async def install_flow(self, device_id: str, flow: dict): """Install a flow rule on a device.""" pass @abstractmethod async def delete_flow(self, device_id: str, flow_id: str): """Delete a flow rule from a device.""" pass @abstractmethod async def get_statistics(self, device_id: str) -> dict: """Retrieve statistics from a device.""" pass class OpenFlowPlugin(SouthboundProtocol): """OpenFlow southbound plugin implementation.""" def __init__(self): self.connections: Dict[str, object] = {} # device_id -> connection async def connect_device(self, device_id: str, address: str): # Establish OpenFlow connection # Handle HELLO, FEATURES_REQUEST, etc. print(f"OpenFlow: Connecting to {device_id} at {address}") async def install_flow(self, device_id: str, flow: dict): # Build and send FLOW_MOD message print(f"OpenFlow: Installing flow on {device_id}") async def delete_flow(self, device_id: str, flow_id: str): # Build and send FLOW_MOD (delete) message print(f"OpenFlow: Deleting flow {flow_id} on {device_id}") async def get_statistics(self, device_id: str) -> dict: # Send STATS_REQUEST and collect STATS_REPLY return {'flows': [], 'ports': []} class NETCONFPlugin(SouthboundProtocol): """NETCONF southbound plugin implementation.""" async def connect_device(self, device_id: str, address: str): print(f"NETCONF: Connecting to {device_id}") async def install_flow(self, device_id: str, flow: dict): # Build XML configuration and send via NETCONF print(f"NETCONF: Configuring {device_id}") async def delete_flow(self, device_id: str, flow_id: str): print(f"NETCONF: Removing config from {device_id}") async def get_statistics(self, device_id: str) -> dict: # Use NETCONF get-data operation return {'interfaces': []} class SouthboundManager: """ Manages southbound plugins and provides unified device access. Applications interact with devices through this abstraction without knowing which protocol is used underneath. """ def __init__(self): self.plugins: Dict[str, SouthboundProtocol] = { 'openflow': OpenFlowPlugin(), 'netconf': NETCONFPlugin(), } self.device_protocols: Dict[str, str] = {} # device_id -> protocol def get_plugin(self, device_id: str) -> Optional[SouthboundProtocol]: protocol = self.device_protocols.get(device_id) return self.plugins.get(protocol) async def install_flow(self, device_id: str, flow: dict): plugin = self.get_plugin(device_id) if plugin: await plugin.install_flow(device_id, flow) # ==================================================# EVENT BUS# ================================================== class EventType(Enum): """Types of events distributed through the event bus.""" DEVICE_CONNECTED = "device_connected" DEVICE_DISCONNECTED = "device_disconnected" LINK_DISCOVERED = "link_discovered" LINK_DOWN = "link_down" HOST_DISCOVERED = "host_discovered" PACKET_IN = "packet_in" FLOW_REMOVED = "flow_removed" TOPOLOGY_CHANGED = "topology_changed" @dataclassclass Event: """An event distributed through the event bus.""" type: EventType data: dict timestamp: float = field(default_factory=lambda: __import__('time').time()) class EventBus: """ Internal publish-subscribe event bus. Enables loose coupling between controller components. Components subscribe to events they care about; publishers don't need to know about subscribers. """ def __init__(self): self.subscribers: Dict[EventType, List[Callable]] = defaultdict(list) self.event_queue: asyncio.Queue = asyncio.Queue() def subscribe(self, event_type: EventType, handler: Callable): """Register a handler for an event type.""" self.subscribers[event_type].append(handler) async def publish(self, event: Event): """Publish an event to all subscribers.""" await self.event_queue.put(event) async def dispatch_loop(self): """Main event dispatch loop.""" while True: event = await self.event_queue.get() handlers = self.subscribers.get(event.type, []) for handler in handlers: try: await handler(event) except Exception as e: print(f"Error in event handler: {e}") # ==================================================# TOPOLOGY MANAGER# ================================================== @dataclassclass Device: """A network device registered with the controller.""" id: str dpid: str ports: Dict[int, dict] capabilities: Set[str] connected: bool = True @dataclass class Link: """A link between two network devices.""" src_device: str src_port: int dst_device: str dst_port: int bandwidth_bps: int latency_us: int @dataclassclass Host: """An end host discovered in the network.""" mac: str ip: Optional[str] device_id: str port: int class TopologyManager: """ Discovers and maintains network topology. Core responsibilities: - Device inventory management - Link discovery (via LLDP injection) - Host tracking (via packet-in analysis) - Topology graph computation """ def __init__(self, event_bus: EventBus, southbound: SouthboundManager): self.event_bus = event_bus self.southbound = southbound self.devices: Dict[str, Device] = {} self.links: Dict[str, Link] = {} self.hosts: Dict[str, Host] = {} # Subscribe to relevant events event_bus.subscribe(EventType.DEVICE_CONNECTED, self.handle_device_connected) event_bus.subscribe(EventType.PACKET_IN, self.handle_packet_in) async def handle_device_connected(self, event: Event): """Handle new device connection.""" device = Device( id=event.data['device_id'], dpid=event.data['dpid'], ports=event.data.get('ports', {}), capabilities=set(event.data.get('capabilities', [])), ) self.devices[device.id] = device # Initiate link discovery await self.start_link_discovery(device.id) async def start_link_discovery(self, device_id: str): """ Initiate LLDP-based link discovery. Process: 1. Send LLDP packet out each port 2. LLDP arrives at connected device 3. Connected device sends LLDP to controller (packet-in) 4. Controller correlates LLDP to discover link """ device = self.devices.get(device_id) if not device: return for port_num in device.ports: lldp_packet = self._build_lldp_packet(device_id, port_num) await self.southbound.send_packet_out(device_id, port_num, lldp_packet) async def handle_packet_in(self, event: Event): """Handle packet-in events for topology discovery.""" packet = event.data.get('packet') if self._is_lldp(packet): # Extract LLDP info and discover link src_device = self._extract_lldp_device(packet) src_port = self._extract_lldp_port(packet) dst_device = event.data['device_id'] dst_port = event.data['in_port'] link = Link( src_device=src_device, src_port=src_port, dst_device=dst_device, dst_port=dst_port, bandwidth_bps=0, # Would come from port info latency_us=0, ) link_id = f"{src_device}:{src_port}-{dst_device}:{dst_port}" self.links[link_id] = link await self.event_bus.publish(Event( type=EventType.LINK_DISCOVERED, data={'link': link} )) elif self._is_arp_or_ip(packet): # Extract host info host = self._extract_host_info( packet, event.data['device_id'], event.data['in_port'] ) if host: self.hosts[host.mac] = host await self.event_bus.publish(Event( type=EventType.HOST_DISCOVERED, data={'host': host} )) def get_topology_graph(self): """Return topology as a graph for path computation.""" import networkx as nx graph = nx.DiGraph() for device in self.devices.values(): graph.add_node(device.id, type='switch') for link in self.links.values(): graph.add_edge( link.src_device, link.dst_device, port=link.src_port, bandwidth=link.bandwidth_bps ) for host in self.hosts.values(): graph.add_node(host.mac, type='host', ip=host.ip) graph.add_edge(host.mac, host.device_id, port=host.port) graph.add_edge(host.device_id, host.mac, port=host.port) return graph # Helper methods (stubs) def _build_lldp_packet(self, device_id, port): return b'' def _is_lldp(self, packet): return False def _is_arp_or_ip(self, packet): return False def _extract_lldp_device(self, packet): return '' def _extract_lldp_port(self, packet): return 0 def _extract_host_info(self, packet, device_id, port): return None # ==================================================# FLOW MANAGER# ================================================== @dataclassclass InstalledFlow: """Tracks a flow installed in the network.""" flow_id: str device_id: str match: dict actions: list priority: int installed_by: str # Application that installed it install_time: float class FlowManager: """ Manages flow rules across all devices. Responsibilities: - Flow installation, modification, deletion - Flow tracking and lifecycle management - Conflict detection and resolution - Flow table capacity management """ def __init__(self, southbound: SouthboundManager, event_bus: EventBus): self.southbound = southbound self.event_bus = event_bus self.installed_flows: Dict[str, InstalledFlow] = {} self.flow_counter = 0 async def install_flow( self, device_id: str, match: dict, actions: list, priority: int, app_id: str, idle_timeout: int = 0, hard_timeout: int = 0 ) -> str: """ Install a flow rule on a device. Returns flow_id for tracking. """ # Generate flow ID self.flow_counter += 1 flow_id = f"flow_{self.flow_counter}" # Check for conflicts conflict = self._check_conflicts(device_id, match, priority, app_id) if conflict: raise ValueError(f"Flow conflicts with existing flow: {conflict}") # Build flow specification flow_spec = { 'match': match, 'actions': actions, 'priority': priority, 'idle_timeout': idle_timeout, 'hard_timeout': hard_timeout, } # Install via southbound await self.southbound.install_flow(device_id, flow_spec) # Track installation import time self.installed_flows[flow_id] = InstalledFlow( flow_id=flow_id, device_id=device_id, match=match, actions=actions, priority=priority, installed_by=app_id, install_time=time.time(), ) return flow_id async def delete_flow(self, flow_id: str): """Delete a previously installed flow.""" flow = self.installed_flows.get(flow_id) if not flow: return await self.southbound.delete_flow(flow.device_id, flow_id) del self.installed_flows[flow_id] def _check_conflicts( self, device_id: str, match: dict, priority: int, app_id: str ) -> Optional[str]: """ Check if new flow conflicts with existing flows. Conflict resolution strategies vary by controller: - First wins: Earlier flow takes precedence - Priority-based: Higher priority app wins - Composition: Combine actions from both flows """ for flow in self.installed_flows.values(): if flow.device_id != device_id: continue if flow.priority == priority and self._matches_overlap(flow.match, match): if flow.installed_by != app_id: return flow.flow_id return None def _matches_overlap(self, match1: dict, match2: dict) -> bool: """Check if two match specifications overlap.""" # Simplified: check for identical matches # Full implementation would check field-by-field overlap return match1 == match2 # ==================================================# COMPLETE CONTROLLER# ================================================== class SDNController: """ Complete SDN controller integrating all components. This represents the full controller architecture with all internal components working together. """ def __init__(self): # Initialize event bus first (other components depend on it) self.event_bus = EventBus() # Initialize southbound self.southbound = SouthboundManager() # Initialize core services self.topology = TopologyManager(self.event_bus, self.southbound) self.flows = FlowManager(self.southbound, self.event_bus) # Application registry self.applications: Dict[str, object] = {} async def start(self): """Start the controller.""" # Start event dispatch loop asyncio.create_task(self.event_bus.dispatch_loop()) # Start device listeners (OpenFlow, NETCONF servers) # In real implementation, each protocol plugin starts its listener print("SDN Controller started") def register_application(self, app_id: str, app: object): """Register a network application with the controller.""" self.applications[app_id] = app # Northbound API methods (exposed to applications) def get_topology(self) -> dict: """NBI: Get current network topology.""" return { 'devices': list(self.topology.devices.values()), 'links': list(self.topology.links.values()), 'hosts': list(self.topology.hosts.values()), } async def add_flow(self, device_id: str, flow: dict, app_id: str) -> str: """NBI: Install a flow rule.""" return await self.flows.install_flow( device_id=device_id, match=flow.get('match', {}), actions=flow.get('actions', []), priority=flow.get('priority', 32768), app_id=app_id, idle_timeout=flow.get('idle_timeout', 0), hard_timeout=flow.get('hard_timeout', 0), )The southbound interface connects the controller to network devices. Multiple protocols exist, each with different capabilities and use cases.
The original SDN protocol, designed specifically for flow-based forwarding control.
Capabilities:
Versions:
Limitations:
| Protocol | Primary Use | Key Features | Typical Deployment |
|---|---|---|---|
| OpenFlow | Flow programming | Match-action flow tables, packet I/O | SDN datacenters, research |
| P4Runtime | Programmable forwarding | Program packet parser and pipeline | Programmable ASICs |
| NETCONF/YANG | Device configuration | XML-based config, transactions | Traditional + SDN hybrid |
| OVSDB | OVS management | Database operations on OVS tables | Virtual switching (OVS) |
| gNMI | Streaming telemetry | Subscribe to real-time state | Network monitoring |
| gRPC | General RPC | High-performance, typed interfaces | Various SDN implementations |
Use OpenFlow when:
Use P4Runtime when:
Use NETCONF when:
Use gNMI when:
Multi-Protocol Deployments Most production SDN deployments use multiple southbound protocols:
SDN can be deployed in various models depending on requirements, existing infrastructure, and organizational capabilities.
Single Controller Simplest deployment: one controller instance manages the entire network.
Controller Cluster (Active-Active) Multiple controller instances share workload.
Controller Cluster (Active-Standby) One active controller, standbys ready for failover.
Hierarchical Controllers Multiple domains with local controllers coordinated by global controller.
Production SDN deployments require high availability (HA). The centralized controller must not become a single point of failure. SDN HA design addresses multiple failure scenarios.
Controller Clustering Multiple controller instances form a cluster:
Switch-Controller Connection Redundancy Switches connect to multiple controllers:
| Role | Flow Mods | Packet-In Handling | Failure Behavior |
|---|---|---|---|
| MASTER | Full access | Receives all | Another controller promoted |
| SLAVE | Read-only | None | Can be promoted to master |
| EQUAL | Full access | Receives all | Other equals continue operating |
Controller Disconnection Handling What happens when a switch loses connection to all controllers?
Fail-Secure Mode:
Fail-Standalone Mode:
Fail-Open Mode:
Persistent Flow Rules Critical flows can be installed with no timeout:
Distributed controller state introduces consistency challenges. During network partitions or controller failures, different controllers may have different views of network state. CAP theorem tradeoffs apply: choose between consistency (pause operations until state converges) and availability (continue operating with possibly stale state). Controller design significantly impacts behavior during failures.
SDN deployments must scale to meet network size and traffic characteristics. Several dimensions of scale affect architecture design.
Proactive vs Reactive Flow Installation
Flow Aggregation
Controller Horizontal Scaling
Switch-Side Processing
Sampling and Summarization
| Metric | Small | Medium | Large | Very Large |
|---|---|---|---|---|
| Switches | 10-50 | 100-500 | 500-2000 | 2000+ |
| Total flows | 10K | 100K | 1M | 10M+ |
| Flow setup/sec | 1K | 10K | 100K | 500K+ |
| Controller instances | 1 | 3 | 5-7 | 10+ |
| Typical use case | Enterprise | Campus | Cloud DC | Hyperscale |
Successful SDN deployments follow established design patterns that address common challenges. Understanding these patterns helps in designing and troubleshooting SDN systems.
Controller in Every Packet Path Sending every packet to controller doesn't scale. Use reactive installation with flow caching, not packet-by-packet control.
Unbounded Reactive State Reactive flow installation can fill flow tables. Implement timeouts, rate limits, and capacity monitoring.
Single Controller Without HA Production deployments need controller redundancy. Single controller is a single point of failure.
Ignoring Switch Heterogeneity Different switches have different capabilities (table sizes, protocols, performance). Design must accommodate the weakest link.
Stateless Controller Design Controllers need persistent state for consistency across failures. Ephemeral state causes problems during failover.
We've completed a comprehensive tour of SDN architecture—from high-level component organization to detailed design patterns. Let's consolidate the key architectural insights:
Module Complete:
This concludes the SDN Concepts module. You now have a comprehensive understanding of Software-Defined Networking—from the fundamental paradigm shift away from traditional distributed networking, through control/data plane separation, network programmability, SDN benefits, to detailed architecture and design patterns.
The next module will explore OpenFlow in depth—the protocol that made SDN practical, covering flow tables, messages, pipeline processing, and implementation details.
Congratulations on completing the SDN Concepts module! You've mastered the foundational concepts of Software-Defined Networking: the paradigm shift from traditional networking, control/data plane separation, programmable networks, the benefits SDN delivers, and the complete architectural picture. You're now prepared to dive into specific SDN technologies like OpenFlow, SDN controllers, and network function virtualization.