Loading learning content...
Consider a retail corporation with 500 stores across the country, each with point-of-sale systems, inventory management computers, and employee workstations. Every transaction must reach the central data center for processing. Every inventory update must synchronize with corporate systems. Every employee in every store needs access to corporate email, HR systems, and training portals.
Now imagine deploying 500 individual remote access VPN connections—managing thousands of client configurations, handling user credentials for tens of thousands of employees, and troubleshooting connectivity issues when individual laptops misconfigure their VPN settings. The operational burden would be crushing.
Site-to-site VPN solves this problem by shifting the VPN termination point from individual user devices to network infrastructure. Instead of 500 stores each having employees with VPN clients, 500 stores each have a single VPN gateway that secures all traffic from that location. The network itself becomes the VPN participant, transparently protecting every device without requiring per-device configuration.
This architectural pattern—connecting networks rather than users—is fundamental to enterprise infrastructure. It enables unified corporate networks spanning continents, disaster recovery sites that mirror primary data centers, and cloud environments that extend on-premises networks into AWS, Azure, or GCP regions.
By the end of this page, you will understand site-to-site VPN architecture in depth: the topologies that connect multiple sites, routing protocols that work over VPN tunnels, high availability designs that prevent single points of failure, MTU considerations that affect performance, and practical implementation details that distinguish reliable deployments from fragile ones.
Site-to-site VPN connects two or more networks through encrypted tunnels, making geographically distributed locations appear as a single logical network. Unlike remote access VPN where individual users initiate connections, site-to-site VPN tunnels are typically established between VPN gateways—dedicated devices or software services that handle encryption, decapsulation, and routing for all traffic between sites.
Core Architectural Components
VPN Gateway/Concentrator: At each site, a VPN gateway device terminates the VPN tunnel. This could be:
Tunnel Configuration: Each tunnel requires matching configuration on both endpoints:
Routing: Site-to-site VPN requires routing configuration so that traffic destined for remote networks is directed into the VPN tunnel:
Traffic Flow in Site-to-Site VPN
Let's trace a packet from a workstation in Branch Office 1 (IP: 10.2.1.50) to a server at Headquarters (IP: 10.1.1.100):
Return traffic follows the same process in reverse, with HQ encrypting replies and Branch decrypting them.
A key advantage of site-to-site VPN: end devices (workstations, servers, printers) require no special configuration. They're unaware that traffic crosses an encrypted tunnel. This drastically simplifies deployment compared to installing VPN clients on every device.
When connecting multiple sites, the topology—how tunnels interconnect sites—has profound implications for performance, resilience, management complexity, and cost. Understanding topology options is essential for designing effective VPN architectures.
Hub-and-Spoke Topology
In hub-and-spoke (also called star topology), all branch sites connect only to a central hub site. Inter-branch traffic routes through the hub:
[Branch A]----\
\____[Hub]____/----[Branch B]
[Branch C]----/
Characteristics:
Advantages:
Disadvantages:
Full Mesh Topology
In full mesh topology, every site has a direct tunnel to every other site:
[Branch A]----[Branch B]
| \ / |
| \ / |
| \/ |
| /\ |
| / \ |
[Branch C]----[Hub D]
Characteristics:
Advantages:
Disadvantages:
Partial Mesh Topology
Partial mesh is a pragmatic compromise—some sites connect directly while others route through hubs:
This approach balances tunnel count, performance, and management complexity.
| Topology | Tunnels for N Sites | 10 Sites | 50 Sites | 100 Sites | Complexity | Resilience |
|---|---|---|---|---|---|---|
| Hub-and-Spoke | N - 1 | 9 | 49 | 99 | Low | Single hub failure = total outage |
| Full Mesh | N×(N-1)/2 | 45 | 1,225 | 4,950 | Very High | Highly resilient |
| Partial Mesh | Varies | ~15-25 | ~100-300 | ~200-500 | Medium | Regional failures possible |
| Dual Hub | ~2×(N-1) | 18 | 98 | 198 | Medium | Good with hub redundancy |
DMVPN: Dynamic Multipoint VPN
Cisco's Dynamic Multipoint VPN (DMVPN) technology addresses the mesh scalability problem through dynamic tunnel creation:
DMVPN provides full mesh connectivity benefits with hub-and-spoke configuration simplicity:
Similar solutions exist from other vendors (ADVPN from various vendors, SD-WAN solutions), all solving the same fundamental problem of mesh scalability.
Choose hub-and-spoke for small deployments or when central policy enforcement is critical. Use full mesh only for a handful of sites that genuinely need optimal direct paths. For most enterprise deployments, partial mesh or dynamic mesh solutions (DMVPN, SD-WAN) provide the best balance of performance, manageability, and resilience.
Site-to-site VPN requires routing configuration that directs traffic destined for remote networks into VPN tunnels. The choice between static and dynamic routing has significant operational implications.
Static Routing
With static routing, administrators manually configure routes pointing remote subnets to the VPN tunnel interface:
! Example Cisco IOS static route
ip route 10.2.0.0 255.255.0.0 Tunnel0
ip route 10.3.0.0 255.255.0.0 Tunnel0
Advantages:
Disadvantages:
Dynamic Routing Over VPN
Dynamic routing protocols can run over VPN tunnels, treating the tunnel as just another network link:
OSPF over VPN:
BGP over VPN:
EIGRP over VPN (Cisco proprietary):
Routing Protocol Selection
Different protocols suit different scenarios:
OSPF:
BGP:
Static with Object Tracking:
Route Summarization
With dynamic routing, careful summarization reduces routing table size and control plane overhead:
Example: Instead of advertising 10.2.1.0/24, 10.2.2.0/24, ... 10.2.50.0/24 from Branch 1, advertise a single 10.2.0.0/16 summary.
When running routing protocols over VPN, ensure tunnels are stable before protocol adjacencies form. Flapping tunnels cause routing instability. Also consider routing protocol timers—fast hello intervals on unstable Internet connections can cause unnecessary reconvergence. Tune timers appropriately for the underlying transport reliability.
Site-to-site VPN often carries business-critical traffic. Design for high availability requires redundancy at multiple levels.
VPN Gateway Redundancy
Single VPN gateways create single points of failure. Redundancy approaches include:
Active/Passive Clustering:
Active/Active Clustering:
Independent Gateways with Routing Failover:
Tunnel Redundancy
Even with redundant gateways, a single ISP failure can disable connectivity. Tunnel-level redundancy options:
Multiple ISPs:
Multiple Tunnels per ISP:
Dead Peer Detection (DPD)
VPN gateways need to detect when remote peers become unreachable. Dead Peer Detection is an IKE mechanism that:
DPD settings balance responsiveness against false positives:
Path MTU and Fragmentation
VPN encapsulation adds overhead (typically 50-80+ bytes for IPSec ESP), reducing the maximum transmission unit (MTU) for payload data:
Solution approaches:
Pre-fragmentation:
Post-fragmentation:
Path MTU Discovery:
Best practice: Configure TCP MSS clamping at VPN gateways to 1360-1380 bytes, ensuring TCP-based traffic fits without fragmentation.
Failover mechanisms should be tested regularly, not just during initial deployment. Schedule maintenance windows to deliberately fail primary components and verify that redundancy works as expected. Document failover times and compare against SLA requirements.
IPSec is the dominant protocol for site-to-site VPN due to its standardization, security strength, and wide vendor support. Understanding IPSec configuration for site-to-site scenarios requires grasping several interconnected concepts.
IKE (Internet Key Exchange) Phases
IPSec tunnel establishment occurs in two phases:
Phase 1 (IKE SA Establishment):
Phase 2 (IPSec SA Establishment):
IKEv1 vs. IKEv2
IKEv2 is the modern standard, offering improvements over IKEv1:
| Feature | IKEv1 | IKEv2 |
|---|---|---|
| Message exchanges | 6-9 | 4 |
| NAT traversal | Add-on | Built-in |
| Dead peer detection | Add-on | Built-in |
| Reliability | None | Retransmission, sequence numbers |
| Authentication methods | Limited | Extensible (EAP support) |
| Rekeying | Disruptive | Seamless |
| MOBIKE | No | Yes (mobility support) |
New deployments should always use IKEv2 unless compatibility with legacy equipment mandates IKEv1.
Policy-Based vs. Route-Based VPN
IPSec implementations fall into two architectural approaches:
Policy-Based VPN:
Route-Based VPN:
Route-based is generally preferred for modern deployments because:
Example Route-Based Configuration (Cisco IOS):
! Phase 1 (IKEv2)
crypto ikev2 proposal PROP-IKEV2
encryption aes-cbc-256
integrity sha384
group 19
crypto ikev2 policy POL-IKEV2
proposal PROP-IKEV2
crypto ikev2 keyring KEYRING
peer HQ
address 203.0.113.1
pre-shared-key SECURE_KEY_HERE
crypto ikev2 profile IKEV2-PROFILE
match identity remote address 203.0.113.1
authentication local pre-share
authentication remote pre-share
keyring local KEYRING
! Phase 2 (IPSec)
crypto ipsec transform-set TS esp-aes 256 esp-sha384-hmac
mode tunnel
crypto ipsec profile IPSEC-PROFILE
set transform-set TS
set ikev2-profile IKEV2-PROFILE
! Tunnel Interface
interface Tunnel0
ip address 10.255.0.2 255.255.255.252
tunnel source GigabitEthernet0/0
tunnel destination 203.0.113.1
tunnel mode ipsec ipv4
tunnel protection ipsec profile IPSEC-PROFILE
! Routing
ip route 10.1.0.0 255.255.0.0 Tunnel0
For new deployments: Use AES-256-GCM (authenticated encryption), SHA-384 or SHA-512 for integrity where needed, and Diffie-Hellman Group 19 (256-bit ECC), 20 (384-bit ECC), or 21 (521-bit ECC). Avoid SHA-1, DES, 3DES, and DH Groups 1, 2, and 5 which are all considered weak or deprecated.
Extending corporate networks to cloud environments is a critical use case for site-to-site VPN. All major cloud providers offer VPN gateway services that integrate with on-premises VPN infrastructure.
AWS VPN Solutions
AWS Site-to-Site VPN:
AWS Transit Gateway with VPN:
Azure VPN Gateway
VpnGw SKUs:
Azure Virtual WAN:
| Provider | Service | Max Throughput | BGP Support | HA Mechanism |
|---|---|---|---|---|
| AWS | Site-to-Site VPN | 1.25 Gbps/tunnel (ECMP for more) | Yes | Dual tunnels to different AZs |
| Azure | VPN Gateway | 10 Gbps (VpnGw5) | Yes | Active-Active, Zone-redundant |
| GCP | Cloud VPN | 3 Gbps/tunnel (8 tunnels = 24 Gbps) | Yes | HA VPN with 99.99% SLA |
| Oracle Cloud | IPSec VPN | 250 Mbps - 10 Gbps | Yes | Dual tunnel redundancy |
Google Cloud Platform (GCP) VPN
Classic VPN:
HA VPN:
Multi-Cloud VPN Connectivity
Organizations using multiple cloud providers need to interconnect them:
Option 1: On-premises hub:
Option 2: Direct cloud-to-cloud:
Option 3: Cloud interconnect services:
Design Considerations for Cloud VPN
Cloud VPN gateways have throughput limits that may surprise architects accustomed to high-capacity on-premises equipment. An AWS VPN connection's 1.25 Gbps per tunnel may be insufficient for data-intensive workloads. For bulk data transfer or high-throughput requirements, consider AWS Direct Connect, Azure ExpressRoute, or GCP Dedicated Interconnect.
Deploying site-to-site VPN is only the beginning. Ongoing operations require attention to monitoring, troubleshooting, and lifecycle management.
Monitoring and Alerting
Critical metrics to monitor:
Tunnel State:
Performance Metrics:
Security Events:
Troubleshooting Methodology
When tunnels fail, systematic troubleshooting is essential:
Lifecycle Management
Certificate Rotation:
Algorithm Upgrades:
Firmware/Software Updates:
Change Management
VPN changes require careful coordination:
Site-to-site VPN configurations should be meticulously documented: IP addresses, subnet ranges, authentication details (without storing actual secrets in documentation), routing configuration, and contact information for remote site administrators. During outages, good documentation reduces troubleshooting time dramatically.
Site-to-site VPN is the backbone of enterprise network connectivity, securely linking geographically distributed locations into unified infrastructure. Let's consolidate the key knowledge from this page:
What's Next:
The next page explores Remote Access VPN—the counterpart to site-to-site VPN that connects individual users rather than networks. You'll learn about the unique challenges of remote access scenarios: user authentication, client provisioning, split tunneling policies, and the security considerations that arise when endpoints are outside corporate control.
You now have a deep understanding of site-to-site VPN: the architectures that connect enterprise networks, the topologies that scale to hundreds of sites, the routing that enables seamless inter-site communication, and the operational practices that keep VPN infrastructure reliable. This knowledge is essential for designing and maintaining enterprise network connectivity.