Load Balancing - Learning Module

Loading content...

0/240

Global Load Balancing

Serving the World: Beyond Single-Region Load Balancing

When a user in Tokyo access your application, do they connect to a server in Tokyo or one in Virginia? The answer to this question—and the sophisticated systems that determine it—is the domain of Global Load Balancing (GSLB), also known as Global Server Load Balancing or Geographic Load Balancing.

Global load balancing operates at a fundamentally different layer than the load balancers we've discussed so far. While traditional load balancers distribute traffic within a data center or region, global load balancers distribute traffic across data centers and regions, often spanning continents.

This page explores the architectures, protocols, and strategies that enable global services to route billions of requests daily to the optimal location, providing low latency, high availability, and intelligent traffic management across the entire planet.

What You Will Master

By the end of this page, you will understand: how DNS-based global load balancing works, how BGP anycast provides geographic routing, the tradeoffs between different GSLB approaches, multi-region architecture patterns, failover and disaster recovery strategies, and performance optimization techniques used by global services.

Why Global Load Balancing Matters

Global load balancing addresses three fundamental challenges that single-region architectures cannot solve:

Challenge 1: Latency

Data travels at the speed of light, but even light is slow across continents:

Route	Distance	Light Speed Latency	Real-World RTT
NYC → London	5,585 km	~19 ms	70-90 ms
NYC → Tokyo	10,850 km	~36 ms	150-200 ms
NYC → Sydney	16,000 km	~53 ms	200-250 ms

RTT = Round Trip Time. Real-world is higher due to routing, switching, and processing.

For a user in Sydney accessing a US-only service, every request adds 200+ ms of latency. If a page requires 10 API calls, that's 2+ seconds of added delay—often the difference between engagement and abandonment.

Challenge 2: Availability

No single data center provides 100% availability. Hardware fails. Networks have outages. Natural disasters occur.

Single Region Availability:
  99.9% uptime = 8.76 hours/year downtime

Multi-Region with Failover:
  If regions fail independently:
  99.99% uptime = 52.6 minutes/year downtime
  (Achieved when either region can serve all traffic)

Challenge 3: Regulatory Compliance

Many regulations require data to remain in specific geographic regions:

Regulation	Requirement
GDPR (EU)	EU citizen data processed in EU
CCPA (California)	California consumer data protections
PDPA (Singapore)	Personal data processed in Singapore
Data Localization Laws	Various countries require in-country processing

Global load balancing enables routing users to compliant regions automatically.

Challenge 4: Cost Optimization

Data egress costs vary by region. Traffic between regions is expensive:

AWS Data Transfer Costs (approximate):
  Within AZ: Free
  Between AZs: $0.01/GB
  Between Regions: $0.02-0.09/GB
  To Internet: $0.05-0.09/GB

Keeping traffic in-region reduces costs significantly at scale.

The 100ms Rule

Research shows that every 100ms of latency reduces conversions by ~1%. For e-commerce at scale, this translates to millions in lost revenue. Global load balancing isn't just a technical optimization—it's a business imperative.

DNS-Based Global Load Balancing

The most widely used approach to global load balancing leverages the Domain Name System (DNS). Since every connection begins with a DNS lookup, DNS provides a natural interception point for geographic routing.

How DNS-Based GSLB Works:

1. User wants to access app.example.com
2. Browser queries DNS resolver for app.example.com
3. Resolver contacts authoritative DNS (GSLB-enabled)
4. GSLB DNS determines best endpoint based on:
   - Geographic location of resolver
   - Health of backends in each region
   - Load/capacity in each region
   - Custom routing policies
5. GSLB returns IP address of optimal region
6. User connects to returned IP

           User (Tokyo)
               │
               ▼
         DNS Resolver
               │
               ▼
         GSLB DNS ─────► Determines: Tokyo is closest
               │
               ▼
         Returns: 52.198.x.x (Tokyo region IP)
               │
               ▼
         User connects to Tokyo datacenter

Key Caveat: Resolver Location ≠ User Location

DNS queries come from resolvers, not users. Users might use:

ISP resolver (usually geographically close)
Public resolver (8.8.8.8, 1.1.1.1) with global presence
Corporate resolver (could be in different region)

EDNS Client Subnet (ECS):

ECS solves the resolver location problem by including client subnet in DNS queries:

DNS Query (without ECS):
  "What is app.example.com?" (from resolver 8.8.8.8)
  
DNS Query (with ECS):
  "What is app.example.com?"
  "Asking on behalf of client in 203.0.113.0/24"

The authoritative DNS can route based on actual user location, not resolver location.

DNS-Based GSLB Configuration (AWS Route 53 Example):

{
  "HealthCheck": {
    "Type": "HTTPS",
    "FullyQualifiedDomainName": "app.us-east-1.example.com"
  }
}

{
  "RecordSet": {
    "Name": "app.example.com",
    "Type": "A",
    "SetIdentifier": "us-east-1",
    "Region": "us-east-1",
    "HealthCheckId": "abc123",
    "AliasTarget": {
      "DNSName": "alb-us-east-1.elb.amazonaws.com"
    }
  }
}

// Similar records for us-west-2, eu-west-1, ap-northeast-1...

Routing Policies:

Policy	Description	Use Case
Geolocation	Route based on user location	Compliance, localization
Latency	Route to lowest-latency region	Performance
Weighted	Distribute by percentage	Gradual migration
Failover	Primary-secondary	Disaster recovery
Geoproximity	Route to nearest with bias	Balance latency/capacity

DNS TTL and Failover Speed

DNS responses are cached based on TTL (Time To Live). If TTL is 300 seconds (5 minutes), failover takes up to 5 minutes as caches expire. Lower TTLs mean faster failover but more DNS queries. Common compromise: 60 seconds for critical services.

BGP Anycast: Network-Level Geographic Routing

Anycast is a network addressing and routing method where the same IP address is announced from multiple locations. Routers automatically direct traffic to the nearest (in network terms) location.

How Anycast Works:

Data Centers:
  Tokyo DC announces: 203.0.113.0/24
  London DC announces: 203.0.113.0/24
  NYC DC announces: 203.0.113.0/24
  (All the same IP range)

Routing:
  User in Tokyo → Internet → Route to Tokyo DC (closest)
  User in Paris → Internet → Route to London DC (closest)
  User in Brazil → Internet → Route to NYC DC (closest)

           ┌──────────────────────────────────────────┐
           │            Internet (BGP)                │
           │                                          │
           │   User Tokyo ─────────► Tokyo DC         │
           │                    (shortest AS path)    │
           │                                          │
           │   User Paris ─────────► London DC        │
           │                    (shortest AS path)    │
           │                                          │
           │   User Brazil ────────► NYC DC           │
           │                    (shortest AS path)    │
           └──────────────────────────────────────────┘

Anycast vs. DNS-Based:

Aspect	DNS-Based	Anycast
Routing granularity	Per DNS resolver	Per packet
Failover speed	TTL-dependent (seconds to minutes)	Seconds (BGP convergence)
Connection affinity	Good (same IP for connection)	Variable (routing changes mid-connection possible)
Caching impact	None	None
Implementation	DNS configuration	BGP/network configuration
Best for	HTTP/HTTPS (application layer)	DNS, UDP, stateless services

Anycast and TCP Connections:

Anycast works best with stateless protocols because routing changes can occur mid-connection:

1. User starts TCP connection to 203.0.113.1 (routed to NYC)
2. During connection, BGP route changes
3. Next packets are routed to London
4. London doesn't have connection state → Connection fails

Mitigations:

Stable Routing: Ensure BGP announcements are stable
Connection Pinning: Some systems bind connections to specific DC
Use Anycast for Initial Connection Only: Redirect to unicast IP for persistent connections
QUIC/HTTP3: Connection IDs survive IP changes

Where Anycast is Used:

DNS Root Servers: Most root servers use anycast (e.g., F-root has 200+ locations)
CDN Edge Nodes: Cloudflare, Akamai, Fastly
DDoS Protection: Attack traffic absorbed by many distributed nodes
UDP Services: Where connection state isn't an issue

Cloudflare's Architecture:

Cloudflare announces same IPs from 300+ data centers worldwide.

User in Tokyo:
  1. DNS for example.com → Returns Cloudflare anycast IP
  2. TCP connection → Anycast routes to nearest Cloudflare PoP
  3. Cloudflare PoP proxies to origin (if needed)

Anycast Requires Network Access

Implementing anycast requires announcing BGP routes, which typically requires having your own AS number and IP space, or using a provider (cloud, CDN) that offers anycast. It's not something you can do with standard hosting.

Multi-Region Architecture Patterns

Global load balancing integrates with broader multi-region architecture. Several patterns exist, each with distinct tradeoffs.

Pattern 1: Active-Passive (Primary-Secondary)

                    Primary (US-East)
                    ┌─────────────────┐
      All traffic ─►│ Full Application │
                    │ Full Database    │
                    └─────────────────┘
                           │
                           │ Replication
                           ▼
                    Secondary (EU-West)
                    ┌─────────────────┐
      On failover ─►│ Full Application │
                    │ DB Replica       │
                    └─────────────────┘

Characteristics:

Simplest multi-region pattern
Secondary region is standby (minimal cost)
Failover requires promoting secondary
RTO (Recovery Time): Minutes to hours
RPO (Data Loss): Depends on replication lag

Pattern 2: Active-Active (All Regions Serve Traffic)

                    US-East                    EU-West
                ┌─────────────────┐        ┌─────────────────┐
      US Users ─►│ Application     │        │ Application     │◄─ EU Users
                │ Database (Primary)│◄──────►│ Database (Replica)│
                └─────────────────┘        └─────────────────┘

Characteristics:

Both regions serve production traffic
Lower latency for users in each region
Database replication complexity (single-primary or multi-primary)
More expensive (both regions at production capacity)
Better RTO/RPO than active-passive

Pattern 3: Follow-the-Sun

Timeframe           Active Region
00:00-08:00 UTC     Asia-Pacific
08:00-16:00 UTC     Europe
16:00-24:00 UTC     Americas

GSLB routes based on time of day + user location.

Characteristics:

Optimizes for when users are active
Reduces infrastructure costs (off-peak regions can scale down)
Complex orchestration
Works for regionally segmented user bases

Pattern 4: Data Sovereignty (Geo-Sharded)

EU Users → EU Region (EU data only)
US Users → US Region (US data only)
APAC Users → APAC Region (APAC data only)

No cross-region data replication (compliance requirement).

Characteristics:

Required for compliance (GDPR, data localization)
Each region is independent
No global user accounts (or federated identity)
Simpler from data perspective, complex for product

Pattern 5: CDN with Origin Shield

           Edge PoPs (100s worldwide)
               │
               ▼
           Origin Shield (1-2 per region)
               │
               ▼
           Origin (Single or Multi-Region)

GSLB routes to CDN edge, CDN handles global distribution.

Characteristics:

Offloads global distribution to CDN
Origin can be single-region (simpler)
Works well for cacheable content
Limited for personalized/real-time content

Multi-Region Pattern Comparison
Pattern	Complexity	Cost	RTO	Best For
Active-Passive	Low	Low	Minutes	Basic DR, cost-sensitive
Active-Active	High	High	Seconds	Low latency, high availability
Follow-the-Sun	Medium	Medium	Varies	Regional user bases
Geo-Sharded	Medium	Medium	N/A	Compliance requirements
CDN + Origin	Low	Varies	N/A	Content-heavy applications

Failover Strategies and Disaster Recovery

Global load balancing enables automatic failover when regions fail. However, failover involves multiple components that must work together.

Failover Detection:

GSLB systems detect failures through health checks:

1. GSLB health check to US-East fails
2. Health check fails 3 consecutive times (30 seconds)
3. US-East marked unhealthy
4. DNS queries now return EU-West only
5. DNS TTL expires on clients (60 seconds)
6. Clients reconnect to EU-West

Total failover time: ~90 seconds (30s detection + 60s TTL)

Reducing Failover Time:

Component	Optimization	Tradeoff
Health check interval	Reduce 10s → 5s	More health check traffic
Unhealthy threshold	Reduce 3 → 2	More false positives
DNS TTL	Reduce 300s → 60s	More DNS queries (cost)
Connection timeout	Reduce on clients	Faster retry to new region

Failover Types:

1. Automated Failover: GSLB automatically reroutes when health checks fail.

Pros: Fast, no human intervention
Cons: Can trigger on transient failures; hard to debug

2. Manual Failover: Operators explicitly trigger failover.

Pros: Controlled, deliberate
Cons: Slower; requires on-call availability

3. Hybrid (Automated with Safeguards):

If (US-East unhealthy) AND (EU-West healthy) AND (not in maintenance window):
    Failover automatically
Else:
    Page on-call for manual decision

Data Consistency During Failover:

The hardest part of failover is data:

Scenario:
  User writes data to US-East
  US-East fails before replication to EU-West
  User is failed over to EU-West
  User's recent data is missing!

Mitigation Strategies:

Synchronous Replication: Write confirmed only when replicated
- Pros: No data loss
- Cons: Latency penalty (~100ms+ per write)
Asynchronous with Replay: Accept some data lag, handle on failover
- Pros: Low latency writes
- Cons: Some data loss possible (RPO > 0)
Multi-Primary Databases: Write to any region
- Pros: No write latency penalty
- Cons: Conflict resolution complexity (CRDTs, last-write-wins)

Failback After Recovery:

1. US-East recovers
2. Verify US-East is fully healthy (data sync complete)
3. Gradually shift traffic back (10% → 50% → 100%)
4. Monitor for issues
5. Return to normal state

DO NOT:
  - Immediately failback (may oscillate)
  - Failback without verifying data sync
  - Failback during business peak

Test Your Failover

The only way to know if failover works is to test it. Regularly (quarterly or monthly) perform failover drills. Chaos engineering (deliberately causing failures) reveals gaps before real outages find them. If you can't test it in production, test in staging with production-like data.

Performance Optimization Techniques

Global load balancing is not just about routing to the closest region—it's about optimizing the entire user experience.

Technique 1: Latency-Based Routing

Rather than geographic proximity, route based on actual measured latency:

AWS Route 53 Latency Routing:
  Continuously measures latency from resolver locations to regions
  Returns IP of region with lowest measured latency
  
  Result: User in São Paulo might be routed to US-East (lower latency)
          instead of EU-West (geographically closer but higher latency)

Technique 2: Edge/CDN Integration

Request Path with CDN:

 User → CDN Edge (nearest PoP)
           │
           ▼ cached? ──yes──► Return from cache
           │
           no
           │
           ▼
        Origin Shield (regional cache)
           │
           ▼ cached? ──yes──► Return from cache
           │
           no
           │
           ▼
        Origin (Application server)

CDN reduces load on origin and improves latency for cacheable content.

Technique 3: Connection Optimization

TLS Session Resumption:

First connection: Full TLS handshake (2 RTT)
Subsequent: Session ticket reuse (1 RTT)
Keep users on same edge PoP to leverage session cache

0-RTT TLS (TLS 1.3):

Returning clients can send data with first packet
Requires client has session ticket from previous connection
Reduces latency by 1 RTT for repeat visitors

TCP Fast Open:

Send data in SYN packet (reduce 1 RTT)
Requires server support and client caching

Technique 4: Protocol Optimization

Protocol	Benefit
HTTP/2	Multiplexing (reduce connections), header compression
HTTP/3 (QUIC)	No head-of-line blocking, 0-RTT, faster handshake
gRPC	Efficient binary protocol, multiplexed
WebSocket	Persistent connection (reduce connection setup)

Technique 5: Regional Pre-fetching

<!-- Hint browser to establish connection early -->
<link rel="preconnect" href="https://api.example.com">
<link rel="dns-prefetch" href="https://cdn.example.com">

Browser establishes connections before they're needed, hiding latency from user perception.

Measure, Don't Assume

Use Real User Monitoring (RUM) to measure actual latency experienced by users. Server-side metrics miss network and rendering time. Tools like Google Analytics, Datadog RUM, or New Relic provide user-centric performance data.

Cloud Provider GSLB Solutions

Major cloud providers offer managed GSLB services with varying capabilities.

AWS Global Accelerator + Route 53:

                    Global Accelerator
                    (Anycast IPs)
                         │
            ┌────────────┼────────────┐
            ▼            ▼            ▼
        Edge PoP     Edge PoP     Edge PoP
        (Americas)   (Europe)     (Asia)
            │            │            │
            │      AWS Backbone       │
            └────────────┬────────────┘
                         │
            ┌────────────┼────────────┐
            ▼            ▼            ▼
        Region 1     Region 2     Region 3
        (ALB/NLB)    (ALB/NLB)    (ALB/NLB)

Features:

Static anycast IPs (no TTL-based failover delay)
AWS backbone network (reduced internet hops)
Automatic failover in <30 seconds
Weighted routing for blue-green deployments

Google Cloud Global Load Balancer:

Single anycast IP serves entire globe:

  Client → Google Edge → Google Network → Backend (any region)
  
  Backend selection based on:
    - Geographic proximity
    - Backend capacity
    - Health status

Features:

True global anycast HTTP(S) LB
Single IP for global service
Integrated with Cloud CDN
Backend services across regions
Traffic Director for service mesh

Cloud GSLB Comparison
Feature	AWS (GA + R53)	GCP (Global LB)	Azure (Traffic Manager)
Routing Method	Anycast + DNS	Anycast	DNS
Static IP	Yes (GA)	Yes	No (DNS-based)
Failover Speed	~30s	Seconds	TTL-dependent
Protocol Support	TCP/UDP (GA), HTTP (ALB)	HTTP(S), TCP, UDP	HTTP, TCP
CDN Integration	CloudFront separate	Cloud CDN integrated	Azure CDN separate
Health Checks	Multi-layer	Multi-layer	Multi-layer
Pricing Model	Per accelerator + data \| Per hour + data	Per rule + data	Per DNS query + endpoint

CDN Providers:

CDN providers offer sophisticated GSLB as a core feature:

Provider	Approach	Edge PoPs
Cloudflare	Anycast	300+
Akamai	DNS + Anycast	4000+
Fastly	Anycast	70+
AWS CloudFront	DNS	450+

Configuration Example: Cloudflare Load Balancer

{
  "name": "example-lb",
  "default_pools": ["pool-us-east", "pool-eu-west"],
  "fallback_pool": "pool-us-east",
  "region_pools": {
    "WNAM": ["pool-us-west"],
    "ENAM": ["pool-us-east"],
    "WEU": ["pool-eu-west"],
    "EEU": ["pool-eu-west"],
    "OC": ["pool-ap-syd"]
  },
  "steering_policy": "geo"
}

Vendor Lock-in Consideration

Cloud-native GSLB solutions work best with same-cloud resources. For multi-cloud or hybrid, consider CDN-based GSLB (Cloudflare, Akamai) or self-managed DNS (NS1, Dyn) that can route to any backend regardless of cloud provider.

Summary: Global Load Balancing Mastery

We've comprehensively explored global load balancing, from DNS-based routing to BGP anycast, and from multi-region architecture patterns to cloud provider solutions.

Key Takeaways

•Global LB solves latency, availability, and compliance — Single-region cannot deliver global user experience.
•DNS-based is most common — Easy to implement, but limited by TTL for failover speed.
•Anycast provides instant failover — Network-level routing, best for stateless and UDP services.
•Multi-region architecture determines complexity — Active-passive is simple; active-active is resilient but complex.
•Failover requires coordination — Detection, routing, data consistency all must work together.
•Performance optimization is multi-layered — CDN, connection reuse, protocol evolution all contribute.
•Cloud providers offer managed solutions — Trade flexibility for operational simplicity.
•Test failover regularly — Untested DR is unreliable DR.

Module Complete:

Congratulations! You've completed the Load Balancing module. You now understand:

Load balancer fundamentals: What they are, why they exist, architecture patterns
L4 vs L7: The tradeoffs between transport and application layer balancing
Algorithms: From round-robin to consistent hashing and Power of Two Choices
Health checks: Active, passive, configuration, and graceful degradation
Global load balancing: DNS-based, anycast, multi-region patterns, and failover

This knowledge equips you to design, implement, and operate load balancing at any scale—from a simple web application to a globally distributed platform serving millions of users.

Module Complete

You have mastered load balancing concepts at a principal engineer level. You can now design load balancing strategies for any scale, from single-server applications to globally distributed systems. Apply these concepts to build resilient, performant, and scalable network architectures.

Global Load Balancing

Serving the World: Beyond Single-Region Load Balancing

What You Will Master

Why Global Load Balancing Matters

Global load balancing addresses three fundamental challenges that single-region architectures cannot solve:

Challenge 1: Latency

Data travels at the speed of light, but even light is slow across continents:

Route	Distance	Light Speed Latency	Real-World RTT
NYC → London	5,585 km	~19 ms	70-90 ms
NYC → Tokyo	10,850 km	~36 ms	150-200 ms
NYC → Sydney	16,000 km	~53 ms	200-250 ms

RTT = Round Trip Time. Real-world is higher due to routing, switching, and processing.

Challenge 2: Availability

No single data center provides 100% availability. Hardware fails. Networks have outages. Natural disasters occur.

Single Region Availability:
  99.9% uptime = 8.76 hours/year downtime

Multi-Region with Failover:
  If regions fail independently:
  99.99% uptime = 52.6 minutes/year downtime
  (Achieved when either region can serve all traffic)

Challenge 3: Regulatory Compliance

Many regulations require data to remain in specific geographic regions:

Regulation	Requirement
GDPR (EU)	EU citizen data processed in EU
CCPA (California)	California consumer data protections
PDPA (Singapore)	Personal data processed in Singapore
Data Localization Laws	Various countries require in-country processing

Global load balancing enables routing users to compliant regions automatically.

Challenge 4: Cost Optimization

Data egress costs vary by region. Traffic between regions is expensive:

AWS Data Transfer Costs (approximate):
  Within AZ: Free
  Between AZs: $0.01/GB
  Between Regions: $0.02-0.09/GB
  To Internet: $0.05-0.09/GB

Keeping traffic in-region reduces costs significantly at scale.

The 100ms Rule

DNS-Based Global Load Balancing

How DNS-Based GSLB Works:

1. User wants to access app.example.com
2. Browser queries DNS resolver for app.example.com
3. Resolver contacts authoritative DNS (GSLB-enabled)
4. GSLB DNS determines best endpoint based on:
   - Geographic location of resolver
   - Health of backends in each region
   - Load/capacity in each region
   - Custom routing policies
5. GSLB returns IP address of optimal region
6. User connects to returned IP

           User (Tokyo)
               │
               ▼
         DNS Resolver
               │
               ▼
         GSLB DNS ─────► Determines: Tokyo is closest
               │
               ▼
         Returns: 52.198.x.x (Tokyo region IP)
               │
               ▼
         User connects to Tokyo datacenter

Key Caveat: Resolver Location ≠ User Location

DNS queries come from resolvers, not users. Users might use:

ISP resolver (usually geographically close)
Public resolver (8.8.8.8, 1.1.1.1) with global presence
Corporate resolver (could be in different region)

EDNS Client Subnet (ECS):

ECS solves the resolver location problem by including client subnet in DNS queries:

DNS Query (without ECS):
  "What is app.example.com?" (from resolver 8.8.8.8)
  
DNS Query (with ECS):
  "What is app.example.com?"
  "Asking on behalf of client in 203.0.113.0/24"

The authoritative DNS can route based on actual user location, not resolver location.

DNS-Based GSLB Configuration (AWS Route 53 Example):

{
  "HealthCheck": {
    "Type": "HTTPS",
    "FullyQualifiedDomainName": "app.us-east-1.example.com"
  }
}

{
  "RecordSet": {
    "Name": "app.example.com",
    "Type": "A",
    "SetIdentifier": "us-east-1",
    "Region": "us-east-1",
    "HealthCheckId": "abc123",
    "AliasTarget": {
      "DNSName": "alb-us-east-1.elb.amazonaws.com"
    }
  }
}

// Similar records for us-west-2, eu-west-1, ap-northeast-1...

Routing Policies:

Policy	Description	Use Case
Geolocation	Route based on user location	Compliance, localization
Latency	Route to lowest-latency region	Performance
Weighted	Distribute by percentage	Gradual migration
Failover	Primary-secondary	Disaster recovery
Geoproximity	Route to nearest with bias	Balance latency/capacity

DNS TTL and Failover Speed

BGP Anycast: Network-Level Geographic Routing

Anycast is a network addressing and routing method where the same IP address is announced from multiple locations. Routers automatically direct traffic to the nearest (in network terms) location.

How Anycast Works:

Data Centers:
  Tokyo DC announces: 203.0.113.0/24
  London DC announces: 203.0.113.0/24
  NYC DC announces: 203.0.113.0/24
  (All the same IP range)

Routing:
  User in Tokyo → Internet → Route to Tokyo DC (closest)
  User in Paris → Internet → Route to London DC (closest)
  User in Brazil → Internet → Route to NYC DC (closest)

           ┌──────────────────────────────────────────┐
           │            Internet (BGP)                │
           │                                          │
           │   User Tokyo ─────────► Tokyo DC         │
           │                    (shortest AS path)    │
           │                                          │
           │   User Paris ─────────► London DC        │
           │                    (shortest AS path)    │
           │                                          │
           │   User Brazil ────────► NYC DC           │
           │                    (shortest AS path)    │
           └──────────────────────────────────────────┘

Anycast vs. DNS-Based:

Aspect	DNS-Based	Anycast
Routing granularity	Per DNS resolver	Per packet
Failover speed	TTL-dependent (seconds to minutes)	Seconds (BGP convergence)
Connection affinity	Good (same IP for connection)	Variable (routing changes mid-connection possible)
Caching impact	None	None
Implementation	DNS configuration	BGP/network configuration
Best for	HTTP/HTTPS (application layer)	DNS, UDP, stateless services

Anycast and TCP Connections:

Anycast works best with stateless protocols because routing changes can occur mid-connection:

1. User starts TCP connection to 203.0.113.1 (routed to NYC)
2. During connection, BGP route changes
3. Next packets are routed to London
4. London doesn't have connection state → Connection fails

Mitigations:

Stable Routing: Ensure BGP announcements are stable
Connection Pinning: Some systems bind connections to specific DC
Use Anycast for Initial Connection Only: Redirect to unicast IP for persistent connections
QUIC/HTTP3: Connection IDs survive IP changes

Where Anycast is Used:

DNS Root Servers: Most root servers use anycast (e.g., F-root has 200+ locations)
CDN Edge Nodes: Cloudflare, Akamai, Fastly
DDoS Protection: Attack traffic absorbed by many distributed nodes
UDP Services: Where connection state isn't an issue

Cloudflare's Architecture:

Cloudflare announces same IPs from 300+ data centers worldwide.

User in Tokyo:
  1. DNS for example.com → Returns Cloudflare anycast IP
  2. TCP connection → Anycast routes to nearest Cloudflare PoP
  3. Cloudflare PoP proxies to origin (if needed)

Anycast Requires Network Access

Multi-Region Architecture Patterns

Global load balancing integrates with broader multi-region architecture. Several patterns exist, each with distinct tradeoffs.

Pattern 1: Active-Passive (Primary-Secondary)

                    Primary (US-East)
                    ┌─────────────────┐
      All traffic ─►│ Full Application │
                    │ Full Database    │
                    └─────────────────┘
                           │
                           │ Replication
                           ▼
                    Secondary (EU-West)
                    ┌─────────────────┐
      On failover ─►│ Full Application │
                    │ DB Replica       │
                    └─────────────────┘

Characteristics:

Simplest multi-region pattern
Secondary region is standby (minimal cost)
Failover requires promoting secondary
RTO (Recovery Time): Minutes to hours
RPO (Data Loss): Depends on replication lag

Pattern 2: Active-Active (All Regions Serve Traffic)

                    US-East                    EU-West
                ┌─────────────────┐        ┌─────────────────┐
      US Users ─►│ Application     │        │ Application     │◄─ EU Users
                │ Database (Primary)│◄──────►│ Database (Replica)│
                └─────────────────┘        └─────────────────┘

Characteristics:

Both regions serve production traffic
Lower latency for users in each region
Database replication complexity (single-primary or multi-primary)
More expensive (both regions at production capacity)
Better RTO/RPO than active-passive

Pattern 3: Follow-the-Sun

Timeframe           Active Region
00:00-08:00 UTC     Asia-Pacific
08:00-16:00 UTC     Europe
16:00-24:00 UTC     Americas

GSLB routes based on time of day + user location.

Characteristics:

Optimizes for when users are active
Reduces infrastructure costs (off-peak regions can scale down)
Complex orchestration
Works for regionally segmented user bases

Pattern 4: Data Sovereignty (Geo-Sharded)

EU Users → EU Region (EU data only)
US Users → US Region (US data only)
APAC Users → APAC Region (APAC data only)

No cross-region data replication (compliance requirement).

Characteristics:

Required for compliance (GDPR, data localization)
Each region is independent
No global user accounts (or federated identity)
Simpler from data perspective, complex for product

Pattern 5: CDN with Origin Shield

           Edge PoPs (100s worldwide)
               │
               ▼
           Origin Shield (1-2 per region)
               │
               ▼
           Origin (Single or Multi-Region)

GSLB routes to CDN edge, CDN handles global distribution.

Characteristics:

Offloads global distribution to CDN
Origin can be single-region (simpler)
Works well for cacheable content
Limited for personalized/real-time content

Multi-Region Pattern Comparison
Pattern	Complexity	Cost	RTO	Best For
Active-Passive	Low	Low	Minutes	Basic DR, cost-sensitive
Active-Active	High	High	Seconds	Low latency, high availability
Follow-the-Sun	Medium	Medium	Varies	Regional user bases
Geo-Sharded	Medium	Medium	N/A	Compliance requirements
CDN + Origin	Low	Varies	N/A	Content-heavy applications

Failover Strategies and Disaster Recovery

Global load balancing enables automatic failover when regions fail. However, failover involves multiple components that must work together.

Failover Detection:

GSLB systems detect failures through health checks:

1. GSLB health check to US-East fails
2. Health check fails 3 consecutive times (30 seconds)
3. US-East marked unhealthy
4. DNS queries now return EU-West only
5. DNS TTL expires on clients (60 seconds)
6. Clients reconnect to EU-West

Total failover time: ~90 seconds (30s detection + 60s TTL)

Reducing Failover Time:

Component	Optimization	Tradeoff
Health check interval	Reduce 10s → 5s	More health check traffic
Unhealthy threshold	Reduce 3 → 2	More false positives
DNS TTL	Reduce 300s → 60s	More DNS queries (cost)
Connection timeout	Reduce on clients	Faster retry to new region

Failover Types:

1. Automated Failover: GSLB automatically reroutes when health checks fail.

Pros: Fast, no human intervention
Cons: Can trigger on transient failures; hard to debug

2. Manual Failover: Operators explicitly trigger failover.

Pros: Controlled, deliberate
Cons: Slower; requires on-call availability

3. Hybrid (Automated with Safeguards):

If (US-East unhealthy) AND (EU-West healthy) AND (not in maintenance window):
    Failover automatically
Else:
    Page on-call for manual decision

Data Consistency During Failover:

The hardest part of failover is data:

Scenario:
  User writes data to US-East
  US-East fails before replication to EU-West
  User is failed over to EU-West
  User's recent data is missing!

Mitigation Strategies:

Synchronous Replication: Write confirmed only when replicated
- Pros: No data loss
- Cons: Latency penalty (~100ms+ per write)
Asynchronous with Replay: Accept some data lag, handle on failover
- Pros: Low latency writes
- Cons: Some data loss possible (RPO > 0)
Multi-Primary Databases: Write to any region
- Pros: No write latency penalty
- Cons: Conflict resolution complexity (CRDTs, last-write-wins)

Failback After Recovery:

1. US-East recovers
2. Verify US-East is fully healthy (data sync complete)
3. Gradually shift traffic back (10% → 50% → 100%)
4. Monitor for issues
5. Return to normal state

DO NOT:
  - Immediately failback (may oscillate)
  - Failback without verifying data sync
  - Failback during business peak

Test Your Failover

Performance Optimization Techniques

Global load balancing is not just about routing to the closest region—it's about optimizing the entire user experience.

Technique 1: Latency-Based Routing

Rather than geographic proximity, route based on actual measured latency:

AWS Route 53 Latency Routing:
  Continuously measures latency from resolver locations to regions
  Returns IP of region with lowest measured latency
  
  Result: User in São Paulo might be routed to US-East (lower latency)
          instead of EU-West (geographically closer but higher latency)

Technique 2: Edge/CDN Integration

Request Path with CDN:

 User → CDN Edge (nearest PoP)
           │
           ▼ cached? ──yes──► Return from cache
           │
           no
           │
           ▼
        Origin Shield (regional cache)
           │
           ▼ cached? ──yes──► Return from cache
           │
           no
           │
           ▼
        Origin (Application server)

CDN reduces load on origin and improves latency for cacheable content.

Technique 3: Connection Optimization

TLS Session Resumption:

First connection: Full TLS handshake (2 RTT)
Subsequent: Session ticket reuse (1 RTT)
Keep users on same edge PoP to leverage session cache

0-RTT TLS (TLS 1.3):

Returning clients can send data with first packet
Requires client has session ticket from previous connection
Reduces latency by 1 RTT for repeat visitors

TCP Fast Open:

Send data in SYN packet (reduce 1 RTT)
Requires server support and client caching

Technique 4: Protocol Optimization

Protocol	Benefit
HTTP/2	Multiplexing (reduce connections), header compression
HTTP/3 (QUIC)	No head-of-line blocking, 0-RTT, faster handshake
gRPC	Efficient binary protocol, multiplexed
WebSocket	Persistent connection (reduce connection setup)

Technique 5: Regional Pre-fetching

<!-- Hint browser to establish connection early -->
<link rel="preconnect" href="https://api.example.com">
<link rel="dns-prefetch" href="https://cdn.example.com">

Browser establishes connections before they're needed, hiding latency from user perception.

Measure, Don't Assume

Cloud Provider GSLB Solutions

Major cloud providers offer managed GSLB services with varying capabilities.

AWS Global Accelerator + Route 53:

                    Global Accelerator
                    (Anycast IPs)
                         │
            ┌────────────┼────────────┐
            ▼            ▼            ▼
        Edge PoP     Edge PoP     Edge PoP
        (Americas)   (Europe)     (Asia)
            │            │            │
            │      AWS Backbone       │
            └────────────┬────────────┘
                         │
            ┌────────────┼────────────┐
            ▼            ▼            ▼
        Region 1     Region 2     Region 3
        (ALB/NLB)    (ALB/NLB)    (ALB/NLB)

Features:

Static anycast IPs (no TTL-based failover delay)
AWS backbone network (reduced internet hops)
Automatic failover in <30 seconds
Weighted routing for blue-green deployments

Google Cloud Global Load Balancer:

Single anycast IP serves entire globe:

  Client → Google Edge → Google Network → Backend (any region)
  
  Backend selection based on:
    - Geographic proximity
    - Backend capacity
    - Health status

Features:

True global anycast HTTP(S) LB
Single IP for global service
Integrated with Cloud CDN
Backend services across regions
Traffic Director for service mesh

Cloud GSLB Comparison
Feature	AWS (GA + R53)	GCP (Global LB)	Azure (Traffic Manager)
Routing Method	Anycast + DNS	Anycast	DNS
Static IP	Yes (GA)	Yes	No (DNS-based)
Failover Speed	~30s	Seconds	TTL-dependent
Protocol Support	TCP/UDP (GA), HTTP (ALB)	HTTP(S), TCP, UDP	HTTP, TCP
CDN Integration	CloudFront separate	Cloud CDN integrated	Azure CDN separate
Health Checks	Multi-layer	Multi-layer	Multi-layer
Pricing Model	Per accelerator + data \| Per hour + data	Per rule + data	Per DNS query + endpoint

CDN Providers:

CDN providers offer sophisticated GSLB as a core feature:

Provider	Approach	Edge PoPs
Cloudflare	Anycast	300+
Akamai	DNS + Anycast	4000+
Fastly	Anycast	70+
AWS CloudFront	DNS	450+

Configuration Example: Cloudflare Load Balancer

{
  "name": "example-lb",
  "default_pools": ["pool-us-east", "pool-eu-west"],
  "fallback_pool": "pool-us-east",
  "region_pools": {
    "WNAM": ["pool-us-west"],
    "ENAM": ["pool-us-east"],
    "WEU": ["pool-eu-west"],
    "EEU": ["pool-eu-west"],
    "OC": ["pool-ap-syd"]
  },
  "steering_policy": "geo"
}

Vendor Lock-in Consideration

Summary: Global Load Balancing Mastery

We've comprehensively explored global load balancing, from DNS-based routing to BGP anycast, and from multi-region architecture patterns to cloud provider solutions.

Key Takeaways

•Global LB solves latency, availability, and compliance — Single-region cannot deliver global user experience.
•DNS-based is most common — Easy to implement, but limited by TTL for failover speed.
•Anycast provides instant failover — Network-level routing, best for stateless and UDP services.
•Multi-region architecture determines complexity — Active-passive is simple; active-active is resilient but complex.
•Failover requires coordination — Detection, routing, data consistency all must work together.
•Performance optimization is multi-layered — CDN, connection reuse, protocol evolution all contribute.
•Cloud providers offer managed solutions — Trade flexibility for operational simplicity.
•Test failover regularly — Untested DR is unreliable DR.

Module Complete:

Congratulations! You've completed the Load Balancing module. You now understand:

Load balancer fundamentals: What they are, why they exist, architecture patterns
L4 vs L7: The tradeoffs between transport and application layer balancing
Algorithms: From round-robin to consistent hashing and Power of Two Choices
Health checks: Active, passive, configuration, and graceful degradation
Global load balancing: DNS-based, anycast, multi-region patterns, and failover

This knowledge equips you to design, implement, and operate load balancing at any scale—from a simple web application to a globally distributed platform serving millions of users.

Module Complete