Global Load Balancing - Learning Module

Loading content...

0/273

Global Server Load Balancing (GSLB)

Beyond the Data Center: Distributing Traffic Worldwide

When a user in Tokyo requests content from your application, where should that request go? To a server in California, 8,500 kilometers away with 150ms of round-trip latency? Or to a nearby server in Singapore, reducing latency to 40ms? When your European data center experiences an outage at 3 AM, how do you seamlessly redirect millions of users to healthy infrastructure without manual intervention?

These questions define the domain of Global Server Load Balancing (GSLB)—a critical architectural discipline that extends load balancing from a single data center to a worldwide, geographically distributed infrastructure. GSLB represents the pinnacle of traffic management sophistication, combining network engineering, DNS infrastructure, health monitoring, and intelligent routing policies to deliver seamless global user experiences.

What You Will Learn

By the end of this page, you will have mastered: the fundamental concepts and architectural patterns of GSLB; how GSLB differs from traditional load balancing; the critical role of DNS in global traffic distribution; health-based routing and disaster recovery patterns; latency optimization and geographic affinity strategies; and real-world implementation approaches used by hyperscale internet companies.

The Need for Global Load Balancing

Traditional load balancers operate within a single data center or availability zone, distributing requests across a pool of servers connected to the same network fabric. While essential, this approach has fundamental limitations that become critical as organizations scale globally:

The Single-Region Problem:

Consider an e-commerce platform with all infrastructure in the US-East region. Users in Asia experience 250-400ms of network latency before the first byte even reaches the application. During peak Asian shopping hours, users experience degraded performance precisely when engagement matters most. A regional power outage or network partition renders the entire service unavailable worldwide—the dreaded single point of failure at continental scale.

Latency Impact by Geographic Distance
User Location	Server Location	Typical RTT	User Experience Impact
New York	US-East (Virginia)	~20ms	Excellent: imperceptible delay
London	US-East (Virginia)	~80ms	Good: minor delay noticeable
Tokyo	US-East (Virginia)	~180ms	Poor: visible loading delays
Sydney	US-East (Virginia)	~250ms	Degraded: frustrating experience
Mumbai	US-East (Virginia)	~220ms	Degraded: high bounce rates

The Business Imperative:

Research consistently demonstrates that latency directly impacts business metrics. Amazon famously reported that every 100ms of latency costs 1% in sales. Google found that a 500ms delay in search results caused a 20% drop in traffic. For global businesses, achieving sub-100ms response times for users worldwide isn't just a technical goal—it's a business necessity.

What GSLB Solves:

Global Server Load Balancing addresses these challenges by intelligently routing user requests to the optimal data center based on multiple factors: geographic proximity for latency minimization, server health for availability, capacity for load distribution, and business policies for regulatory compliance or cost optimization.

The Fundamental Shift

While traditional load balancing asks 'which server in this data center should handle this request?', GSLB asks 'which data center on the planet should handle this request?' This elevation in scope requires fundamentally different mechanisms, primarily DNS-based routing rather than network-layer packet manipulation.

GSLB Architecture Fundamentals

Understanding GSLB requires grasping how it leverages DNS infrastructure to make routing decisions at a global scale. Unlike traditional load balancers that operate at the network or transport layer, GSLB primarily functions at the application layer through intelligent DNS resolution.

The DNS-Based Approach:

When a user requests api.example.com, their device queries the DNS system for the IP address of that hostname. In a GSLB-enabled architecture, this DNS resolution becomes an intelligent routing decision point. The GSLB system responds with the IP address of the data center deemed optimal for that particular user at that particular moment.

Core Components of a GSLB System:

Essential GSLB Components

•Authoritative DNS Infrastructure — Specialized DNS servers that respond to queries with data center IP addresses based on routing policies. These servers must handle massive query volumes with sub-millisecond latency while applying complex decision logic.
•Global Health Monitoring — Distributed probes that continuously assess the availability and performance of each data center from multiple vantage points worldwide. Health status informs routing decisions in near-real-time.
•Geographic Database — IP-to-location mapping that enables the GSLB to determine a user's approximate physical location based on their source IP address. Commercial databases like MaxMind provide city-level accuracy for most IP ranges.
•Policy Engine — The decision-making core that evaluates health data, geographic proximity, capacity metrics, and business rules to select the optimal data center for each incoming query.
•Metrics and Analytics — Continuous collection of traffic patterns, latency measurements, and routing decisions to inform capacity planning and policy optimization.

Converting Mermaid diagram...

The Resolution Flow:

Query Reception: User's recursive resolver sends a DNS query for your hostname to the authoritative GSLB nameserver.
User Identification: GSLB extracts the client's IP address (or EDNS Client Subnet if available) and queries the geographic database to determine approximate location.
Health Assessment: GSLB checks the current health status of all candidate data centers, eliminating any that are unhealthy or at capacity.
Policy Evaluation: The policy engine applies configured rules—geographic proximity, latency measurements, capacity weights, cost optimizations—to rank available data centers.
Response Generation: GSLB returns the IP address(es) of the selected data center(s), typically with a relatively short TTL (30-300 seconds) to enable rapid traffic redistribution.
Connection Establishment: The user's application connects directly to the selected data center, bypassing the GSLB for actual data transfer.

TTL Considerations

DNS TTL (Time To Live) creates a fundamental tradeoff. Short TTLs (30-60 seconds) enable rapid failover but increase DNS query load and may expose latency to DNS resolution delays. Long TTLs (300+ seconds) reduce DNS load but slow failover response and may leave users connecting to failed endpoints. Most GSLB deployments use TTLs of 60-300 seconds, balanced against their failover requirements.

Routing Policies and Strategies

The intelligence of a GSLB system lies in its routing policies—the rules that determine which data center receives each user's traffic. Production GSLB deployments typically employ multiple policies in sophisticated combinations, though understanding each policy individually is essential.

Geographic Routing (GeoIP):

The most common GSLB strategy routes users to the physically closest data center based on their IP address geolocation. This approach minimizes network latency for the majority of requests while being computationally simple to implement.

gslb-policy-example.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# Example GSLB Geographic Routing Policy
gslb_policy:
  name: "geographic-proximity"
  type: "geo-proximity"
  
  regions:
    - name: "asia-pacific"
      countries: ["JP", "KR", "AU", "SG", "IN", "ID", "PH", "TH", "VN", "MY"]
      target_datacenter: "tokyo-dc"
      fallback: "singapore-dc"
      
    - name: "europe"
      countries: ["GB", "DE", "FR", "IT", "ES", "NL", "SE", "PL", "CH", "AT"]
      target_datacenter: "london-dc"
      fallback: "frankfurt-dc"
      
    - name: "americas"
      countries: ["US", "CA", "MX", "BR", "AR", "CO", "CL"]
      target_datacenter: "virginia-dc"
      fallback: "oregon-dc"
      
  default:
    target_datacenter: "virginia-dc"
    
  health_check:
    interval: 30s
    timeout: 5s
    unhealthy_threshold: 3
    endpoints:
      - path: "/health"
        expected_status: 200

Latency-Based Routing:

Going beyond static geographic mappings, latency-based routing uses actual network performance measurements to make routing decisions. The GSLB system continuously measures round-trip time from its DNS servers (or distributed probes) to each data center and directs users to the lowest-latency option.

This approach handles edge cases that geographic routing misses—a user in South Africa might have lower latency to a London data center than to one in São Paulo, despite geographic proximity suggesting otherwise, due to submarine cable routing and peering arrangements.

Weighted Routing:

When data centers have different capacities or cost structures, weighted routing distributes traffic proportionally. A primary data center might receive 70% of traffic, with a secondary handling 30%. This enables gradual migrations, A/B testing of infrastructure, and cost optimization across clouds.

GSLB Routing Policy Comparison
Policy Type	Decision Basis	Best For	Limitations
Geographic	User IP geolocation	Simple global distribution	Assumes proximity = low latency
Latency-Based	Measured RTT	Performance optimization	Measurement overhead; unstable in congestion
Weighted	Configured ratios	Capacity management, migrations	Doesn't adapt to conditions
Health-Based	Endpoint availability	High availability	Binary decision; no optimization
Hybrid	Multiple factors combined	Production deployments	Complexity in tuning

Health-Aware Routing:

Regardless of other policies, health-aware routing is table stakes for production GSLB. Unhealthy data centers are removed from the pool automatically, preventing users from being directed to failed infrastructure. Health checks typically verify:

Synthetic Transactions: Simulated API calls that exercise the full application stack
Layer 4 Connectivity: TCP connection establishment to load balancer VIPs
Layer 7 Health Endpoints: HTTP health check paths returning application-level status
Capacity Metrics: Server load, queue depth, or response time thresholds

Failover Routing:

Active-passive failover configurations maintain a primary data center that receives all traffic under normal conditions, with a standby data center activated only when the primary fails. This pattern is common for regulatory compliance (keeping data in a specific region) or cost optimization (minimizing expensive secondary capacity).

Policy Chaining

Production GSLB deployments rarely use a single policy in isolation. A typical chain might be: (1) eliminate unhealthy data centers via health checks, (2) select candidates based on geographic affinity, (3) among candidates, choose the lowest-latency option, (4) apply weights for capacity balancing. The policy engine evaluates this chain for every DNS query.

Health Monitoring Architectures

The reliability of a GSLB system depends critically on its health monitoring architecture. Incorrect health assessments can be catastrophic—a false positive (marking healthy as unhealthy) causes unnecessary failovers, while a false negative (marking unhealthy as healthy) sends users to broken infrastructure.

Distributed Health Probing:

Effective health monitoring requires probes distributed globally, not just from the GSLB controller location. A data center might be reachable from the GSLB controller in Frankfurt but unreachable from users in Asia due to a routing problem. Distributed probing provides multiple perspectives on health.

Health Monitoring Best Practices

•Probe from Multiple Locations — Deploy health probes in at least 3 geographic regions. Require consensus (2/3 or 3/5) before marking a data center unhealthy to avoid false positives from probe-side network issues.
•Layer Health Appropriately — Combine Layer 4 (TCP) and Layer 7 (HTTP) health checks. TCP checks catch network-level failures quickly; HTTP checks verify application-level functionality.
•Test the Full Path — Health endpoints should exercise database connections, cache availability, and dependencies. A web server responding to /health while the database is down provides false confidence.
•Set Appropriate Thresholds — Avoid triggering failover on transient issues. Typical configurations require 3-5 consecutive failures before marking unhealthy, with checks every 10-30 seconds.
•Implement Grace Periods — After a failover, require extended healthy periods before fail-back to prevent flapping between data centers during unstable conditions.
•Monitor the Monitors — Health probes themselves can fail. Implement monitoring of the health monitoring system to ensure probes are functioning.

health-check-config.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
# Comprehensive GSLB Health Check Configuration
health_monitoring:
  global_probes:
    locations:
      - region: "us-east"
        endpoints: ["probe-ue1.example.net", "probe-ue2.example.net"]
      - region: "eu-west"
        endpoints: ["probe-ew1.example.net", "probe-ew2.example.net"]
      - region: "ap-northeast"
        endpoints: ["probe-an1.example.net", "probe-an2.example.net"]
    
    consensus:
      quorum: "majority"  # Or "all" for strict checking
      
  checks:
    - name: "tcp-vip"
      type: "tcp"
      port: 443
      interval: 10s
      timeout: 3s
      unhealthy_threshold: 3
      healthy_threshold: 2
      
    - name: "http-healthz"
      type: "http"
      path: "/healthz"
      host: "api.example.com"
      port: 443
      tls: true
      interval: 15s
      timeout: 5s
      unhealthy_threshold: 3
      healthy_threshold: 3
      expected_codes: [200]
      expected_body_contains: "healthy"
      
    - name: "deep-health"
      type: "http"
      path: "/health/deep"
      port: 443
      tls: true
      interval: 30s
      timeout: 10s
      unhealthy_threshold: 2
      healthy_threshold: 5
      expected_codes: [200]
      # Deep checks verify database, cache, dependencies
      
  failover:
    cooldown_period: 300s  # 5 minutes before fail-back
    notification:
      - type: "pagerduty"
        severity: "critical"
      - type: "slack"
        channel: "#infrastructure-alerts"

Active vs. Passive Health Checks:

Active health checks (probes initiated by the GSLB system) provide predictable, controllable monitoring but add load to target services. Passive health checks (analyzing real user traffic) have no additional load but depend on having sufficient traffic for statistical significance. Hybrid approaches use active checks as the primary mechanism with passive analysis for anomaly detection.

The Flapping Problem:

Intermittent failures can cause rapid oscillation between data centers—a phenomenon called flapping. User A gets routed to DC1, which fails, so future users go to DC2. DC1 recovers, future users go back to DC1, which fails again. This creates inconsistent experience and strains the infrastructure.

Mitigation strategies include hysteresis (requiring longer healthy periods before recovery than unhealthy periods before failover), dampening (limiting failover frequency), and graduated recovery (slowly shifting traffic back to a recovered data center rather than instant failback).

The Cascade Failure Risk

When one data center fails, GSLB redirects its traffic to remaining data centers. If those data centers are already near capacity, this surge can trigger a cascade failure—overload causes the second DC to fail, pushing all traffic to the third, which also fails. Capacity planning must account for N-1 or N-2 scenarios where traffic is redistributed during failures.

Real-World GSLB Architectures

Understanding GSLB from a theoretical perspective is essential, but examining how major organizations implement global traffic distribution provides critical practical insights. Let's analyze common architectural patterns found in production deployments.

Pattern 1: Active-Active Multi-Region

The gold standard for global services, active-active runs fully independent application stacks in multiple regions, each serving traffic continuously. Users are routed to the nearest healthy region, and all regions operate at similar utilization levels.

Converting Mermaid diagram...

Pattern 2: Active-Passive with Geographic Affinity

For applications with strict data residency requirements or where active-active complexity isn't justified, active-passive maintains warm standby regions that only receive traffic during primary region failures. Traffic is normally restricted to a single region, with GSLB only redirecting during outages.

Pattern 3: Anycast + Regional Load Balancing

Many CDNs and hyperscale services combine Anycast (explored in a later page) with traditional GSLB. Anycast provides initial geographic routing at the network layer, while application-layer GSLB handles more sophisticated decisions. This hybrid approach is common at companies like Cloudflare, Fastly, and AWS CloudFront.

Pattern 4: Multi-Cloud GSLB

Organizations operating across multiple cloud providers use GSLB to abstract cloud-specific infrastructure. Users are routed to AWS, GCP, or Azure based on regions, pricing, or provider-specific outages. This pattern is increasingly common for resilience against cloud provider failures.

GSLB Pattern Comparison for Production Deployments
Pattern	Complexity	Resilience	Data Consistency	Cost
Active-Active Multi-Region	Very High	Excellent	Challenging (conflicts)	High (full infra everywhere)
Active-Passive	Medium	Good	Simple (single primary)	Medium (idle standby)
Anycast + Regional LB	High	Excellent	Regional scope	High (anycast infra)
Multi-Cloud GSLB	Very High	Maximum	Cloud-dependent	Variable (multi-cloud overhead)

Start Simple, Evolve Complexity

Most organizations shouldn't start with active-active multi-region. Begin with a single region, add a passive DR region, then evolve to active-active as traffic and resilience requirements grow. Each step adds significant operational complexity that must be matched by organizational capability.

GSLB Technologies and Providers

Implementing GSLB requires selecting from a range of technologies, from managed cloud services to self-operated infrastructure. Understanding the landscape helps in making appropriate architectural decisions.

Cloud Provider GSLB Services:

Cloud Provider GSLB Offerings
Provider	Service Name	Key Features	Integration
AWS	Route 53	Latency, geo, weighted, failover routing; health checks; alias records	Deep AWS integration; works with ALB/NLB/CloudFront
Google Cloud	Cloud DNS + Traffic Director	Geo routing; cross-region load balancing; anycast VIPs	Native GCP integration; global HTTP(S) LB
Azure	Traffic Manager + Front Door	Performance, geographic, priority, weighted routing	Azure integration; Front Door for edge caching
Cloudflare	Load Balancing	Geo steering, health checks, proximity, random; edge compute	CDN integration; Workers for programmable routing

Self-Managed GSLB Options:

For organizations requiring full control, self-managed GSLB using authoritative DNS with intelligent backends is possible:

PowerDNS with Lua scripting: Open-source DNS server with programmable routing policies
BIND with custom response policy zones: Traditional but limited flexibility
Kubernetes external-dns with custom controllers: Cloud-native approach for K8s environments
F5 BIG-IP DNS (formerly GTM): Enterprise hardware appliance with comprehensive GSLB
Citrix ADC GSLB: NetScaler-based intelligent DNS routing

Hybrid Approaches:

Many organizations use cloud DNS as the authoritative layer (for reliability and DDoS protection) while implementing custom logic in the application layer. For example, using Route 53 for basic geographic routing, but having the application redirect users to alternate regions based on real-time conditions observed at the application layer.

DNS as a Dependency

GSLB makes your DNS infrastructure a critical dependency. Provider-managed DNS (Route 53, Cloud DNS, Cloudflare) typically offers 100% SLA with globally distributed infrastructure. Self-managed DNS requires significant investment in reliability engineering. Most organizations should use managed DNS services and focus engineering effort on application-level resilience.

Summary: Global Server Load Balancing

Global Server Load Balancing represents a fundamental capability for any organization serving users worldwide. Let's consolidate the core concepts we've explored:

Key Takeaways

•GSLB Extends Load Balancing Globally — While traditional load balancers distribute traffic within a data center, GSLB routes traffic across geographically distributed infrastructure using DNS as its primary mechanism.
•DNS is the Control Plane — GSLB leverages the DNS system to make intelligent routing decisions, returning different IP addresses based on user location, server health, and business policies.
•Multiple Policies Enable Sophisticated Routing — Geographic, latency-based, weighted, and health-aware policies can be combined to optimize for performance, availability, and cost simultaneously.
•Health Monitoring is Mission-Critical — Distributed health probing from multiple vantage points, with appropriate thresholds and consensus requirements, ensures accurate health assessments.
•TTL Creates Fundamental Tradeoffs — Short DNS TTLs enable rapid failover but increase DNS load; long TTLs reduce load but slow failover. Most deployments use 60-300 second TTLs.
•Start Simple, Add Complexity Incrementally — Active-active multi-region is the gold standard but carries significant operational burden. Begin with simpler patterns and evolve as requirements demand.

What's Next:

With a solid understanding of GSLB fundamentals, we'll next explore DNS-Based Load Balancing in greater depth—examining how DNS mechanics impact routing decisions, the role of recursive resolvers, EDNS Client Subnet for improved accuracy, and advanced DNS patterns for traffic management.

Page Complete

You've mastered Global Server Load Balancing—the architectural discipline enabling intelligent traffic distribution across worldwide infrastructure. You understand the DNS-based approach, routing policies, health monitoring architectures, and real-world patterns. Next, we'll dive deeper into DNS-based load balancing mechanics.

Global Server Load Balancing (GSLB)

Beyond the Data Center: Distributing Traffic Worldwide

What You Will Learn

The Need for Global Load Balancing

The Single-Region Problem:

Latency Impact by Geographic Distance
User Location	Server Location	Typical RTT	User Experience Impact
New York	US-East (Virginia)	~20ms	Excellent: imperceptible delay
London	US-East (Virginia)	~80ms	Good: minor delay noticeable
Tokyo	US-East (Virginia)	~180ms	Poor: visible loading delays
Sydney	US-East (Virginia)	~250ms	Degraded: frustrating experience
Mumbai	US-East (Virginia)	~220ms	Degraded: high bounce rates

The Business Imperative:

What GSLB Solves:

The Fundamental Shift

GSLB Architecture Fundamentals

The DNS-Based Approach:

Core Components of a GSLB System:

Essential GSLB Components

•Authoritative DNS Infrastructure — Specialized DNS servers that respond to queries with data center IP addresses based on routing policies. These servers must handle massive query volumes with sub-millisecond latency while applying complex decision logic.
•Global Health Monitoring — Distributed probes that continuously assess the availability and performance of each data center from multiple vantage points worldwide. Health status informs routing decisions in near-real-time.
•Geographic Database — IP-to-location mapping that enables the GSLB to determine a user's approximate physical location based on their source IP address. Commercial databases like MaxMind provide city-level accuracy for most IP ranges.
•Policy Engine — The decision-making core that evaluates health data, geographic proximity, capacity metrics, and business rules to select the optimal data center for each incoming query.
•Metrics and Analytics — Continuous collection of traffic patterns, latency measurements, and routing decisions to inform capacity planning and policy optimization.

Converting Mermaid diagram...

The Resolution Flow:

Query Reception: User's recursive resolver sends a DNS query for your hostname to the authoritative GSLB nameserver.
User Identification: GSLB extracts the client's IP address (or EDNS Client Subnet if available) and queries the geographic database to determine approximate location.
Health Assessment: GSLB checks the current health status of all candidate data centers, eliminating any that are unhealthy or at capacity.
Policy Evaluation: The policy engine applies configured rules—geographic proximity, latency measurements, capacity weights, cost optimizations—to rank available data centers.
Response Generation: GSLB returns the IP address(es) of the selected data center(s), typically with a relatively short TTL (30-300 seconds) to enable rapid traffic redistribution.
Connection Establishment: The user's application connects directly to the selected data center, bypassing the GSLB for actual data transfer.

TTL Considerations

Routing Policies and Strategies

Geographic Routing (GeoIP):

gslb-policy-example.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# Example GSLB Geographic Routing Policy
gslb_policy:
  name: "geographic-proximity"
  type: "geo-proximity"
  
  regions:
    - name: "asia-pacific"
      countries: ["JP", "KR", "AU", "SG", "IN", "ID", "PH", "TH", "VN", "MY"]
      target_datacenter: "tokyo-dc"
      fallback: "singapore-dc"
      
    - name: "europe"
      countries: ["GB", "DE", "FR", "IT", "ES", "NL", "SE", "PL", "CH", "AT"]
      target_datacenter: "london-dc"
      fallback: "frankfurt-dc"
      
    - name: "americas"
      countries: ["US", "CA", "MX", "BR", "AR", "CO", "CL"]
      target_datacenter: "virginia-dc"
      fallback: "oregon-dc"
      
  default:
    target_datacenter: "virginia-dc"
    
  health_check:
    interval: 30s
    timeout: 5s
    unhealthy_threshold: 3
    endpoints:
      - path: "/health"
        expected_status: 200

Latency-Based Routing:

Weighted Routing:

GSLB Routing Policy Comparison
Policy Type	Decision Basis	Best For	Limitations
Geographic	User IP geolocation	Simple global distribution	Assumes proximity = low latency
Latency-Based	Measured RTT	Performance optimization	Measurement overhead; unstable in congestion
Weighted	Configured ratios	Capacity management, migrations	Doesn't adapt to conditions
Health-Based	Endpoint availability	High availability	Binary decision; no optimization
Hybrid	Multiple factors combined	Production deployments	Complexity in tuning

Health-Aware Routing:

Synthetic Transactions: Simulated API calls that exercise the full application stack
Layer 4 Connectivity: TCP connection establishment to load balancer VIPs
Layer 7 Health Endpoints: HTTP health check paths returning application-level status
Capacity Metrics: Server load, queue depth, or response time thresholds

Failover Routing:

Policy Chaining

Health Monitoring Architectures

Distributed Health Probing:

Health Monitoring Best Practices

•Probe from Multiple Locations — Deploy health probes in at least 3 geographic regions. Require consensus (2/3 or 3/5) before marking a data center unhealthy to avoid false positives from probe-side network issues.
•Layer Health Appropriately — Combine Layer 4 (TCP) and Layer 7 (HTTP) health checks. TCP checks catch network-level failures quickly; HTTP checks verify application-level functionality.
•Test the Full Path — Health endpoints should exercise database connections, cache availability, and dependencies. A web server responding to /health while the database is down provides false confidence.
•Set Appropriate Thresholds — Avoid triggering failover on transient issues. Typical configurations require 3-5 consecutive failures before marking unhealthy, with checks every 10-30 seconds.
•Implement Grace Periods — After a failover, require extended healthy periods before fail-back to prevent flapping between data centers during unstable conditions.
•Monitor the Monitors — Health probes themselves can fail. Implement monitoring of the health monitoring system to ensure probes are functioning.

health-check-config.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
# Comprehensive GSLB Health Check Configuration
health_monitoring:
  global_probes:
    locations:
      - region: "us-east"
        endpoints: ["probe-ue1.example.net", "probe-ue2.example.net"]
      - region: "eu-west"
        endpoints: ["probe-ew1.example.net", "probe-ew2.example.net"]
      - region: "ap-northeast"
        endpoints: ["probe-an1.example.net", "probe-an2.example.net"]
    
    consensus:
      quorum: "majority"  # Or "all" for strict checking
      
  checks:
    - name: "tcp-vip"
      type: "tcp"
      port: 443
      interval: 10s
      timeout: 3s
      unhealthy_threshold: 3
      healthy_threshold: 2
      
    - name: "http-healthz"
      type: "http"
      path: "/healthz"
      host: "api.example.com"
      port: 443
      tls: true
      interval: 15s
      timeout: 5s
      unhealthy_threshold: 3
      healthy_threshold: 3
      expected_codes: [200]
      expected_body_contains: "healthy"
      
    - name: "deep-health"
      type: "http"
      path: "/health/deep"
      port: 443
      tls: true
      interval: 30s
      timeout: 10s
      unhealthy_threshold: 2
      healthy_threshold: 5
      expected_codes: [200]
      # Deep checks verify database, cache, dependencies
      
  failover:
    cooldown_period: 300s  # 5 minutes before fail-back
    notification:
      - type: "pagerduty"
        severity: "critical"
      - type: "slack"
        channel: "#infrastructure-alerts"

Active vs. Passive Health Checks:

The Flapping Problem:

The Cascade Failure Risk

Real-World GSLB Architectures

Pattern 1: Active-Active Multi-Region

Converting Mermaid diagram...

Pattern 2: Active-Passive with Geographic Affinity

Pattern 3: Anycast + Regional Load Balancing

Pattern 4: Multi-Cloud GSLB

GSLB Pattern Comparison for Production Deployments
Pattern	Complexity	Resilience	Data Consistency	Cost
Active-Active Multi-Region	Very High	Excellent	Challenging (conflicts)	High (full infra everywhere)
Active-Passive	Medium	Good	Simple (single primary)	Medium (idle standby)
Anycast + Regional LB	High	Excellent	Regional scope	High (anycast infra)
Multi-Cloud GSLB	Very High	Maximum	Cloud-dependent	Variable (multi-cloud overhead)

Start Simple, Evolve Complexity

GSLB Technologies and Providers

Cloud Provider GSLB Services:

Cloud Provider GSLB Offerings
Provider	Service Name	Key Features	Integration
AWS	Route 53	Latency, geo, weighted, failover routing; health checks; alias records	Deep AWS integration; works with ALB/NLB/CloudFront
Google Cloud	Cloud DNS + Traffic Director	Geo routing; cross-region load balancing; anycast VIPs	Native GCP integration; global HTTP(S) LB
Azure	Traffic Manager + Front Door	Performance, geographic, priority, weighted routing	Azure integration; Front Door for edge caching
Cloudflare	Load Balancing	Geo steering, health checks, proximity, random; edge compute	CDN integration; Workers for programmable routing

Self-Managed GSLB Options:

For organizations requiring full control, self-managed GSLB using authoritative DNS with intelligent backends is possible:

PowerDNS with Lua scripting: Open-source DNS server with programmable routing policies
BIND with custom response policy zones: Traditional but limited flexibility
Kubernetes external-dns with custom controllers: Cloud-native approach for K8s environments
F5 BIG-IP DNS (formerly GTM): Enterprise hardware appliance with comprehensive GSLB
Citrix ADC GSLB: NetScaler-based intelligent DNS routing

Hybrid Approaches:

DNS as a Dependency

Summary: Global Server Load Balancing

Global Server Load Balancing represents a fundamental capability for any organization serving users worldwide. Let's consolidate the core concepts we've explored:

Key Takeaways

•GSLB Extends Load Balancing Globally — While traditional load balancers distribute traffic within a data center, GSLB routes traffic across geographically distributed infrastructure using DNS as its primary mechanism.
•DNS is the Control Plane — GSLB leverages the DNS system to make intelligent routing decisions, returning different IP addresses based on user location, server health, and business policies.
•Multiple Policies Enable Sophisticated Routing — Geographic, latency-based, weighted, and health-aware policies can be combined to optimize for performance, availability, and cost simultaneously.
•Health Monitoring is Mission-Critical — Distributed health probing from multiple vantage points, with appropriate thresholds and consensus requirements, ensures accurate health assessments.
•TTL Creates Fundamental Tradeoffs — Short DNS TTLs enable rapid failover but increase DNS load; long TTLs reduce load but slow failover. Most deployments use 60-300 second TTLs.
•Start Simple, Add Complexity Incrementally — Active-active multi-region is the gold standard but carries significant operational burden. Begin with simpler patterns and evolve as requirements demand.

What's Next:

Page Complete