Load Balancer Comparison - Learning Module

Loading content...

0/273

Envoy — Modern, Cloud-Native

The Emergence of the Cloud-Native Proxy

While NGINX and HAProxy were born in the era of monolithic applications and physical servers, Envoy emerged from a fundamentally different context: the cloud-native, microservices-first world of modern distributed systems. Created by Matt Klein at Lyft in 2016 and open-sourced shortly after, Envoy was designed specifically to address the challenges of dynamic, containerized, ephemeral infrastructure.

Envoy's creation was motivated by a critical observation: traditional proxies were designed for static configuration files and long-lived servers. In a Kubernetes world where pods spin up and down constantly, where services auto-scale in response to load, and where deployments happen dozens of times per day, the reload-based configuration model becomes a significant operational burden.

Today, Envoy serves as the foundation for major service mesh implementations including Istio, AWS App Mesh, Google Cloud Traffic Director, and the Consul Connect data plane. Its adoption spans organizations from startups to hyperscalers, making it arguably the most important networking software of the cloud-native era.

Learning Objectives

By completing this page, you will understand Envoy's architecture and threading model, master its xDS dynamic configuration APIs, comprehend its observability and debugging capabilities, and recognize optimal deployment patterns including sidecar proxies and edge deployments.

Architectural Foundation: Designed for Dynamic Environments

Envoy's architecture diverges from traditional proxies in several fundamental ways, all driven by the requirements of cloud-native infrastructure.

Core Architectural Principles:

Envoy Design Philosophy

•Out-of-process architecture — Envoy runs as a sidecar alongside applications, not embedded within them. This provides language-agnostic network capabilities and allows independent upgrade cycles.
•Dynamic configuration — All configuration can be updated at runtime via APIs (xDS), without restarts or reloads. The control plane pushes configuration changes; Envoy applies them atomically.
•Observability-first — Rich metrics (Prometheus-compatible), distributed tracing integration (Zipkin, Jaeger, OpenTelemetry), and structured access logging are built-in, not bolted-on.
•L3/L4 and L7 aware — Full TCP proxy capabilities combined with deep HTTP/2, gRPC, and WebSocket understanding.
•Extensibility — Filter chain architecture enables custom logic via C++ plugins, Lua scripts, or WebAssembly (Wasm) modules.

Threading Model:

Envoy employs a multi-threaded model with careful attention to thread-local storage (TLS) to minimize lock contention:

Main thread — Handles configuration updates, stats flushing, and administrative tasks
Worker threads — Handle all data plane traffic (one per CPU core typically)
Thread-local storage — Each worker has its own copy of configuration and connection pools

When configuration updates arrive via xDS, the main thread atomically swaps shared pointers in each worker's TLS. Workers pick up changes on their next event loop iteration without any locking on the hot path.

Connection Draining:

Unlike NGINX (which requires external scripting for graceful shutdown) or HAProxy (which does this well), Envoy has sophisticated connection draining built-in. When configuration changes remove a cluster, listeners are drained gracefully—existing connections complete while new connections are refused.

Hot Restart for Zero-Downtime Upgrades

Envoy's 'hot restart' capability enables binary upgrades without dropping connections. The new process starts, inherits listening sockets from the old process, and takes over new connections while the old process drains existing ones. This enables true zero-downtime upgrades in production.

xDS: The Dynamic Configuration Revolution

Envoy's xDS (x Discovery Services) APIs represent a paradigm shift in proxy configuration. Rather than static configuration files that require reloads, xDS enables a control plane to stream configuration updates to Envoy instances in real-time.

The xDS API Family:

xDS API Components
API	Purpose	Configuration Managed
LDS (Listener DS)	Configure listeners	Ports, protocols, TLS settings, filter chains
RDS (Route DS)	Configure routing	Virtual hosts, routes, traffic policies
CDS (Cluster DS)	Configure clusters	Backend groups, load balancing, health checks
EDS (Endpoint DS)	Configure endpoints	Individual backend addresses and weights
SDS (Secret DS)	Configure secrets	TLS certificates, keys, trusted CAs
ECDS (Extension Config DS)	Configure extensions	Dynamic Wasm filters, custom extensions
ADS (Aggregated DS)	Unified stream	All xDS resources over single gRPC stream

Configuration Hierarchy:

Envoy's configuration model follows a logical hierarchy:

Listeners (LDS)
  └── Filter Chains
        └── HTTP Connection Manager
              └── Routes (RDS)
                    └── Virtual Hosts
                          └── Routes
                                └── Clusters (CDS)
                                      └── Endpoints (EDS)

This separation enables independent updates to different configuration layers. For example, when a new service instance starts, only EDS updates are needed—the listener, routes, and cluster configuration remain unchanged.

envoy.yaml — Bootstrap Configuration
yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
# Envoy bootstrap configuration
# Defines admin interface and how to connect to xDS control plane
 
admin:
  address:
    socket_address:
      address: 0.0.0.0
      port_value: 9901
 
# Static resources (bootstrap-time configuration)
static_resources:
  clusters:
    # Control plane cluster for xDS
    - name: xds_cluster
      type: STRICT_DNS
      connect_timeout: 5s
      load_assignment:
        cluster_name: xds_cluster
        endpoints:
          - lb_endpoints:
              - endpoint:
                  address:
                    socket_address:
                      address: control-plane.example.com
                      port_value: 18000
      typed_extension_protocol_options:
        envoy.extensions.upstreams.http.v3.HttpProtocolOptions:
          "@type": type.googleapis.com/envoy.extensions.upstreams.http.v3.HttpProtocolOptions
          explicit_http_config:
            http2_protocol_options: {}
 
# Dynamic resources via xDS
dynamic_resources:
  ads_config:
    api_type: GRPC
    transport_api_version: V3
    grpc_services:
      - envoy_grpc:
          cluster_name: xds_cluster
  # Get listeners from LDS
  lds_config:
    resource_api_version: V3
    ads: {}
  # Get clusters from CDS  
  cds_config:
    resource_api_version: V3
    ads: {}
 
# Node identification for control plane
node:
  cluster: my-cluster
  id: my-node-id
  metadata:
    role: sidecar
    namespace: production
    service: api-gateway

xDS Protocol Details:

xDS operates over gRPC streaming connections. The control plane pushes configuration resources, and Envoy acknowledges receipt. This enables:

Incremental updates — Only changed resources are transmitted
Version tracking — Each resource has a version; Envoy can detect staleness
NACK support — Envoy can reject invalid configurations, continuing with previous valid config
Delta xDS — Transmit only diffs rather than full configuration snapshots

Popular xDS control planes include:

Istio Pilot — Service mesh control plane
AWS App Mesh — Managed control plane
Consul Connect — Service discovery integration
Gloo Edge — API gateway control plane
Custom implementations — Using go-control-plane or similar libraries

Bootstrap vs Dynamic Configuration

Envoy requires a bootstrap configuration file to start—this configures the admin interface and tells Envoy how to reach the xDS control plane. All other configuration can then be dynamically managed. In Kubernetes, the bootstrap is typically generated by an init container or mutating webhook.

Load Balancing and Traffic Management

Envoy implements sophisticated load balancing capabilities designed for microservices environments where dozens of upstream services, each with variable instance counts, must be managed dynamically.

Envoy Load Balancing Policies
Policy	Behavior	Use Case
Round Robin	Cycles through endpoints sequentially	Default, general-purpose
Least Request	Routes to endpoint with fewest active requests	Variable latency backends
Ring Hash	Consistent hashing on configurable keys	Cache affinity, sharding
Maglev	Google's consistent hash variant	Large scale, minimal disruption
Random	Random selection (weighted)	Simple, high-scale scenarios
Original DST	Routes to original destination IP	Transparent proxying

Cluster Configuration Examples
yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
# Cluster with least request balancing
clusters:
  - name: api_cluster
    type: EDS
    connect_timeout: 5s
    lb_policy: LEAST_REQUEST
    least_request_lb_config:
      choice_count: 2  # Power of two random choices
    
    # Health checking
    health_checks:
      - timeout: 5s
        interval: 10s
        healthy_threshold: 2
        unhealthy_threshold: 3
        http_health_check:
          path: /health
          expected_statuses:
            - start: 200
              end: 299
    
    # Circuit breaker settings
    circuit_breakers:
      thresholds:
        - priority: DEFAULT
          max_connections: 1000
          max_pending_requests: 1000
          max_requests: 1000
          max_retries: 10
    
    # Outlier detection (automatic ejection of failing endpoints)
    outlier_detection:
      consecutive_5xx: 5
      interval: 10s
      base_ejection_time: 30s
      max_ejection_percent: 50
      enforcing_consecutive_5xx: 100
      enforcing_success_rate: 100
      success_rate_minimum_hosts: 5
      success_rate_request_volume: 100
      success_rate_stdev_factor: 1900
    
    # Connection pooling
    typed_extension_protocol_options:
      envoy.extensions.upstreams.http.v3.HttpProtocolOptions:
        "@type": type.googleapis.com/envoy.extensions.upstreams.http.v3.HttpProtocolOptions
        common_http_protocol_options:
          max_connections_per_host: 100
          idle_timeout: 60s
        explicit_http_config:
          http2_protocol_options: {}
 
  # Ring hash for cache distribution
  - name: cache_cluster
    type: EDS
    connect_timeout: 5s
    lb_policy: RING_HASH
    ring_hash_lb_config:
      minimum_ring_size: 1024
      maximum_ring_size: 8388608
      hash_function: XX_HASH
    
    # Configure what to hash on
    # (done at route level with hash_policy)

Advanced Traffic Management:

Envoy's route configuration enables sophisticated traffic manipulation beyond basic load balancing:

Route-Level Traffic Management
yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
# Route configuration with traffic management
routes:
  - match:
      prefix: /api/
    route:
      # Weighted cluster routing (canary deployments)
      weighted_clusters:
        clusters:
          - name: api_v1
            weight: 90
          - name: api_v2
            weight: 10
      
      # Retry policy
      retry_policy:
        retry_on: "5xx,reset,connect-failure"
        num_retries: 3
        per_try_timeout: 5s
        retry_back_off:
          base_interval: 0.1s
          max_interval: 1s
        retriable_status_codes:
          - 503
          - 504
      
      # Timeout configuration
      timeout: 60s
      idle_timeout: 30s
      
      # Hash policy for consistent routing
      hash_policy:
        - header:
            header_name: "x-user-id"
        - cookie:
            name: session_id
            ttl: 3600s
        - connection_properties:
            source_ip: true
      
      # Request hedging (send duplicate requests)
      hedge_policy:
        hedge_on_per_try_timeout: true
        initial_requests: 2
      
      # Traffic mirroring (shadow testing)
      request_mirror_policies:
        - cluster: api_shadow
          runtime_fraction:
            default_value:
              numerator: 10
              denominator: HUNDRED
 
  # Header-based routing for A/B testing
  - match:
      prefix: /
      headers:
        - name: x-experiment
          exact_match: "treatment"
    route:
      cluster: experiment_treatment
 
  - match:
      prefix: /
    route:
      cluster: experiment_control

Traffic Mirroring for Safe Testing

Envoy's request mirroring sends duplicate requests to a shadow cluster without affecting primary traffic. This enables testing new backend versions with production traffic patterns while completely isolating the shadow responses. A powerful technique for validating performance and correctness before promotion.

Observability: Complete Network Visibility

Envoy was designed with the principle that observability should be built-in, not bolted-on. Every proxied connection generates rich telemetry without requiring application changes—a critical capability for debugging distributed systems.

Observability Pillars in Envoy

•Metrics — Prometheus-compatible stats for every cluster, listener, and HTTP route. Includes request counts, latencies (histograms), error rates, connection pool stats, circuit breaker state.
•Distributed Tracing — Automatic span generation and context propagation for Zipkin, Jaeger, Lightstep, Datadog, and OpenTelemetry. Envoy adds its own span while propagating upstream trace context.
•Access Logging — Highly configurable structured logs with request/response details, timing breakdowns, upstream selection decisions, and failure causes.
•Admin Interface — HTTP API for runtime introspection: active connections, cluster health, configuration dumps, statistics, tracing.

Observability Configuration
yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
# Stats configuration
stats_config:
  stats_tags:
    - tag_name: cluster_name
      regex: "^cluster\.((.+?)\.)"
    - tag_name: route_name  
      regex: "^http\.route\.((.+?)\.)"
  use_all_default_tags: true
 
# Access logging with detailed timing breakdown
http_filters:
  - name: envoy.filters.http.router
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
 
# In listener/route configuration:
access_log:
  - name: envoy.access_loggers.stream
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.access_loggers.stream.v3.StdoutAccessLog
      log_format:
        json_format:
          # Request info
          timestamp: "%START_TIME%"
          method: "%REQ(:METHOD)%"
          path: "%REQ(X-ENVOY-ORIGINAL-PATH?:PATH)%"
          protocol: "%PROTOCOL%"
          
          # Response info
          response_code: "%RESPONSE_CODE%"
          response_flags: "%RESPONSE_FLAGS%"
          
          # Timing breakdown (microseconds)
          duration: "%DURATION%"
          time_to_first_byte: "%RESPONSE_DURATION%"
          time_to_last_rx_byte: "%REQUEST_DURATION%"
          upstream_response_time: "%RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)%"
          
          # Connection info
          downstream_address: "%DOWNSTREAM_REMOTE_ADDRESS%"
          upstream_host: "%UPSTREAM_HOST%"
          upstream_cluster: "%UPSTREAM_CLUSTER%"
          
          # Tracing
          request_id: "%REQ(X-REQUEST-ID)%"
          trace_id: "%REQ(X-B3-TRACEID)%"
          
          # TLS info
          tls_version: "%DOWNSTREAM_TLS_VERSION%"
          tls_cipher: "%DOWNSTREAM_TLS_CIPHER%"
 
# Distributed tracing configuration
tracing:
  http:
    name: envoy.tracers.zipkin
    typed_config:
      "@type": type.googleapis.com/envoy.config.trace.v3.ZipkinConfig
      collector_cluster: zipkin_cluster
      collector_endpoint: /api/v2/spans
      collector_endpoint_version: HTTP_JSON
      shared_span_context: true
      trace_id_128bit: true

Response Flags:

Envoy's access logs include RESPONSE_FLAGS that provide crucial debugging information. Common flags include:

UH — No healthy upstream (all endpoints unavailable)
UF — Upstream connection failure
UO — Upstream overflow (circuit breaker triggered)
NR — No route configured for request
URX — Request rejected by outlier detection
DC — Downstream connection termination
LH — Local service health check failure
UT — Upstream request timeout
UC — Upstream connection termination

These flags enable rapid root cause identification when investigating failures.

Golden Signals from Envoy

Envoy natively exposes the four golden signals of SRE: Latency (histogram metrics), Traffic (request rate), Errors (response codes, connection failures), and Saturation (connection pool usage, pending requests). This enables comprehensive service-level monitoring without application instrumentation.

Service Mesh Integration: The Sidecar Pattern

Envoy's defining deployment pattern is as a sidecar proxy in service mesh architectures. In this model, every application instance is paired with its own Envoy instance, creating a distributed network of proxies that handle all service-to-service communication.

Service Mesh Architecture:

Service Mesh Conceptual Diagram

text

┌─────────────────────────────────────────────────────────────────┐
│                         Control Plane                            │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐              │
│  │    Istio    │  │   Consul    │  │    Gloo     │              │
│  │    Pilot    │  │   Connect   │  │    Edge     │              │
│  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘              │
│         │                │                │                      │
│         └────────────────┼────────────────┘                      │
│                          │ xDS (gRPC)                            │
└──────────────────────────┼───────────────────────────────────────┘
                           │
                           ▼
┌─────────────────────────────────────────────────────────────────┐
│                         Data Plane                               │
│                                                                  │
│  ┌───────────────────────────┐   ┌───────────────────────────┐  │
│  │         Pod A             │   │         Pod B             │  │
│  │  ┌─────────┐┌──────────┐ │   │  ┌─────────┐┌──────────┐ │  │
│  │  │Service │─│  Envoy   │─┼───┼──│  Envoy  ││ Service  │ │  │
│  │  │   A    │ │ Sidecar  │ │   │  │ Sidecar ││    B     │ │  │
│  │  └─────────┘└──────────┘ │   │  └─────────┘└──────────┘ │  │
│  └───────────────────────────┘   └───────────────────────────┘  │
│                                                                  │
└──────────────────────────────────────────────────────────────────┘

Sidecar Injection:

In Kubernetes environments, sidecar injection happens automatically via mutating admission webhooks. When a pod is created, the mesh's admission controller modifies the pod specification to add:

An init container that configures iptables to redirect traffic through Envoy
The Envoy sidecar container
Shared volumes for configuration and secrets

Traffic Interception:

Envoy as a sidecar intercepts both inbound and outbound traffic using iptables REDIRECT rules:

Traffic Interception (Conceptual)
bash
1
2
3
4
5
6
7
8
9
10
11
12
# Inbound traffic: Redirect incoming connections to Envoy (port 15006)
iptables -t nat -A PREROUTING -p tcp -j REDIRECT --to-port 15006
 
# Outbound traffic: Redirect outgoing connections to Envoy (port 15001)
iptables -t nat -A OUTPUT -p tcp -j REDIRECT --to-port 15001
 
# Exclude Envoy's own traffic (prevent loops)
iptables -t nat -A OUTPUT -m owner --uid-owner 1337 -j RETURN
 
# Applications connect to localhost:$PORT but actually talk to remote service
#
# Flow: App → iptables → Envoy (15001) → Remote Envoy (15006) → Remote App

Benefits of Sidecar Deployment

•Language agnostic — Any application technology gets uniform networking capabilities
•Zero code changes — mTLS, retries, observability without application modification
•Uniform policy — Consistent security, routing, and observability across all services
•Independent upgrades — Update Envoy without redeploying applications
•Blast radius containment — Proxy failures only affect co-located service

Sidecar Overhead

Sidecars add resource overhead: ~50MB memory per instance, ~5-10% CPU increase, 1-3ms latency per hop. At scale (thousands of pods), this translates to significant infrastructure cost. Evaluate whether the benefits (mTLS, observability, traffic control) justify the overhead for your use case.

Advanced Features and Extensibility

Envoy's extensibility model enables sophisticated customization without forking the core proxy. The primary extension mechanisms are filters, WebAssembly (Wasm), and external processors.

Built-in Advanced Features

•Rate Limiting — Integration with external rate limit services (supports local and global rate limiting)
•JWT Authentication — Native JWT validation, claim extraction, RBAC integration
•WASM Extensions — Run custom logic in sandboxed WebAssembly modules
•External Authorization — Delegate auth decisions to external services (OPA, custom)
•Fault Injection — Inject delays and errors for chaos testing
•Compression — Automatic response compression (gzip, Brotli)
•CORS — Cross-origin resource sharing header management
•RBAC — Role-based access control on routes and methods

Advanced Filter Configuration
yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
http_filters:
  # Rate limiting filter
  - name: envoy.filters.http.ratelimit
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters.http.ratelimit.v3.RateLimit
      domain: "api_gateway"
      rate_limit_service:
        grpc_service:
          envoy_grpc:
            cluster_name: ratelimit_cluster
        transport_api_version: V3
 
  # JWT authentication
  - name: envoy.filters.http.jwt_authn
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters.http.jwt_authn.v3.JwtAuthentication
      providers:
        auth0:
          issuer: https://your-domain.auth0.com/
          audiences:
            - your-api-identifier
          remote_jwks:
            http_uri:
              uri: https://your-domain.auth0.com/.well-known/jwks.json
              cluster: auth0_cluster
              timeout: 5s
            cache_duration: 600s
          forward: true
          forward_payload_header: x-jwt-payload
      rules:
        - match:
            prefix: /api/
          requires:
            provider_name: auth0
 
  # External authorization
  - name: envoy.filters.http.ext_authz
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters.http.ext_authz.v3.ExtAuthz
      grpc_service:
        envoy_grpc:
          cluster_name: authz_cluster
        timeout: 2s
      include_peer_certificate: true
      transport_api_version: V3
 
  # Fault injection for chaos testing
  - name: envoy.filters.http.fault
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters.http.fault.v3.HTTPFault
      delay:
        header_delay: {}
        fixed_delay: 5s
        percentage:
          numerator: 10
          denominator: HUNDRED
      abort:
        header_abort: {}
        http_status: 503
        percentage:
          numerator: 5
          denominator: HUNDRED
 
  # WASM extension
  - name: envoy.filters.http.wasm
    typed_config:
      "@type": type.googleapis.com/envoy.extensions.filters.http.wasm.v3.Wasm
      config:
        name: my_custom_filter
        vm_config:
          runtime: envoy.wasm.runtime.v8
          code:
            local:
              filename: /etc/envoy/wasm/custom_filter.wasm
        configuration:
          "@type": type.googleapis.com/google.protobuf.StringValue
          value: |
            {"setting": "value"}
  
  # Router must be last
  - name: envoy.filters.http.router

WebAssembly Extensions:

Wasm enables writing custom filters in Rust, Go, C++, or AssemblyScript that run within Envoy's sandbox. This provides near-native performance with memory safety and isolated execution.

Use cases include:

Custom request transformation logic
Proprietary authentication schemes
Business-specific rate limiting
Header manipulation for legacy system integration

Wasm filters can be dynamically loaded via ECDS (Extension Config Discovery Service), enabling runtime updates without Envoy restarts.

Proxy-Wasm SDK

The proxy-wasm project provides standard SDKs for building Envoy-compatible Wasm filters. The Rust SDK (proxy-wasm-rust-sdk) is most mature. Filters built with proxy-wasm are portable across Envoy, NGINX (with ngx_wasm_module), and other compatible proxies.

Performance and Resource Considerations

Envoy's performance profile reflects its design priorities: feature richness and observability, with acceptable overhead rather than maximum raw throughput.

Envoy Performance Characteristics
Aspect	Typical Value	Notes
Latency overhead	1-3ms per hop	Higher with complex filter chains, TLS
Memory per instance	50-100MB baseline	Increases with connection count, config size
CPU usage	5-15% of app CPU	Higher with tracing, complex routing
Throughput	10,000-50,000 RPS/core	Depends on request size, filter complexity
Connection overhead	~8KB per connection	Includes TLS state if enabled

Comparison with HAProxy/NGINX:

Envoy typically shows 10-30% higher latency than HAProxy for equivalent workloads. This is the cost of:

Richer observability (metrics, tracing hooks)
Dynamic configuration support
More sophisticated filter chain processing
Built-in TLS termination

For edge/ingress deployments where every microsecond matters, HAProxy or NGINX may be preferable. For sidecar deployments where developer experience and observability are primary concerns, Envoy's overhead is acceptable.

Tuning Recommendations:

Performance Optimization Strategies

•Match concurrency to cores — Set --concurrency to number of CPU cores allocated
•Tune access logging — Disable in high-throughput paths or use async logging
•Optimize tracing sampling — Use 1-10% sampling for tracing in high-traffic services
•Enable connection pooling — Reuse upstream connections, especially with TLS
•Size buffers appropriately — Large buffers for streaming, small for low-latency
•Use circuit breakers — Prevent resource exhaustion during cascading failures

Right-Size Sidecar Resources

In Kubernetes, configure sidecar resource requests/limits based on observed usage. Start with 100m CPU / 128Mi memory for light sidecars, 250m CPU / 256Mi for heavy ones. Use Vertical Pod Autoscaler to optimize over time.

When to Choose Envoy

Envoy's unique capabilities make it the clear choice for certain architectures while being overkill for others.

Choose Envoy When

•Building or adopting a service mesh (Istio, Linkerd alternative)
•Dynamic configuration changes are frequent
•Advanced observability is a priority
•You need gRPC support with header-based routing
•Platform team building internal infrastructure
•Kubernetes-native deployment model
•Multi-protocol support (HTTP/2, gRPC, WebSocket) required

Consider Alternatives When

•Minimal latency overhead is critical (use HAProxy)
•Configuration is static and simple (use NGINX or HAProxy)
•Team lacks Kubernetes/cloud-native expertise
•Traditional VM-based infrastructure
•Need integrated static file serving (use NGINX)
•Prefer fully managed solutions (use cloud LBs)

Summary:

Envoy represents the next generation of service proxies, designed from the ground up for cloud-native, dynamic, microservices architectures. Its combination of xDS-based dynamic configuration, rich observability, and extensibility via Wasm makes it the foundation of modern service mesh infrastructure.

The trade-off is complexity: Envoy's configuration model is more sophisticated than NGINX or HAProxy, and its operational overhead (memory, CPU, latency) is higher. For organizations investing in platform engineering and service mesh infrastructure, this complexity is manageable and the benefits are substantial. For simpler deployments, NGINX or HAProxy remains preferable.

In the next page, we'll examine AWS ALB/NLB—managed load balancing services that eliminate operational overhead entirely.

Page Complete

You now possess comprehensive knowledge of Envoy as a cloud-native service proxy—from its xDS configuration APIs to observability features, service mesh integration, and extensibility. Next, we'll explore AWS ALB and NLB, understanding when managed cloud solutions are the optimal choice.