Service Discovery - Learning Module

Loading content...

0/273

Kubernetes Service Discovery

Platform-Native Service Discovery

Kubernetes has fundamentally changed how we think about service discovery. Rather than bolting discovery onto applications through libraries or sidecars, Kubernetes provides service discovery as a platform primitive. When you deploy an application to Kubernetes, discovery is available out of the box—no additional infrastructure, no client libraries, no configuration beyond declaring a Service resource.

This platform-native approach has made Kubernetes the de facto standard for container orchestration, and understanding its discovery mechanisms is essential for any engineer building cloud-native applications. Kubernetes' discovery model combines DNS-based resolution with a real-time endpoints API, providing both simplicity for basic use cases and flexibility for advanced scenarios.

This page provides a comprehensive exploration of Kubernetes service discovery: how it works, the components involved, configuration options, and patterns for production environments.

What You Will Learn

By the end of this page, you will understand how Kubernetes Services and Endpoints work, how CoreDNS provides DNS-based discovery, the difference between ClusterIP, NodePort, LoadBalancer, and headless services, ExternalName for external service integration, and advanced patterns including EndpointSlices and topology-aware routing.

The Kubernetes Service Model

In Kubernetes, a Service is an abstraction that defines a logical set of Pods and a policy for accessing them. Services enable loose coupling between dependent components—a consumer doesn't need to know which specific Pods implement the service, only the Service's stable identity.

The Core Problem Services Solve:

Pods in Kubernetes are ephemeral. They come and go as deployments roll out, nodes fail, or scaling events occur. Each Pod gets a unique IP address, but that address exists only for the Pod's lifetime. If you hardcode a Pod's IP address, your application will break as soon as that Pod is replaced.

Services provide a stable abstraction over the dynamic set of Pods:

Stable DNS Name: my-service.my-namespace.svc.cluster.local
Stable IP Address: ClusterIP remains constant for the Service's lifetime
Automatic Load Balancing: Traffic distributed across healthy Pods

Converting Mermaid diagram...

How Services Track Pods:

Services use label selectors to identify which Pods belong to the service. When you create a Service with a selector, Kubernetes automatically creates and maintains an Endpoints object that lists the IP addresses and ports of all Pods matching the selector.

service-and-deployment.yaml
YAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
# Deployment creates Pods with labels
apiVersion: apps/v1
kind: Deployment
metadata:
  name: order-service
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: order-service
      version: v2
  template:
    metadata:
      labels:
        app: order-service    # Service will select these
        version: v2           # Can be used for canary routing
        team: commerce
    spec:
      containers:
        - name: order-service
          image: order-service:2.3.1
          ports:
            - containerPort: 8080
              name: http
            - containerPort: 9090
              name: grpc
          readinessProbe:       # Critical: determines endpoint inclusion
            httpGet:
              path: /health/ready
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 10
          livenessProbe:
            httpGet:
              path: /health/live
              port: 8080
            initialDelaySeconds: 15
            periodSeconds: 20
 
---
# Service selects Pods by label
apiVersion: v1
kind: Service
metadata:
  name: order-service
  namespace: production
  labels:
    app: order-service
spec:
  type: ClusterIP           # Default type
  selector:                 # Matches Pod labels
    app: order-service
    version: v2             # Could target specific version
  ports:
    - name: http
      port: 80              # Service port (what clients connect to)
      targetPort: 8080      # Container port (where traffic goes)
      protocol: TCP
    - name: grpc
      port: 9090
      targetPort: 9090
      protocol: TCP
 
# Kubernetes automatically creates Endpoints:
# kubectl get endpoints order-service -n production
# NAME            ENDPOINTS
# order-service   10.244.1.5:8080,10.244.2.8:8080,10.244.3.12:8080

Readiness Probes Are Critical

A Pod is only added to the Endpoints list when its readiness probe passes. Without a readiness probe, Pods are added immediately upon starting—before they're actually ready to serve traffic. Always configure readiness probes for any Pod that will receive traffic through a Service.

Service Types

Kubernetes supports several Service types, each designed for different networking scenarios. Understanding these types is essential for designing accessible applications.

ClusterIP is the default Service type. It exposes the Service on a cluster-internal IP address. The Service is only reachable from within the cluster.

Use Cases:

Internal microservice-to-microservice communication
Backend services that shouldn't be exposed externally
Databases, caches, and other infrastructure services

clusterip-service.yaml
YAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
apiVersion: v1
kind: Service
metadata:
  name: user-service
  namespace: production
spec:
  type: ClusterIP    # Default, can be omitted
  selector:
    app: user-service
  ports:
    - port: 8080
      targetPort: 8080
 
# Accessible within cluster only:
# - DNS: user-service.production.svc.cluster.local
# - Or just: user-service (from same namespace)
# - ClusterIP: e.g., 10.96.50.25

Service Type Comparison
Type	Internal Access	External Access	IP Allocation	Use Case
ClusterIP	Yes - ClusterIP + DNS	No	Virtual ClusterIP	Internal services
NodePort	Yes - ClusterIP + DNS	Yes - <NodeIP>:<NodePort>	ClusterIP + NodePort	Dev/testing, simple external
LoadBalancer	Yes - ClusterIP + DNS	Yes - External LB IP	ClusterIP + External IP	Production external access
Headless (None)	Yes - DNS returns Pod IPs	No	No ClusterIP	Client-side LB, StatefulSets
ExternalName	Yes - CNAME to external	N/A (external)	No IP - CNAME only	External service abstraction

Ingress for HTTP/HTTPS

For HTTP/HTTPS traffic, consider Ingress resources instead of LoadBalancer services. Ingress provides path-based routing, name-based virtual hosting, TLS termination, and more—typically backed by a single LoadBalancer rather than one per service.

CoreDNS and DNS-Based Discovery

CoreDNS is the default DNS server for Kubernetes clusters (replacing kube-dns since Kubernetes 1.11). It provides DNS-based service discovery, allowing Pods to discover services using standard DNS queries.

How DNS Discovery Works:

Each Pod is configured to use the cluster DNS server (CoreDNS) as its nameserver
CoreDNS watches the Kubernetes API for Service and Endpoints changes
When a Pod queries for a service name, CoreDNS returns the appropriate IP(s)
CoreDNS handles both forward and reverse DNS lookups

DNS Naming Convention:

Kubernetes follows a standard naming convention for service DNS:

<service-name>.<namespace>.svc.<cluster-domain>

For example: order-service.production.svc.cluster.local

DNS Resolution Shortcuts:

Same namespace: order-service (shortest form)
Cross-namespace: order-service.other-namespace
Fully qualified: order-service.other-namespace.svc.cluster.local

dns-examples.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# From a Pod in the 'production' namespace
 
# Same namespace - shortest form
$ nslookup order-service
Server:    10.96.0.10
Address:   10.96.0.10:53
 
Name:      order-service.production.svc.cluster.local
Address:   10.96.100.50
 
# Cross-namespace - specify namespace
$ nslookup redis.cache
Name:      redis.cache.svc.cluster.local
Address:   10.96.200.30
 
# Fully qualified domain name (FQDN)
$ nslookup order-service.production.svc.cluster.local
Name:      order-service.production.svc.cluster.local
Address:   10.96.100.50
 
# SRV records for port discovery
$ nslookup -type=SRV _http._tcp.order-service.production.svc.cluster.local
_http._tcp.order-service.production.svc.cluster.local service = 0 100 80 order-service.production.svc.cluster.local
 
# Headless service returns all Pod IPs
$ nslookup database-cluster.production.svc.cluster.local
Name:      database-cluster.production.svc.cluster.local
Address:   10.244.1.5
Address:   10.244.2.8
Address:   10.244.3.12
 
# StatefulSet Pod-specific DNS
$ nslookup postgresql-0.database-cluster.production.svc.cluster.local
Name:      postgresql-0.database-cluster.production.svc.cluster.local
Address:   10.244.1.5

CoreDNS Configuration:

CoreDNS is configured via a ConfigMap that defines its behavior. The default configuration watches the Kubernetes API and responds to DNS queries for cluster services.

coredns-configmap.yaml
YAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
data:
  Corefile: |
    .:53 {
        errors
        health {
            lameduck 5s
        }
        ready
        
        # Kubernetes plugin - handles cluster DNS
        kubernetes cluster.local in-addr.arpa ip6.arpa {
            pods insecure
            fallthrough in-addr.arpa ip6.arpa
            ttl 30             # DNS TTL for records
        }
        
        prometheus :9153       # Metrics endpoint
        
        # Forward external queries to upstream DNS
        forward . /etc/resolv.conf {
            max_concurrent 1000
        }
        
        cache 30               # Cache responses for 30s
        loop
        reload
        loadbalance           # Round-robin DNS responses
    }
    
    # Custom zone for internal services
    # internal.company.com:53 {
    #     file /etc/coredns/internal.company.com.zone
    # }

DNS TTL and Propagation

CoreDNS uses a default TTL of 30 seconds. This means Pod changes take up to 30 seconds to propagate to all clients' DNS caches. For latency-sensitive applications, consider using headless services with client-side load balancing, or a service mesh that provides real-time endpoint updates.

Endpoints and EndpointSlices

While DNS provides a simple discovery interface, the underlying mechanism is the Endpoints (and in newer clusters, EndpointSlices) API. Understanding these resources is essential for debugging, advanced integrations, and understanding how kube-proxy routes traffic.

Endpoints vs EndpointSlices:

Endpoints: Original API, one Endpoints object per Service containing all Pod IPs
EndpointSlices: Newer API (GA in 1.21), shards endpoints across multiple objects for better scalability

endpoints-examples.yaml
YAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
# Endpoints (auto-generated from Service selector)
# kubectl get endpoints order-service -n production -o yaml
 
apiVersion: v1
kind: Endpoints
metadata:
  name: order-service
  namespace: production
  labels:
    app: order-service
subsets:
  - addresses:          # Ready pods
      - ip: 10.244.1.5
        nodeName: node-1
        targetRef:
          kind: Pod
          name: order-service-abc123
          namespace: production
      - ip: 10.244.2.8
        nodeName: node-2
        targetRef:
          kind: Pod
          name: order-service-def456
          namespace: production
      - ip: 10.244.3.12
        nodeName: node-3
        targetRef:
          kind: Pod
          name: order-service-ghi789
          namespace: production
    notReadyAddresses:  # Pods failing readiness probe
      - ip: 10.244.4.20
        nodeName: node-4
        targetRef:
          kind: Pod
          name: order-service-jkl012
          namespace: production
    ports:
      - name: http
        port: 8080
        protocol: TCP
      - name: grpc
        port: 9090
        protocol: TCP
 
---
# EndpointSlice (newer, more scalable)
# kubectl get endpointslice -l kubernetes.io/service-name=order-service -n production
 
apiVersion: discovery.k8s.io/v1
kind: EndpointSlice
metadata:
  name: order-service-abc12
  namespace: production
  labels:
    kubernetes.io/service-name: order-service
  ownerReferences:
    - apiVersion: v1
      kind: Service
      name: order-service
addressType: IPv4
endpoints:
  - addresses:
      - "10.244.1.5"
    conditions:
      ready: true
      serving: true
      terminating: false
    nodeName: node-1
    targetRef:
      kind: Pod
      name: order-service-abc123
      namespace: production
  - addresses:
      - "10.244.2.8"
    conditions:
      ready: true
      serving: true
      terminating: false
    nodeName: node-2
    targetRef:
      kind: Pod
      name: order-service-def456
      namespace: production
ports:
  - name: http
    port: 8080
    protocol: TCP

Why EndpointSlices?

With large services (thousands of endpoints), the Endpoints API becomes problematic:

API Size: A single Endpoints object with 5000 IPs is large, causing etcd write performance issues
Watch Overhead: Any Pod change triggers a full Endpoints object update, causing massive watch traffic
Scaling Limits: Hard limit of ~10,000 endpoints per Endpoints object

EndpointSlices address these by:

Sharding: Each EndpointSlice contains at most 100 endpoints (configurable)
Efficient Updates: Pod additions/removals only update the affected slice
Topology Hints: Support for topology-aware routing

Endpoints vs EndpointSlices
Aspect	Endpoints	EndpointSlices
Default in Kubernetes	< 1.21	= 1.21
Endpoints per object	All endpoints in one object	Max 100 per slice (default)
Update granularity	Full object rewrite	Per-slice updates
Topology support	No	Yes (topology hints)
Condition tracking	ready/notReady only	ready, serving, terminating
Recommended for	Small services, legacy	Production, large services

Watching for Changes

Service meshes and load balancers watch the Endpoints or EndpointSlices API rather than relying solely on DNS. This provides real-time updates when Pods become ready or fail, avoiding DNS TTL delays. If you're building custom integrations, use EndpointSlices for better scalability.

How Traffic Routing Works

DNS tells clients which IP to connect to, but how does traffic actually reach the right Pod? kube-proxy is the component responsible for implementing Service-level load balancing.

kube-proxy Modes:

kube-proxy can operate in different modes, each with different performance characteristics:

iptables mode (default in most clusters) uses Linux iptables rules to intercept and redirect traffic to backend Pods.

How it works:

kube-proxy watches Services and Endpoints
For each Service, it creates iptables rules that:
- Match packets destined for the ClusterIP
- Randomly select a backend Pod (DNAT)
- Rewrite the destination to the Pod IP
Return traffic is automatically handled by connection tracking

iptables-rules.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# View iptables rules for a service (simplified)
$ iptables -t nat -L KUBE-SERVICES -n
 
Chain KUBE-SERVICES
target                    prot  opt  source    destination
KUBE-SVC-ORDER-SERVICE    tcp   --   0.0.0.0/0 10.96.100.50  /* order-service */
 
# Service chain randomly selects a backend
$ iptables -t nat -L KUBE-SVC-ORDER-SERVICE -n
 
Chain KUBE-SVC-ORDER-SERVICE
target                   prot   opt  source    destination
KUBE-SEP-AAAA            --     --   0.0.0.0/0 0.0.0.0/0  probability 0.333
KUBE-SEP-BBBB            --     --   0.0.0.0/0 0.0.0.0/0  probability 0.500
KUBE-SEP-CCCC            --     --   0.0.0.0/0 0.0.0.0/0  # last one gets remainder
 
# Each SEP chain DNATs to a specific Pod
$ iptables -t nat -L KUBE-SEP-AAAA -n
 
Chain KUBE-SEP-AAAA
target  prot  opt  source    destination
DNAT    tcp   --   0.0.0.0/0 0.0.0.0/0  to:10.244.1.5:8080

Converting Mermaid diagram...

Performance at Scale

If you have more than 1000 services, consider switching from iptables to IPVS mode or using Cilium with eBPF. iptables rule evaluation is linear (O(n)), which can add significant latency with many services. IPVS uses hash tables for O(1) lookups.

Advanced Discovery Patterns

Beyond basic service discovery, Kubernetes supports several advanced patterns for sophisticated routing requirements.

Topology-Aware Routing

•Purpose: Route traffic to topologically proximate endpoints to reduce latency and cross-zone costs
•Implementation: EndpointSlice topology hints guide kube-proxy to prefer zone-local endpoints
•Use Cases: Multi-zone clusters, cost optimization (cross-AZ traffic costs money in cloud)
•Configuration: Enable via Service annotation service.kubernetes.io/topology-aware-hints: Auto

topology-aware.yaml
YAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
apiVersion: v1
kind: Service
metadata:
  name: order-service
  namespace: production
  annotations:
    # Enable topology-aware routing
    service.kubernetes.io/topology-aware-hints: "Auto"
spec:
  selector:
    app: order-service
  ports:
    - port: 8080
 
# With topology hints enabled:
# - Traffic from zone-a prefers pods in zone-a
# - Falls back to other zones if zone-a has insufficient capacity
# - Reduces cross-zone network costs in cloud environments

Internal Traffic Policy

•Purpose: Control whether traffic to a ClusterIP can route to endpoints on other nodes
•Options: Cluster (default, any node) or Local (only same node)
•Local Use Case: Node-local caches, logging agents, node-specific services
•Trade-off: Local policy may cause failures if no local endpoint exists

internal-traffic-policy.yaml
YAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
apiVersion: v1
kind: Service
metadata:
  name: node-local-cache
spec:
  selector:
    app: cache-agent
  ports:
    - port: 6379
  # Only route to pods on the same node as the client
  internalTrafficPolicy: Local
 
# Use case: A DaemonSet runs a cache on every node
# Applications connect to the local cache, not a random node's cache
# Reduces network hops and latency

Custom Endpoints (Services Without Selectors)

•Purpose: Create Services that route to endpoints you define manually
•Use Cases: External databases, services in other clusters, gradual migrations
•Implementation: Create Service without selector, manually create Endpoints object
•Maintenance: You are responsible for keeping Endpoints accurate

custom-endpoints.yaml
YAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# Service without selector
apiVersion: v1
kind: Service
metadata:
  name: external-database
  namespace: production
spec:
  # No selector! Endpoints won't be auto-created
  ports:
    - port: 5432
      targetPort: 5432
 
---
# Manually created Endpoints
apiVersion: v1
kind: Endpoints
metadata:
  name: external-database    # Must match Service name
  namespace: production
subsets:
  - addresses:
      - ip: 10.100.50.10     # External DB IP
      - ip: 10.100.50.11     # Replica IP
    ports:
      - port: 5432
 
# Applications connect to external-database:5432
# Kubernetes routes to the manually specified IPs
# Useful for:
# - External databases (RDS, Cloud SQL)
# - Services in other clusters
# - Gradual migration from external to internal

Manual Endpoints Maintenance

When using custom Endpoints, you are responsible for keeping them current. If the external service's IP changes and you don't update the Endpoints, discovery will fail. Consider using ExternalName services for external DNS names, or automation to sync external IPs to Endpoints.

Debugging Service Discovery

Service discovery issues are among the most common problems in Kubernetes. Here's a systematic approach to debugging discovery failures.

debugging-commands.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# 1. Verify Service exists and has the right selector
$ kubectl get svc order-service -n production -o yaml
# Check: selector matches Pod labels?
# Check: ports configured correctly?
 
# 2. Check Endpoints - are backend Pods registered?
$ kubectl get endpoints order-service -n production
NAME            ENDPOINTS
order-service   10.244.1.5:8080,10.244.2.8:8080
# If empty: Pods don't match selector, or failing readiness probes
 
# 3. Check Pod labels match Service selector
$ kubectl get pods -n production -l app=order-service
NAME                           READY   STATUS    RESTARTS
order-service-abc123-xyz       1/1     Running   0
order-service-def456-xyz       1/1     Running   0
# No pods? Labels don't match selector
 
# 4. Check Pod readiness
$ kubectl get pods -n production -l app=order-service -o wide
# READY column should be 1/1
# If 0/1: readiness probe is failing
 
$ kubectl describe pod order-service-abc123-xyz -n production
# Look for: Readiness probe failed messages
 
# 5. Test DNS resolution from a Pod
$ kubectl run debug --rm -it --image=busybox -- /bin/sh
/ # nslookup order-service.production.svc.cluster.local
# Should return ClusterIP
 
/ # nslookup order-service
# Works if you're in the same namespace
 
# 6. Test actual connectivity
/ # wget -qO- http://order-service:8080/health
# Or: curl if using a different image
 
# 7. Check CoreDNS logs
$ kubectl logs -n kube-system -l k8s-app=kube-dns
# Look for: errors, failed queries, upstream issues
 
# 8. Check kube-proxy logs
$ kubectl logs -n kube-system -l k8s-app=kube-proxy
# Look for: iptables errors, sync failures
 
# 9. Verify iptables rules exist (on a node)
$ iptables -t nat -L KUBE-SERVICES | grep order-service
# Should show a chain for the service

Common Discovery Issues and Solutions
Symptom	Likely Cause	Solution
Empty Endpoints	No Pods match selector	Verify Pod labels match Service selector
Empty Endpoints	Pods failing readiness	Fix readiness probe or application issues
DNS resolution fails	CoreDNS not running	Check CoreDNS pods in kube-system
DNS resolution fails	Wrong DNS search domain	Check Pod's /etc/resolv.conf
Connection refused	No Pods in Endpoints	Check Pod status and readiness
Connection timeout	Network policy blocking	Review NetworkPolicy resources
Intermittent failures	Some Pods unhealthy	Check individual Pod health
Slow performance	Cross-zone traffic	Enable topology-aware routing

The Debug Pod Technique

Keep a debugging container image handy (e.g., nicolaka/netshoot) that includes DNS tools, curl, wget, and network utilities. Spawn temporary debug Pods to test discovery from within the cluster: kubectl run debug --rm -it --image=nicolaka/netshoot -- /bin/bash

Summary: Discovery as a Platform Primitive

We've comprehensively explored Kubernetes service discovery—from fundamental concepts to advanced patterns. Let's consolidate the key insights:

Key Takeaways

•Services provide stable abstraction — ClusterIP and DNS name remain constant while Pods come and go.
•Service types serve different needs — ClusterIP for internal, LoadBalancer for external, Headless for client-side LB, ExternalName for external services.
•CoreDNS enables DNS-based discovery — Standard DNS resolution with predictable naming convention.
•Endpoints/EndpointSlices track real Pod IPs — Auto-populated from selector, used by kube-proxy for routing.
•kube-proxy implements load balancing — iptables for simplicity, IPVS for scale, eBPF for performance.
•Readiness probes are critical — Pods only receive traffic when passing readiness checks.
•Advanced patterns exist for complex needs — Topology-aware routing, internal traffic policy, custom endpoints.
•Debugging follows a systematic process — Check Service → Endpoints → Pod readiness → DNS → connectivity.

Module Complete!

You've now completed the Service Discovery module. You understand why service discovery is needed, DNS-based and registry-based approaches, client-side vs server-side discovery patterns, and Kubernetes' platform-native implementation. This knowledge forms the foundation for building resilient, scalable microservices architectures.

What's Next:

With discovery mastered, the next module explores Circuit Breaker Pattern—how to prevent cascade failures when discovered services become degraded or unavailable. Understanding circuit breakers is essential for building truly resilient distributed systems.

Module Complete

Congratulations! You now have comprehensive knowledge of service discovery—from fundamental concepts to Kubernetes implementation details. You understand why discovery is needed, how DNS and registries work, client-side vs server-side patterns, and how Kubernetes provides discovery as a platform primitive. This knowledge is essential for designing and operating resilient microservices architectures.

Kubernetes Service Discovery

Platform-Native Service Discovery

This page provides a comprehensive exploration of Kubernetes service discovery: how it works, the components involved, configuration options, and patterns for production environments.

What You Will Learn

The Kubernetes Service Model

The Core Problem Services Solve:

Services provide a stable abstraction over the dynamic set of Pods:

Stable DNS Name: my-service.my-namespace.svc.cluster.local
Stable IP Address: ClusterIP remains constant for the Service's lifetime
Automatic Load Balancing: Traffic distributed across healthy Pods

Converting Mermaid diagram...

How Services Track Pods:

service-and-deployment.yaml
YAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
# Deployment creates Pods with labels
apiVersion: apps/v1
kind: Deployment
metadata:
  name: order-service
  namespace: production
spec:
  replicas: 3
  selector:
    matchLabels:
      app: order-service
      version: v2
  template:
    metadata:
      labels:
        app: order-service    # Service will select these
        version: v2           # Can be used for canary routing
        team: commerce
    spec:
      containers:
        - name: order-service
          image: order-service:2.3.1
          ports:
            - containerPort: 8080
              name: http
            - containerPort: 9090
              name: grpc
          readinessProbe:       # Critical: determines endpoint inclusion
            httpGet:
              path: /health/ready
              port: 8080
            initialDelaySeconds: 5
            periodSeconds: 10
          livenessProbe:
            httpGet:
              path: /health/live
              port: 8080
            initialDelaySeconds: 15
            periodSeconds: 20
 
---
# Service selects Pods by label
apiVersion: v1
kind: Service
metadata:
  name: order-service
  namespace: production
  labels:
    app: order-service
spec:
  type: ClusterIP           # Default type
  selector:                 # Matches Pod labels
    app: order-service
    version: v2             # Could target specific version
  ports:
    - name: http
      port: 80              # Service port (what clients connect to)
      targetPort: 8080      # Container port (where traffic goes)
      protocol: TCP
    - name: grpc
      port: 9090
      targetPort: 9090
      protocol: TCP
 
# Kubernetes automatically creates Endpoints:
# kubectl get endpoints order-service -n production
# NAME            ENDPOINTS
# order-service   10.244.1.5:8080,10.244.2.8:8080,10.244.3.12:8080

Readiness Probes Are Critical

Service Types

Kubernetes supports several Service types, each designed for different networking scenarios. Understanding these types is essential for designing accessible applications.

ClusterIP is the default Service type. It exposes the Service on a cluster-internal IP address. The Service is only reachable from within the cluster.

Use Cases:

Internal microservice-to-microservice communication
Backend services that shouldn't be exposed externally
Databases, caches, and other infrastructure services

clusterip-service.yaml
YAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
apiVersion: v1
kind: Service
metadata:
  name: user-service
  namespace: production
spec:
  type: ClusterIP    # Default, can be omitted
  selector:
    app: user-service
  ports:
    - port: 8080
      targetPort: 8080
 
# Accessible within cluster only:
# - DNS: user-service.production.svc.cluster.local
# - Or just: user-service (from same namespace)
# - ClusterIP: e.g., 10.96.50.25

Service Type Comparison
Type	Internal Access	External Access	IP Allocation	Use Case
ClusterIP	Yes - ClusterIP + DNS	No	Virtual ClusterIP	Internal services
NodePort	Yes - ClusterIP + DNS	Yes - <NodeIP>:<NodePort>	ClusterIP + NodePort	Dev/testing, simple external
LoadBalancer	Yes - ClusterIP + DNS	Yes - External LB IP	ClusterIP + External IP	Production external access
Headless (None)	Yes - DNS returns Pod IPs	No	No ClusterIP	Client-side LB, StatefulSets
ExternalName	Yes - CNAME to external	N/A (external)	No IP - CNAME only	External service abstraction

Ingress for HTTP/HTTPS

CoreDNS and DNS-Based Discovery

How DNS Discovery Works:

Each Pod is configured to use the cluster DNS server (CoreDNS) as its nameserver
CoreDNS watches the Kubernetes API for Service and Endpoints changes
When a Pod queries for a service name, CoreDNS returns the appropriate IP(s)
CoreDNS handles both forward and reverse DNS lookups

DNS Naming Convention:

Kubernetes follows a standard naming convention for service DNS:

<service-name>.<namespace>.svc.<cluster-domain>

For example: order-service.production.svc.cluster.local

DNS Resolution Shortcuts:

Same namespace: order-service (shortest form)
Cross-namespace: order-service.other-namespace
Fully qualified: order-service.other-namespace.svc.cluster.local

dns-examples.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# From a Pod in the 'production' namespace
 
# Same namespace - shortest form
$ nslookup order-service
Server:    10.96.0.10
Address:   10.96.0.10:53
 
Name:      order-service.production.svc.cluster.local
Address:   10.96.100.50
 
# Cross-namespace - specify namespace
$ nslookup redis.cache
Name:      redis.cache.svc.cluster.local
Address:   10.96.200.30
 
# Fully qualified domain name (FQDN)
$ nslookup order-service.production.svc.cluster.local
Name:      order-service.production.svc.cluster.local
Address:   10.96.100.50
 
# SRV records for port discovery
$ nslookup -type=SRV _http._tcp.order-service.production.svc.cluster.local
_http._tcp.order-service.production.svc.cluster.local service = 0 100 80 order-service.production.svc.cluster.local
 
# Headless service returns all Pod IPs
$ nslookup database-cluster.production.svc.cluster.local
Name:      database-cluster.production.svc.cluster.local
Address:   10.244.1.5
Address:   10.244.2.8
Address:   10.244.3.12
 
# StatefulSet Pod-specific DNS
$ nslookup postgresql-0.database-cluster.production.svc.cluster.local
Name:      postgresql-0.database-cluster.production.svc.cluster.local
Address:   10.244.1.5

CoreDNS Configuration:

CoreDNS is configured via a ConfigMap that defines its behavior. The default configuration watches the Kubernetes API and responds to DNS queries for cluster services.

coredns-configmap.yaml
YAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns
  namespace: kube-system
data:
  Corefile: |
    .:53 {
        errors
        health {
            lameduck 5s
        }
        ready
        
        # Kubernetes plugin - handles cluster DNS
        kubernetes cluster.local in-addr.arpa ip6.arpa {
            pods insecure
            fallthrough in-addr.arpa ip6.arpa
            ttl 30             # DNS TTL for records
        }
        
        prometheus :9153       # Metrics endpoint
        
        # Forward external queries to upstream DNS
        forward . /etc/resolv.conf {
            max_concurrent 1000
        }
        
        cache 30               # Cache responses for 30s
        loop
        reload
        loadbalance           # Round-robin DNS responses
    }
    
    # Custom zone for internal services
    # internal.company.com:53 {
    #     file /etc/coredns/internal.company.com.zone
    # }

DNS TTL and Propagation

Endpoints and EndpointSlices

Endpoints vs EndpointSlices:

Endpoints: Original API, one Endpoints object per Service containing all Pod IPs
EndpointSlices: Newer API (GA in 1.21), shards endpoints across multiple objects for better scalability

endpoints-examples.yaml
YAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
# Endpoints (auto-generated from Service selector)
# kubectl get endpoints order-service -n production -o yaml
 
apiVersion: v1
kind: Endpoints
metadata:
  name: order-service
  namespace: production
  labels:
    app: order-service
subsets:
  - addresses:          # Ready pods
      - ip: 10.244.1.5
        nodeName: node-1
        targetRef:
          kind: Pod
          name: order-service-abc123
          namespace: production
      - ip: 10.244.2.8
        nodeName: node-2
        targetRef:
          kind: Pod
          name: order-service-def456
          namespace: production
      - ip: 10.244.3.12
        nodeName: node-3
        targetRef:
          kind: Pod
          name: order-service-ghi789
          namespace: production
    notReadyAddresses:  # Pods failing readiness probe
      - ip: 10.244.4.20
        nodeName: node-4
        targetRef:
          kind: Pod
          name: order-service-jkl012
          namespace: production
    ports:
      - name: http
        port: 8080
        protocol: TCP
      - name: grpc
        port: 9090
        protocol: TCP
 
---
# EndpointSlice (newer, more scalable)
# kubectl get endpointslice -l kubernetes.io/service-name=order-service -n production
 
apiVersion: discovery.k8s.io/v1
kind: EndpointSlice
metadata:
  name: order-service-abc12
  namespace: production
  labels:
    kubernetes.io/service-name: order-service
  ownerReferences:
    - apiVersion: v1
      kind: Service
      name: order-service
addressType: IPv4
endpoints:
  - addresses:
      - "10.244.1.5"
    conditions:
      ready: true
      serving: true
      terminating: false
    nodeName: node-1
    targetRef:
      kind: Pod
      name: order-service-abc123
      namespace: production
  - addresses:
      - "10.244.2.8"
    conditions:
      ready: true
      serving: true
      terminating: false
    nodeName: node-2
    targetRef:
      kind: Pod
      name: order-service-def456
      namespace: production
ports:
  - name: http
    port: 8080
    protocol: TCP

Why EndpointSlices?

With large services (thousands of endpoints), the Endpoints API becomes problematic:

API Size: A single Endpoints object with 5000 IPs is large, causing etcd write performance issues
Watch Overhead: Any Pod change triggers a full Endpoints object update, causing massive watch traffic
Scaling Limits: Hard limit of ~10,000 endpoints per Endpoints object

EndpointSlices address these by:

Sharding: Each EndpointSlice contains at most 100 endpoints (configurable)
Efficient Updates: Pod additions/removals only update the affected slice
Topology Hints: Support for topology-aware routing

Endpoints vs EndpointSlices
Aspect	Endpoints	EndpointSlices
Default in Kubernetes	< 1.21	= 1.21
Endpoints per object	All endpoints in one object	Max 100 per slice (default)
Update granularity	Full object rewrite	Per-slice updates
Topology support	No	Yes (topology hints)
Condition tracking	ready/notReady only	ready, serving, terminating
Recommended for	Small services, legacy	Production, large services

Watching for Changes

How Traffic Routing Works

DNS tells clients which IP to connect to, but how does traffic actually reach the right Pod? kube-proxy is the component responsible for implementing Service-level load balancing.

kube-proxy Modes:

kube-proxy can operate in different modes, each with different performance characteristics:

iptables mode (default in most clusters) uses Linux iptables rules to intercept and redirect traffic to backend Pods.

How it works:

kube-proxy watches Services and Endpoints
For each Service, it creates iptables rules that:
- Match packets destined for the ClusterIP
- Randomly select a backend Pod (DNAT)
- Rewrite the destination to the Pod IP
Return traffic is automatically handled by connection tracking

iptables-rules.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# View iptables rules for a service (simplified)
$ iptables -t nat -L KUBE-SERVICES -n
 
Chain KUBE-SERVICES
target                    prot  opt  source    destination
KUBE-SVC-ORDER-SERVICE    tcp   --   0.0.0.0/0 10.96.100.50  /* order-service */
 
# Service chain randomly selects a backend
$ iptables -t nat -L KUBE-SVC-ORDER-SERVICE -n
 
Chain KUBE-SVC-ORDER-SERVICE
target                   prot   opt  source    destination
KUBE-SEP-AAAA            --     --   0.0.0.0/0 0.0.0.0/0  probability 0.333
KUBE-SEP-BBBB            --     --   0.0.0.0/0 0.0.0.0/0  probability 0.500
KUBE-SEP-CCCC            --     --   0.0.0.0/0 0.0.0.0/0  # last one gets remainder
 
# Each SEP chain DNATs to a specific Pod
$ iptables -t nat -L KUBE-SEP-AAAA -n
 
Chain KUBE-SEP-AAAA
target  prot  opt  source    destination
DNAT    tcp   --   0.0.0.0/0 0.0.0.0/0  to:10.244.1.5:8080

Converting Mermaid diagram...

Performance at Scale

Advanced Discovery Patterns

Beyond basic service discovery, Kubernetes supports several advanced patterns for sophisticated routing requirements.

Topology-Aware Routing

•Purpose: Route traffic to topologically proximate endpoints to reduce latency and cross-zone costs
•Implementation: EndpointSlice topology hints guide kube-proxy to prefer zone-local endpoints
•Use Cases: Multi-zone clusters, cost optimization (cross-AZ traffic costs money in cloud)
•Configuration: Enable via Service annotation service.kubernetes.io/topology-aware-hints: Auto

topology-aware.yaml
YAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
apiVersion: v1
kind: Service
metadata:
  name: order-service
  namespace: production
  annotations:
    # Enable topology-aware routing
    service.kubernetes.io/topology-aware-hints: "Auto"
spec:
  selector:
    app: order-service
  ports:
    - port: 8080
 
# With topology hints enabled:
# - Traffic from zone-a prefers pods in zone-a
# - Falls back to other zones if zone-a has insufficient capacity
# - Reduces cross-zone network costs in cloud environments

Internal Traffic Policy

•Purpose: Control whether traffic to a ClusterIP can route to endpoints on other nodes
•Options: Cluster (default, any node) or Local (only same node)
•Local Use Case: Node-local caches, logging agents, node-specific services
•Trade-off: Local policy may cause failures if no local endpoint exists

internal-traffic-policy.yaml
YAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
apiVersion: v1
kind: Service
metadata:
  name: node-local-cache
spec:
  selector:
    app: cache-agent
  ports:
    - port: 6379
  # Only route to pods on the same node as the client
  internalTrafficPolicy: Local
 
# Use case: A DaemonSet runs a cache on every node
# Applications connect to the local cache, not a random node's cache
# Reduces network hops and latency

Custom Endpoints (Services Without Selectors)

•Purpose: Create Services that route to endpoints you define manually
•Use Cases: External databases, services in other clusters, gradual migrations
•Implementation: Create Service without selector, manually create Endpoints object
•Maintenance: You are responsible for keeping Endpoints accurate

custom-endpoints.yaml
YAML
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
# Service without selector
apiVersion: v1
kind: Service
metadata:
  name: external-database
  namespace: production
spec:
  # No selector! Endpoints won't be auto-created
  ports:
    - port: 5432
      targetPort: 5432
 
---
# Manually created Endpoints
apiVersion: v1
kind: Endpoints
metadata:
  name: external-database    # Must match Service name
  namespace: production
subsets:
  - addresses:
      - ip: 10.100.50.10     # External DB IP
      - ip: 10.100.50.11     # Replica IP
    ports:
      - port: 5432
 
# Applications connect to external-database:5432
# Kubernetes routes to the manually specified IPs
# Useful for:
# - External databases (RDS, Cloud SQL)
# - Services in other clusters
# - Gradual migration from external to internal

Manual Endpoints Maintenance

Debugging Service Discovery

Service discovery issues are among the most common problems in Kubernetes. Here's a systematic approach to debugging discovery failures.

debugging-commands.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
# 1. Verify Service exists and has the right selector
$ kubectl get svc order-service -n production -o yaml
# Check: selector matches Pod labels?
# Check: ports configured correctly?
 
# 2. Check Endpoints - are backend Pods registered?
$ kubectl get endpoints order-service -n production
NAME            ENDPOINTS
order-service   10.244.1.5:8080,10.244.2.8:8080
# If empty: Pods don't match selector, or failing readiness probes
 
# 3. Check Pod labels match Service selector
$ kubectl get pods -n production -l app=order-service
NAME                           READY   STATUS    RESTARTS
order-service-abc123-xyz       1/1     Running   0
order-service-def456-xyz       1/1     Running   0
# No pods? Labels don't match selector
 
# 4. Check Pod readiness
$ kubectl get pods -n production -l app=order-service -o wide
# READY column should be 1/1
# If 0/1: readiness probe is failing
 
$ kubectl describe pod order-service-abc123-xyz -n production
# Look for: Readiness probe failed messages
 
# 5. Test DNS resolution from a Pod
$ kubectl run debug --rm -it --image=busybox -- /bin/sh
/ # nslookup order-service.production.svc.cluster.local
# Should return ClusterIP
 
/ # nslookup order-service
# Works if you're in the same namespace
 
# 6. Test actual connectivity
/ # wget -qO- http://order-service:8080/health
# Or: curl if using a different image
 
# 7. Check CoreDNS logs
$ kubectl logs -n kube-system -l k8s-app=kube-dns
# Look for: errors, failed queries, upstream issues
 
# 8. Check kube-proxy logs
$ kubectl logs -n kube-system -l k8s-app=kube-proxy
# Look for: iptables errors, sync failures
 
# 9. Verify iptables rules exist (on a node)
$ iptables -t nat -L KUBE-SERVICES | grep order-service
# Should show a chain for the service

Common Discovery Issues and Solutions
Symptom	Likely Cause	Solution
Empty Endpoints	No Pods match selector	Verify Pod labels match Service selector
Empty Endpoints	Pods failing readiness	Fix readiness probe or application issues
DNS resolution fails	CoreDNS not running	Check CoreDNS pods in kube-system
DNS resolution fails	Wrong DNS search domain	Check Pod's /etc/resolv.conf
Connection refused	No Pods in Endpoints	Check Pod status and readiness
Connection timeout	Network policy blocking	Review NetworkPolicy resources
Intermittent failures	Some Pods unhealthy	Check individual Pod health
Slow performance	Cross-zone traffic	Enable topology-aware routing

The Debug Pod Technique

Summary: Discovery as a Platform Primitive

We've comprehensively explored Kubernetes service discovery—from fundamental concepts to advanced patterns. Let's consolidate the key insights:

Key Takeaways

•Services provide stable abstraction — ClusterIP and DNS name remain constant while Pods come and go.
•Service types serve different needs — ClusterIP for internal, LoadBalancer for external, Headless for client-side LB, ExternalName for external services.
•CoreDNS enables DNS-based discovery — Standard DNS resolution with predictable naming convention.
•Endpoints/EndpointSlices track real Pod IPs — Auto-populated from selector, used by kube-proxy for routing.
•kube-proxy implements load balancing — iptables for simplicity, IPVS for scale, eBPF for performance.
•Readiness probes are critical — Pods only receive traffic when passing readiness checks.
•Advanced patterns exist for complex needs — Topology-aware routing, internal traffic policy, custom endpoints.
•Debugging follows a systematic process — Check Service → Endpoints → Pod readiness → DNS → connectivity.

Module Complete!

What's Next:

Module Complete