System Design (HLD)Kubernetes Networking

Kubernetes Networking

LevelAdvanced

Duration90 mins

TopicKubernetes Networking

1 / 5

Kubernetes Service Types: ClusterIP, NodePort, LoadBalancer

The Challenge of Container Networking

In the world of containerized applications, networking is fundamentally different from traditional server-based deployments. Containers are ephemeral—they spin up, die, migrate across nodes, and scale horizontally in seconds. Their IP addresses change constantly. Yet clients need stable, reliable ways to reach these ever-shifting targets.

This is the problem Kubernetes Services solve. A Service is an abstraction that provides a stable network identity and load balancing for a dynamic set of Pods. It's one of the most critical primitives in Kubernetes, and understanding Service types is essential for any engineer designing production systems.

What You Will Learn

By the end of this page, you will understand the three primary Kubernetes Service types—ClusterIP, NodePort, and LoadBalancer—their architecture, use cases, trade-offs, and implementation details. You'll be able to select the right Service type for any scenario and troubleshoot networking issues with confidence.

The Pod Networking Problem

Before diving into Service types, let's understand the fundamental problem they solve. In Kubernetes, every Pod receives its own IP address from the cluster's Pod network (CNI). This seems convenient—Pods can communicate directly using these IPs. But there's a critical issue:

Pods are ephemeral. When a Pod dies and is recreated by a ReplicaSet or Deployment, it receives a new IP address. Any client that cached the old IP is now pointing at nothing.

Consider a simple microservices architecture:

pod-lifecycle-problem.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# A Deployment creates 3 replicas of a backend service
apiVersion: apps/v1
kind: Deployment
metadata:
  name: backend
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: backend
        image: myapp/backend:v1.0
 
# The Pods might have IPs like:
# - backend-pod-1: 10.244.1.15
# - backend-pod-2: 10.244.2.23
# - backend-pod-3: 10.244.3.18
 
# After a rolling update or node failure:
# - backend-pod-1-new: 10.244.1.42  # Different IP!
# - backend-pod-2-new: 10.244.3.55  # Different IP!
# - backend-pod-3-new: 10.244.2.31  # Different IP!
 
# Any frontend service that cached the old IPs is now broken.

Beyond IP instability, there are additional challenges:

Load Balancing: How do you distribute traffic across multiple Pod replicas?
Service Discovery: How do clients find which Pods provide a particular service?
Health-Aware Routing: How do you stop sending traffic to Pods that are unhealthy?
Abstraction: How do you decouple the frontend from backend implementation details?

Kubernetes Services solve all of these problems by providing a stable virtual IP (ClusterIP), automatic Pod discovery via label selectors, built-in load balancing, and integration with the cluster's DNS system.

The Abstraction Layer

A Service creates a level of indirection. Clients connect to the Service's stable IP/DNS name, and Kubernetes handles routing to healthy Pod instances. This decoupling is foundational to building resilient, scalable systems.

The Service Abstraction

A Kubernetes Service is defined by a YAML manifest that specifies:

Selector: Which Pods belong to this Service (matched by labels)
Ports: The port mappings from Service to Pod
Type: How the Service is exposed (ClusterIP, NodePort, LoadBalancer, ExternalName)

When you create a Service, Kubernetes automatically:

Assigns a stable ClusterIP (unless explicitly set)
Creates a DNS record (e.g., my-service.my-namespace.svc.cluster.local)
Configures iptables/IPVS rules on all nodes for routing
Creates an Endpoints object that tracks healthy Pod IPs
Continuously watches for Pod changes and updates routing accordingly

basic-service.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
apiVersion: v1
kind: Service
metadata:
  name: backend-service
  namespace: production
spec:
  selector:
    app: backend          # Selects all Pods with label app=backend
    tier: api
  ports:
  - name: http
    port: 80              # Port the Service listens on
    targetPort: 8080      # Port on the Pod containers
    protocol: TCP
  - name: grpc
    port: 9090
    targetPort: 9090
    protocol: TCP
  type: ClusterIP         # Default type - internal only
 
---
# The corresponding Endpoints object (auto-created)
apiVersion: v1
kind: Endpoints
metadata:
  name: backend-service   # Must match Service name
  namespace: production
subsets:
- addresses:
  - ip: 10.244.1.42       # Healthy Pod IPs
  - ip: 10.244.3.55
  - ip: 10.244.2.31
  ports:
  - name: http
    port: 8080
  - name: grpc
    port: 9090

Key Concepts:

Selector Matching: The Service continuously watches for Pods matching its selector. When Pods are added or removed, the Endpoints object is updated automatically.
Port vs TargetPort: The port is what clients connect to on the Service. The targetPort is what the Pod container listens on. They don't have to match.
Named Ports: Using named ports allows flexibility—if a Pod exposes port 8080 with name http, the Service can reference http instead of hardcoding 8080.
Session Affinity: By default, Services use random load balancing. You can enable sessionAffinity: ClientIP to route all requests from the same client IP to the same Pod (sticky sessions).

ClusterIP: Internal Cluster Communication

ClusterIP is the default Service type and the foundation upon which other types are built. It creates a stable virtual IP address that is only reachable from within the cluster network.

How ClusterIP Works

When you create a ClusterIP Service:

IP Allocation: The API server allocates an IP from the cluster's service CIDR range (configured at cluster creation, e.g., 10.96.0.0/12)
DNS Registration: CoreDNS creates records:
- service-name.namespace.svc.cluster.local → ClusterIP
- SRV records for service discovery with ports
iptables/IPVS Rules: kube-proxy (or iptables-mode / IPVS-mode) programs routing rules on every node:
- Traffic destined for the ClusterIP is intercepted
- DNAT (Destination NAT) redirects to a randomly selected healthy Pod IP
- SNAT (Source NAT) ensures return traffic routes correctly

clusterip-service.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
apiVersion: v1
kind: Service
metadata:
  name: user-service
  namespace: default
spec:
  type: ClusterIP           # Default, can be omitted
  clusterIP: 10.96.100.50   # Optional: specify exact IP; usually auto-allocated
  selector:
    app: user-service
  ports:
  - name: http
    port: 80
    targetPort: 8080
  - name: metrics
    port: 9090
    targetPort: 9090
 
---
# Headless Service (ClusterIP: None)
# Used when you want direct Pod access without load balancing
apiVersion: v1
kind: Service
metadata:
  name: user-service-headless
  namespace: default
spec:
  type: ClusterIP
  clusterIP: None           # Makes this a headless service
  selector:
    app: user-service
  ports:
  - name: http
    port: 80
    targetPort: 8080
 
# DNS resolution for headless service returns all Pod IPs directly:
# user-service-headless.default.svc.cluster.local → 10.244.1.42, 10.244.3.55, 10.244.2.31

Headless Services: When You Need Direct Pod Access

Setting clusterIP: None creates a headless Service. Instead of getting a virtual IP with load balancing, DNS queries return the IPs of all matching Pods directly. This is essential for:

StatefulSets: Each Pod needs a stable, unique identity (e.g., mysql-0.mysql-headless.namespace.svc)
Client-side load balancing: When your application implements its own load balancing logic (e.g., gRPC)
Service discovery: When you need to enumerate all Pods programmatically

ClusterIP Use Cases
Use Case	Why ClusterIP	Example
Microservice communication	Internal services don't need external exposure	API Gateway → User Service → Auth Service
Database connections	Databases should never be exposed externally	Application → PostgreSQL Service
Internal caching	Cache clusters are internal infrastructure	Application → Redis Cluster Service
Background workers	Job processors don't serve external traffic	API → Worker Queue Service
Metrics collection	Prometheus scrapes internal endpoints	Prometheus → Service metrics endpoint

ClusterIP Best Practices

• Always use ClusterIP for internal services—it's the most secure option • Use headless services for StatefulSets and when you need stable network identities • Configure readinessProbes on Pods to ensure only healthy instances receive traffic • Consider using named ports for flexibility during port changes

NodePort: Exposing Services on Node IPs

NodePort extends ClusterIP by exposing the Service on a static port across all nodes in the cluster. Any traffic sent to <NodeIP>:<NodePort> is forwarded to the Service.

How NodePort Works

Port Allocation: Kubernetes allocates a port from the NodePort range (default: 30000-32767, configurable via --service-node-port-range)
ClusterIP Creation: A ClusterIP is still created—NodePort builds on top of ClusterIP
iptables Rules: kube-proxy configures rules on every node to accept traffic on the NodePort and forward it to the Service
Node Binding: Every node in the cluster listens on the NodePort, regardless of whether it runs Pods for that Service

nodeport-service.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
apiVersion: v1
kind: Service
metadata:
  name: web-frontend
  namespace: production
spec:
  type: NodePort
  selector:
    app: web-frontend
  ports:
  - name: http
    port: 80              # ClusterIP port (internal)
    targetPort: 8080      # Pod container port
    nodePort: 31080       # Optional: specify exact port (30000-32767)
                          # If omitted, Kubernetes auto-allocates
  - name: https
    port: 443
    targetPort: 8443
    nodePort: 31443
 
# Access patterns:
# - Internal:  http://web-frontend.production.svc:80
# - External:  http://<any-node-ip>:31080
#   Examples:  http://192.168.1.10:31080
#              http://192.168.1.11:31080
#              http://192.168.1.12:31080  (all work!)

NodePort Traffic Flow

Understanding the traffic path is essential for troubleshooting:

Client → Node1:31080 → iptables intercepts
                      → DNAT to ClusterIP:80 or directly to Pod IP
                      → Load balance across all healthy Pods
                      → Pod on any node responds
                      → Return traffic via SNAT back to Node1
                      → Node1 → Client

Important: Traffic arriving at Node1 may be routed to a Pod running on Node3. This is the default behavior and adds a network hop. You can change this with externalTrafficPolicy.

nodeport-traffic-policy.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
apiVersion: v1
kind: Service
metadata:
  name: web-frontend
spec:
  type: NodePort
  selector:
    app: web-frontend
  
  # externalTrafficPolicy controls routing behavior
  externalTrafficPolicy: Cluster  # Default: load balance across all nodes
                                   # Pros: Even distribution, high availability
                                   # Cons: Extra network hop, client IP is SNAT'd
 
  # OR
 
  externalTrafficPolicy: Local     # Only route to Pods on the receiving node
                                   # Pros: No extra hop, preserves client IP
                                   # Cons: Uneven distribution, 503 if no local Pod
 
  ports:
  - port: 80
    targetPort: 8080
    nodePort: 31080

NodePort Advantages

•Simple external access — No cloud provider dependency; works on bare metal
•Multi-node redundancy — Any node can receive traffic; no single point of failure
•Debugging convenience — Direct node access for testing and troubleshooting
•Portable — Works identically across all Kubernetes environments

NodePort Limitations

•Port range restrictions — Only 30000-32767; not standard ports like 80/443
•Security exposure — Opens ports on all nodes, increasing attack surface
•No built-in load balancer — You need external LB or DNS round-robin
•Client IP masking — With Cluster policy, source IP is lost via SNAT

When to Use NodePort

NodePort is primarily useful for development, testing, or non-production environments. For production workloads, prefer LoadBalancer (with cloud providers) or Ingress controllers. NodePort exposes ports on every node, which may conflict with security policies.

LoadBalancer: Cloud-Native External Access

LoadBalancer is the production-grade solution for exposing services externally. It extends NodePort by provisioning an external load balancer from the cloud provider (AWS, GCP, Azure) and configuring it to route traffic to the Service.

How LoadBalancer Works

NodePort and ClusterIP Created: LoadBalancer builds on both—you get a ClusterIP, a NodePort, and the external LB
Cloud Controller Manager: When the Service is created, Kubernetes' cloud controller manager (CCM) contacts the cloud provider's API
External LB Provisioned: An actual load balancer resource is created in your cloud account (e.g., AWS NLB/ALB, GCP Load Balancer, Azure LB)
Health Checks Configured: The LB is configured to health-check nodes on the NodePort
External IP Assigned: An external IP or DNS name is allocated and displayed in the Service status

loadbalancer-service.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
apiVersion: v1
kind: Service
metadata:
  name: api-gateway
  namespace: production
  annotations:
    # Cloud-specific annotations for LB configuration
    # AWS Network Load Balancer
    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
    service.beta.kubernetes.io/aws-load-balancer-internal: "false"
    service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
    
    # OR for AWS Application Load Balancer (ALB)
    # service.beta.kubernetes.io/aws-load-balancer-type: "alb"
    
    # GCP annotations
    # cloud.google.com/load-balancer-type: "External"
    # networking.gke.io/load-balancer-type: "External"
    
    # Azure annotations
    # service.beta.kubernetes.io/azure-load-balancer-internal: "false"
spec:
  type: LoadBalancer
  selector:
    app: api-gateway
  ports:
  - name: http
    port: 80
    targetPort: 8080
  - name: https
    port: 443
    targetPort: 8443
  
  # Optional: Preserve client IP (recommended for production)
  externalTrafficPolicy: Local
  
  # Optional: Restrict external access to specific IP ranges
  loadBalancerSourceRanges:
  - 10.0.0.0/8           # Internal networks
  - 203.0.113.0/24       # Trusted external range
 
---
# After creation, check status:
# kubectl get svc api-gateway -o wide
# NAME          TYPE           CLUSTER-IP     EXTERNAL-IP      PORT(S)
# api-gateway   LoadBalancer   10.96.50.100   34.120.155.78    80:31234/TCP,443:31567/TCP

LoadBalancer Traffic Flow

Client → Cloud LB (34.120.155.78:80)
       → Health check: which nodes are healthy?
       → Forward to Node2:31234 (round-robin across healthy nodes)
       → iptables/IPVS on Node2
       → [If externalTrafficPolicy: Cluster] Load balance to any Pod
       → [If externalTrafficPolicy: Local] Only Pods on Node2
       → Pod processes request
       → Response returns via same path

Cloud Provider Specifics

Each cloud provider offers different load balancer types with varying capabilities:

Cloud Load Balancer Options
Cloud	LB Type	Layer	Features	Use Case
AWS	NLB (Network)	L4	Ultra-low latency, millions of requests/sec, static IP	High-performance TCP/UDP, gaming, IoT
AWS	ALB (Application)	L7	Path-based routing, host-based routing, WAF integration	HTTP/HTTPS APIs, microservices
GCP	Network LB	L4	Regional, passthrough, preserves client IP	TCP/UDP with low latency
GCP	HTTP(S) LB	L7	Global, CDN integration, managed SSL	Global HTTP applications
Azure	Standard LB	L4	Zone-redundant, outbound rules, HA ports	Enterprise TCP/UDP workloads
Azure	App Gateway	L7	WAF, SSL termination, URL routing	Web applications requiring WAF

LoadBalancer Cost Consideration

Each LoadBalancer Service provisions a separate cloud load balancer, which incurs additional cost (typically $15-20/month per LB on AWS/GCP). For multiple services, consider using a single Ingress controller with one LoadBalancer instead of individual LoadBalancer Services.

Comparing Service Types: Decision Framework

Choosing the right Service type requires understanding your access patterns, environment constraints, and operational requirements. Here's a comprehensive decision framework:

Service Type Comparison Matrix
Aspect	ClusterIP	NodePort	LoadBalancer
Accessibility	Internal only	External via node IPs	External via LB IP
Port Range	Any port	30000-32767	Any port
Load Balancing	Built-in (kube-proxy)	Built-in + external LB needed	Cloud LB + kube-proxy
Client IP Preservation	N/A (internal)	With Local policy	With Local policy
Cloud Dependency	None	None	Requires cloud provider
Cost	None	None	Cloud LB charges
Production Readiness	High	Low (dev/test)	High
TLS Termination	Application-level	Application-level	LB or application
Health Checks	readinessProbe	readinessProbe + node checks	LB health checks

Decision Flowchart

Question 1: Does this service need external access?

No → ClusterIP (default, most secure)
Yes → Continue

Question 2: Are you running on a cloud provider with CCM?

Yes → Continue to Q3
No (bare metal/on-prem) → NodePort + External LB (MetalLB, HAProxy) or Ingress

Question 3: Is this a TCP/UDP service requiring direct external access?

Yes → LoadBalancer (L4 LB like NLB)
No (HTTP/HTTPS with routing needs) → Ingress with single LoadBalancer

Question 4: Do you need multiple external services?

Yes → Single Ingress with path/host routing (cost-effective)
No (single service) → LoadBalancer is acceptable

The ExternalName Special Case

There's a fourth Service type: ExternalName. It doesn't create iptables rules or a ClusterIP. Instead, it creates a CNAME DNS record pointing to an external domain. Use it to give cluster-internal names to external services (e.g., a managed database outside the cluster).

Example: externalName: my-database.us-east-1.rds.amazonaws.com

Advanced Service Configurations

Production Services often require additional configuration beyond the basics. Here are critical settings every engineer should understand:

advanced-service-config.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
apiVersion: v1
kind: Service
metadata:
  name: advanced-api
  namespace: production
  annotations:
    # Health check configuration (AWS NLB example)
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-interval: "10"
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-healthy-threshold: "2"
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-unhealthy-threshold: "2"
    
    # Connection draining
    service.beta.kubernetes.io/aws-load-balancer-connection-draining-enabled: "true"
    service.beta.kubernetes.io/aws-load-balancer-connection-draining-timeout: "60"
spec:
  type: LoadBalancer
  selector:
    app: advanced-api
  ports:
  - name: http
    port: 80
    targetPort: 8080
  
  # Session affinity - route same client to same Pod
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 3600    # Affinity timeout (1 hour)
  
  # Traffic policy for client IP preservation
  externalTrafficPolicy: Local
  
  # Internal traffic policy (Kubernetes 1.21+)
  internalTrafficPolicy: Local  # Prefer local Pods for internal traffic too
  
  # Allocate specific cluster IP (optional, usually auto-allocated)
  # clusterIP: 10.96.100.50
  
  # IP families for dual-stack (IPv4 + IPv6)
  ipFamilies:
  - IPv4
  - IPv6
  ipFamilyPolicy: PreferDualStack
 
---
# Service Account for fine-grained RBAC if needed
apiVersion: v1
kind: ServiceAccount
metadata:
  name: advanced-api-sa
  namespace: production

Key Advanced Settings Explained

•sessionAffinity: ClientIP — Enables sticky sessions. All requests from the same client IP go to the same Pod. Essential for stateful applications that store session data in memory.
•externalTrafficPolicy: Local — Only routes to Pods on the node receiving traffic. Preserves client IP but may cause uneven distribution. Set publishNotReadyAddresses: true on the Pod for graceful shutdown.
•internalTrafficPolicy: Local — Similar to external policy but for cluster-internal traffic. Reduces network hops and latency for local communication.
•ipFamilyPolicy — Controls dual-stack behavior: SingleStack (default), PreferDualStack (try both, prefer specified order), RequireDualStack (fail if not available).
•Connection Draining — Cloud LB annotation that allows in-flight requests to complete before removing a node/Pod from the pool. Critical for zero-downtime deployments.

Troubleshooting Service Connectivity

Service networking issues are among the most common problems in Kubernetes. Here's a systematic troubleshooting approach:

troubleshooting-commands.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# Step 1: Verify Service exists and has correct configuration
kubectl get svc my-service -o yaml
kubectl describe svc my-service
 
# Step 2: Check Endpoints - are healthy Pods registered?
kubectl get endpoints my-service
# Expected: List of Pod IPs
# Problem: Empty means no Pods match selector OR Pods aren't ready
 
# Step 3: Verify Pod selector matches
kubectl get pods -l app=my-app       # Replace with your selector
kubectl get pods -l app=my-app -o wide  # Check IPs match endpoints
 
# Step 4: Check Pod readiness
kubectl describe pod <pod-name>
# Look for:
# - Ready condition: True
# - readinessProbe status
# - Container status
 
# Step 5: Test DNS resolution
kubectl run -it --rm debug --image=busybox -- nslookup my-service
kubectl run -it --rm debug --image=busybox -- nslookup my-service.namespace.svc.cluster.local
 
# Step 6: Test connectivity from within cluster
kubectl run -it --rm debug --image=curlimages/curl -- curl -v http://my-service:80
 
# Step 7: Check iptables rules (on a node)
sudo iptables -t nat -L KUBE-SERVICES -n | grep my-service
 
# Step 8: For LoadBalancer, check cloud controller logs
kubectl logs -n kube-system -l component=cloud-controller-manager
 
# Step 9: Check events
kubectl get events --field-selector involvedObject.name=my-service

Common Service Issues and Solutions
Symptom	Likely Cause	Solution
Endpoints is empty	No Pods match selector OR Pods not ready	Fix label selector; add/fix readinessProbe
DNS doesn't resolve	CoreDNS issue or wrong domain	Check CoreDNS pods; use full FQDN (.svc.cluster.local)
Connection refused	Pod not listening on targetPort	Verify container is listening; check targetPort vs actual port
Intermittent failures	Some Pods unhealthy	Check readinessProbes; review Pod logs
LoadBalancer EXTERNAL-IP pending	Cloud controller issue or quota	Check CCM logs; verify cloud permissions/quotas
Client IP shows internal IP	SNAT masking with Cluster policy	Switch to externalTrafficPolicy: Local
503 errors with Local policy	No Pods on receiving node	Ensure Pod affinity or use Cluster policy

Pro Debugging Tip

Install a debug Pod with networking tools for faster troubleshooting:

kubectl run netdebug --image=nicolaka/netshoot -it --rm -- bash

This image includes curl, nslookup, dig, ping, traceroute, tcpdump, and more—everything you need for network debugging.

Summary: Kubernetes Service Types

We've covered the foundational Kubernetes Service types in depth. Let's consolidate the key takeaways:

Key Takeaways

•Services provide stable networking for ephemeral Pods, solving the fundamental problem of dynamic container IPs.
•ClusterIP is the default and most common type — use it for all internal communication. It's secure, simple, and doesn't expose any ports externally.
•Headless Services (ClusterIP: None) are essential for StatefulSets and client-side load balancing scenarios.
•NodePort exposes services on all node IPs — useful for development/testing but rarely appropriate for production due to port restrictions and security concerns.
•LoadBalancer provisions cloud load balancers — the production standard for external access, but incurs cloud costs per Service.
•externalTrafficPolicy: Local preserves client IPs but requires careful capacity planning to avoid 503 errors.
•Always configure readinessProbes — they're essential for healthy load balancing and zero-downtime deployments.

What's Next:

Services solve the problem of Pod-level connectivity, but they don't address HTTP routing, TLS termination, or path-based traffic splitting. In the next page, we'll explore Ingress Controllers—the L7 solution that builds on Services to provide sophisticated HTTP/HTTPS traffic management.

Page Complete

You now have a comprehensive understanding of Kubernetes Service types—ClusterIP, NodePort, and LoadBalancer. You can select the appropriate type for any scenario, configure advanced settings, and troubleshoot connectivity issues systematically. Next, we'll explore Ingress controllers for L7 traffic management.

1 / 5

Loading learning content...

System Design (HLD)Kubernetes Networking

Kubernetes Networking

LevelAdvanced

Duration90 mins

TopicKubernetes Networking

1 / 5

Kubernetes Service Types: ClusterIP, NodePort, LoadBalancer

The Challenge of Container Networking

What You Will Learn

The Pod Networking Problem

Pods are ephemeral. When a Pod dies and is recreated by a ReplicaSet or Deployment, it receives a new IP address. Any client that cached the old IP is now pointing at nothing.

Consider a simple microservices architecture:

pod-lifecycle-problem.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# A Deployment creates 3 replicas of a backend service
apiVersion: apps/v1
kind: Deployment
metadata:
  name: backend
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: backend
        image: myapp/backend:v1.0
 
# The Pods might have IPs like:
# - backend-pod-1: 10.244.1.15
# - backend-pod-2: 10.244.2.23
# - backend-pod-3: 10.244.3.18
 
# After a rolling update or node failure:
# - backend-pod-1-new: 10.244.1.42  # Different IP!
# - backend-pod-2-new: 10.244.3.55  # Different IP!
# - backend-pod-3-new: 10.244.2.31  # Different IP!
 
# Any frontend service that cached the old IPs is now broken.

Beyond IP instability, there are additional challenges:

Load Balancing: How do you distribute traffic across multiple Pod replicas?
Service Discovery: How do clients find which Pods provide a particular service?
Health-Aware Routing: How do you stop sending traffic to Pods that are unhealthy?
Abstraction: How do you decouple the frontend from backend implementation details?

The Abstraction Layer

The Service Abstraction

A Kubernetes Service is defined by a YAML manifest that specifies:

Selector: Which Pods belong to this Service (matched by labels)
Ports: The port mappings from Service to Pod
Type: How the Service is exposed (ClusterIP, NodePort, LoadBalancer, ExternalName)

When you create a Service, Kubernetes automatically:

Assigns a stable ClusterIP (unless explicitly set)
Creates a DNS record (e.g., my-service.my-namespace.svc.cluster.local)
Configures iptables/IPVS rules on all nodes for routing
Creates an Endpoints object that tracks healthy Pod IPs
Continuously watches for Pod changes and updates routing accordingly

basic-service.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
apiVersion: v1
kind: Service
metadata:
  name: backend-service
  namespace: production
spec:
  selector:
    app: backend          # Selects all Pods with label app=backend
    tier: api
  ports:
  - name: http
    port: 80              # Port the Service listens on
    targetPort: 8080      # Port on the Pod containers
    protocol: TCP
  - name: grpc
    port: 9090
    targetPort: 9090
    protocol: TCP
  type: ClusterIP         # Default type - internal only
 
---
# The corresponding Endpoints object (auto-created)
apiVersion: v1
kind: Endpoints
metadata:
  name: backend-service   # Must match Service name
  namespace: production
subsets:
- addresses:
  - ip: 10.244.1.42       # Healthy Pod IPs
  - ip: 10.244.3.55
  - ip: 10.244.2.31
  ports:
  - name: http
    port: 8080
  - name: grpc
    port: 9090

Key Concepts:

Selector Matching: The Service continuously watches for Pods matching its selector. When Pods are added or removed, the Endpoints object is updated automatically.
Port vs TargetPort: The port is what clients connect to on the Service. The targetPort is what the Pod container listens on. They don't have to match.
Named Ports: Using named ports allows flexibility—if a Pod exposes port 8080 with name http, the Service can reference http instead of hardcoding 8080.
Session Affinity: By default, Services use random load balancing. You can enable sessionAffinity: ClientIP to route all requests from the same client IP to the same Pod (sticky sessions).

ClusterIP: Internal Cluster Communication

ClusterIP is the default Service type and the foundation upon which other types are built. It creates a stable virtual IP address that is only reachable from within the cluster network.

How ClusterIP Works

When you create a ClusterIP Service:

IP Allocation: The API server allocates an IP from the cluster's service CIDR range (configured at cluster creation, e.g., 10.96.0.0/12)
DNS Registration: CoreDNS creates records:
- service-name.namespace.svc.cluster.local → ClusterIP
- SRV records for service discovery with ports
iptables/IPVS Rules: kube-proxy (or iptables-mode / IPVS-mode) programs routing rules on every node:
- Traffic destined for the ClusterIP is intercepted
- DNAT (Destination NAT) redirects to a randomly selected healthy Pod IP
- SNAT (Source NAT) ensures return traffic routes correctly

clusterip-service.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
apiVersion: v1
kind: Service
metadata:
  name: user-service
  namespace: default
spec:
  type: ClusterIP           # Default, can be omitted
  clusterIP: 10.96.100.50   # Optional: specify exact IP; usually auto-allocated
  selector:
    app: user-service
  ports:
  - name: http
    port: 80
    targetPort: 8080
  - name: metrics
    port: 9090
    targetPort: 9090
 
---
# Headless Service (ClusterIP: None)
# Used when you want direct Pod access without load balancing
apiVersion: v1
kind: Service
metadata:
  name: user-service-headless
  namespace: default
spec:
  type: ClusterIP
  clusterIP: None           # Makes this a headless service
  selector:
    app: user-service
  ports:
  - name: http
    port: 80
    targetPort: 8080
 
# DNS resolution for headless service returns all Pod IPs directly:
# user-service-headless.default.svc.cluster.local → 10.244.1.42, 10.244.3.55, 10.244.2.31

Headless Services: When You Need Direct Pod Access

Setting clusterIP: None creates a headless Service. Instead of getting a virtual IP with load balancing, DNS queries return the IPs of all matching Pods directly. This is essential for:

StatefulSets: Each Pod needs a stable, unique identity (e.g., mysql-0.mysql-headless.namespace.svc)
Client-side load balancing: When your application implements its own load balancing logic (e.g., gRPC)
Service discovery: When you need to enumerate all Pods programmatically

ClusterIP Use Cases
Use Case	Why ClusterIP	Example
Microservice communication	Internal services don't need external exposure	API Gateway → User Service → Auth Service
Database connections	Databases should never be exposed externally	Application → PostgreSQL Service
Internal caching	Cache clusters are internal infrastructure	Application → Redis Cluster Service
Background workers	Job processors don't serve external traffic	API → Worker Queue Service
Metrics collection	Prometheus scrapes internal endpoints	Prometheus → Service metrics endpoint

ClusterIP Best Practices

NodePort: Exposing Services on Node IPs

NodePort extends ClusterIP by exposing the Service on a static port across all nodes in the cluster. Any traffic sent to <NodeIP>:<NodePort> is forwarded to the Service.

How NodePort Works

Port Allocation: Kubernetes allocates a port from the NodePort range (default: 30000-32767, configurable via --service-node-port-range)
ClusterIP Creation: A ClusterIP is still created—NodePort builds on top of ClusterIP
iptables Rules: kube-proxy configures rules on every node to accept traffic on the NodePort and forward it to the Service
Node Binding: Every node in the cluster listens on the NodePort, regardless of whether it runs Pods for that Service

nodeport-service.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
apiVersion: v1
kind: Service
metadata:
  name: web-frontend
  namespace: production
spec:
  type: NodePort
  selector:
    app: web-frontend
  ports:
  - name: http
    port: 80              # ClusterIP port (internal)
    targetPort: 8080      # Pod container port
    nodePort: 31080       # Optional: specify exact port (30000-32767)
                          # If omitted, Kubernetes auto-allocates
  - name: https
    port: 443
    targetPort: 8443
    nodePort: 31443
 
# Access patterns:
# - Internal:  http://web-frontend.production.svc:80
# - External:  http://<any-node-ip>:31080
#   Examples:  http://192.168.1.10:31080
#              http://192.168.1.11:31080
#              http://192.168.1.12:31080  (all work!)

NodePort Traffic Flow

Understanding the traffic path is essential for troubleshooting:

Client → Node1:31080 → iptables intercepts
                      → DNAT to ClusterIP:80 or directly to Pod IP
                      → Load balance across all healthy Pods
                      → Pod on any node responds
                      → Return traffic via SNAT back to Node1
                      → Node1 → Client

Important: Traffic arriving at Node1 may be routed to a Pod running on Node3. This is the default behavior and adds a network hop. You can change this with externalTrafficPolicy.

nodeport-traffic-policy.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
apiVersion: v1
kind: Service
metadata:
  name: web-frontend
spec:
  type: NodePort
  selector:
    app: web-frontend
  
  # externalTrafficPolicy controls routing behavior
  externalTrafficPolicy: Cluster  # Default: load balance across all nodes
                                   # Pros: Even distribution, high availability
                                   # Cons: Extra network hop, client IP is SNAT'd
 
  # OR
 
  externalTrafficPolicy: Local     # Only route to Pods on the receiving node
                                   # Pros: No extra hop, preserves client IP
                                   # Cons: Uneven distribution, 503 if no local Pod
 
  ports:
  - port: 80
    targetPort: 8080
    nodePort: 31080

NodePort Advantages

•Simple external access — No cloud provider dependency; works on bare metal
•Multi-node redundancy — Any node can receive traffic; no single point of failure
•Debugging convenience — Direct node access for testing and troubleshooting
•Portable — Works identically across all Kubernetes environments

NodePort Limitations

•Port range restrictions — Only 30000-32767; not standard ports like 80/443
•Security exposure — Opens ports on all nodes, increasing attack surface
•No built-in load balancer — You need external LB or DNS round-robin
•Client IP masking — With Cluster policy, source IP is lost via SNAT

When to Use NodePort

LoadBalancer: Cloud-Native External Access

How LoadBalancer Works

NodePort and ClusterIP Created: LoadBalancer builds on both—you get a ClusterIP, a NodePort, and the external LB
Cloud Controller Manager: When the Service is created, Kubernetes' cloud controller manager (CCM) contacts the cloud provider's API
External LB Provisioned: An actual load balancer resource is created in your cloud account (e.g., AWS NLB/ALB, GCP Load Balancer, Azure LB)
Health Checks Configured: The LB is configured to health-check nodes on the NodePort
External IP Assigned: An external IP or DNS name is allocated and displayed in the Service status

loadbalancer-service.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
apiVersion: v1
kind: Service
metadata:
  name: api-gateway
  namespace: production
  annotations:
    # Cloud-specific annotations for LB configuration
    # AWS Network Load Balancer
    service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
    service.beta.kubernetes.io/aws-load-balancer-internal: "false"
    service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
    
    # OR for AWS Application Load Balancer (ALB)
    # service.beta.kubernetes.io/aws-load-balancer-type: "alb"
    
    # GCP annotations
    # cloud.google.com/load-balancer-type: "External"
    # networking.gke.io/load-balancer-type: "External"
    
    # Azure annotations
    # service.beta.kubernetes.io/azure-load-balancer-internal: "false"
spec:
  type: LoadBalancer
  selector:
    app: api-gateway
  ports:
  - name: http
    port: 80
    targetPort: 8080
  - name: https
    port: 443
    targetPort: 8443
  
  # Optional: Preserve client IP (recommended for production)
  externalTrafficPolicy: Local
  
  # Optional: Restrict external access to specific IP ranges
  loadBalancerSourceRanges:
  - 10.0.0.0/8           # Internal networks
  - 203.0.113.0/24       # Trusted external range
 
---
# After creation, check status:
# kubectl get svc api-gateway -o wide
# NAME          TYPE           CLUSTER-IP     EXTERNAL-IP      PORT(S)
# api-gateway   LoadBalancer   10.96.50.100   34.120.155.78    80:31234/TCP,443:31567/TCP

LoadBalancer Traffic Flow

Client → Cloud LB (34.120.155.78:80)
       → Health check: which nodes are healthy?
       → Forward to Node2:31234 (round-robin across healthy nodes)
       → iptables/IPVS on Node2
       → [If externalTrafficPolicy: Cluster] Load balance to any Pod
       → [If externalTrafficPolicy: Local] Only Pods on Node2
       → Pod processes request
       → Response returns via same path

Cloud Provider Specifics

Each cloud provider offers different load balancer types with varying capabilities:

Cloud Load Balancer Options
Cloud	LB Type	Layer	Features	Use Case
AWS	NLB (Network)	L4	Ultra-low latency, millions of requests/sec, static IP	High-performance TCP/UDP, gaming, IoT
AWS	ALB (Application)	L7	Path-based routing, host-based routing, WAF integration	HTTP/HTTPS APIs, microservices
GCP	Network LB	L4	Regional, passthrough, preserves client IP	TCP/UDP with low latency
GCP	HTTP(S) LB	L7	Global, CDN integration, managed SSL	Global HTTP applications
Azure	Standard LB	L4	Zone-redundant, outbound rules, HA ports	Enterprise TCP/UDP workloads
Azure	App Gateway	L7	WAF, SSL termination, URL routing	Web applications requiring WAF

LoadBalancer Cost Consideration

Comparing Service Types: Decision Framework

Choosing the right Service type requires understanding your access patterns, environment constraints, and operational requirements. Here's a comprehensive decision framework:

Service Type Comparison Matrix
Aspect	ClusterIP	NodePort	LoadBalancer
Accessibility	Internal only	External via node IPs	External via LB IP
Port Range	Any port	30000-32767	Any port
Load Balancing	Built-in (kube-proxy)	Built-in + external LB needed	Cloud LB + kube-proxy
Client IP Preservation	N/A (internal)	With Local policy	With Local policy
Cloud Dependency	None	None	Requires cloud provider
Cost	None	None	Cloud LB charges
Production Readiness	High	Low (dev/test)	High
TLS Termination	Application-level	Application-level	LB or application
Health Checks	readinessProbe	readinessProbe + node checks	LB health checks

Decision Flowchart

Question 1: Does this service need external access?

No → ClusterIP (default, most secure)
Yes → Continue

Question 2: Are you running on a cloud provider with CCM?

Yes → Continue to Q3
No (bare metal/on-prem) → NodePort + External LB (MetalLB, HAProxy) or Ingress

Question 3: Is this a TCP/UDP service requiring direct external access?

Yes → LoadBalancer (L4 LB like NLB)
No (HTTP/HTTPS with routing needs) → Ingress with single LoadBalancer

Question 4: Do you need multiple external services?

Yes → Single Ingress with path/host routing (cost-effective)
No (single service) → LoadBalancer is acceptable

The ExternalName Special Case

Example: externalName: my-database.us-east-1.rds.amazonaws.com

Advanced Service Configurations

Production Services often require additional configuration beyond the basics. Here are critical settings every engineer should understand:

advanced-service-config.yaml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
apiVersion: v1
kind: Service
metadata:
  name: advanced-api
  namespace: production
  annotations:
    # Health check configuration (AWS NLB example)
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-interval: "10"
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-healthy-threshold: "2"
    service.beta.kubernetes.io/aws-load-balancer-healthcheck-unhealthy-threshold: "2"
    
    # Connection draining
    service.beta.kubernetes.io/aws-load-balancer-connection-draining-enabled: "true"
    service.beta.kubernetes.io/aws-load-balancer-connection-draining-timeout: "60"
spec:
  type: LoadBalancer
  selector:
    app: advanced-api
  ports:
  - name: http
    port: 80
    targetPort: 8080
  
  # Session affinity - route same client to same Pod
  sessionAffinity: ClientIP
  sessionAffinityConfig:
    clientIP:
      timeoutSeconds: 3600    # Affinity timeout (1 hour)
  
  # Traffic policy for client IP preservation
  externalTrafficPolicy: Local
  
  # Internal traffic policy (Kubernetes 1.21+)
  internalTrafficPolicy: Local  # Prefer local Pods for internal traffic too
  
  # Allocate specific cluster IP (optional, usually auto-allocated)
  # clusterIP: 10.96.100.50
  
  # IP families for dual-stack (IPv4 + IPv6)
  ipFamilies:
  - IPv4
  - IPv6
  ipFamilyPolicy: PreferDualStack
 
---
# Service Account for fine-grained RBAC if needed
apiVersion: v1
kind: ServiceAccount
metadata:
  name: advanced-api-sa
  namespace: production

Key Advanced Settings Explained

•sessionAffinity: ClientIP — Enables sticky sessions. All requests from the same client IP go to the same Pod. Essential for stateful applications that store session data in memory.
•externalTrafficPolicy: Local — Only routes to Pods on the node receiving traffic. Preserves client IP but may cause uneven distribution. Set publishNotReadyAddresses: true on the Pod for graceful shutdown.
•internalTrafficPolicy: Local — Similar to external policy but for cluster-internal traffic. Reduces network hops and latency for local communication.
•ipFamilyPolicy — Controls dual-stack behavior: SingleStack (default), PreferDualStack (try both, prefer specified order), RequireDualStack (fail if not available).
•Connection Draining — Cloud LB annotation that allows in-flight requests to complete before removing a node/Pod from the pool. Critical for zero-downtime deployments.

Troubleshooting Service Connectivity

Service networking issues are among the most common problems in Kubernetes. Here's a systematic troubleshooting approach:

troubleshooting-commands.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
# Step 1: Verify Service exists and has correct configuration
kubectl get svc my-service -o yaml
kubectl describe svc my-service
 
# Step 2: Check Endpoints - are healthy Pods registered?
kubectl get endpoints my-service
# Expected: List of Pod IPs
# Problem: Empty means no Pods match selector OR Pods aren't ready
 
# Step 3: Verify Pod selector matches
kubectl get pods -l app=my-app       # Replace with your selector
kubectl get pods -l app=my-app -o wide  # Check IPs match endpoints
 
# Step 4: Check Pod readiness
kubectl describe pod <pod-name>
# Look for:
# - Ready condition: True
# - readinessProbe status
# - Container status
 
# Step 5: Test DNS resolution
kubectl run -it --rm debug --image=busybox -- nslookup my-service
kubectl run -it --rm debug --image=busybox -- nslookup my-service.namespace.svc.cluster.local
 
# Step 6: Test connectivity from within cluster
kubectl run -it --rm debug --image=curlimages/curl -- curl -v http://my-service:80
 
# Step 7: Check iptables rules (on a node)
sudo iptables -t nat -L KUBE-SERVICES -n | grep my-service
 
# Step 8: For LoadBalancer, check cloud controller logs
kubectl logs -n kube-system -l component=cloud-controller-manager
 
# Step 9: Check events
kubectl get events --field-selector involvedObject.name=my-service

Common Service Issues and Solutions
Symptom	Likely Cause	Solution
Endpoints is empty	No Pods match selector OR Pods not ready	Fix label selector; add/fix readinessProbe
DNS doesn't resolve	CoreDNS issue or wrong domain	Check CoreDNS pods; use full FQDN (.svc.cluster.local)
Connection refused	Pod not listening on targetPort	Verify container is listening; check targetPort vs actual port
Intermittent failures	Some Pods unhealthy	Check readinessProbes; review Pod logs
LoadBalancer EXTERNAL-IP pending	Cloud controller issue or quota	Check CCM logs; verify cloud permissions/quotas
Client IP shows internal IP	SNAT masking with Cluster policy	Switch to externalTrafficPolicy: Local
503 errors with Local policy	No Pods on receiving node	Ensure Pod affinity or use Cluster policy

Pro Debugging Tip

Install a debug Pod with networking tools for faster troubleshooting:

kubectl run netdebug --image=nicolaka/netshoot -it --rm -- bash

This image includes curl, nslookup, dig, ping, traceroute, tcpdump, and more—everything you need for network debugging.

Summary: Kubernetes Service Types

We've covered the foundational Kubernetes Service types in depth. Let's consolidate the key takeaways:

Key Takeaways

•Services provide stable networking for ephemeral Pods, solving the fundamental problem of dynamic container IPs.
•ClusterIP is the default and most common type — use it for all internal communication. It's secure, simple, and doesn't expose any ports externally.
•Headless Services (ClusterIP: None) are essential for StatefulSets and client-side load balancing scenarios.
•NodePort exposes services on all node IPs — useful for development/testing but rarely appropriate for production due to port restrictions and security concerns.
•LoadBalancer provisions cloud load balancers — the production standard for external access, but incurs cloud costs per Service.
•externalTrafficPolicy: Local preserves client IPs but requires careful capacity planning to avoid 503 errors.
•Always configure readinessProbes — they're essential for healthy load balancing and zero-downtime deployments.

What's Next:

Page Complete

1 / 5