Computer NetworksContainerized Networking

Containerized Networking

LevelAdvanced

Duration90 mins

TopicContainerized Networking

1 / 5

Container Networking Fundamentals

The Containerization Paradigm Shift

In 2013, Docker introduced a paradigm that would fundamentally transform how we build, deploy, and operate software. Containers—lightweight, portable, and isolated execution environments—enabled developers to package applications with their dependencies and run them consistently across any infrastructure. But beneath this deceptively simple abstraction lies a sophisticated networking layer that makes it all possible.

Container networking is where operating systems, network protocols, and distributed systems converge. Understanding it deeply is essential for anyone operating modern infrastructure, from single-host Docker deployments to globe-spanning Kubernetes clusters serving billions of requests.

What You Will Learn

By the end of this page, you will understand the fundamental building blocks of container networking: Linux namespaces for isolation, virtual Ethernet pairs for connectivity, bridge networks for host-local communication, and the challenges that arise when containers need to communicate across hosts. You'll see how containers are not virtual machines—and why that distinction creates unique networking requirements.

Containers vs Virtual Machines: A Networking Perspective

Before diving into container networking specifics, we must understand what makes containers fundamentally different from virtual machines—and why this difference creates unique networking challenges and opportunities.

Virtual Machines (VMs) emulate complete hardware environments. Each VM runs its own operating system kernel, has its own network stack implementation, and appears to the hypervisor as an independent machine with dedicated virtual network interface cards (vNICs). VM networking leverages decades of established paradigms: assign IP addresses, configure virtual switches, apply firewall rules—the same concepts used for physical servers.

Containers share the host operating system's kernel. They don't emulate hardware; instead, they use kernel features to create isolated process environments. This fundamental architectural difference means container networking cannot simply replicate VM networking patterns. Containers require new abstractions.

Architectural Comparison: VM vs Container Networking
Aspect	Virtual Machine	Container	Networking Implication
Kernel	Own kernel per VM	Shared host kernel	Containers use kernel networking features directly
Network Stack	Full stack per VM	Virtualized via namespaces	Containers share kernel's TCP/IP implementation
Startup Time	Minutes	Milliseconds	Network setup must be instantaneous
Density	10-20 per host	100-1000+ per host	IP address exhaustion becomes critical concern
Isolation	Hardware-level (hypervisor)	Process-level (kernel)	Security boundaries require careful network isolation
Network Identity	Persistent MAC/IP	Ephemeral by default	Traditional networking assumptions break

The Ephemerality Problem

Containers are designed to be ephemeral—created and destroyed in milliseconds. Traditional networking assumes relatively static endpoints with persistent addresses. This mismatch creates profound challenges: how do you route traffic to something that might not exist a second from now? How do you maintain connections when containers are recycled? These questions drive the evolution of container networking solutions.

The density challenge:

A single physical host might run 500+ containers simultaneously, each requiring network connectivity. Traditional approaches of assigning one IP per workload would exhaust address space rapidly. Container networking solutions must address:

Address space efficiency: How do hundreds of containers share limited IP addresses?
Port management: How do multiple containers expose services without port conflicts?
Routing scalability: How does traffic find the right container among thousands?
Performance overhead: How do we minimize the cost of network virtualization at scale?

These requirements drove the development of container-specific networking primitives and patterns we'll explore throughout this module.

Linux Namespaces: The Isolation Foundation

Container isolation is built upon Linux namespaces—a kernel feature that partitions system resources so that one set of processes sees a different view than another. Namespaces create the illusion that a container has its own isolated environment, even though it shares the kernel with all other containers on the host.

Linux provides several namespace types, each isolating a different aspect of the system. For container networking, the network namespace is paramount, but understanding the full picture reveals how containers achieve comprehensive isolation.

Linux Namespace Types and Their Roles in Container Isolation
Namespace	Isolates	Container Impact	Networking Relevance
Network (net)	Network stack, interfaces, routing	Each container has own network identity	Critical - Core of container networking
Mount (mnt)	Filesystem mount points	Containers see own filesystem view	Low - But affects /etc/hosts, /etc/resolv.conf
PID	Process IDs	Container processes numbered from 1	Low - Network processes appear isolated
UTS	Hostname and domain	Container has own hostname	Medium - Affects network identity/DNS
IPC	Inter-process communication	Shared memory isolated per container	Low - But may affect local communication patterns
User	User and group IDs	Container can have root without host root	Medium - Affects network capability permissions
Cgroup	Cgroup root directory	Resource limits isolated	Low - But cgroups limit network bandwidth

Deep Dive: The Network Namespace

The network namespace is the cornerstone of container networking. Each network namespace contains:

Network interfaces: Each namespace has its own set of network interfaces, including the loopback (lo) interface. A container's eth0 is completely separate from another container's eth0.
IP addresses: IP addresses are bound to interfaces within a namespace. The same IP can exist in different namespaces simultaneously without conflict.
Routing tables: Each namespace maintains its own routing table, determining how packets are forwarded.
Firewall rules (iptables/nftables): Packet filtering rules are namespace-specific, enabling per-container firewalling.
Socket bindings: Ports are namespace-scoped. Two containers can both bind to port 80 without conflict.
Network-related sysctls: Kernel tuning parameters like TCP settings can be namespace-specific.

namespace-exploration.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# View current network namespace (default is 'init' namespace)
ip netns list
 
# Create a new network namespace
sudo ip netns add my_container_ns
 
# Execute commands inside the namespace
sudo ip netns exec my_container_ns ip addr show
# Output: Only shows loopback (lo), in DOWN state
 
# Notice: the namespace is completely isolated - no eth0, no connectivity
 
# Compare to host namespace
ip addr show
# Output: Shows all host interfaces (eth0, lo, docker0, etc.)
 
# Each namespace has independent routing
sudo ip netns exec my_container_ns ip route show
# Output: Empty - no routes configured
 
# And independent iptables rules
sudo ip netns exec my_container_ns iptables -L
# Output: Empty chains - no rules
 
# Clean up
sudo ip netns delete my_container_ns

Namespace Implementation Detail

Network namespaces are implemented in the kernel via the CLONE_NEWNET flag to the clone() or unshare() system calls. When Docker creates a container, it calls unshare(CLONE_NEWNET) to create a new network namespace. The container's processes then run within this namespace, completely isolated from the host's network stack by default.

The isolation-connectivity paradox:

Here lies the central challenge: a freshly created network namespace is completely isolated. It can't communicate with the host, other containers, or the outside world. The namespace has achieved perfect isolation—but also perfect uselessness.

Container networking must solve this paradox: how do we provide connectivity while maintaining isolation? The answer lies in virtual network devices that can bridge namespaces while preserving security boundaries.

Virtual Ethernet Pairs: Connecting Namespaces

Virtual Ethernet pairs (veth) are the primary mechanism for connecting network namespaces. A veth pair consists of two virtual network interfaces that act as a tunnel: packets sent through one end emerge from the other, regardless of which namespace each end resides in.

Think of a veth pair as a virtual Ethernet cable with two ends. You can place each end in a different namespace, creating a dedicated communication channel between them. This is how containers connect to the host network and, through it, to the outside world.

Converting Mermaid diagram...

veth-pair-creation.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# Create a network namespace (simulating a container)
sudo ip netns add container_ns
 
# Create a veth pair
# veth-host: will stay in the host namespace
# veth-container: will be moved into the container namespace
sudo ip link add veth-host type veth peer name veth-container
 
# Move one end into the container namespace
sudo ip link set veth-container netns container_ns
 
# Verify: veth-container is now GONE from host namespace
ip link show | grep veth
# Only shows: veth-host
 
# But it exists in the container namespace
sudo ip netns exec container_ns ip link show
# Shows: lo, veth-container
 
# Configure IP addresses
# Host side
sudo ip addr add 172.17.0.1/24 dev veth-host
sudo ip link set veth-host up
 
# Container side
sudo ip netns exec container_ns ip addr add 172.17.0.2/24 dev veth-container
sudo ip netns exec container_ns ip link set veth-container up
sudo ip netns exec container_ns ip link set lo up
 
# Test connectivity!
sudo ip netns exec container_ns ping -c 2 172.17.0.1
# SUCCESS: Container can reach host via veth pair
 
ping -c 2 172.17.0.2
# SUCCESS: Host can reach container
 
# The veth pair creates a Layer 2 tunnel between namespaces

How veth pairs work internally:

Kernel datastructure linking: When you create a veth pair, the kernel creates two net_device structures that point to each other. Each has a peer pointer to its partner.
Packet transmission: When the network stack calls dev_queue_xmit() on one veth interface, instead of going to hardware, the driver's xmit function directly calls netif_rx() on the peer interface in the other namespace.
No copying overhead: The packet's sk_buff structure (Linux kernel's representation of a network packet) is passed directly between namespaces. There's minimal copying—just pointer manipulation.
Bidirectional by nature: veth pairs are inherently bidirectional—they act as a pipe, not a one-way valve.

Performance Characteristics

Veth pairs add minimal latency (typically <5 microseconds) compared to physical interfaces. However, every packet traverses both interfaces and the bridge (if used), which means CPU overhead scales with packet count rather than byte count. High packet-per-second (PPS) workloads are more affected than high-throughput, large-packet workloads.

Veth Pair Key Properties

•Always created in pairs — You cannot create a single veth interface; they come as connected pairs.
•One end can be moved to any namespace — This is how containers get their network interface.
•Deleting one end deletes both — The pair is atomic; you can't have a dangling veth.
•Layer 2 device — Operates at Ethernet frame level, has MAC addresses, can be bridged.
•Supports VLANs and QoS — Can apply tc (traffic control) rules, VLAN tags, etc.
•Checksum offload works — The kernel maintains consistency, no extra CPU overhead.

Linux Bridge Networking

With veth pairs, we can connect individual containers to the host namespace. But what about connecting containers to each other? And what about connecting them to the outside world?

The Linux bridge (software-defined switch) solves this. A bridge acts as a Layer 2 (Ethernet) switch: it learns MAC addresses, forwards frames between connected interfaces, and maintains a forwarding table exactly like a physical switch.

Docker's default docker0 bridge is the canonical example. Every container (in bridge mode) gets a veth pair: one end attached to docker0, the other placed inside the container as its eth0.

Converting Mermaid diagram...

Bridge operation details:

MAC learning: When a frame arrives on a bridge port, the bridge records which MAC address lives behind which port. This is the forwarding database (FDB).
Frame forwarding: When a frame arrives, the bridge looks up the destination MAC in the FDB. If found, it forwards only to that port. If unknown, it floods to all ports (except the source).
Local addresses: The bridge itself has a MAC and IP address (172.17.0.1 for docker0). Frames destined for this address are delivered to the host's network stack.
External connectivity: For containers to reach the internet, packets must be NAT'd (masqueraded) by iptables. The source IP (container's private address) is replaced with the host's public IP.

bridge-setup.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# Create a bridge (similar to what Docker creates)
sudo ip link add docker0 type bridge
sudo ip addr add 172.17.0.1/16 dev docker0
sudo ip link set docker0 up
 
# Enable IP forwarding (required for routing between containers and outside)
sudo sysctl -w net.ipv4.ip_forward=1
 
# Add iptables rules for container internet access (NAT/masquerade)
sudo iptables -t nat -A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
 
# Allow forwarding for container traffic  
sudo iptables -A FORWARD -i docker0 -o eth0 -j ACCEPT
sudo iptables -A FORWARD -i eth0 -o docker0 -m state --state RELATED,ESTABLISHED -j ACCEPT
 
# Now containers connected to docker0 can reach the internet!
 
# View bridge status
bridge link show docker0
brctl show docker0  # legacy command
 
# View MAC forwarding database
bridge fdb show br docker0

Bridge Scalability Limitations

Linux bridges work well for tens to hundreds of containers on a single host. However, bridge broadcast domains don't scale well—MAC flooding and broadcast storms become problematic at large scale. Additionally, bridges are host-local; containers on different hosts cannot communicate via bridges alone. This limitation drives the need for overlay networks and more sophisticated solutions covered later in this module.

Container-to-Container Communication via Bridge

•Container 1 sends packet to Container 2's IP — The packet has src=172.17.0.2, dst=172.17.0.3
•Container 1 checks routing table — Destination is on the same subnet, so send directly (no gateway needed)
•ARP resolution — Container 1 sends ARP 'who has 172.17.0.3?' broadcast
•Bridge forwards ARP — The broadcast floods to all container-attached veths
•Container 2 responds — 'I have 172.17.0.3, my MAC is xx:xx:xx:xx:xx:xx'
•Frame transmission — Container 1 sends Ethernet frame to Container 2's MAC
•Bridge learns and forwards — Updates FDB, forwards frame only to Container 2's veth port
•Delivery via veth pair — Frame traverses veth pair into Container 2's namespace

Port Mapping and Network Address Translation

Containers on a bridge network have private IP addresses (like 172.17.0.x) that are not routable from the outside world. For external clients to reach containerized services, we need port mapping (also called port publishing or port forwarding).

When you run docker run -p 8080:80 nginx, you're instructing Docker to:

Listen on host port 8080
Forward incoming connections to container port 80
Translate return traffic back to the original client

This is implemented using Linux iptables NAT rules, specifically DNAT (Destination NAT) for incoming traffic.

port-mapping-iptables.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# When you run: docker run -p 8080:80 nginx
# Docker creates rules equivalent to:
 
# 1. DNAT rule in nat table PREROUTING chain
# Packets arriving at host:8080 get destination rewritten to container:80
sudo iptables -t nat -A PREROUTING -p tcp --dport 8080 \
    -j DNAT --to-destination 172.17.0.2:80
 
# 2. Also needed: DNAT for localhost access
# (PREROUTING doesn't catch traffic from the host itself)
sudo iptables -t nat -A OUTPUT -p tcp --dport 8080 \
    -j DNAT --to-destination 172.17.0.2:80
 
# 3. Masquerade rule for container replies (already set up for general access)
# Ensures return traffic has correct source IP
sudo iptables -t nat -A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
 
# View Docker's actual NAT rules
sudo iptables -t nat -L -n -v | grep docker
 
# DOCKER chain contains the port mapping rules
sudo iptables -t nat -L DOCKER -n -v
 
# Example output:
# Chain DOCKER (2 references)
#  pkts bytes target     prot opt in     out     source       destination
#     0     0 RETURN     all  --  docker0 *       0.0.0.0/0    0.0.0.0/0
#     5   260 DNAT       tcp  --  !docker0 *      0.0.0.0/0    0.0.0.0/0  tcp dpt:8080 to:172.17.0.2:80

The NAT packet flow:

Inbound request (client → container):

Client sends packet to HOST_IP:8080
Packet arrives at host's network interface
PREROUTING chain evaluates DNAT rule
Destination rewritten: 172.17.0.2:80
Routing decision: forward to docker0 bridge
Bridge forwards to container's veth
Container receives request on port 80

Outbound response (container → client):

Container sends response to original client IP
Packet leaves container via veth
Bridge forwards to host's network stack
POSTROUTING chain evaluates masquerade rule
Source IP rewritten from 172.17.0.2 to HOST_IP
Connection tracking ensures correct translation
Response sent to client with host's IP as source

Connection Tracking (conntrack)

Linux's connection tracking subsystem (conntrack) maintains state for every connection, enabling stateful NAT. It remembers the original destination of DNAT'd packets so return traffic can be correctly translated. View connection tracking entries with: sudo conntrack -L

Port Mapping Advantages

•Works with private IP addresses — Containers don't need public IPs
•Port flexibility — Map any host port to any container port
•Multiple services per IP — Different containers on different ports
•Simple firewall integration — Control at host level

Port Mapping Limitations

•Port conflicts — Two containers can't map to the same host port
•NAT overhead — CPU cost for translation at high PPS
•Breaks some protocols — Protocols embedding IPs (SIP, FTP active mode)
•Debugging complexity — Tracing through NAT adds cognitive load

Container Network Models

Container runtimes like Docker support multiple network modes, each with different isolation, performance, and connectivity characteristics. Understanding these modes is essential for choosing the right networking approach for your use case.

Docker Network Modes Comparison
Mode	Isolation	Performance	Use Case	Command Example
bridge (default)	High	Good	Most applications; inter-container communication with external access	`docker run --network bridge ...`
host	None	Best	Performance-critical apps; when port flexibility isn't needed	`docker run --network host ...`
none	Complete	N/A	Security-sensitive workloads requiring no network access	`docker run --network none ...`
container:id	Shared with specified container	Good	Sidecar patterns; tightly coupled containers	`docker run --network container:nginx ...`
macvlan	L2 isolation	Excellent	When containers need real network identities	`docker run --network my_macvlan ...`
ipvlan	L3 isolation	Excellent	L3-aware environments; avoids MAC issues	`docker run --network my_ipvlan ...`

In host mode, the container shares the host's network namespace entirely. There's no network isolation—the container sees all host interfaces, uses the host's IP address, and binds directly to host ports.

When to use:

Performance-critical applications where veth/bridge overhead matters
Applications that need to see all network traffic (monitoring tools)
When you need to bind to privileged ports (<1024) without special configuration
Legacy applications that assume a traditional network environment

host-mode.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Run nginx in host mode - no network isolation
docker run -d --network host nginx
 
# nginx now binds directly to host's port 80
# (no -p flag needed)
curl localhost:80
# Served by nginx
 
# Inside the container, you see host's interfaces
docker exec <container_id> ip addr
# Shows eth0, docker0, etc. (same as host)
 
# Performance benefit: no veth crossing, no bridge, no NAT
# But: complete loss of network isolation

Summary: Container Networking Fundamentals

We've covered the foundational building blocks that make container networking possible. These primitives—namespaces, veth pairs, bridges, and NAT—form the substrate upon which all container networking solutions are built.

Key Takeaways

•Containers are not VMs — They share the host kernel and use kernel features (namespaces) for isolation, creating unique networking requirements.
•Network namespaces provide isolation — Each container has its own network stack: interfaces, routing, firewall rules, and port bindings.
•Veth pairs connect namespaces — These virtual Ethernet tunnels allow packets to flow between the isolated container world and the host.
•Linux bridges act as virtual switches — They enable container-to-container communication and form the foundation of default Docker networking.
•NAT enables external connectivity — iptables DNAT/SNAT rules allow containers with private IPs to communicate with the outside world.
•Multiple network modes exist — Bridge, host, macvlan, ipvlan, and none modes offer different isolation/performance tradeoffs.

What's next:

In the next page, we'll take a deep dive into Docker Networking specifically—exploring Docker's network drivers, custom bridge networks, DNS-based service discovery, and the nuances of container networking in Docker Compose and Swarm environments.

Page Complete

You now understand the fundamental Linux primitives that power container networking. These concepts—namespaces, veth pairs, bridges, and NAT—appear in every container networking solution, from simple Docker deployments to complex Kubernetes clusters. Master these, and the advanced topics will follow naturally.

1 / 5

Loading learning content...

Computer NetworksContainerized Networking

Containerized Networking

LevelAdvanced

Duration90 mins

TopicContainerized Networking

1 / 5

Container Networking Fundamentals

The Containerization Paradigm Shift

What You Will Learn

Containers vs Virtual Machines: A Networking Perspective

Architectural Comparison: VM vs Container Networking
Aspect	Virtual Machine	Container	Networking Implication
Kernel	Own kernel per VM	Shared host kernel	Containers use kernel networking features directly
Network Stack	Full stack per VM	Virtualized via namespaces	Containers share kernel's TCP/IP implementation
Startup Time	Minutes	Milliseconds	Network setup must be instantaneous
Density	10-20 per host	100-1000+ per host	IP address exhaustion becomes critical concern
Isolation	Hardware-level (hypervisor)	Process-level (kernel)	Security boundaries require careful network isolation
Network Identity	Persistent MAC/IP	Ephemeral by default	Traditional networking assumptions break

The Ephemerality Problem

The density challenge:

Address space efficiency: How do hundreds of containers share limited IP addresses?
Port management: How do multiple containers expose services without port conflicts?
Routing scalability: How does traffic find the right container among thousands?
Performance overhead: How do we minimize the cost of network virtualization at scale?

These requirements drove the development of container-specific networking primitives and patterns we'll explore throughout this module.

Linux Namespaces: The Isolation Foundation

Linux Namespace Types and Their Roles in Container Isolation
Namespace	Isolates	Container Impact	Networking Relevance
Network (net)	Network stack, interfaces, routing	Each container has own network identity	Critical - Core of container networking
Mount (mnt)	Filesystem mount points	Containers see own filesystem view	Low - But affects /etc/hosts, /etc/resolv.conf
PID	Process IDs	Container processes numbered from 1	Low - Network processes appear isolated
UTS	Hostname and domain	Container has own hostname	Medium - Affects network identity/DNS
IPC	Inter-process communication	Shared memory isolated per container	Low - But may affect local communication patterns
User	User and group IDs	Container can have root without host root	Medium - Affects network capability permissions
Cgroup	Cgroup root directory	Resource limits isolated	Low - But cgroups limit network bandwidth

Deep Dive: The Network Namespace

The network namespace is the cornerstone of container networking. Each network namespace contains:

Network interfaces: Each namespace has its own set of network interfaces, including the loopback (lo) interface. A container's eth0 is completely separate from another container's eth0.
IP addresses: IP addresses are bound to interfaces within a namespace. The same IP can exist in different namespaces simultaneously without conflict.
Routing tables: Each namespace maintains its own routing table, determining how packets are forwarded.
Firewall rules (iptables/nftables): Packet filtering rules are namespace-specific, enabling per-container firewalling.
Socket bindings: Ports are namespace-scoped. Two containers can both bind to port 80 without conflict.
Network-related sysctls: Kernel tuning parameters like TCP settings can be namespace-specific.

namespace-exploration.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# View current network namespace (default is 'init' namespace)
ip netns list
 
# Create a new network namespace
sudo ip netns add my_container_ns
 
# Execute commands inside the namespace
sudo ip netns exec my_container_ns ip addr show
# Output: Only shows loopback (lo), in DOWN state
 
# Notice: the namespace is completely isolated - no eth0, no connectivity
 
# Compare to host namespace
ip addr show
# Output: Shows all host interfaces (eth0, lo, docker0, etc.)
 
# Each namespace has independent routing
sudo ip netns exec my_container_ns ip route show
# Output: Empty - no routes configured
 
# And independent iptables rules
sudo ip netns exec my_container_ns iptables -L
# Output: Empty chains - no rules
 
# Clean up
sudo ip netns delete my_container_ns

Namespace Implementation Detail

The isolation-connectivity paradox:

Virtual Ethernet Pairs: Connecting Namespaces

Converting Mermaid diagram...

veth-pair-creation.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
# Create a network namespace (simulating a container)
sudo ip netns add container_ns
 
# Create a veth pair
# veth-host: will stay in the host namespace
# veth-container: will be moved into the container namespace
sudo ip link add veth-host type veth peer name veth-container
 
# Move one end into the container namespace
sudo ip link set veth-container netns container_ns
 
# Verify: veth-container is now GONE from host namespace
ip link show | grep veth
# Only shows: veth-host
 
# But it exists in the container namespace
sudo ip netns exec container_ns ip link show
# Shows: lo, veth-container
 
# Configure IP addresses
# Host side
sudo ip addr add 172.17.0.1/24 dev veth-host
sudo ip link set veth-host up
 
# Container side
sudo ip netns exec container_ns ip addr add 172.17.0.2/24 dev veth-container
sudo ip netns exec container_ns ip link set veth-container up
sudo ip netns exec container_ns ip link set lo up
 
# Test connectivity!
sudo ip netns exec container_ns ping -c 2 172.17.0.1
# SUCCESS: Container can reach host via veth pair
 
ping -c 2 172.17.0.2
# SUCCESS: Host can reach container
 
# The veth pair creates a Layer 2 tunnel between namespaces

How veth pairs work internally:

Kernel datastructure linking: When you create a veth pair, the kernel creates two net_device structures that point to each other. Each has a peer pointer to its partner.
Packet transmission: When the network stack calls dev_queue_xmit() on one veth interface, instead of going to hardware, the driver's xmit function directly calls netif_rx() on the peer interface in the other namespace.
No copying overhead: The packet's sk_buff structure (Linux kernel's representation of a network packet) is passed directly between namespaces. There's minimal copying—just pointer manipulation.
Bidirectional by nature: veth pairs are inherently bidirectional—they act as a pipe, not a one-way valve.

Performance Characteristics

Veth Pair Key Properties

•Always created in pairs — You cannot create a single veth interface; they come as connected pairs.
•One end can be moved to any namespace — This is how containers get their network interface.
•Deleting one end deletes both — The pair is atomic; you can't have a dangling veth.
•Layer 2 device — Operates at Ethernet frame level, has MAC addresses, can be bridged.
•Supports VLANs and QoS — Can apply tc (traffic control) rules, VLAN tags, etc.
•Checksum offload works — The kernel maintains consistency, no extra CPU overhead.

Linux Bridge Networking

With veth pairs, we can connect individual containers to the host namespace. But what about connecting containers to each other? And what about connecting them to the outside world?

Docker's default docker0 bridge is the canonical example. Every container (in bridge mode) gets a veth pair: one end attached to docker0, the other placed inside the container as its eth0.

Converting Mermaid diagram...

Bridge operation details:

MAC learning: When a frame arrives on a bridge port, the bridge records which MAC address lives behind which port. This is the forwarding database (FDB).
Frame forwarding: When a frame arrives, the bridge looks up the destination MAC in the FDB. If found, it forwards only to that port. If unknown, it floods to all ports (except the source).
Local addresses: The bridge itself has a MAC and IP address (172.17.0.1 for docker0). Frames destined for this address are delivered to the host's network stack.
External connectivity: For containers to reach the internet, packets must be NAT'd (masqueraded) by iptables. The source IP (container's private address) is replaced with the host's public IP.

bridge-setup.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# Create a bridge (similar to what Docker creates)
sudo ip link add docker0 type bridge
sudo ip addr add 172.17.0.1/16 dev docker0
sudo ip link set docker0 up
 
# Enable IP forwarding (required for routing between containers and outside)
sudo sysctl -w net.ipv4.ip_forward=1
 
# Add iptables rules for container internet access (NAT/masquerade)
sudo iptables -t nat -A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
 
# Allow forwarding for container traffic  
sudo iptables -A FORWARD -i docker0 -o eth0 -j ACCEPT
sudo iptables -A FORWARD -i eth0 -o docker0 -m state --state RELATED,ESTABLISHED -j ACCEPT
 
# Now containers connected to docker0 can reach the internet!
 
# View bridge status
bridge link show docker0
brctl show docker0  # legacy command
 
# View MAC forwarding database
bridge fdb show br docker0

Bridge Scalability Limitations

Container-to-Container Communication via Bridge

•Container 1 sends packet to Container 2's IP — The packet has src=172.17.0.2, dst=172.17.0.3
•Container 1 checks routing table — Destination is on the same subnet, so send directly (no gateway needed)
•ARP resolution — Container 1 sends ARP 'who has 172.17.0.3?' broadcast
•Bridge forwards ARP — The broadcast floods to all container-attached veths
•Container 2 responds — 'I have 172.17.0.3, my MAC is xx:xx:xx:xx:xx:xx'
•Frame transmission — Container 1 sends Ethernet frame to Container 2's MAC
•Bridge learns and forwards — Updates FDB, forwards frame only to Container 2's veth port
•Delivery via veth pair — Frame traverses veth pair into Container 2's namespace

Port Mapping and Network Address Translation

When you run docker run -p 8080:80 nginx, you're instructing Docker to:

Listen on host port 8080
Forward incoming connections to container port 80
Translate return traffic back to the original client

This is implemented using Linux iptables NAT rules, specifically DNAT (Destination NAT) for incoming traffic.

port-mapping-iptables.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# When you run: docker run -p 8080:80 nginx
# Docker creates rules equivalent to:
 
# 1. DNAT rule in nat table PREROUTING chain
# Packets arriving at host:8080 get destination rewritten to container:80
sudo iptables -t nat -A PREROUTING -p tcp --dport 8080 \
    -j DNAT --to-destination 172.17.0.2:80
 
# 2. Also needed: DNAT for localhost access
# (PREROUTING doesn't catch traffic from the host itself)
sudo iptables -t nat -A OUTPUT -p tcp --dport 8080 \
    -j DNAT --to-destination 172.17.0.2:80
 
# 3. Masquerade rule for container replies (already set up for general access)
# Ensures return traffic has correct source IP
sudo iptables -t nat -A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
 
# View Docker's actual NAT rules
sudo iptables -t nat -L -n -v | grep docker
 
# DOCKER chain contains the port mapping rules
sudo iptables -t nat -L DOCKER -n -v
 
# Example output:
# Chain DOCKER (2 references)
#  pkts bytes target     prot opt in     out     source       destination
#     0     0 RETURN     all  --  docker0 *       0.0.0.0/0    0.0.0.0/0
#     5   260 DNAT       tcp  --  !docker0 *      0.0.0.0/0    0.0.0.0/0  tcp dpt:8080 to:172.17.0.2:80

The NAT packet flow:

Inbound request (client → container):

Client sends packet to HOST_IP:8080
Packet arrives at host's network interface
PREROUTING chain evaluates DNAT rule
Destination rewritten: 172.17.0.2:80
Routing decision: forward to docker0 bridge
Bridge forwards to container's veth
Container receives request on port 80

Outbound response (container → client):

Container sends response to original client IP
Packet leaves container via veth
Bridge forwards to host's network stack
POSTROUTING chain evaluates masquerade rule
Source IP rewritten from 172.17.0.2 to HOST_IP
Connection tracking ensures correct translation
Response sent to client with host's IP as source

Connection Tracking (conntrack)

Port Mapping Advantages

•Works with private IP addresses — Containers don't need public IPs
•Port flexibility — Map any host port to any container port
•Multiple services per IP — Different containers on different ports
•Simple firewall integration — Control at host level

Port Mapping Limitations

•Port conflicts — Two containers can't map to the same host port
•NAT overhead — CPU cost for translation at high PPS
•Breaks some protocols — Protocols embedding IPs (SIP, FTP active mode)
•Debugging complexity — Tracing through NAT adds cognitive load

Container Network Models

Docker Network Modes Comparison
Mode	Isolation	Performance	Use Case	Command Example
bridge (default)	High	Good	Most applications; inter-container communication with external access	`docker run --network bridge ...`
host	None	Best	Performance-critical apps; when port flexibility isn't needed	`docker run --network host ...`
none	Complete	N/A	Security-sensitive workloads requiring no network access	`docker run --network none ...`
container:id	Shared with specified container	Good	Sidecar patterns; tightly coupled containers	`docker run --network container:nginx ...`
macvlan	L2 isolation	Excellent	When containers need real network identities	`docker run --network my_macvlan ...`
ipvlan	L3 isolation	Excellent	L3-aware environments; avoids MAC issues	`docker run --network my_ipvlan ...`

When to use:

Performance-critical applications where veth/bridge overhead matters
Applications that need to see all network traffic (monitoring tools)
When you need to bind to privileged ports (<1024) without special configuration
Legacy applications that assume a traditional network environment

host-mode.sh
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Run nginx in host mode - no network isolation
docker run -d --network host nginx
 
# nginx now binds directly to host's port 80
# (no -p flag needed)
curl localhost:80
# Served by nginx
 
# Inside the container, you see host's interfaces
docker exec <container_id> ip addr
# Shows eth0, docker0, etc. (same as host)
 
# Performance benefit: no veth crossing, no bridge, no NAT
# But: complete loss of network isolation

Summary: Container Networking Fundamentals

Key Takeaways

•Containers are not VMs — They share the host kernel and use kernel features (namespaces) for isolation, creating unique networking requirements.
•Network namespaces provide isolation — Each container has its own network stack: interfaces, routing, firewall rules, and port bindings.
•Veth pairs connect namespaces — These virtual Ethernet tunnels allow packets to flow between the isolated container world and the host.
•Linux bridges act as virtual switches — They enable container-to-container communication and form the foundation of default Docker networking.
•NAT enables external connectivity — iptables DNAT/SNAT rules allow containers with private IPs to communicate with the outside world.
•Multiple network modes exist — Bridge, host, macvlan, ipvlan, and none modes offer different isolation/performance tradeoffs.

What's next:

Page Complete

1 / 5