Loading learning content...
Every TCP connection begins in a CLOSED state—nothing exists, no resources allocated, no communication possible. Through establishment, data transfer, and termination, the connection traverses various states, ultimately returning to CLOSED. This circular journey is the fundamental rhythm of TCP communication.
In this final page of our TCP state diagram exploration, we'll complete the picture by examining the passive closer's termination states (CLOSE_WAIT and LAST_ACK), understand what the CLOSED state truly represents, and synthesize our understanding into a comprehensive view of the complete TCP state machine. We'll also address common issues that arise from state mismanagement and establish best practices for robust connection handling.
By the end of this page, you'll have a complete mental model of TCP connection lifecycles—from birth through death and back to potential rebirth—enabling you to debug, optimize, and reason about TCP behavior in any context.
By the end of this page, you will understand: (1) The CLOSE_WAIT and LAST_ACK states (passive closer's perspective), (2) What CLOSED state means and resource cleanup, (3) The complete TCP state machine and all transitions, (4) Common pitfalls and the CLOSE_WAIT leak problem, (5) Best practices for connection lifecycle management, and (6) How to visualize and monitor the complete state machine in production.
We've examined the active closer's journey (ESTABLISHED → FIN_WAIT_1 → FIN_WAIT_2 → TIME_WAIT → CLOSED). Now let's examine the other side—the endpoint that receives the first FIN.
Entry: Receive FIN from peer while in ESTABLISHED; send ACK
Meaning: The peer has finished sending, but our application hasn't called close() yet.
Duration: Entirely controlled by our application—how long until it calls close()?
This state is fundamentally different from the active closer's states because:
Active Closer Passive Closer (You)
───────────── ───────────────────
ESTABLISHED ESTABLISHED
│ │
│ ────── FIN ──────────────────▶ │
FIN_WAIT_1 │
│ │ recv() returns 0
│ │ (EOF indicator)
│ ◀────── ACK ───────────────── │
FIN_WAIT_2 CLOSE_WAIT
│ │
│ (Application processes
│ remaining data)
│ │
│ close() called
│ ◀────── FIN ───────────────── │
│ LAST_ACK
TIME_WAIT │
│ ────── ACK ──────────────────▶ │
│ CLOSED
From the application's perspective:
If the application ignores the EOF and never closes:
CLOSE_WAIT accumulation is almost always an application bug. The kernel is waiting for your application to call close(), but the application never does. This leaks file descriptors and socket memory. Unlike FIN_WAIT_2 (which can timeout for orphaned sockets), CLOSE_WAIT has no kernel timeout because the application is still in control.
Entry: Application calls close(); kernel sends FIN
Meaning: We've sent our FIN, waiting for peer's ACK
Duration: Brief—typically one RTT
| Aspect | CLOSE_WAIT | LAST_ACK |
|---|---|---|
| Entered after | Receiving FIN | Sending FIN |
| Waiting for | Application close() | ACK from peer |
| Duration control | Application | Network (RTT) |
| Risk | Application leak | Brief, no risk |
| Exit on | close() called | ACK received |
| Next state | LAST_ACK | CLOSED |
When the final ACK arrives:
Note: The passive closer goes directly to CLOSED—no TIME_WAIT needed. TIME_WAIT is only on the side that sends the first FIN (the active closer). The passive closer's final segment was FIN, and the peer's ACK is sufficient confirmation.
The CLOSED state is a bit of a philosophical concept in TCP—it represents the absence of a connection rather than a connection in a particular state. A socket in CLOSED state isn't really a socket at all; it's the condition before creation or after destruction.
Before connection:
After connection:
When a connection reaches CLOSED, the kernel:
You generally cannot observe CLOSED state:
# This shows existing connections, not CLOSED ones
ss -tan
netstat -tan
# CLOSED connections are gone - nothing to show
The only way to "see" CLOSED is by not seeing the connection in state listings.
Not all paths to CLOSED are graceful:
RST (Reset) Reception:
Any State ─────── Receive RST ─────────▶ CLOSED
An RST immediately terminates the connection:
Common RST Causes:
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950
/** * Demonstrating different closure behaviors */#include <sys/socket.h>#include <netinet/in.h> /** * Graceful close: FIN sent, TIME_WAIT on active closer */void graceful_close(int sockfd) { // Normal close: FIN exchange, TIME_WAIT applies close(sockfd);} /** * Abortive close: RST sent, no TIME_WAIT * Use with caution - may cause data loss */void abortive_close(int sockfd) { struct linger so_linger; so_linger.l_onoff = 1; // Enable linger so_linger.l_linger = 0; // But with zero timeout = abort setsockopt(sockfd, SOL_SOCKET, SO_LINGER, &so_linger, sizeof(so_linger)); // close() now sends RST instead of FIN close(sockfd); // Connection goes directly to CLOSED, no TIME_WAIT // WARNING: Any unsent data is lost! // WARNING: Peer receives RST, not graceful close!} /** * Half-close: Close write side, keep reading */void half_close_write(int sockfd) { // Send FIN, but still receive data shutdown(sockfd, SHUT_WR); // Can still read from socket char buf[1024]; while (recv(sockfd, buf, sizeof(buf), 0) > 0) { // Process remaining data from peer } // Now fully close close(sockfd);}Using SO_LINGER with zero timeout to avoid TIME_WAIT seems attractive but is dangerous: (1) Any data in the send buffer is discarded, (2) The peer receives RST which may cause error handling, (3) You lose the duplicate protection TIME_WAIT provides, (4) Some middleware/firewalls may misbehave. Only use for intentional abort scenarios.
Let's now synthesize everything into the complete TCP finite state machine. This diagram represents the canonical view from RFC 793, with all possible states and transitions.
| State | Description | Category |
|---|---|---|
| CLOSED | No connection exists | Baseline |
| LISTEN | Waiting for connection | Server passive open |
| SYN_SENT | SYN sent, awaiting SYN+ACK | Client connection |
| SYN_RECEIVED | SYN received, SYN+ACK sent | Server connection |
| ESTABLISHED | Connection open, data transfer | Stable |
| FIN_WAIT_1 | FIN sent, awaiting ACK | Active close |
| FIN_WAIT_2 | FIN ACKed, awaiting peer FIN | Active close |
| CLOSING | Both FINs sent, awaiting ACKs | Simultaneous close |
| TIME_WAIT | Waiting for delayed packets | Active close cleanup |
| CLOSE_WAIT | FIN received, awaiting app close | Passive close |
| LAST_ACK | FIN sent, awaiting final ACK | Passive close |
Connection Establishment (3 states):
Data Transfer (1 state):
Active Close (4 states):
Passive Close (2 states):
Baseline (1 state):
┌─────────────────────────────────────┐
│ │
│ CLOSED │
│ │
└───────────────┬───────────────────┬─┘
│ │
Passive Open │ │ Active Open
create socket│ │ send SYN
▼ ▼
┌─────────────┐ ┌─────────────┐
│ LISTEN │ │ SYN_SENT │
└──────┬──────┘ └──────┬──────┘
│ │
Rcv SYN │ │ Rcv SYN+ACK
Snd SYN+ACK│ │ Snd ACK
▼ │
┌─────────────┐ │
│SYN_RECEIVED │◀─────────────────────┤
└──────┬──────┘ Rcv SYN, Snd SYN+ACK │
│ Rcv ACK │
│ │
└──────────────┬──────────────┘
│
▼
┌─────────────┐
│ ESTABLISHED │
└──────┬──────┘
┌───────────────┴───────────────┐
Close │ │ Rcv FIN
Snd FIN │ │ Snd ACK
▼ ▼
┌─────────────┐ ┌─────────────┐
│ FIN_WAIT_1 │ │ CLOSE_WAIT │
└──────┬──────┘ └──────┬──────┘
┌─────────────┼─────────────┐ │
Rcv ACK│ Rcv FIN Rcv FIN+ACK │ Close
│ Snd ACK Snd ACK │ Snd FIN
▼ │ │ ▼
┌─────────────┐ │ │ ┌─────────────┐
│ FIN_WAIT_2 │ │ │ │ LAST_ACK │
└──────┬──────┘ │ │ └──────┬──────┘
│ ▼ │ │
Rcv FIN│ ┌─────────────┐ │ │ Rcv ACK
Snd ACK│ │ CLOSING │ │ │
│ └──────┬──────┘ │ │
│ │ Rcv ACK │ │
│ ▼ ▼ ▼
│ ┌─────────────────────┐ ┌─────────────┐
└─────▶│ TIME_WAIT │ │ CLOSED │
└──────────┬──────────┘ └─────────────┘
│ 2MSL Timeout
▼
┌─────────────┐
│ CLOSED │
└─────────────┘
Think of the state machine as two parallel tracks that meet at ESTABLISHED: (1) The setup track (CLOSED → LISTEN/SYN_SENT → SYN_RECEIVED → ESTABLISHED), (2) The teardown track splits into active closer (FIN_WAIT_1 → FIN_WAIT_2 → TIME_WAIT → CLOSED) and passive closer (CLOSE_WAIT → LAST_ACK → CLOSED). CLOSING is a brief detour for simultaneous close.
Understanding the state machine helps diagnose real-world connection problems. Here are the most common issues and their state-based explanations.
Symptom: Thousands of connections in CLOSE_WAIT, growing over time
Cause: Application reads EOF (recv returns 0) but never calls close()
Diagnosis:
# Count CLOSE_WAIT sockets
ss -tan state close-wait | wc -l
# Find responsible process
ss -tanp state close-wait
# Watch growth rate
watch -n5 'ss -tan state close-wait | wc -l'
Common Code Patterns That Cause This:
# BUG: Never closes on clean disconnect
def handle_client(sock):
while True:
data = sock.recv(1024)
if not data:
break # EOF received, but socket never closed!
process(data)
# Missing: sock.close()
# FIX: Always close socket
def handle_client_fixed(sock):
try:
while True:
data = sock.recv(1024)
if not data:
break
process(data)
finally:
sock.close() # Always close, even on break
Solution: Audit code for socket lifecycle management. Use context managers or finally blocks to ensure close().
| State | Normal Count | Accumulation Cause | Fix |
|---|---|---|---|
| SYN_RECEIVED | Low (brief) | SYN flood attack | Enable SYN cookies, rate limit |
| ESTABLISHED | Varies | Slow clients, long-lived connections | Usually normal; add timeouts if needed |
| FIN_WAIT_1 | Very low | Network issues, ACKs not returning | Check network path |
| FIN_WAIT_2 | Low-medium | Slow peer close | Lower tcp_fin_timeout |
| CLOSE_WAIT | Very low | Application bug (missing close) | Fix application code |
| TIME_WAIT | Can be high | High connection rate (active closer) | Often normal; use tcp_tw_reuse |
| LAST_ACK | Very low | Final ACK lost | Usually brief; network issue if persisting |
Symptom: EMFILE or ENFILE errors, cannot accept new connections
Root cause: File descriptors exhausted—sockets are files in Unix
Often related to:
Diagnosis:
# Check file descriptor limits
ulimit -n # Per-process limit
cat /proc/sys/fs/file-max # System-wide limit
# Count open fds for a process
ls /proc/<PID>/fd | wc -l
# Count total open sockets
ss -s
Symptom: Many SYN_RECEIVED sockets, server unresponsive
Cause: SYN flood attack or misconfigured load balancer
Solution:
# Enable SYN cookies
sysctl -w net.ipv4.tcp_syncookies=1
# Increase SYN backlog
sysctl -w net.ipv4.tcp_max_syn_backlog=4096
# Reduce SYN+ACK retries (faster cleanup)
sysctl -w net.ipv4.tcp_synack_retries=2
CLOSE_WAIT is the only state where the 'problem' is definitively in your application code and not the network, peer, or kernel. The peer has properly closed; you haven't responded. If you see CLOSE_WAIT accumulating, look at your application's socket handling, not system tuning.
Managing TCP connections correctly requires attention to the entire lifecycle. Here are best practices derived from our understanding of the state machine.
1. Set options before connect()/listen()
// Right: options set before listen
setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt));
listen(sockfd, backlog);
// Wrong: some options ignored after listen
listen(sockfd, backlog);
setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt)); // May not work!
2. Use SO_REUSEADDR for servers
int opt = 1;
setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt));
3. Handle partial reads/writes
# Don't assume recv gets everything
data = b''
while len(data) < expected_length:
chunk = sock.recv(expected_length - len(data))
if not chunk:
raise ConnectionError("Peer closed prematurely")
data += chunk
4. Set appropriate timeouts
sock.settimeout(30) # 30 second timeout for all operations
5. Monitor for EOF in receive loops
while True:
data = sock.recv(4096)
if not data: # EOF - peer closed
break # Exit gracefully, then close
process(data)
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576
"""Robust socket handling with proper lifecycle management"""import socketimport logging class ManagedSocket: """Context manager for proper socket lifecycle""" def __init__(self, sock_or_tuple): if isinstance(sock_or_tuple, socket.socket): self.sock = sock_or_tuple else: # Create and connect host, port = sock_or_tuple self.sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) self.sock.settimeout(30) # Connection and I/O timeout self.sock.connect((host, port)) def __enter__(self): return self.sock def __exit__(self, exc_type, exc_val, exc_tb): try: # Graceful shutdown: send FIN, wait for peer's FIN self.sock.shutdown(socket.SHUT_WR) # Drain any remaining data self.sock.settimeout(5) while self.sock.recv(4096): pass except (socket.error, OSError): pass # Already closed or error - acceptable finally: self.sock.close() # Always close return False # Don't suppress exceptions def handle_client_robustly(client_sock, addr): """Properly handle a client connection""" logging.info(f"Connection from {addr}") try: with ManagedSocket(client_sock) as sock: while True: # Receive with timeout sock.settimeout(60) # Idle timeout data = sock.recv(4096) if not data: # EOF: peer sent FIN logging.info(f"{addr} closed connection") break # Process and respond response = process_request(data) # Handle partial sends total_sent = 0 while total_sent < len(response): sent = sock.send(response[total_sent:]) if sent == 0: raise ConnectionError("Socket connection broken") total_sent += sent except socket.timeout: logging.warning(f"{addr} timed out") except ConnectionError as e: logging.warning(f"{addr} connection error: {e}") except Exception as e: logging.error(f"{addr} unexpected error: {e}") # Socket automatically closed by context manager logging.info(f"{addr} handler complete") def process_request(data): # Application logic here return b"OK"6. Always close sockets (use finally or context managers)
try:
# Use socket
pass
finally:
sock.close() # ALWAYS executes
7. Handle both graceful and abrupt closure
try:
data = sock.recv(4096)
if not data: # Graceful close
handle_peer_close()
except ConnectionResetError: # Abrupt close (RST)
handle_reset()
finally:
sock.close()
8. Consider half-close for protocols that need it
sock.shutdown(socket.SHUT_WR) # Send FIN, keep receiving
while True:
response = sock.recv(4096)
if not response:
break
process(response)
sock.close()
In production systems, monitoring TCP states provides early warning of problems and helps with capacity planning.
Connection Counts by State:
# One-liner for all states
ss -tan | awk 'NR>1 {states[$1]++} END {for (s in states) print s, states[s]}'
# Specific states to track:
# - ESTABLISHED: Active connections (capacity indicator)
# - TIME_WAIT: Recent closures (normal at high traffic)
# - CLOSE_WAIT: Application bugs (should be near zero)
# - SYN_RECV: Connection queue depth (attack indicator)
Connection Rate:
# New connections per second (from /proc)
watch -n1 'cat /proc/net/snmp | grep Tcp: | tail -1 | awk "{print \$6}"'
Error Counters:
# See retransmits, resets, failures
netstat -s | grep -i "retransmit\|reset\|failed\|overflow"
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192
#!/usr/bin/env python3"""TCP State Metrics ExporterOutputs Prometheus-compatible metrics for monitoring Run as: python tcp_state_exporter.pyScrape at: http://localhost:9100/metrics""" from http.server import HTTPServer, BaseHTTPRequestHandlerimport subprocessimport time LISTEN_PORT = 9100 def get_tcp_states(): """Get TCP connection counts by state""" result = subprocess.run( ['ss', '-tan'], capture_output=True, text=True ) states = {} for line in result.stdout.strip().split('')[1:]: parts = line.split() if parts: state = parts[0] states[state] = states.get(state, 0) + 1 return states def get_tcp_stats(): """Get TCP statistics from /proc/net/snmp""" with open('/proc/net/snmp', 'r') as f: content = f.read() # Parse TCP line lines = content.strip().split('') for i, line in enumerate(lines): if line.startswith('Tcp:') and i + 1 < len(lines): headers = line.split()[1:] values = lines[i + 1].split()[1:] return dict(zip(headers, map(int, values))) return {} class MetricsHandler(BaseHTTPRequestHandler): def do_GET(self): if self.path != '/metrics': self.send_response(404) self.end_headers() return # Collect metrics states = get_tcp_states() stats = get_tcp_stats() # Format as Prometheus metrics lines = [] # Connection states lines.append('# HELP tcp_connections TCP connections by state') lines.append('# TYPE tcp_connections gauge') for state, count in states.items(): lines.append(f'tcp_connections{{state="{state}"}} {count}') # TCP statistics if stats: lines.append('# HELP tcp_active_opens Active connection openings') lines.append('# TYPE tcp_active_opens counter') lines.append(f'tcp_active_opens {stats.get("ActiveOpens", 0)}') lines.append('# HELP tcp_passive_opens Passive connection openings') lines.append('# TYPE tcp_passive_opens counter') lines.append(f'tcp_passive_opens {stats.get("PassiveOpens", 0)}') lines.append('# HELP tcp_retransmits Segment retransmissions') lines.append('# TYPE tcp_retransmits counter') lines.append(f'tcp_retransmits {stats.get("RetransSegs", 0)}') # Send response self.send_response(200) self.send_header('Content-Type', 'text/plain') self.end_headers() self.wfile.write(''.join(lines).encode()) if __name__ == '__main__': server = HTTPServer(('', LISTEN_PORT), MetricsHandler) print(f"TCP metrics available at http://localhost:{LISTEN_PORT}/metrics") server.serve_forever()| Metric | Warning Threshold | Critical Threshold | Notes |
|---|---|---|---|
| CLOSE_WAIT count | 100 | 1000 | Almost always a bug |
| CLOSE_WAIT growth | Any sustained growth | Investigate immediately | |
| TIME_WAIT count | 50% of ephemeral range | 80% | May need tcp_tw_reuse |
| SYN_RECV count | 1000 | 5000 | Possible SYN flood |
| ESTABLISHED count | Depends on capacity | Near ulimit | Scale up |
| Retransmit rate | 1% of segments | 5% | Network issues |
Before setting alert thresholds, establish baselines for your specific workload. A load balancer may normally have 10,000+ TIME_WAIT sockets; a database server may have only a few. What's 'normal' depends entirely on your application's connection patterns.
We've completed our comprehensive exploration of the TCP state machine—from the emptiness of CLOSED through establishment, data transfer, and termination, back to CLOSED.
| State | Phase | Key Characteristic |
|---|---|---|
| CLOSED | Baseline | No connection exists |
| LISTEN | Establishment | Server waiting for connections |
| SYN_SENT | Establishment | Client sent SYN, awaiting SYN+ACK |
| SYN_RECEIVED | Establishment | Server received SYN, sent SYN+ACK |
| ESTABLISHED | Data Transfer | Stable connection, bidirectional data |
| FIN_WAIT_1 | Active Close | Sent FIN, awaiting ACK |
| FIN_WAIT_2 | Active Close | FIN ACKed, awaiting peer's FIN |
| CLOSING | Simultaneous Close | Both FINs sent, awaiting ACKs |
| TIME_WAIT | Active Close | 2MSL wait for packet safety |
| CLOSE_WAIT | Passive Close | Received FIN, awaiting app close |
| LAST_ACK | Passive Close | Sent FIN, awaiting final ACK |
You've now mastered the TCP state diagram—one of the most important concepts in network programming. This knowledge enables you to:
The TCP state machine may seem complex at first, but it's actually a beautifully systematic design that handles all the edge cases of reliable communication over unreliable networks. Each state has a purpose; each transition has meaning. This systematic approach is why TCP has been the foundation of the Internet for over four decades.
Congratulations! You've completed the TCP State Diagram module. You now possess a comprehensive understanding of all 11 TCP states, their transitions, and their practical implications. This knowledge is fundamental to network engineering, systems programming, and debugging production issues. The state machine you've learned is the same one running in billions of devices worldwide, enabling the reliable communication that underlies modern digital life.