Tcp State Diagram - Learning Module

Loading content...

0/240

FIN_WAIT and TIME_WAIT: The Orderly Termination

The Art of Graceful Disconnection

If the three-way handshake is TCP's greeting, the closing sequence is its farewell—and as in life, saying goodbye properly is just as important as saying hello. TCP cannot simply "hang up" because the network is unreliable and both directions of communication must close independently. The result is a choreographed four-way handshake for graceful termination.

The FIN_WAIT states (FIN_WAIT_1 and FIN_WAIT_2) and the notorious TIME_WAIT state represent the active closer's side of this termination sequence. These states are often misunderstood, sometimes blamed for "too many connections," and occasionally bypassed incorrectly. Understanding them deeply is essential for operating high-connection-rate servers, debugging connection issues, and designing robust network applications.

This page explores the termination sequence from the perspective of the endpoint that initiates the close. We'll examine each state, understand why it exists, learn about the infamous TIME_WAIT delay and why it's actually crucial, and see how improper handling of these states can destabilize systems.

Learning Objectives

By the end of this page, you will understand: (1) The four-way handshake for TCP connection termination, (2) The FIN_WAIT_1 and FIN_WAIT_2 states, (3) The critical purpose of TIME_WAIT and the 2MSL timeout, (4) Why TIME_WAIT accumulation isn't always a problem, (5) When and how to use SO_REUSEADDR and SO_REUSEPORT, and (6) Debugging connection termination issues.

The Four-Way Handshake for Termination

While connection establishment uses a three-way handshake, termination uses a four-way handshake (sometimes called four-segment exchange). This asymmetry exists because TCP connections are full-duplex—each direction must be closed independently.

Why Four Steps?

The three-way handshake works because connection establishment is symmetric—both sides start with nothing and agree to connect together. Termination is different:

Active closer (initiates close) may have stopped sending but needs to receive remaining data
Passive closer may still have data to send even after receiving FIN
Both directions must acknowledge the other's close

This requires four segments: FIN from A, ACK from B, FIN from B, ACK from A.

The Complete Sequence

    Active Closer (A)                       Passive Closer (B)
    ─────────────────                       ──────────────────
    ESTABLISHED                             ESTABLISHED
         │                                       │
   close() called                               │
         │                                       │
         │ ──────── FIN (seq=x) ─────────▶      │
    FIN_WAIT_1                                  │
         │                                       │ Receive FIN
         │                                       │ (can still send)
         │ ◀──────── ACK (ack=x+1) ───────      │
    FIN_WAIT_2                             CLOSE_WAIT
         │                                       │
         │                               (B sends remaining data)
         │                                       │
         │                                  close() called
         │ ◀──────── FIN (seq=y) ─────────      │
         │                                 LAST_ACK
    TIME_WAIT                                   │
         │ ──────── ACK (ack=y+1) ─────────▶    │
         │                                  CLOSED
         │                                       .
    [2MSL wait]                                  .
         │                                       .
     CLOSED                                      .

Key Observations

The active closer enters more states (FIN_WAIT_1 → FIN_WAIT_2 → TIME_WAIT → CLOSED)
The passive closer enters different states (CLOSE_WAIT → LAST_ACK → CLOSED) — covered in next page
TIME_WAIT is only on the active closer — the side that sends the first FIN
Both FINs must be acknowledged — reliable delivery even for control segments

Converting Mermaid diagram...

The Half-Close Possibility

Between FIN_WAIT_2 and TIME_WAIT, something interesting can happen: the half-close. At this point:

A has sent FIN ("I'm done sending")
B has ACKed A's FIN ("I acknowledge you're done")
B has not sent FIN yet ("But I'm still sending")

This is a legitimate state:

// Application A
shutdown(sockfd, SHUT_WR);  // Send FIN
// Can still receive data from B
while (recv(sockfd, buf, sizeof(buf), 0) > 0) {
    process(buf);
}
close(sockfd);  // After B closes

The connection is "half-closed"—one direction is closed, the other is open. This is useful when:

Client sends request, closes write side, waits for complete response
File transfer: sender closes after last byte, receiver closes after processing
Pipelines: signal end of input, wait for output

Simultaneous Close

If both sides call close() at approximately the same time, both enter FIN_WAIT_1 simultaneously, then both transition to CLOSING state (not FIN_WAIT_2), then both enter TIME_WAIT after receiving the other's FIN. This is similar to simultaneous open but for closing—rare but fully supported by the state machine.

The FIN_WAIT States

The FIN_WAIT states represent the active closer waiting for the connection to fully terminate. Let's examine each in detail.

FIN_WAIT_1: Awaiting FIN Acknowledgment

Entry: Application calls close() or shutdown(SHUT_WR); kernel sends FIN

Waiting for: ACK of our FIN (and possibly peer's FIN)

Exit conditions:

Received	Transition to	Notes
ACK only	FIN_WAIT_2	Normal case, peer hasn't closed yet
FIN + ACK	TIME_WAIT	Peer closing simultaneously with ACK
FIN only	CLOSING	Simultaneous close (rare)

Important characteristics:

Still can receive data: We've closed write direction, not read
FIN is retransmitted if not ACKed: Uses standard RTO timer
Brief state: Usually transitions quickly (one RTT)

FIN_WAIT_1
    │
    ├── Receive ACK ────────────▶ FIN_WAIT_2
    │
    ├── Receive FIN ────────────▶ CLOSING
    │      (send ACK)
    │
    └── Receive FIN+ACK ────────▶ TIME_WAIT
           (send ACK)

FIN_WAIT_2: Awaiting Peer's FIN

Entry: Our FIN has been acknowledged

Waiting for: Peer's FIN (closing the other direction)

Exit condition: Receive FIN → send ACK → transition to TIME_WAIT

The FIN_WAIT_2 Danger:

Unlike FIN_WAIT_1, FIN_WAIT_2 can persist indefinitely if the peer never sends FIN:

Peer application may still be sending data
Peer application may be buggy (never calls close())
Peer may have crashed (kernel never cleans up)

This creates a resource leak. The socket consumes kernel memory forever.

fin_wait_2_timeout.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Linux defense against orphaned FIN_WAIT_2
 
# View current setting
sysctl net.ipv4.tcp_fin_timeout
# Default: 60 (seconds)
 
# This timeout ONLY applies to FIN_WAIT_2 sockets that are:
# - Orphaned (not attached to any process)
# - The socket fd was closed, not just shutdown()
 
# Lower the timeout if you have FIN_WAIT_2 accumulation
sudo sysctl -w net.ipv4.tcp_fin_timeout=30
 
# Check for FIN_WAIT_2 accumulation
ss -tan state fin-wait-2 | wc -l
 
# Watch for patterns (remote host or port)
ss -tan state fin-wait-2 | awk '{print $4}' | sort | uniq -c | sort -rn
 
# Note: tcp_fin_timeout is NOT related to TIME_WAIT timeout
# TIME_WAIT is hardcoded to 2*MSL in most systems

FIN_WAIT_2 Timeout Behavior

Different situations have different timeout behaviors:

Socket attached to process (not closed yet):

No timeout—waits forever
Application is responsible for detecting hung connections
Use SO_RCVTIMEO or select()/poll() with timeout

Socket orphaned (close() called, process detached):

Uses tcp_fin_timeout (default 60 seconds)
After timeout, socket is destroyed
This is kernel protection against leaked sockets

Socket with shutdown() but not close():

Still attached to process
No timeout applies
Application controls the socket's lifetime

FIN_WAIT State Comparison
Aspect	FIN_WAIT_1	FIN_WAIT_2
Waiting for	ACK of our FIN	Peer's FIN
Duration	Brief (1 RTT)	Variable (peer's decision)
Can receive data?	Yes	Yes (until peer's FIN)
Default timeout	None (retransmit FIN)	60s for orphaned sockets
Resource risk	Low	Medium (can accumulate)
Typical cause	Packet in flight	Peer still sending or slow close

FIN_WAIT_2 Accumulation

Many FIN_WAIT_2 connections usually indicate a misbehaving remote—it's not acknowledging your closure in reasonable time. Check if specific remote hosts are responsible. This is the peer's fault, not yours, but you can defend with tcp_fin_timeout and application-level timeouts.

TIME_WAIT: The Most Misunderstood State

TIME_WAIT is simultaneously one of TCP's most important mechanisms and one of the most frequently complained about. Servers accumulate thousands of TIME_WAIT sockets. Developers try to "optimize" it away. Often, these attempts cause more problems than they solve.

Understanding TIME_WAIT means understanding why it exists and when (if ever) it can be safely modified.

Entry to TIME_WAIT

The active closer enters TIME_WAIT after:

Receiving the peer's FIN (the second FIN in the four-way handshake)
Sending ACK for the peer's FIN

At this point, the connection is logically closed—no more data in either direction. But the socket doesn't immediately disappear.

The 2MSL Timeout

MSL = Maximum Segment Lifetime

The MSL is the maximum time a TCP segment can exist in the network. RFC 793 recommends 2 minutes, but implementations vary:

System	MSL	TIME_WAIT Duration (2×MSL)
Linux	30 seconds	60 seconds
BSD/macOS	30 seconds	60 seconds
Windows	120 seconds (2 min)	240 seconds (4 min)
Solaris	60 seconds	120 seconds

The TIME_WAIT state lasts exactly 2×MSL before transitioning to CLOSED.

Why TIME_WAIT Exists: Two Critical Purposes

Purpose 1: Reliable Termination

The final ACK in the four-way handshake can be lost. If it is:

Active Closer                     Passive Closer
──────────────                    ──────────────
                                  LAST_ACK
                                      │
    ACK ─────────────▶ [LOST]        │
                                      │
TIME_WAIT                             │ (retransmit FIN)
    │ ◀─────────────── FIN ──────────┘
    │                                 │
    │ ── ACK (retransmit) ──────────▶│
                                  CLOSED

Without TIME_WAIT:

Final ACK is lost
Passive closer retransmits FIN
Active closer's socket is gone, kernel sends RST
Passive closer sees reset instead of clean close

With TIME_WAIT:

Final ACK is lost
Passive closer retransmits FIN
Active closer still has socket, can retransmit ACK
Both sides terminate cleanly

Purpose 2: Prevent Old Duplicate Segments

The network may have old packets from this connection still floating around:

Scenario without TIME_WAIT:

1. Connection A:1234 ↔ B:80 closed
2. Packet from old connection still in network (delayed by congestion)
3. New connection uses same ports: A:1234 ↔ B:80
4. Old packet arrives, sequence number happens to be valid
5. Old data accepted as part of new connection → DATA CORRUPTION

With 2MSL TIME_WAIT, any old packets from the previous connection will have expired (TTL → 0) before the same port pair can be reused.

TIME_WAIT Is Not a Bug

TIME_WAIT may look wasteful—sockets doing nothing for 60+ seconds. But removing or shortening it risks corrupting subsequent connections. The "fix" can cause bugs far worse than the "problem." Think carefully before attempting to bypass TIME_WAIT.

Converting Mermaid diagram...

TIME_WAIT Accumulation and Remedies

High-traffic servers—especially those making many outbound connections—can accumulate large numbers of TIME_WAIT sockets. Let's understand when this is a problem and what to do about it.

When TIME_WAIT Accumulates

Server closes connections (active closer):

HTTP/1.0 server closing after each request
Server timing out idle clients
Server rejecting connections (RST or close)

Client-like pattern:

Load balancer/proxy connecting to backends
Service making many outbound requests
Connection pool with short-lived connections

The Math:

Connections per second × TIME_WAIT duration = accumulation

1,000 connections/sec × 60 seconds = 60,000 TIME_WAIT sockets
Each TIME_WAIT socket consumes ~300-400 bytes
60,000 sockets × 400 bytes ≈ 24 MB

Is It Actually a Problem?

Usually No:

TIME_WAIT sockets use minimal resources
Modern systems handle hundreds of thousands easily
Port space is 65,535 per IP pair, not global

Sometimes Yes:

Outbound connections to same destination exhaust ephemeral ports
Very high rate can hit OS file descriptor limits
Some legacy applications/middleware have hardcoded limits

time_wait_analysis.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
#!/bin/bash
# Analyze TIME_WAIT socket situation
 
echo "=== TIME_WAIT Analysis ==="
 
# Count TIME_WAIT sockets
TW_COUNT=$(ss -tan state time-wait | wc -l)
echo "Current TIME_WAIT count: $TW_COUNT"
 
# Context: available ephemeral ports
EPH_LOW=$(sysctl -n net.ipv4.ip_local_port_range | awk '{print $1}')
EPH_HIGH=$(sysctl -n net.ipv4.ip_local_port_range | awk '{print $2}')
EPH_RANGE=$((EPH_HIGH - EPH_LOW))
echo "Ephemeral port range: $EPH_LOW-$EPH_HIGH ($EPH_RANGE ports)"
 
# TIME_WAIT per remote destination (if accumulating with specific host)
echo ""
echo "=== TIME_WAIT by Remote Address ==="
ss -tan state time-wait | awk '{print $4}' | sort | uniq -c | sort -rn | head -10
 
# Risk assessment
if [ $TW_COUNT -gt $((EPH_RANGE / 2)) ]; then
    echo ""
    echo "⚠️  WARNING: TIME_WAIT count is high relative to available ports"
    echo "    Consider: SO_REUSEADDR, connection pooling, or multiple source IPs"
else
    echo ""
    echo "✓ TIME_WAIT count is within safe range"
fi
 
# Show sysctl settings
echo ""
echo "=== Relevant Sysctl Settings ==="
sysctl net.ipv4.tcp_tw_reuse 2>/dev/null || echo "tcp_tw_reuse: not available"
sysctl net.ipv4.tcp_max_tw_buckets 2>/dev/null || echo "tcp_max_tw_buckets: not available"

Legitimate Remedies

1. SO_REUSEADDR (Most Common)

int opt = 1;
setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt));

Allows binding to an address already in TIME_WAIT. This is:

Safe for servers (listening sockets)
Standard practice for avoiding "Address already in use"
Does NOT affect client connections

2. tcp_tw_reuse (Linux Only)

sysctl -w net.ipv4.tcp_tw_reuse=1

Allows reusing TIME_WAIT sockets for outbound connections when safe:

Only if TCP timestamps are enabled (ensures sequence number safety)
Only for new connections to same remote IP:port
Safe when enabled (designed for this purpose)

3. Connection Pooling

Reuse established connections instead of opening new ones:

HTTP keep-alive
Database connection pools
Persistent connections to backends

This is the best solution—avoids the problem entirely.

4. Multiple Source IPs

For heavy outbound traffic to single destination:

Use multiple local IPs
Binds to different IPs for different connections
Expands available port space by number of IPs

Dangerous "Remedies" (Avoid)

tcp_tw_recycle (REMOVED from Linux 4.12+)

Former setting that aggressively recycled TIME_WAIT sockets:

Caused broken connections through NAT
Removed from kernel as fundamentally unsafe
If you see advice to enable this: it's outdated

Reducing MSL/TIME_WAIT Duration

Some systems allow this, but:

Risks data corruption from old duplicates
Risks failed retransmission of final ACK
Only consider if you fully understand implications

The Right Solution

In order of preference: (1) Use connection pooling/keep-alive to reuse connections, (2) Enable tcp_tw_reuse for outbound-heavy workloads, (3) Add source IPs if connecting to single destination at high rate, (4) Accept TIME_WAIT accumulation if it's not causing actual problems. Don't chase low TIME_WAIT counts as a goal in itself.

The CLOSING State (Simultaneous Close)

The CLOSING state is a rare but valid state that occurs during simultaneous close—when both endpoints send FIN at approximately the same time.

How Simultaneous Close Happens

    Host A                      Host B
       │                           │
  close() called             close() called
       │                           │
       │ ──── FIN ─────────────▶  │
  FIN_WAIT_1                       │
       │  ◀───────── FIN ─────────│
       │                      FIN_WAIT_1
       │                           │
   CLOSING                    CLOSING
       │ ──── ACK ─────────────▶  │
       │  ◀───────── ACK ─────────│
       │                           │
  TIME_WAIT                  TIME_WAIT

State Transition Details

From State	Receive	Action	To State
FIN_WAIT_1	FIN (no ACK)	Send ACK	CLOSING
CLOSING	ACK		TIME_WAIT

In normal close:

FIN_WAIT_1 → FIN_WAIT_2 (receive ACK)
FIN_WAIT_2 → TIME_WAIT (receive FIN)

In simultaneous close:

FIN_WAIT_1 → CLOSING (receive FIN before ACK)
CLOSING → TIME_WAIT (receive ACK)

Practical Occurrence

Simultaneous close is uncommon because applications usually have asymmetric close patterns:

Server-initiated close: server detects idle, closes
Client-initiated close: user leaves, closes
Error-initiated: one side encounters error, closes

However, it can occur:

Time-based connection limiting (both sides timeout simultaneously)
Coordinated shutdown protocols
Race conditions in communication patterns

The Complete State Machine for Active Close

                            ESTABLISHED
                                 │
                    Application close()
                    Send FIN
                                 │
                                 ▼
                            FIN_WAIT_1
                                 │
              ┌──────────────────┼──────────────────┐
              │                  │                  │
         Rcv ACK            Rcv FIN           Rcv FIN+ACK
              │             Snd ACK            Snd ACK
              │                  │                  │
              ▼                  ▼                  │
         FIN_WAIT_2          CLOSING               │
              │                  │                  │
         Rcv FIN            Rcv ACK                │
         Snd ACK                 │                  │
              │                  │                  │
              └──────────────────┴──────────────────┘
                                 │
                                 ▼
                            TIME_WAIT
                                 │
                            2MSL timeout
                                 │
                                 ▼
                              CLOSED

Every path leads to TIME_WAIT, ensuring the 2MSL protection applies regardless of how termination occurred.

CLOSING Is Brief

The CLOSING state typically lasts only one RTT—the time for the peer's ACK to arrive. If you observe many sockets in CLOSING state, it suggests the peer isn't sending ACKs (perhaps crashed or network issues). Unlike TIME_WAIT, CLOSING accumulation indicates a problem.

Debugging Termination Issues

Connection termination issues manifest in various ways. Here's how to diagnose and resolve them.

Monitoring Connection States

# Count connections by state
ss -tan | awk 'NR>1 {print $1}' | sort | uniq -c | sort -rn

# Example output:
#   4523 TIME-WAIT
#    843 ESTABLISHED
#    127 FIN-WAIT-2
#     23 CLOSE-WAIT
#     12 FIN-WAIT-1

# Watch states in real-time
watch -n1 'ss -tan | awk "NR>1 {print \$1}" | sort | uniq -c | sort -rn'

# Find which processes have closing connections
ss -tanp state fin-wait-1
ss -tanp state fin-wait-2

Common Issues and Diagnosis

Issue: Many FIN_WAIT_1 sockets

Symptom: FIN not being ACKed

Possible causes:

Network issues (packets not reaching peer)
Peer is overloaded (not responding)
Firewall dropping FIN packets

Diagnosis:

# Are connections to specific host?
ss -tan state fin-wait-1 | awk '{print $5}' | sort | uniq -c

# Capture traffic to see if FIN is sent and ACK returns
tcpdump -i any 'tcp[tcpflags] & (tcp-fin|tcp-ack) != 0' -nn

Issue: Many FIN_WAIT_2 sockets

Symptom: Peer not sending FIN

Possible causes:

Peer application not calling close()
Peer still sending data (legitimate)
Peer crashed (kernel didn't clean up—rare)
Buggy peer software

Diagnosis:

# How old are these connections?
ss -tan state fin-wait-2 -o

# Are they orphaned (no process attached)?
# Orphaned FIN_WAIT_2 will timeout after tcp_fin_timeout
lsof -i TCP | grep FIN_WAIT_2

Remedy:

# Lower the orphan timeout
sysctl -w net.ipv4.tcp_fin_timeout=30

Termination State Troubleshooting
State	High Count Indicates	Resolution
FIN_WAIT_1	FINs not being ACKed	Check network path, peer health
FIN_WAIT_2	Peer not sending FIN	Lower tcp_fin_timeout; check peer
CLOSING	Simultaneous close, ACKs delayed	Usually brief; network issue if persisting
TIME_WAIT	Many closed connections (normal)	Usually not a problem; see remedies above
LAST_ACK	Final ACK lost (peer's side)	Indicates peer has issues, not you

connection_state_monitor.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
#!/usr/bin/env python3
"""
TCP connection state monitor with alerts
Run with elevated privileges for full visibility
"""
 
import subprocess
import time
from collections import Counter
 
# Alert thresholds
THRESHOLDS = {
    'FIN-WAIT-1': 100,   # Should be brief
    'FIN-WAIT-2': 500,   # May accumulate if peer is slow
    'CLOSE-WAIT': 100,   # Our app not closing sockets
    'TIME-WAIT': 50000,  # Normal to be high
    'CLOSING': 50,       # Should be brief
    'LAST-ACK': 100,     # Peer not acknowledging our FIN
}
 
def get_connection_states():
    """Get count of connections by state"""
    result = subprocess.run(
        ['ss', '-tan'],
        capture_output=True, text=True
    )
    states = []
    for line in result.stdout.strip().split('
')[1:]:  # Skip header
        parts = line.split()
        if parts:
            states.append(parts[0])
    return Counter(states)
 
def check_alerts(states):
    """Check if any state exceeds threshold"""
    alerts = []
    for state, threshold in THRESHOLDS.items():
        count = states.get(state, 0)
        if count > threshold:
            alerts.append(f"⚠️  {state}: {count} (threshold: {threshold})")
    return alerts
 
def main():
    print("TCP Connection State Monitor")
    print("=" * 50)
    
    while True:
        states = get_connection_states()
        
        # Clear screen and print current state
        print("\033[2J\033[H")  # Clear screen
        print(f"Time: {time.strftime('%H:%M:%S')}")
        print("-" * 30)
        
        for state, count in sorted(states.items(), key=lambda x: -x[1]):
            indicator = ""
            if state in THRESHOLDS and count > THRESHOLDS[state]:
                indicator = " ⚠️"
            print(f"{state:15s}: {count:6d}{indicator}")
        
        alerts = check_alerts(states)
        if alerts:
            print("
" + "=" * 30)
            print("ALERTS:")
            for alert in alerts:
                print(f"  {alert}")
        
        time.sleep(1)
 
if __name__ == "__main__":
    main()

Packet Capture Is Your Friend

When diagnosing termination issues, tcpdump reveals exactly what's happening: tcpdump -i any 'tcp[tcpflags] & (tcp-fin|tcp-rst) != 0' shows FIN and RST packets. You'll see who sends FIN, who ACKs, and how long between events. This removes guesswork and shows the actual problem.

Summary: FIN_WAIT and TIME_WAIT States

We've explored the active closer's journey through connection termination—the states that handle graceful disconnection and protect the network from confusion.

Key Takeaways

Core Concepts

•Four-way handshake for termination — FIN, ACK, FIN, ACK; each direction closes independently.
•FIN_WAIT_1 — Waiting for ACK of our FIN; should be brief (one RTT).
•FIN_WAIT_2 — Our FIN ACKed, waiting for peer's FIN; can persist if peer is slow to close.
•TIME_WAIT has two purposes — Ensure final ACK delivery and prevent old duplicate segment confusion.
•2MSL wait is intentional — Ensures all packets from old connection have expired before port reuse.
•TIME_WAIT accumulation is usually normal — Use connection pooling, tcp_tw_reuse as remedies if needed.
•CLOSING state — Rare state during simultaneous close; both endpoints send FIN concurrently.

What's Next

We've examined termination from the active closer's perspective. But what about the other side—the endpoint that receives the first FIN? That endpoint enters CLOSE_WAIT and LAST_ACK states, and has its own set of characteristics and potential issues. In the final page, we'll complete the picture by examining these states and the CLOSED state that all connections ultimately reach.

Page Complete

You now understand the FIN_WAIT and TIME_WAIT states in depth—from the four-way handshake mechanics, through the crucial purpose of TIME_WAIT, to practical debugging of termination issues. This knowledge is essential for operating high-traffic servers and understanding why connections don't just "disappear" when you call close().

FIN_WAIT and TIME_WAIT: The Orderly Termination

The Art of Graceful Disconnection

Learning Objectives

The Four-Way Handshake for Termination

Why Four Steps?

The three-way handshake works because connection establishment is symmetric—both sides start with nothing and agree to connect together. Termination is different:

Active closer (initiates close) may have stopped sending but needs to receive remaining data
Passive closer may still have data to send even after receiving FIN
Both directions must acknowledge the other's close

This requires four segments: FIN from A, ACK from B, FIN from B, ACK from A.

The Complete Sequence

    Active Closer (A)                       Passive Closer (B)
    ─────────────────                       ──────────────────
    ESTABLISHED                             ESTABLISHED
         │                                       │
   close() called                               │
         │                                       │
         │ ──────── FIN (seq=x) ─────────▶      │
    FIN_WAIT_1                                  │
         │                                       │ Receive FIN
         │                                       │ (can still send)
         │ ◀──────── ACK (ack=x+1) ───────      │
    FIN_WAIT_2                             CLOSE_WAIT
         │                                       │
         │                               (B sends remaining data)
         │                                       │
         │                                  close() called
         │ ◀──────── FIN (seq=y) ─────────      │
         │                                 LAST_ACK
    TIME_WAIT                                   │
         │ ──────── ACK (ack=y+1) ─────────▶    │
         │                                  CLOSED
         │                                       .
    [2MSL wait]                                  .
         │                                       .
     CLOSED                                      .

Key Observations

The active closer enters more states (FIN_WAIT_1 → FIN_WAIT_2 → TIME_WAIT → CLOSED)
The passive closer enters different states (CLOSE_WAIT → LAST_ACK → CLOSED) — covered in next page
TIME_WAIT is only on the active closer — the side that sends the first FIN
Both FINs must be acknowledged — reliable delivery even for control segments

Converting Mermaid diagram...

The Half-Close Possibility

Between FIN_WAIT_2 and TIME_WAIT, something interesting can happen: the half-close. At this point:

A has sent FIN ("I'm done sending")
B has ACKed A's FIN ("I acknowledge you're done")
B has not sent FIN yet ("But I'm still sending")

This is a legitimate state:

// Application A
shutdown(sockfd, SHUT_WR);  // Send FIN
// Can still receive data from B
while (recv(sockfd, buf, sizeof(buf), 0) > 0) {
    process(buf);
}
close(sockfd);  // After B closes

The connection is "half-closed"—one direction is closed, the other is open. This is useful when:

Client sends request, closes write side, waits for complete response
File transfer: sender closes after last byte, receiver closes after processing
Pipelines: signal end of input, wait for output

Simultaneous Close

The FIN_WAIT States

The FIN_WAIT states represent the active closer waiting for the connection to fully terminate. Let's examine each in detail.

FIN_WAIT_1: Awaiting FIN Acknowledgment

Entry: Application calls close() or shutdown(SHUT_WR); kernel sends FIN

Waiting for: ACK of our FIN (and possibly peer's FIN)

Exit conditions:

Received	Transition to	Notes
ACK only	FIN_WAIT_2	Normal case, peer hasn't closed yet
FIN + ACK	TIME_WAIT	Peer closing simultaneously with ACK
FIN only	CLOSING	Simultaneous close (rare)

Important characteristics:

Still can receive data: We've closed write direction, not read
FIN is retransmitted if not ACKed: Uses standard RTO timer
Brief state: Usually transitions quickly (one RTT)

FIN_WAIT_1
    │
    ├── Receive ACK ────────────▶ FIN_WAIT_2
    │
    ├── Receive FIN ────────────▶ CLOSING
    │      (send ACK)
    │
    └── Receive FIN+ACK ────────▶ TIME_WAIT
           (send ACK)

FIN_WAIT_2: Awaiting Peer's FIN

Entry: Our FIN has been acknowledged

Waiting for: Peer's FIN (closing the other direction)

Exit condition: Receive FIN → send ACK → transition to TIME_WAIT

The FIN_WAIT_2 Danger:

Unlike FIN_WAIT_1, FIN_WAIT_2 can persist indefinitely if the peer never sends FIN:

Peer application may still be sending data
Peer application may be buggy (never calls close())
Peer may have crashed (kernel never cleans up)

This creates a resource leak. The socket consumes kernel memory forever.

fin_wait_2_timeout.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# Linux defense against orphaned FIN_WAIT_2
 
# View current setting
sysctl net.ipv4.tcp_fin_timeout
# Default: 60 (seconds)
 
# This timeout ONLY applies to FIN_WAIT_2 sockets that are:
# - Orphaned (not attached to any process)
# - The socket fd was closed, not just shutdown()
 
# Lower the timeout if you have FIN_WAIT_2 accumulation
sudo sysctl -w net.ipv4.tcp_fin_timeout=30
 
# Check for FIN_WAIT_2 accumulation
ss -tan state fin-wait-2 | wc -l
 
# Watch for patterns (remote host or port)
ss -tan state fin-wait-2 | awk '{print $4}' | sort | uniq -c | sort -rn
 
# Note: tcp_fin_timeout is NOT related to TIME_WAIT timeout
# TIME_WAIT is hardcoded to 2*MSL in most systems

FIN_WAIT_2 Timeout Behavior

Different situations have different timeout behaviors:

Socket attached to process (not closed yet):

No timeout—waits forever
Application is responsible for detecting hung connections
Use SO_RCVTIMEO or select()/poll() with timeout

Socket orphaned (close() called, process detached):

Uses tcp_fin_timeout (default 60 seconds)
After timeout, socket is destroyed
This is kernel protection against leaked sockets

Socket with shutdown() but not close():

Still attached to process
No timeout applies
Application controls the socket's lifetime

FIN_WAIT State Comparison
Aspect	FIN_WAIT_1	FIN_WAIT_2
Waiting for	ACK of our FIN	Peer's FIN
Duration	Brief (1 RTT)	Variable (peer's decision)
Can receive data?	Yes	Yes (until peer's FIN)
Default timeout	None (retransmit FIN)	60s for orphaned sockets
Resource risk	Low	Medium (can accumulate)
Typical cause	Packet in flight	Peer still sending or slow close

FIN_WAIT_2 Accumulation

TIME_WAIT: The Most Misunderstood State

Understanding TIME_WAIT means understanding why it exists and when (if ever) it can be safely modified.

Entry to TIME_WAIT

The active closer enters TIME_WAIT after:

Receiving the peer's FIN (the second FIN in the four-way handshake)
Sending ACK for the peer's FIN

At this point, the connection is logically closed—no more data in either direction. But the socket doesn't immediately disappear.

The 2MSL Timeout

MSL = Maximum Segment Lifetime

The MSL is the maximum time a TCP segment can exist in the network. RFC 793 recommends 2 minutes, but implementations vary:

System	MSL	TIME_WAIT Duration (2×MSL)
Linux	30 seconds	60 seconds
BSD/macOS	30 seconds	60 seconds
Windows	120 seconds (2 min)	240 seconds (4 min)
Solaris	60 seconds	120 seconds

The TIME_WAIT state lasts exactly 2×MSL before transitioning to CLOSED.

Why TIME_WAIT Exists: Two Critical Purposes

Purpose 1: Reliable Termination

The final ACK in the four-way handshake can be lost. If it is:

Active Closer                     Passive Closer
──────────────                    ──────────────
                                  LAST_ACK
                                      │
    ACK ─────────────▶ [LOST]        │
                                      │
TIME_WAIT                             │ (retransmit FIN)
    │ ◀─────────────── FIN ──────────┘
    │                                 │
    │ ── ACK (retransmit) ──────────▶│
                                  CLOSED

Without TIME_WAIT:

Final ACK is lost
Passive closer retransmits FIN
Active closer's socket is gone, kernel sends RST
Passive closer sees reset instead of clean close

With TIME_WAIT:

Final ACK is lost
Passive closer retransmits FIN
Active closer still has socket, can retransmit ACK
Both sides terminate cleanly

Purpose 2: Prevent Old Duplicate Segments

The network may have old packets from this connection still floating around:

Scenario without TIME_WAIT:

1. Connection A:1234 ↔ B:80 closed
2. Packet from old connection still in network (delayed by congestion)
3. New connection uses same ports: A:1234 ↔ B:80
4. Old packet arrives, sequence number happens to be valid
5. Old data accepted as part of new connection → DATA CORRUPTION

With 2MSL TIME_WAIT, any old packets from the previous connection will have expired (TTL → 0) before the same port pair can be reused.

TIME_WAIT Is Not a Bug

Converting Mermaid diagram...

TIME_WAIT Accumulation and Remedies

High-traffic servers—especially those making many outbound connections—can accumulate large numbers of TIME_WAIT sockets. Let's understand when this is a problem and what to do about it.

When TIME_WAIT Accumulates

Server closes connections (active closer):

HTTP/1.0 server closing after each request
Server timing out idle clients
Server rejecting connections (RST or close)

Client-like pattern:

Load balancer/proxy connecting to backends
Service making many outbound requests
Connection pool with short-lived connections

The Math:

Connections per second × TIME_WAIT duration = accumulation

1,000 connections/sec × 60 seconds = 60,000 TIME_WAIT sockets
Each TIME_WAIT socket consumes ~300-400 bytes
60,000 sockets × 400 bytes ≈ 24 MB

Is It Actually a Problem?

Usually No:

TIME_WAIT sockets use minimal resources
Modern systems handle hundreds of thousands easily
Port space is 65,535 per IP pair, not global

Sometimes Yes:

Outbound connections to same destination exhaust ephemeral ports
Very high rate can hit OS file descriptor limits
Some legacy applications/middleware have hardcoded limits

time_wait_analysis.sh
Bash
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
#!/bin/bash
# Analyze TIME_WAIT socket situation
 
echo "=== TIME_WAIT Analysis ==="
 
# Count TIME_WAIT sockets
TW_COUNT=$(ss -tan state time-wait | wc -l)
echo "Current TIME_WAIT count: $TW_COUNT"
 
# Context: available ephemeral ports
EPH_LOW=$(sysctl -n net.ipv4.ip_local_port_range | awk '{print $1}')
EPH_HIGH=$(sysctl -n net.ipv4.ip_local_port_range | awk '{print $2}')
EPH_RANGE=$((EPH_HIGH - EPH_LOW))
echo "Ephemeral port range: $EPH_LOW-$EPH_HIGH ($EPH_RANGE ports)"
 
# TIME_WAIT per remote destination (if accumulating with specific host)
echo ""
echo "=== TIME_WAIT by Remote Address ==="
ss -tan state time-wait | awk '{print $4}' | sort | uniq -c | sort -rn | head -10
 
# Risk assessment
if [ $TW_COUNT -gt $((EPH_RANGE / 2)) ]; then
    echo ""
    echo "⚠️  WARNING: TIME_WAIT count is high relative to available ports"
    echo "    Consider: SO_REUSEADDR, connection pooling, or multiple source IPs"
else
    echo ""
    echo "✓ TIME_WAIT count is within safe range"
fi
 
# Show sysctl settings
echo ""
echo "=== Relevant Sysctl Settings ==="
sysctl net.ipv4.tcp_tw_reuse 2>/dev/null || echo "tcp_tw_reuse: not available"
sysctl net.ipv4.tcp_max_tw_buckets 2>/dev/null || echo "tcp_max_tw_buckets: not available"

Legitimate Remedies

1. SO_REUSEADDR (Most Common)

int opt = 1;
setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt));

Allows binding to an address already in TIME_WAIT. This is:

Safe for servers (listening sockets)
Standard practice for avoiding "Address already in use"
Does NOT affect client connections

2. tcp_tw_reuse (Linux Only)

sysctl -w net.ipv4.tcp_tw_reuse=1

Allows reusing TIME_WAIT sockets for outbound connections when safe:

Only if TCP timestamps are enabled (ensures sequence number safety)
Only for new connections to same remote IP:port
Safe when enabled (designed for this purpose)

3. Connection Pooling

Reuse established connections instead of opening new ones:

HTTP keep-alive
Database connection pools
Persistent connections to backends

This is the best solution—avoids the problem entirely.

4. Multiple Source IPs

For heavy outbound traffic to single destination:

Use multiple local IPs
Binds to different IPs for different connections
Expands available port space by number of IPs

Dangerous "Remedies" (Avoid)

tcp_tw_recycle (REMOVED from Linux 4.12+)

Former setting that aggressively recycled TIME_WAIT sockets:

Caused broken connections through NAT
Removed from kernel as fundamentally unsafe
If you see advice to enable this: it's outdated

Reducing MSL/TIME_WAIT Duration

Some systems allow this, but:

Risks data corruption from old duplicates
Risks failed retransmission of final ACK
Only consider if you fully understand implications

The Right Solution

The CLOSING State (Simultaneous Close)

The CLOSING state is a rare but valid state that occurs during simultaneous close—when both endpoints send FIN at approximately the same time.

How Simultaneous Close Happens

    Host A                      Host B
       │                           │
  close() called             close() called
       │                           │
       │ ──── FIN ─────────────▶  │
  FIN_WAIT_1                       │
       │  ◀───────── FIN ─────────│
       │                      FIN_WAIT_1
       │                           │
   CLOSING                    CLOSING
       │ ──── ACK ─────────────▶  │
       │  ◀───────── ACK ─────────│
       │                           │
  TIME_WAIT                  TIME_WAIT

State Transition Details

From State	Receive	Action	To State
FIN_WAIT_1	FIN (no ACK)	Send ACK	CLOSING
CLOSING	ACK		TIME_WAIT

In normal close:

FIN_WAIT_1 → FIN_WAIT_2 (receive ACK)
FIN_WAIT_2 → TIME_WAIT (receive FIN)

In simultaneous close:

FIN_WAIT_1 → CLOSING (receive FIN before ACK)
CLOSING → TIME_WAIT (receive ACK)

Practical Occurrence

Simultaneous close is uncommon because applications usually have asymmetric close patterns:

Server-initiated close: server detects idle, closes
Client-initiated close: user leaves, closes
Error-initiated: one side encounters error, closes

However, it can occur:

Time-based connection limiting (both sides timeout simultaneously)
Coordinated shutdown protocols
Race conditions in communication patterns

The Complete State Machine for Active Close

                            ESTABLISHED
                                 │
                    Application close()
                    Send FIN
                                 │
                                 ▼
                            FIN_WAIT_1
                                 │
              ┌──────────────────┼──────────────────┐
              │                  │                  │
         Rcv ACK            Rcv FIN           Rcv FIN+ACK
              │             Snd ACK            Snd ACK
              │                  │                  │
              ▼                  ▼                  │
         FIN_WAIT_2          CLOSING               │
              │                  │                  │
         Rcv FIN            Rcv ACK                │
         Snd ACK                 │                  │
              │                  │                  │
              └──────────────────┴──────────────────┘
                                 │
                                 ▼
                            TIME_WAIT
                                 │
                            2MSL timeout
                                 │
                                 ▼
                              CLOSED

Every path leads to TIME_WAIT, ensuring the 2MSL protection applies regardless of how termination occurred.

CLOSING Is Brief

Debugging Termination Issues

Connection termination issues manifest in various ways. Here's how to diagnose and resolve them.

Monitoring Connection States

# Count connections by state
ss -tan | awk 'NR>1 {print $1}' | sort | uniq -c | sort -rn

# Example output:
#   4523 TIME-WAIT
#    843 ESTABLISHED
#    127 FIN-WAIT-2
#     23 CLOSE-WAIT
#     12 FIN-WAIT-1

# Watch states in real-time
watch -n1 'ss -tan | awk "NR>1 {print \$1}" | sort | uniq -c | sort -rn'

# Find which processes have closing connections
ss -tanp state fin-wait-1
ss -tanp state fin-wait-2

Common Issues and Diagnosis

Issue: Many FIN_WAIT_1 sockets

Symptom: FIN not being ACKed

Possible causes:

Network issues (packets not reaching peer)
Peer is overloaded (not responding)
Firewall dropping FIN packets

Diagnosis:

# Are connections to specific host?
ss -tan state fin-wait-1 | awk '{print $5}' | sort | uniq -c

# Capture traffic to see if FIN is sent and ACK returns
tcpdump -i any 'tcp[tcpflags] & (tcp-fin|tcp-ack) != 0' -nn

Issue: Many FIN_WAIT_2 sockets

Symptom: Peer not sending FIN

Possible causes:

Peer application not calling close()
Peer still sending data (legitimate)
Peer crashed (kernel didn't clean up—rare)
Buggy peer software

Diagnosis:

# How old are these connections?
ss -tan state fin-wait-2 -o

# Are they orphaned (no process attached)?
# Orphaned FIN_WAIT_2 will timeout after tcp_fin_timeout
lsof -i TCP | grep FIN_WAIT_2

Remedy:

# Lower the orphan timeout
sysctl -w net.ipv4.tcp_fin_timeout=30

Termination State Troubleshooting
State	High Count Indicates	Resolution
FIN_WAIT_1	FINs not being ACKed	Check network path, peer health
FIN_WAIT_2	Peer not sending FIN	Lower tcp_fin_timeout; check peer
CLOSING	Simultaneous close, ACKs delayed	Usually brief; network issue if persisting
TIME_WAIT	Many closed connections (normal)	Usually not a problem; see remedies above
LAST_ACK	Final ACK lost (peer's side)	Indicates peer has issues, not you

connection_state_monitor.py
Python
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
#!/usr/bin/env python3
"""
TCP connection state monitor with alerts
Run with elevated privileges for full visibility
"""
 
import subprocess
import time
from collections import Counter
 
# Alert thresholds
THRESHOLDS = {
    'FIN-WAIT-1': 100,   # Should be brief
    'FIN-WAIT-2': 500,   # May accumulate if peer is slow
    'CLOSE-WAIT': 100,   # Our app not closing sockets
    'TIME-WAIT': 50000,  # Normal to be high
    'CLOSING': 50,       # Should be brief
    'LAST-ACK': 100,     # Peer not acknowledging our FIN
}
 
def get_connection_states():
    """Get count of connections by state"""
    result = subprocess.run(
        ['ss', '-tan'],
        capture_output=True, text=True
    )
    states = []
    for line in result.stdout.strip().split('
')[1:]:  # Skip header
        parts = line.split()
        if parts:
            states.append(parts[0])
    return Counter(states)
 
def check_alerts(states):
    """Check if any state exceeds threshold"""
    alerts = []
    for state, threshold in THRESHOLDS.items():
        count = states.get(state, 0)
        if count > threshold:
            alerts.append(f"⚠️  {state}: {count} (threshold: {threshold})")
    return alerts
 
def main():
    print("TCP Connection State Monitor")
    print("=" * 50)
    
    while True:
        states = get_connection_states()
        
        # Clear screen and print current state
        print("\033[2J\033[H")  # Clear screen
        print(f"Time: {time.strftime('%H:%M:%S')}")
        print("-" * 30)
        
        for state, count in sorted(states.items(), key=lambda x: -x[1]):
            indicator = ""
            if state in THRESHOLDS and count > THRESHOLDS[state]:
                indicator = " ⚠️"
            print(f"{state:15s}: {count:6d}{indicator}")
        
        alerts = check_alerts(states)
        if alerts:
            print("
" + "=" * 30)
            print("ALERTS:")
            for alert in alerts:
                print(f"  {alert}")
        
        time.sleep(1)
 
if __name__ == "__main__":
    main()

Packet Capture Is Your Friend

Summary: FIN_WAIT and TIME_WAIT States

We've explored the active closer's journey through connection termination—the states that handle graceful disconnection and protect the network from confusion.

Key Takeaways

Core Concepts

•Four-way handshake for termination — FIN, ACK, FIN, ACK; each direction closes independently.
•FIN_WAIT_1 — Waiting for ACK of our FIN; should be brief (one RTT).
•FIN_WAIT_2 — Our FIN ACKed, waiting for peer's FIN; can persist if peer is slow to close.
•TIME_WAIT has two purposes — Ensure final ACK delivery and prevent old duplicate segment confusion.
•2MSL wait is intentional — Ensures all packets from old connection have expired before port reuse.
•TIME_WAIT accumulation is usually normal — Use connection pooling, tcp_tw_reuse as remedies if needed.
•CLOSING state — Rare state during simultaneous close; both endpoints send FIN concurrently.

What's Next

Page Complete