Loading content...
TCP sockets are the backbone of the Internet's most critical applications. Every web request you make, every email you send, every file you download relies on TCP sockets providing reliable, ordered, bidirectional byte streams between applications.
From the perspective of a network programmer, TCP sockets offer a powerful abstraction: the illusion of a perfect communication channel. You write bytes on one end, and they appear—in order, without duplicates, and without corruption—on the other end. The socket and the TCP stack handle all the complexity of retransmissions, acknowledgments, flow control, and congestion management.
But this abstraction has nuances. Understanding TCP socket programming means understanding when data is actually sent, how connections are established and torn down, what happens during network failures, and how to build applications that are both correct and performant.
By the end of this page, you will master TCP socket programming: the complete connection lifecycle, socket states from a programming perspective, blocking and non-blocking operations, proper error handling, connection patterns for clients and servers, and production considerations for building robust TCP applications.
TCP (Transmission Control Protocol) sockets provide a stream-oriented, connection-based, reliable communication service. Let's understand exactly what each of these characteristics means for the programmer:
Stream-Oriented Communication:
TCP provides a continuous byte stream, not discrete messages. When you write 100 bytes followed by 200 bytes, TCP may deliver this as:
There are no message boundaries. If your application protocol requires messages, you must implement your own framing:
Common framing strategies:
1. Fixed-length messages: Every message is exactly N bytes
2. Length-prefix: First 4 bytes indicate message length
3. Delimiter-based: Messages end with \r\n or similar
4. Self-describing: Protocol specifies how to determine length (HTTP headers)
Never assume that one send() corresponds to one recv(). A common bug is calling recv() once and expecting to receive exactly what was sent. Always loop until you've received the expected amount of data or implement proper message framing.
Creating a TCP socket and establishing a connection involves several system calls that must be performed in the correct order. Let's examine each step in detail.
Step 1: Create the Socket
int sockfd = socket(AF_INET, SOCK_STREAM, 0);
if (sockfd < 0) {
perror("socket creation failed");
exit(EXIT_FAILURE);
}
The socket() call creates an endpoint but doesn't associate it with any address or establish any connection. Parameters:
AF_INET — IPv4 Internet domain (use AF_INET6 for IPv6)SOCK_STREAM — Stream socket (TCP)0 — Let system choose protocol (TCP for SOCK_STREAM)Step 2: Prepare the Address
struct sockaddr_in server_addr;
memset(&server_addr, 0, sizeof(server_addr));
server_addr.sin_family = AF_INET;
server_addr.sin_port = htons(8080); // Convert to network byte order!
// Convert IP string to binary
if (inet_pton(AF_INET, "192.168.1.100", &server_addr.sin_addr) <= 0) {
perror("Invalid address");
exit(EXIT_FAILURE);
}
Rather than manually filling sockaddr structures, use getaddrinfo() which handles DNS resolution, IPv4/IPv6 compatibility, and returns properly formatted addresses. It's the modern, protocol-agnostic approach to address handling.
Step 3: Connect to the Server (Client Side)
if (connect(sockfd, (struct sockaddr *)&server_addr, sizeof(server_addr)) < 0) {
perror("Connection failed");
close(sockfd);
exit(EXIT_FAILURE);
}
The connect() call initiates TCP's three-way handshake:
Client Server
| |
|-------- SYN (seq=x) ------------>|
| |
|<----- SYN-ACK (seq=y, ack=x+1) --|
| |
|-------- ACK (ack=y+1) ---------->|
| |
| Connection Established |
connect() blocks until:
Important connect() behaviors:
| Scenario | Result | errno |
|---|---|---|
| Server accepts | Returns 0 (success) | — |
| Server not listening | Returns -1 | ECONNREFUSED |
| No route to host | Returns -1 | EHOSTUNREACH |
| Network unreachable | Returns -1 | ENETUNREACH |
| Connection timeout | Returns -1 | ETIMEDOUT |
| Interrupted by signal | Returns -1 | EINTR |
Setting up a TCP server involves three critical steps after socket creation: bind, listen, and accept. Each serves a distinct purpose.
Step 1: Bind—Assign a Local Address
struct sockaddr_in server_addr;
memset(&server_addr, 0, sizeof(server_addr));
server_addr.sin_family = AF_INET;
server_addr.sin_addr.s_addr = INADDR_ANY; // Accept on any interface
server_addr.sin_port = htons(8080);
// Essential: Allow address reuse
int opt = 1;
setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt));
if (bind(sockfd, (struct sockaddr *)&server_addr, sizeof(server_addr)) < 0) {
perror("Bind failed");
exit(EXIT_FAILURE);
}
bind() assigns the local address (interface) and port to the socket. For servers, you typically bind to INADDR_ANY to accept connections on all network interfaces.
Without SO_REUSEADDR, if your server exits and you try to restart it, bind() will fail with EADDRINUSE because the port is in TIME_WAIT state. SO_REUSEADDR allows binding to a port that's in TIME_WAIT. This is essential for server development and deployment.
Step 2: Listen—Mark Socket as Passive
if (listen(sockfd, SOMAXCONN) < 0) {
perror("Listen failed");
exit(EXIT_FAILURE);
}
listen() transforms the socket from an active socket (used for connecting) to a passive socket (used for accepting connections). The second argument is the backlog—the maximum length of the pending connections queue.
Understanding the Backlog:
The backlog queue holds connections that have completed TCP's handshake but haven't been accept()ed yet:
Connection States from Server Perspective:
┌───────────────────────────────────────────────────────────────┐
│ SYN Queue (Incomplete) │
│ Connections where we received SYN but haven't sent SYN-ACK │
└───────────────────────────────────────────────────────────────┘
↓
┌───────────────────────────────────────────────────────────────┐
│ Accept Queue (Complete) │
│ Connections fully established, waiting for accept() call │
│ This is what the 'backlog' parameter controls │
└───────────────────────────────────────────────────────────────┘
↓
┌───────────────────────────────────────────────────────────────┐
│ accept() called │
│ New socket returned, connection removed from queue │
└───────────────────────────────────────────────────────────────┘
If the accept queue is full, new connection attempts may be dropped or clients may experience connection refused errors.
Step 3: Accept—Accept Incoming Connections
struct sockaddr_in client_addr;
socklen_t client_len = sizeof(client_addr);
int client_fd = accept(sockfd, (struct sockaddr *)&client_addr, &client_len);
if (client_fd < 0) {
perror("Accept failed");
continue; // In server loop, continue accepting
}
// Now have client_fd for this specific client
// Original sockfd still accepting new connections
char client_ip[INET_ADDRSTRLEN];
inet_ntop(AF_INET, &client_addr.sin_addr, client_ip, sizeof(client_ip));
printf("Connection from %s:%d\n", client_ip, ntohs(client_addr.sin_port));
Critical Points about accept():
Once a TCP connection is established, data transfer uses send() and recv() (or write() and read()). However, these operations have subtleties that every network programmer must understand.
Sending Data:
const char *message = "Hello, Server!";
int total_sent = 0;
int msg_len = strlen(message);
while (total_sent < msg_len) {
int bytes_sent = send(sockfd, message + total_sent, msg_len - total_sent, 0);
if (bytes_sent < 0) {
if (errno == EINTR) continue; // Interrupted, retry
perror("Send failed");
break;
}
if (bytes_sent == 0) {
// Connection closed
break;
}
total_sent += bytes_sent;
}
Why loop on send()?
send() may not send all data in one call if:
send() and write() are nearly identical for connected TCP sockets. send() allows flags (like MSG_NOSIGNAL, MSG_DONTWAIT), while write() doesn't. For simple cases, they're interchangeable. Use send() when you need flag control.
Receiving Data:
char buffer[4096];
int total_received = 0;
int expected_length = 100; // Application knows expected message size
while (total_received < expected_length) {
int bytes_received = recv(sockfd, buffer + total_received,
expected_length - total_received, 0);
if (bytes_received < 0) {
if (errno == EINTR) continue; // Retry on interrupt
perror("Receive failed");
break;
}
if (bytes_received == 0) {
// Connection closed by peer
printf("Connection closed by peer\n");
break;
}
total_received += bytes_received;
}
Understanding recv() Return Values:
| Return Value | Meaning | Action |
|---|---|---|
| > 0 | Bytes received | Process data, may need more |
| 0 | Connection gracefully closed | Clean up and close socket |
| -1 with EINTR | Interrupted by signal | Retry recv() |
| -1 with EAGAIN/EWOULDBLOCK | No data (non-blocking) | Wait and retry |
| -1 with ECONNRESET | Connection reset by peer | Handle error, close socket |
When recv() returns 0, the peer has performed an orderly shutdown—they called close() or shutdown(). This is NOT an error; it's how TCP signals end-of-stream. Your code must check for this and close your side of the connection.
Message Framing Example:
Since TCP is a byte stream, applications must implement their own message boundaries. Here's a common length-prefix pattern:
// Sending a length-prefixed message
int send_message(int sockfd, const char *msg, int msg_len) {
// Send 4-byte length header (network byte order)
uint32_t net_len = htonl(msg_len);
if (send_all(sockfd, &net_len, 4) != 4) return -1;
// Send message body
return send_all(sockfd, msg, msg_len);
}
// Receiving a length-prefixed message
int recv_message(int sockfd, char *buffer, int max_len) {
// Read 4-byte length header
uint32_t net_len;
if (recv_all(sockfd, &net_len, 4) != 4) return -1;
int msg_len = ntohl(net_len);
if (msg_len > max_len) return -1; // Message too large
// Read message body
return recv_all(sockfd, buffer, msg_len);
}
Properly closing TCP connections is more nuanced than simply calling close(). Understanding the shutdown process is essential for avoiding data loss and socket resource leaks.
close() vs shutdown():
// close() — Close the file descriptor
close(sockfd); // Closes both directions, releases resources
// shutdown() — Granular connection shutdown
shutdown(sockfd, SHUT_RD); // Stop receiving
shutdown(sockfd, SHUT_WR); // Stop sending (sends FIN)
shutdown(sockfd, SHUT_RDWR); // Stop both directions
The Four-Way Handshake:
TCP connection termination involves a four-way handshake:
Client Server
| |
|--- FIN (I'm done sending) ------->| Client: FIN_WAIT_1
| |
|<-- ACK (I acknowledge) -----------| Client: FIN_WAIT_2
| | Server: CLOSE_WAIT
| |
|<-- FIN (I'm done sending) --------| Server: LAST_ACK
| |
|--- ACK (I acknowledge) ---------->| Client: TIME_WAIT
| | Server: CLOSED
| |
| [2*MSL timeout] |
| | Client: CLOSED
For graceful shutdown: (1) Call shutdown(SHUT_WR) to send FIN, (2) Continue reading until recv() returns 0, (3) Then call close(). This ensures you receive any data the peer sent before closing and properly complete the four-way handshake.
Half-Close:
TCP supports half-close—closing one direction while keeping the other open. This is useful for protocols where one side finishes sending but still needs to receive:
// Client sends all data, then signals done
send(sockfd, request_data, data_len, 0);
shutdown(sockfd, SHUT_WR); // No more writes, but can still read
// Continue reading server's response
while ((n = recv(sockfd, buffer, sizeof(buffer), 0)) > 0) {
process_response(buffer, n);
}
close(sockfd); // Now fully close
TIME_WAIT State:
After active close, the socket enters TIME_WAIT for 2×MSL (Maximum Segment Lifetime, typically 60 seconds). During this time:
The SO_LINGER Option:
SO_LINGER controls what happens when you close() with unsent data:
struct linger ling;
ling.l_onoff = 1; // Enable linger
ling.l_linger = 10; // Wait up to 10 seconds
setsockopt(sockfd, SOL_SOCKET, SO_LINGER, &ling, sizeof(ling));
| l_onoff | l_linger | Behavior |
|---|---|---|
| 0 | — | Default: close() returns immediately, data sent in background |
| 1 | 0 | Hard close: RST sent, data discarded, socket immediately reusable |
| 1 | >0 | close() blocks up to l_linger seconds until data sent |
Robust TCP applications must handle numerous error conditions and edge cases. Networks are unreliable, peers may crash, and systems may run out of resources.
Connection Reset (RST):
A connection reset occurs when the peer sends a RST packet, indicating the connection is invalid:
int n = recv(sockfd, buffer, sizeof(buffer), 0);
if (n < 0) {
if (errno == ECONNRESET) {
// Peer sent RST — connection abruptly terminated
// Reasons: peer crashed, peer's socket no longer exists,
// or peer explicitly reset connection
close(sockfd);
// Handle reconnection if appropriate
}
}
SIGPIPE Signal:
Writing to a connection that the peer has closed generates SIGPIPE, which by default terminates your process:
// Solution 1: Ignore SIGPIPE globally
signal(SIGPIPE, SIG_IGN);
// Solution 2: Use MSG_NOSIGNAL flag
send(sockfd, data, len, MSG_NOSIGNAL);
// With SIGPIPE ignored, send() returns -1 with errno = EPIPE
If you don't handle SIGPIPE, your server process will die the first time a client disconnects unexpectedly. Every production TCP server must either ignore SIGPIPE or use MSG_NOSIGNAL. This is not optional.
| errno | Cause | Appropriate Response |
|---|---|---|
| ECONNREFUSED | No server listening on port | Retry with backoff, or report to user |
| ETIMEDOUT | Connection or operation timed out | Retry or abort |
| ECONNRESET | Connection reset by peer | Close socket, optionally reconnect |
| EPIPE | Write to closed connection | Close socket, log error |
| ENOTCONN | Socket not connected | Programming error—fix code |
| EADDRINUSE | Address already in use | Use SO_REUSEADDR or wait |
| EINTR | Interrupted by signal | Retry the operation |
| EAGAIN/EWOULDBLOCK | Non-blocking would block | Use poll/select/epoll and retry |
Detecting Dead Connections:
TCP doesn't automatically detect dead connections. If a peer crashes without sending FIN, you won't know until you try to send data (and the send eventually times out). Use TCP keepalives:
int enable = 1;
int idle = 60; // Start keepalive after 60 seconds idle
int interval = 10; // Send keepalive every 10 seconds
int count = 3; // Close after 3 failed keepalives
setsockopt(sockfd, SOL_SOCKET, SO_KEEPALIVE, &enable, sizeof(enable));
setsockopt(sockfd, IPPROTO_TCP, TCP_KEEPIDLE, &idle, sizeof(idle));
setsockopt(sockfd, IPPROTO_TCP, TCP_KEEPINTVL, &interval, sizeof(interval));
setsockopt(sockfd, IPPROTO_TCP, TCP_KEEPCNT, &count, sizeof(count));
With these settings, a dead connection is detected within 60 + (10 × 3) = 90 seconds.
By default, socket operations block—they don't return until the operation completes. This is simple but limits concurrency. Non-blocking mode enables building high-performance servers that handle thousands of connections.
Setting Non-Blocking Mode:
#include <fcntl.h>
// Method 1: fcntl
int flags = fcntl(sockfd, F_GETFL, 0);
fcntl(sockfd, F_SETFL, flags | O_NONBLOCK);
// Method 2: At socket creation (Linux)
int sockfd = socket(AF_INET, SOCK_STREAM | SOCK_NONBLOCK, 0);
Behavior Differences:
| Operation | Blocking Mode | Non-Blocking Mode |
|---|---|---|
| accept() | Waits for connection | Returns -1, errno=EAGAIN if none pending |
| connect() | Waits for handshake | Returns -1, errno=EINPROGRESS (check later) |
| recv() | Waits for data | Returns -1, errno=EAGAIN if no data |
| send() | Waits for buffer space | Returns -1 or partial write if buffer full |
In non-blocking mode, connect() returns immediately with errno=EINPROGRESS. Use poll/select/epoll to wait for the socket to become writable. Then call getsockopt(SO_ERROR) to check if connection succeeded—a zero value means success.
Multiplexing with poll()/select()/epoll:
Non-blocking sockets are typically used with I/O multiplexing to handle multiple connections in a single thread:
#include <poll.h>
struct pollfd fds[MAX_CLIENTS];
int nfds = 0;
// Add listening socket
fds[nfds].fd = listen_fd;
fds[nfds].events = POLLIN;
nfds++;
while (1) {
int ready = poll(fds, nfds, -1); // Wait indefinitely
for (int i = 0; i < nfds; i++) {
if (fds[i].revents & POLLIN) {
if (fds[i].fd == listen_fd) {
// New connection
int client_fd = accept(listen_fd, NULL, NULL);
set_nonblocking(client_fd);
add_to_pollfd(fds, &nfds, client_fd);
} else {
// Data from existing client
handle_client(fds[i].fd);
}
}
}
}
Efficiency Comparison:
| Method | Scalability | Best For |
|---|---|---|
| select() | Limited (FD_SETSIZE, often 1024) | Simple, portable code |
| poll() | Good (no FD limit) | Moderate connections |
| epoll (Linux) | Excellent (O(1) for ready events) | High-performance servers |
| kqueue (BSD/macOS) | Excellent | High-performance on BSD systems |
TCP socket programming provides reliable, ordered communication but requires careful attention to its stream nature and connection lifecycle. Let's consolidate the key insights:
What's Next:
Having mastered TCP sockets, we'll explore UDP sockets—the connectionless alternative that trades reliability for speed and simplicity. UDP is essential for real-time applications like gaming, streaming, and DNS.
You now understand TCP socket programming at a professional level: connection establishment, data transfer patterns, shutdown procedures, error handling, and blocking modes. This knowledge prepares you to build reliable networked applications.