Writing a TCP server is a ritual. You perform the same steps every time, in the same order. It is akin to a pilot's pre flight checklist. You cannot skip a step, and you must check every return value.
The lifecycle follows a clear progression: socket, bind, listen, and finally accept. Let us walk through what happens at the kernel level during each phase.
We have already discussed socket() and bind(). socket() creates the endpoint, and bind() attaches it to an address. One tricky aspect of binding is the SO_REUSEADDR option.
When you stop your server, the port often remains in TIME_WAIT for a while. If you try to restart your server immediately, bind() will fail because the port is technically still in use. By setting SO_REUSEADDR, you are telling the kernel "I know what I am doing; let me bind to this port even if it is pending closure." For development, this is mandatory. Without it, you will spend half your day waiting for timeouts.
The listen() call is interesting. It transforms the socket from an active socket (one that can connect to things) into a passive socket (one that waits for connections).
It also takes a parameter called backlog.
This backlog is the size of the queue for pending connections. When a client sends a SYN, the kernel handles the handshake automatically. It puts the connection into a "half open" state. Once the handshake is complete, the connection moves to a "fully connected" queue, waiting for your application to pick it up.
So, what should the backlog value be? In the old days, we used 5. Today, on high traffic servers, we might use 1024 or higher. If this queue fills up, new clients will receive a "Connection Refused" error (RST) or simply be ignored. We want to avoid that.
accept() is where your application finally meets the user. It pulls the first connection off the queue and gives you a brand new file descriptor.
This is a critical distinction. The listening socket remains open and unchanged. It continues to listen. The new file descriptor represents the specific connection to that one client.
If you are writing a simple iterative server, you might handle the client request right then and there. You read, you write, you close. But while you are doing that, nobody is calling accept(). The queue fills up. Other clients wait. This is blocking.
By default, sockets are blocking. If you call accept() and there are no pending connections, your thread goes to sleep. It waits until someone connects.
If you call read() and there is no data, your thread sleeps.
This is fine for simple tools, but it is death for a high performance server. Imagine your server is handling Client A. Client A decides to take a nap and stops sending data. Your server calls read() and gets blocked. Meanwhile, Client B connects. But your server is asleep, waiting for A. Client B gets no service.
To solve this, we set our sockets to non blocking mode using fcntl with O_NONBLOCK. Now, if we call read() and there is no data, the function returns immediately with an error EAGAIN or EWOULDBLOCK. It says "not right now, try again later."
This frees us. We can move on to check other connections. But now we have a new problem. If we just loop through all connections checking for data, we burn 100% of the CPU doing nothing. We need a way to sleep until something interesting happens on any connection.
This leads us to I/O Multiplexing.