You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Upon shutdown, we stop listening for new requests while allowing existing requests to finish. The first part, "stop listening for new requests", does not play well with the way Kubernetes handles rolling upgrades.
During a Kubernetes Deployment rolling upgrade, Kubernetes performs two actions concurrently:
Initiating removal of the Pod from the load balancer.
Sending SIGTERM to the container.
Problem: removal of the Pod from the load balancer can take an arbitrary amount of time. Until it's done, we should continue to listen for new requests. Otherwise, requests may still be routed to the (terminating) Pod, resulting in HTTP errors.
A common existing solution is to introduce a pre-stop hook that sleeps for a short amount of time. The termination flow then becomes like this:
Concurrently:
Initiating removal of the Pod from the load balancer.
Sequentially:
Run pre-stop hook and wait until it finishes.
Sending SIGTERM to the container.
The hope is that the pre-stop hook waits long enough for removal-from-load-balancer to finish.
Problems with this approach:
It's not guaranteed that by the time the pre-stop hook finishes, the Pod is actually removed from the load balancer (sleeping time too short).
Conversely, the Pod may be removed from the load balancer well before the pre-stop hook finishes (sleeping time too long). This wastes cluster resources.
Proposed solution
Enterprise-only feature.
Modify the shutdown behavior as follows. There are two parts that run sequentially:
Part 1 (optional): Wait until Pod actually removed from load balancer
Keep serving requests as normal until we're actually removed from the load balancer.
This can probably be checked by query the Kubernetes API and wait until the corresponding EndpointSlice to be either gone, or at least doesn't reference our Pod anymore. Needs more investigation.
Part 2: Incoming-request-based shutdown delay
Even when we're removed from the load balancer, there may still be in-flight requests from two places:
The kernel socket backlog may still have new connections that we haven't accept()ed yet.
Already accept()ed sockets may still have unread requests due to HTTP pipelining.
To deal with the kernel socket backlog: don't immediately stop accepting new socket connections. Instead, only do so some time (value configurable) after no requests have appeared (on either new or existing sockets).
We don't need to worry about unread pipelined requests. According to the HTTP 1.1 spec, "Clients MUST also be prepared to resend their requests if the server closes the connection before sending all of the corresponding responses".
The text was updated successfully, but these errors were encountered:
Problem statement
Upon shutdown, we stop listening for new requests while allowing existing requests to finish. The first part, "stop listening for new requests", does not play well with the way Kubernetes handles rolling upgrades.
During a Kubernetes Deployment rolling upgrade, Kubernetes performs two actions concurrently:
Problem: removal of the Pod from the load balancer can take an arbitrary amount of time. Until it's done, we should continue to listen for new requests. Otherwise, requests may still be routed to the (terminating) Pod, resulting in HTTP errors.
See also: Delaying Shutdown to Wait for Pod Deletion Propagation
Existing solutions
A common existing solution is to introduce a pre-stop hook that sleeps for a short amount of time. The termination flow then becomes like this:
Concurrently:
The hope is that the pre-stop hook waits long enough for removal-from-load-balancer to finish.
Problems with this approach:
Proposed solution
Enterprise-only feature.
Modify the shutdown behavior as follows. There are two parts that run sequentially:
Part 1 (optional): Wait until Pod actually removed from load balancer
Keep serving requests as normal until we're actually removed from the load balancer.
This can probably be checked by query the Kubernetes API and wait until the corresponding EndpointSlice to be either gone, or at least doesn't reference our Pod anymore. Needs more investigation.
Part 2: Incoming-request-based shutdown delay
Even when we're removed from the load balancer, there may still be in-flight requests from two places:
To deal with the kernel socket backlog: don't immediately stop accepting new socket connections. Instead, only do so some time (value configurable) after no requests have appeared (on either new or existing sockets).
We don't need to worry about unread pipelined requests. According to the HTTP 1.1 spec, "Clients MUST also be prepared to resend their requests if the server closes the connection before sending all of the corresponding responses".
The text was updated successfully, but these errors were encountered: