Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Currently tokio-postgres exposes two knobs to maintain healthy connections: a connect timeout and keep-alives settings that apply directly to the TCP socket. These cover the cases of connection establishment and for maintaining idle connections, but do not cover the case of an active/established socket that does not hear a response from the receiver for a long period of time. By default it can take 15-20m (15 retries with exponential backoff. the # of retries is controlled by
tcp_retries2
) for a connection to be killed under these circumstances.The generally recommended solution to this problem is to set
TCP_USER_TIMEOUT
to cap the total amount of time a socket waits to receive a response after it is established. https://blog.cloudflare.com/when-tcp-sockets-refuse-to-die/ has a great writeup of this case under "Busy ESTAB socket is not forever".I haven't found a super satisfying way of testing this yet, but staging it here for now.