Client - Server model: Server loses connection to clients after a few hours #306

nfonseca · 2024-03-25T09:47:25Z

Hi,

I have been using warp to test an object storage compatible with S3 and after 3/4 hours the server always loses connection to the clients with the messages below:

warp: Running benchmark on all clients...
warp: <ERROR> websocket: close 1006 (abnormal closure): unexpected EOF

warp: Connecting to ws://172.28.1.147:7761/ws
warp: <ERROR> Connection failed:dial tcp 172.28.1.147:7761: connect: connection refused, retrying...


warp: Connecting to ws://172.28.1.147:7761/ws
warp: <ERROR> Connection failed:dial tcp 172.28.1.147:7761: connect: connection refused, retrying...


warp: Connecting to ws://172.28.1.147:7761/ws
warp: <ERROR> Connection failed:dial tcp 172.28.1.147:7761: connect: connection refused, retrying...


warp: Connecting to ws://172.28.1.147:7761/ws
warp: <ERROR> websocket: close 1006 (abnormal closure): unexpected EOF

warp: <ERROR> websocket: close 1006 (abnormal closure): unexpected EOF

warp: Connecting to ws://172.28.5.169:7761/ws

The test is run on top of OpenShift and I have a server pod and a 4 client replicas. I have assigned enough resources to the server and client pods (unlimited cpu and 16GB of memory limits) but the connection always gets lost.
This seems only to happen with small blocks (10KB).
I tried multiple combinations, with different number of warp client pods and increasing cpu and memory resources but the end result is always the same.
Is there any workaround to this issue ?
When I connect to the client pods, I can see that the service is not listening anymore

root@warp-client-0:/# cat /mnt/data/warp.log
warp: Listening on :7761
warp: Accepting connection from server: eE0Ut7hqJzs8Udy1WVfy
warp: Executing put benchmark.
warp: Starting stage prepare in 999.721676ms
warp: prepare done...
warp: Waiting
warp: Starting stage benchmark in 3.000068548s
warp: Starting
root@warp-client-0:/# curl localhost:7761
curl: (7) Failed to connect to localhost port 7761 after 0 ms: Connection refused

I need to restart the client again to recover the connection.

root@warp-client-0:/# nohup /usr/bin/warp client 2>&1 > /mnt/data/warp.log
nohup: ignoring input and redirecting stderr to stdout



^Z
[1]+  Stopped                 nohup /usr/bin/warp client 2>&1 > /mnt/data/warp.log
root@warp-client-0:/# bg
[1]+ nohup /usr/bin/warp client 2>&1 > /mnt/data/warp.log &
root@warp-client-0:/# curl localhost:7761
404 page not found
root@warp-client-0:/#

Is there a way to have a keep alive setting on the client side or a watchdog ?

Thanks

The text was updated successfully, but these errors were encountered:

klauspost self-assigned this Mar 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Client - Server model: Server loses connection to clients after a few hours #306

Client - Server model: Server loses connection to clients after a few hours #306

nfonseca commented Mar 25, 2024

Client - Server model: Server loses connection to clients after a few hours #306

Client - Server model: Server loses connection to clients after a few hours #306

Comments

nfonseca commented Mar 25, 2024