Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client - Server model: Server loses connection to clients after a few hours #306

Open
nfonseca opened this issue Mar 25, 2024 · 0 comments
Assignees

Comments

@nfonseca
Copy link

Hi,

I have been using warp to test an object storage compatible with S3 and after 3/4 hours the server always loses connection to the clients with the messages below:

warp: Running benchmark on all clients...
warp: <ERROR> websocket: close 1006 (abnormal closure): unexpected EOF

warp: Connecting to ws://172.28.1.147:7761/ws
warp: <ERROR> Connection failed:dial tcp 172.28.1.147:7761: connect: connection refused, retrying...


warp: Connecting to ws://172.28.1.147:7761/ws
warp: <ERROR> Connection failed:dial tcp 172.28.1.147:7761: connect: connection refused, retrying...


warp: Connecting to ws://172.28.1.147:7761/ws
warp: <ERROR> Connection failed:dial tcp 172.28.1.147:7761: connect: connection refused, retrying...


warp: Connecting to ws://172.28.1.147:7761/ws
warp: <ERROR> websocket: close 1006 (abnormal closure): unexpected EOF

warp: <ERROR> websocket: close 1006 (abnormal closure): unexpected EOF

warp: Connecting to ws://172.28.5.169:7761/ws

The test is run on top of OpenShift and I have a server pod and a 4 client replicas. I have assigned enough resources to the server and client pods (unlimited cpu and 16GB of memory limits) but the connection always gets lost.
This seems only to happen with small blocks (10KB).
I tried multiple combinations, with different number of warp client pods and increasing cpu and memory resources but the end result is always the same.
Is there any workaround to this issue ?
When I connect to the client pods, I can see that the service is not listening anymore

root@warp-client-0:/# cat /mnt/data/warp.log
warp: Listening on :7761
warp: Accepting connection from server: eE0Ut7hqJzs8Udy1WVfy
warp: Executing put benchmark.
warp: Starting stage prepare in 999.721676ms
warp: prepare done...
warp: Waiting
warp: Starting stage benchmark in 3.000068548s
warp: Starting
root@warp-client-0:/# curl localhost:7761
curl: (7) Failed to connect to localhost port 7761 after 0 ms: Connection refused

I need to restart the client again to recover the connection.

root@warp-client-0:/# nohup /usr/bin/warp client 2>&1 > /mnt/data/warp.log
nohup: ignoring input and redirecting stderr to stdout



^Z
[1]+  Stopped                 nohup /usr/bin/warp client 2>&1 > /mnt/data/warp.log
root@warp-client-0:/# bg
[1]+ nohup /usr/bin/warp client 2>&1 > /mnt/data/warp.log &
root@warp-client-0:/# curl localhost:7761
404 page not found
root@warp-client-0:/#

Is there a way to have a keep alive setting on the client side or a watchdog ?

Thanks

@klauspost klauspost self-assigned this Mar 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants