You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
2021/03/15 16:14:33 [ERR] memberlist: Failed to send ping: write udp 192.168.2.17:7946->192.168.2.18:7946: sendto: network is unreachable
2021/03/15 16:14:34 [ERR] memberlist: Push/Pull with agent-three failed: dial tcp 192.168.2.18:7946: connect: network is unreachable
Which suggests to me that agent-two knows it can't communicate with agent-three, so I'm wondering why it reports agent-three as alive rather than failed.
I believe this is a bug in the sense that agent-two falsely believes (or reports) it can communicate with at least one other node when in fact it is entirely isolated.
When I reconnect the network adaptor, after a few seconds all nodes report they are all alive again.
FWIW, if I disconnect the network adaptors on agents one and three, and then check agent two, agent two correctly reports one and three are failed.
I'm testing out serf and it seems like a great project. I've hit one issue in my testing so far.
I set up a 3-node serf cluster on 3 VM's. All nodes report alive on all nodes, as expected. Tags update. All seems healthy.
Then I disconnected the VM's network adaptor on
agent-two
. Agents one and three report agent two is failed as I'd expect:However agent two only reports
agent-one
as having failed where I'd have expected it to report both one and three as failed:In the monitor logs on agent two I can see:
Which suggests to me that
agent-two
knows it can't communicate withagent-three
, so I'm wondering why it reports agent-three asalive
rather thanfailed
.I believe this is a bug in the sense that agent-two falsely believes (or reports) it can communicate with at least one other node when in fact it is entirely isolated.
When I reconnect the network adaptor, after a few seconds all nodes report they are all alive again.
FWIW, if I disconnect the network adaptors on agents one and three, and then check agent two, agent two correctly reports one and three are
failed
.My config is:
During this time, the snapshot file on agent-two looks like this:
Does anyone have any suggestions?
Platform details
Ubuntu Focal 20.04.1 LTS VM's on VMWare Fusion 12.1 Pro on macOS Big Sur 11.2.1 using NAT networking.
The text was updated successfully, but these errors were encountered: