Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] Etcd cluster status reported incorrectly. #742

Open
1 task done
TimJones opened this issue Nov 21, 2024 · 0 comments
Open
1 task done

[bug] Etcd cluster status reported incorrectly. #742

TimJones opened this issue Nov 21, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@TimJones
Copy link
Member

Is there an existing issue for this?

  • I have searched the existing issues

Current Behavior

In the Omni Clouster Overview window, the right-hand-side pane with 'Control Plane' status reports Etcd as 'OK' even if it is not.

image

❯ talosctl -n ip-10-0-0-148 services    
NODE         SERVICE      STATE     HEALTH   LAST CHANGE     LAST EVENT
10.0.0.148   apid         Running   OK       2h36m52s ago    Health check successful
10.0.0.148   containerd   Running   OK       23h42m55s ago   Health check successful
10.0.0.148   cri          Running   OK       2h36m52s ago    Health check successful
10.0.0.148   dashboard    Running   ?        23h42m53s ago   Process Process(["/sbin/dashboard"]) started with PID 4094
10.0.0.148   etcd         Running   Fail     2h35m8s ago     Health check failed: context deadline exceeded
10.0.0.148   kubelet      Running   OK       2h36m46s ago    Health check successful
10.0.0.148   machined     Running   OK       23h42m55s ago   Health check successful
10.0.0.148   syslogd      Running   OK       23h42m54s ago   Health check successful
10.0.0.148   trustd       Running   OK       2h36m51s ago    Health check successful
10.0.0.148   udevd        Running   OK       23h42m55s ago   Health check successful

❯ talosctl -n ip-10-0-1-152 services           
NODE         SERVICE      STATE     HEALTH   LAST CHANGE   LAST EVENT
10.0.1.152   apid         Running   OK       5m24s ago     Health check successful
10.0.1.152   containerd   Running   OK       5m25s ago     Health check successful
10.0.1.152   cri          Running   OK       5m23s ago     Health check successful
10.0.1.152   dashboard    Running   ?        5m25s ago     Process Process(["/sbin/dashboard"]) started with PID 4103
10.0.1.152   etcd         Running   Fail     5m3s ago      Health check failed: context deadline exceeded
10.0.1.152   kubelet      Running   OK       5m21s ago     Health check successful
10.0.1.152   machined     Running   OK       5m25s ago     Health check successful
10.0.1.152   syslogd      Running   OK       5m25s ago     Health check successful
10.0.1.152   trustd       Running   OK       5m23s ago     Health check successful
10.0.1.152   udevd        Running   OK       5m25s ago     Health check successful

❯ talosctl -n ip-10-0-2-176 services  
NODE         SERVICE      STATE     HEALTH   LAST CHANGE     LAST EVENT
10.0.2.176   apid         Running   OK       2h37m17s ago    Health check successful
10.0.2.176   containerd   Running   OK       23h43m20s ago   Health check successful
10.0.2.176   cri          Running   OK       2h37m17s ago    Health check successful
10.0.2.176   dashboard    Running   ?        23h43m18s ago   Process Process(["/sbin/dashboard"]) started with PID 4093
10.0.2.176   etcd         Running   Fail     2h35m14s ago    Health check failed: context deadline exceeded
10.0.2.176   kubelet      Running   OK       2h37m11s ago    Health check successful
10.0.2.176   machined     Running   OK       23h43m20s ago   Health check successful
10.0.2.176   syslogd      Running   OK       23h43m19s ago   Health check successful
10.0.2.176   trustd       Running   OK       2h37m17s ago    Health check successful
10.0.2.176   udevd        Running   OK       23h43m19s ago   Health check successful

Expected Behavior

Etcd status to reflect actual etcd cluster status.

Steps To Reproduce

  1. Register 3 nodes in Omni that connect connect to each other
  2. Form them into a single cluster control plane
  3. Review etcd status

What browsers are you seeing the problem on?

Firefox

Anything else?

No response

@TimJones TimJones added the bug Something isn't working label Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant