-
Notifications
You must be signed in to change notification settings - Fork 84
Description
Describe the bug
We have OpenShift cluster A where we have only one namespace xyz
and we call it as our main site where we issue token and link that token with the multiple namespaces in Openshift Cluster B ( roughly we have around 40 namespaces in cluster B ).
When we initially deployed skupper with HA in Cluster A under namespaces xyz
it was working fine later at some point the skupper-routers from OpenShift cluster B started restarting with an error Failed to release lock: Operation cannot be fulfilled on leases.coordination.k8s.io "skupper-site-leader": the object has been modified; please apply your changes to the latest version and try again
.
Below is the logs from one of the namespaces isd-br-dev
from cluster B.
025/08/13 15:58:42 INFO updating network status info component=kube.flow.statusSync configmap=skupper-network-status
E0813 15:58:43.047087 1 leaderelection.go:436] error retrieving resource lock isd-br-dev/skupper-site-leader: Get "https://172.30.0.1:443/apis/coordination.k8s.io/v1/namespaces/isd-br-dev/leases/skupper-site-leader": context deadline exceeded
I0813 15:58:43.047126 1 leaderelection.go:297] failed to renew lease isd-br-dev/skupper-site-leader: context deadline exceeded
E0813 15:58:43.053052 1 leaderelection.go:322] Failed to release lock: Operation cannot be fulfilled on leases.coordination.k8s.io "skupper-site-leader": the object has been modified; please apply your changes to the latest version and try again
2025/08/13 15:58:43 COLLECTOR: Lost leader lock after 15.920086783s
Just because of these frequent restarts connection with microservices is failing.
How To Reproduce
Deploy skupper in HA and link with multiple namespaces.
Expected behavior
Skupper sites should be able to identify the leader in ha and flow the traffic accordingly.
Environment details
- Skupper CLI: 2.1.0
- Skupper Operator (if applicable): [e.g. 1.5.0, 1.4.3]
- Platform: OpenShift
Additional context
Add any other context about the problem here.