Availability when killing all pods in a cluster #228

jonatanwulcan · 2019-09-11T13:25:18Z

Hey,
I'm playing around with kubemci to figure out if it's a good match for the product I'm currently working on. I tried the zone-printer demo and then tried out manually going in and delete the pod that was running in the cluster closest to me.

The result was that the service went down until the pod had restarted. Is this expected behaviour? I was hoping the the traffic would fail over to another cluster.

nikhiljindal · 2019-09-11T17:33:32Z

Yes traffic should fail over to another cluster. Maybe the pod was restarted before GCLB detected that the pod was down?

Can you try changing the health check configuration so that it detects failures faster?
You cannot use kubemci to modify it, but can use gcloud or Google Cloud Console directly to update the Health check created by kubemci.
#135 has some relevant discussion about this.

Many customers run multiple replicas in their cluster to mitigate this issue. Setting up Cluster autoscaling and Pod autoscaling will help as well.

jonatanwulcan · 2019-09-11T18:01:50Z

Thanks for your reply Nikhil. I'll look into updating the health check configuration and I'll report back if this solves the problem.

How fast can I expect failover to happen when a cluster goes down?

Also I was wondering about cluster auto scaling and kubemci. Since you're recommending it I suppose it's supported. How fast will GCLB discover new nodes added to the cluster by the auto scaler?

jonatanwulcan · 2019-09-11T20:58:43Z

I tried out changing the health check configuration. I set it to 5s interval 5s timeout. Fail on 1 consecutive and succeed on 1 consecutive.

For others reading this. You can find the health check configuration in google cloud console under Compute Engine -> Health Checks.

Works just as expected now! Thanks for the help!

jonatanwulcan closed this as completed Sep 16, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Availability when killing all pods in a cluster #228

Availability when killing all pods in a cluster #228

jonatanwulcan commented Sep 11, 2019

nikhiljindal commented Sep 11, 2019

jonatanwulcan commented Sep 11, 2019

jonatanwulcan commented Sep 11, 2019

Availability when killing all pods in a cluster #228

Availability when killing all pods in a cluster #228

Comments

jonatanwulcan commented Sep 11, 2019

nikhiljindal commented Sep 11, 2019

jonatanwulcan commented Sep 11, 2019

jonatanwulcan commented Sep 11, 2019