Deploying your own VPA with leader election enabled in GKE conflicts with the GKE system component #7461

raywainman · 2024-11-04T21:17:51Z

Which component are you using?:

vertical-pod-autoscaler

What version of the component are you using?:

Component version: vertical-pod-autoscaler-1.2.0+

Leader election functionality was added in #6985 and is turned off by default

What k8s version are you using (kubectl version)?:

Any version 1.27+

What environment is this in?:

GKE

What did you expect to happen?:

The self-deployed VPA recommender and the GKE implementation of HPA to continue working.

What happened instead?:

Both the self-deployed VPA recommender and the GKE version use a lease called vpa-recommender in kube-system.

If you deploy your own VPA recommender, it might "steal" the lease and prevent the GKE implementation of HPA.

How to reproduce it (as minimally and precisely as possible):

Create a cluster
Deploy your own vpa-recommender (eg. using vpa-up.sh). Make sure leader election is enabled (leader-elect=true).
Do something to disrupt the control plane (for example upgrade the version).
See if your self deployed version of the vpa-recommender has grabbed the lease, if so HPA could stop working.

Anything else?

This is due to the unfortunate naming collision between GKE's system controller (also called vpa-recommender and the one provided here)

The text was updated successfully, but these errors were encountered:

raywainman · 2024-11-04T21:19:09Z

cc @adrianmoisey

Some thoughts on mitigating this:

Document the flags very carefully to encourage folks not to use the default when deploying on GKE.
Change the default here which could possibly break some users.

adrianmoisey · 2024-11-05T06:32:03Z

I see two paths forward to fixing this:

GKE stops using vpa-recommender as a lease name
VPA stops using vpa-recommender as a lease name

I can't speak for what that lease is being used for in GKE, but I can only assume that changing that lease is difficult or impossible in GKE.

Given that the lease(s) in VPA are only used for VPA components, and running multiple recommenders and updaters for a brief period isn't that worst thing in the world, my vote is that we change the default lease name in the VPA.

Any VPA configured with the lease enabled will only be running multiple pods for a short period of time, which should be fine.

It's obviously not an amazing path forward, but may be worth doing.

I'm curios what @voelzmo and @kwiesmueller think, as they may be the ones approving that controversial PR.

adrianmoisey · 2024-11-05T12:45:02Z

my vote is that we change the default lease name in the VPA.

If we do go this path, i suggest we also make PRs into 3rd party Helm charts to ensure they support the new default name. Some of them hardcode the lease name:

kwiesmueller · 2024-11-05T14:22:20Z

I'm not sure if changing the default in OSS is the right approach because GKE claims it too.
Having documentation (GKE should maybe have a docs page for that too) seems like it should be enough.
Is there anything preventing GKE users from deploying a custom VPA with a different lease name?

cc. @raywainman

adrianmoisey · 2024-11-05T14:28:10Z

I'm not sure if changing the default in OSS is the right approach because GKE claims it too.

My concern is that if someone follows the happy path of deployment, and enables the leases without reading documentation, then GKE breaks in ways that are not obvious.

It took us 4 days and countless messages with GCP support before we found what GCP was doing with the vpa-recommender lease (being that the GKE HPA uses it).
It's also not possible (from what I understand) for a user to self-troubleshoot this problem (ie: GKE users can't see the logs of the HPA controller).

The pain of renaming the default OSS lease is pretty small.

adrianmoisey · 2024-11-06T06:39:47Z

/area vertical-pod-autoscaler

raywainman · 2024-11-06T20:17:49Z

Just thinking out loud - what is the worst that can happen when changing a lease name?

On upgrade, we would have two vpa-recommender pods (one from old version, one from new version) that believes they are the leader.

For a short period of time, both of them will try to compute and apply VPA recommendations. Off the top of my head, these should all generally be the same recommendations, we will just have the possibility of racing/duplication.

Is there any other risk anyone else can think of?

adrianmoisey · 2024-11-06T20:44:05Z

Just thinking out loud - what is the worst that can happen when changing a lease name?

Yup! That is my thought too.

For a short period of time, both of them will try to compute and apply VPA recommendations. Off the top of my head, these should all generally be the same recommendations, we will just have the possibility of racing/duplication.

Correct! My understanding is that the bad thing that could happen is that 2 VPA recommenders should be giving roughly the same recommendation, for a very short period of time, after which only 1 will be active.

And, the VPA recommender's --recommender-interval flag is set to 1 minute by default, lowering that risk. There may be a few other parts I may be forgetting, such as the OOM detector, but I don't think that's a big deal either.

raywainman · 2024-11-06T20:48:59Z

@ialidzhikov what do you think?

My vote is to change it now before the next release (we could even consider a cherrypick here) given that:

This feature is fairly new, we have an opportunity to fix this before it gains wider adoption.
Leader election is off by default.
We can try and be loud in the release notes about this change to warn users in case they see strange behavior.
The theoretical impact of having two leaders running temporarily seems quite minimal.

raywainman · 2024-11-06T20:49:20Z

cc @voelzmo as well

raywainman · 2024-11-06T20:55:11Z

Can maybe something go wrong with the checkpoint?

If the new version leader writes a checkpoint that the old version leader doesn't understand perhaps?

EDIT: Looks like we have support for this case.

adrianmoisey · 2024-11-07T06:43:48Z

@ialidzhikov what do you think?

My vote is to change it now before the next release (we could even consider a cherrypick here) given that:

This feature is fairly new, we have an opportunity to fix this before it gains wider adoption.

Leader election is off by default.

We can try and be loud in the release notes about this change to warn users in case they see strange behavior.

The theoretical impact of having two leaders running temporarily seems quite minimal.

I agree with all these points.
Also to note, the VPA documentation isn't very good about telling people how to run the VPA, the upside being that no documentation was really updated explaining the lease feature (besides the update to the parameters: 211ecdc). I recently added #7463, but that's a few days old.

If we're going to change the lease name, I think now is the perfect time to do so.

ialidzhikov · 2024-11-07T07:45:55Z

Hi folks,

% docker run registry.k8s.io/autoscaling/vpa-recommender:1.2.1 --help

      --leader-elect                                         Start a leader election client and gain leadership before executing the main loop. Enable this when running replicated components for high availability.
      --leader-elect-lease-duration duration                 The duration that non-leader candidates will wait after observing a leadership renewal until attempting to acquire leadership of a led but unrenewed leader slot. This is effectively the maximum duration that a leader can be stopped before it is replaced by another candidate. This is only applicable if leader election is enabled. (default 15s)
      --leader-elect-renew-deadline duration                 The interval between attempts by the acting master to renew a leadership slot before it stops leading. This must be less than the lease duration. This is only applicable if leader election is enabled. (default 10s)
      --leader-elect-resource-lock string                    The type of resource object that is used for locking during leader election. Supported options are 'leases', 'endpointsleases' and 'configmapsleases'. (default "leases")
      --leader-elect-resource-name string                    The name of resource object that is used for locking during leader election. (default "vpa-recommender")
      --leader-elect-resource-namespace string               The namespace of resource object that is used for locking during leader election. (default "kube-system")
      --leader-elect-retry-period duration                   The duration the clients should wait between attempting acquisition and renewal of a leadership. This is only applicable if leader election is enabled. (default 2s)

The leader election resource name and namespace are configurable. If you are installing a custom VPA to a GKE cluster, please use the --leader-elect-resource-namespace and --leader-elect-resource-name flags to avoid using the same lease the GKE's VPA is using.

ialidzhikov · 2024-11-07T07:49:35Z

I am also open to rename the default lease name in the VPA repo to avoid this issue.

kwiesmueller · 2024-11-07T15:15:27Z

Alright, as it's still new and it's an easy way to avoid issues before too many people use it. I guess changing it now is fine.
I am a little worried about breaking people that already use it, so not sure about backporting it (doesn't seem correct under semantic versioning).

raywainman · 2024-11-14T16:02:59Z

We'll need to update this: https://github.com/cowboysysop/charts/blob/master/charts/vertical-pod-autoscaler/templates/recommender/clusterrole.yaml#L141-L142 which was added in cowboysysop/charts#705.

If we combine it with a release then this should be less painful.

raywainman · 2024-11-14T16:16:45Z

Opened cowboysysop/charts#781 to update the Helm chart.

raywainman added the kind/bug Categorizes issue or PR as related to a bug. label Nov 4, 2024

raywainman mentioned this issue Nov 4, 2024

Small documentation changes to warn users about deploying VPA with leader election on GKE. #7462

Merged

adrianmoisey mentioned this issue Nov 5, 2024

Clarify how to run VPA with multiple Pods #7463

Merged

k8s-ci-robot added the area/vertical-pod-autoscaler label Nov 6, 2024

This was referenced Nov 14, 2024

feat(vertical-pod-autoscaler): update registry.k8s.io/autoscaling/vpa to v1.2.1 cowboysysop/charts#705

Merged

Change default lease name for vpa-recommender to "vpa-recommender-lease" to avoid conflicts cowboysysop/charts#781

Open

raywainman mentioned this issue Nov 14, 2024

Change default lease name for vpa-recommender to "vpa-recommender-lease" #7498

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deploying your own VPA with leader election enabled in GKE conflicts with the GKE system component #7461

Deploying your own VPA with leader election enabled in GKE conflicts with the GKE system component #7461

raywainman commented Nov 4, 2024 •

edited

Loading

raywainman commented Nov 4, 2024

adrianmoisey commented Nov 5, 2024

adrianmoisey commented Nov 5, 2024 •

edited

Loading

kwiesmueller commented Nov 5, 2024

adrianmoisey commented Nov 5, 2024

adrianmoisey commented Nov 6, 2024

raywainman commented Nov 6, 2024

adrianmoisey commented Nov 6, 2024

raywainman commented Nov 6, 2024

raywainman commented Nov 6, 2024

raywainman commented Nov 6, 2024 •

edited

Loading

adrianmoisey commented Nov 7, 2024

ialidzhikov commented Nov 7, 2024 •

edited

Loading

ialidzhikov commented Nov 7, 2024

kwiesmueller commented Nov 7, 2024

raywainman commented Nov 14, 2024

raywainman commented Nov 14, 2024

Deploying your own VPA with leader election enabled in GKE conflicts with the GKE system component #7461

Deploying your own VPA with leader election enabled in GKE conflicts with the GKE system component #7461

Comments

raywainman commented Nov 4, 2024 • edited Loading

raywainman commented Nov 4, 2024

adrianmoisey commented Nov 5, 2024

adrianmoisey commented Nov 5, 2024 • edited Loading

kwiesmueller commented Nov 5, 2024

adrianmoisey commented Nov 5, 2024

adrianmoisey commented Nov 6, 2024

raywainman commented Nov 6, 2024

adrianmoisey commented Nov 6, 2024

raywainman commented Nov 6, 2024

raywainman commented Nov 6, 2024

raywainman commented Nov 6, 2024 • edited Loading

adrianmoisey commented Nov 7, 2024

ialidzhikov commented Nov 7, 2024 • edited Loading

ialidzhikov commented Nov 7, 2024

kwiesmueller commented Nov 7, 2024

raywainman commented Nov 14, 2024

raywainman commented Nov 14, 2024

raywainman commented Nov 4, 2024 •

edited

Loading

adrianmoisey commented Nov 5, 2024 •

edited

Loading

raywainman commented Nov 6, 2024 •

edited

Loading

ialidzhikov commented Nov 7, 2024 •

edited

Loading