Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Difficulties with keda and helm during keda->standard hpa 'upgrade' #6250

Closed
SleepyBrett opened this issue Oct 18, 2024 · 3 comments
Closed
Labels
bug Something isn't working stale All issues that are marked as stale due to inactivity

Comments

@SleepyBrett
Copy link

SleepyBrett commented Oct 18, 2024

Report

I have created a helm chart that allows users to define standard hpas or 'opt-into' keda. When keda is disabled by values we create a standard hpa, but when keda is enabled, we do not render the hpa and instead render a scaled object that specifies an hpa name. Because of how helm does it's install this is causing us some issues.

A quick overview of how helm installs/upgrades things:

  1. Helm renders all the templates for the current values and produces a number of k8s objects.
  2. Those k8s objects are then created/updated on the cluster (as long as the current ones are owned by helm (have certain labels/annotations)).
  3. Helm then looks for any objects that are 'owned' by the helm release but were not defined in step 1 and it deletes those objects as they are now orphaned.

In the current chart the name of the hpa that is created when keda is disabled is the same as the hpa name we place into the scaled object when keda is enabled. We use the transfer-hpa-ownership annotation to smooth this over.

So from a helm point of view:

  1. renders a scaled object and not an hpa
  2. apply
  1. keda validation webhook sees the current HPA (not owned by keda) and because of the annotation, since the names match, does not care and moves on.
  2. the scaledobject is created
  1. helm removed the current hpa
  2. the keda controller reconciles the scaled object and since the hpa does not exist, it gets created.

So far so good. We now have an hpa owned by the scaled object.

Now when we then disable keda:

  1. helm renders templates and generates an HPA object but no scaledobject
  2. apply, helm updates the hpa that keda created (we think, this is an odd one beucase i would expect helm to choke here on non-ownership, perhaps helm does not remove the current hpa when 'upgrading to keda' because of the keda ownership block? Audit logs could tell us i suppose)
  3. helm removes the scaled object
  4. keda/or k8s controller manager removes the hpa because it was owned by the scaled object
  5. we are left with a deployment with no hpa

So then we think, ok what if the name of the hpa created by a non-keda install and the hpa referenced by a keda install are different. We make the changes but find that when we go to upgrade from non-keda -> keda the validation webhook rejects us, because at the time we are applying the scaledobject the hpa still exists and we get the failed to create resource: admission webhook "vscaledobject.kb.io" denied the request: the workload 'kedatest-microservice' of type 'apps/v1.Deployment' is already managed by the hpa 'kedatest-microservice' error.

Is there any way around this, we have fiddled a bit with the scaled object annotations, but they are, frankly, pretty poorly documented. Specifically validations.keda.sh/hpa-ownership

Expected Behavior

I expect to be able to toggle back and forth between a standard hpa and keda hpa using the standard helm upgrade -i method in a single step process.

Actual Behavior

Deletion of the scaled object deletes the underlying hpa. Leaving a service that has downgraded from keda to standard hpa with no hpa at all.

Steps to Reproduce the Problem

I've kind of discussed above, but I can provide a slimmed down helm chart on request.

Logs from KEDA operator

logs are unimportant

KEDA Version

2.14.0

Kubernetes Version

1.29

Platform

Amazon Web Services

Scaler Details

unimportant

Anything else?

It seems to me that you could implement a new annotation that would, on scaledobject deletion, instead of marking for delete immediately, first remove the ownership claim from the hpa. Thus leaving the hpa intact. Thoughts?

We realize that this is a bit of an edge case, but it is one that would bite pretty hard and it concerns us.

@SleepyBrett SleepyBrett added the bug Something isn't working label Oct 18, 2024
@JorTurFer
Copy link
Member

Hello
I get the reproduction steps, but the problem there is the ownership of the HPA. When KEDA deploys the HPA, registers the ScaledObject as owner reference in the HPA, and it's k8s the responsible for removing the HPA. This mechanism is there to remove orphan resources, so we can't disable it.
Currently, we don't support disabling specific rules one by one, so the best workaround that I can suggest is removing the admission webhooks (or just scale them to 0) and use different names for your HPA and KEDA's HPA.
In the midterm, I think that supporting more annotations to disable specific admission rules would be a nice feature, if you are willing to add this as the first one, it'd be nice :)

Copy link

stale bot commented Jan 3, 2025

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale All issues that are marked as stale due to inactivity label Jan 3, 2025
Copy link

stale bot commented Jan 10, 2025

This issue has been automatically closed due to inactivity.

@stale stale bot closed this as completed Jan 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale All issues that are marked as stale due to inactivity
Projects
None yet
Development

No branches or pull requests

2 participants