Skip to content

Conversation

@nemacysts
Copy link
Member

This will ensure that pods in CrashLoopBackOff don't impact pool scaledowns (since otherwise, Karpenter will refuse to consolidate nodes due to PDBs)

This is not enabled by default due to our legacy clusters, but will be toggled at a per-cluster level through SystemPaastaConfig.

This PR is also somewhat spicy due to these legacy clusters, so we'll definitely need to do some testing to make sure that the upgraded k8s clientlib doesn't cause chaos there (or we'll have to take more drastic action).

That said, I've upgraded the k8s clientlib version to a somewhat more modern version to match the oldest non-legacy cluster we have and therefore had to make a number of changes (e.g., V1Subject was renamed to RbacV1Subject, etc)

TODO

  • Fix HPA v2beta2 imports (how does this code work atm?!?!?!!
  • Test this in an actual cluster (legacy and non-legacy)

This will ensure that pods in CrashLoopBackOff don't impact pool
scaledowns (since otherwise, Karpenter will refuse to consolidate nodes
due to PDBs)

This is not enabled by default due to our legacy clusters, but will be
toggled at a per-cluster level through SystemPaastaConfig.

This PR is also somewhat spicy due to these legacy clusters, so we'll
definitely need to do some testing to make sure that the upgraded k8s
clientlib doesn't cause chaos there (or we'll have to take more drastic
action).

That said, I've upgraded the k8s clientlib version to a somewhat more
modern version to match the oldest non-legacy cluster we have and
therefore had to make a number of changes (e.g., V1Subject was renamed
to RbacV1Subject, etc)
if load_system_paasta_config().get_enable_unhealthy_pod_eviction():
spec = V1PodDisruptionBudgetSpec(
max_unavailable=max_unavailable,
unhealthy_pod_eviction_policy="AlwaysAllow", # XXX: should this be configurable?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would always use unhealthy_pod_eviction_policy by reading value from config with default value ("IfHealthyBudget")

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But.. I forgot about k8s clusters with older version. Do we pin old PaaSTA version there?

@ajayOO8
Copy link
Contributor

ajayOO8 commented Nov 4, 2025

Forked this into a new PR #4146 with some changes.
(resolving git conflict was a pain)
closing this one

@ajayOO8 ajayOO8 closed this Nov 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants