Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature enforceExpireAfter - respect to expireAfter ttl #1789

Open
ArieLevs opened this issue Oct 30, 2024 · 7 comments
Open

Feature enforceExpireAfter - respect to expireAfter ttl #1789

ArieLevs opened this issue Oct 30, 2024 · 7 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@ArieLevs
Copy link

Description

Karpenter has the expireAfter feature,
lets assume I've configures a 24h value for expireAfter,
24 hours passed and node should be terminated, but karpenter will result in

  Type     Reason             Age                  From       Message
  ----     ------             ----                 ----       -------
  Warning  FailedDraining     27m (x533 over 18h)  karpenter  Failed to drain node, 7 pods are waiting to be evicted
  Normal   DisruptionBlocked  67s (x525 over 18h)  karpenter  Cannot disrupt Node: state node is marked for deletion

this is because one or more of the pods from that node have the karpenter.sh/do-not-disrupt: true annotation.
the result is a node that taint with karpenter.sh/disrupted:NoSchedule, no new pods will jump onto it, and its in a "stuck" situation.

I would like a way to force karpenter to spin new nodes even if I use this annotations.
would it be reasonable to add a enforceExpireAfter: true|false (default false) future, so if true is set, karpenter will ignore/remove the do not disrupt annotation and just delete the node?

What problem are you trying to solve?
forcefully delete nodes after TTL of expireAfter passed

How important is this feature to you?
very, the lack of this features results in underutilized nodes that cannot be auto deleted


  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@ArieLevs ArieLevs added the kind/feature Categorizes issue or PR as related to a new feature. label Oct 30, 2024
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Oct 30, 2024
@ArieLevs ArieLevs changed the title enforceExpireAfter feature - respect to expireAfter ttl Feature enforceExpireAfter - respect to expireAfter ttl Oct 30, 2024
@jmdeal
Copy link
Member

jmdeal commented Oct 30, 2024

Have you considered using terminationGracePeriod?

@ArieLevs
Copy link
Author

Have you considered using terminationGracePeriod?

yes, we use a 2h terminationGracePeriod value, this does not works, probably since a delete call is not even made against the node (its just "marked for deletion" by Karpenter)

@ArieLevs
Copy link
Author

@jmdeal after reading again the documentation from TerminationGracePeriod,

it states:

For instance, a NodeClaim with terminationGracePeriod set to 1h and an expireAfter set to 23h will begin draining after it’s lived for 23h. Let’s say a do-not-disrupt pod has TerminationGracePeriodSeconds set to 300 seconds. If the node hasn’t been fully drained after 55m, Karpenter will delete the pod to allow it’s full terminationGracePeriodSeconds to cleanup. If no pods are blocking draining, Karpenter will cleanup the node as soon as the node is fully drained, rather than waiting for the NodeClaim’s terminationGracePeriod to finish.

so in my case:
terminationGracePeriod set to 1h - true
expireAfter set to 23h - true
a do-not-disrupt pod has TerminationGracePeriodSeconds set to 300 seconds - true (0 seconds)
but,
Karpenter will delete the pod to allow it’s full terminationGracePeriodSeconds to cleanup - false

should this issue changed to a bug instead of a feature request?

@jmdeal
Copy link
Member

jmdeal commented Nov 1, 2024

Yes, if the node has been draining for longer than your terminationGracePeriod, this would be a bug not a feature. TGP should enforce a maximum grace time which should meet your use case. Are you able to share Karpenter logs / events that were emited?

/kind bug
/triage needs-information

@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. triage/needs-information Indicates an issue needs more information in order to work on it. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 1, 2024
@jmdeal
Copy link
Member

jmdeal commented Nov 1, 2024

/remove-kind feature

@k8s-ci-robot k8s-ci-robot removed the kind/feature Categorizes issue or PR as related to a new feature. label Nov 1, 2024
@ArieLevs
Copy link
Author

ArieLevs commented Nov 1, 2024

Sure 👍, will add logs from historical data by early next week (but will probably going to have fresh info from Sunday/Monday)

thanks

@ArieLevs
Copy link
Author

ArieLevs commented Nov 7, 2024

an example from today for a node part of a NodePool with expireAfter: 720h (30 days),
node is alive for 77d

Events:
  Type     Reason             Age                      From       Message
  ----     ------             ----                     ----       -------
  Warning  FailedDraining     5m54s (x1561 over 2d4h)  karpenter  Failed to drain node, 8 pods are waiting to be evicted
  Normal   DisruptionBlocked  58s (x1510 over 2d4h)    karpenter  Cannot disrupt Node: state node is marked for deletion

endless logs of:

{"body":"Failed to drain node, 8 pods are waiting to be evicted","severity":"Warning","attributes":{"k8s.event.action":"","k8s.event.count":1546,"k8s.event.name":"ip-10-235-51-74.ec2.internal.1804f311a19b7e76","k8s.event.reason":"FailedDraining","k8s.event.start_time":"2024-11-07 00:17:41 +0000 UTC","k8s.event.uid":"73e98ab4-b698-4f16-90f1-db050a48d744","k8s.namespace.name":""},"resources":{"k8s.node.name":"","k8s.object.api_version":"v1","k8s.object.fieldpath":"","k8s.object.kind":"Node","k8s.object.name":"ip-10-235-51-74.ec2.internal","k8s.object.resource_version":"965976639","k8s.object.uid":"5abd9407-e06d-4e80-b0db-c48444e4f414"}}

{"body":"Cannot disrupt Node: state node is marked for deletion","severity":"Normal","attributes":{"k8s.event.action":"","k8s.event.count":1501,"k8s.event.name":"ip-10-235-51-74.ec2.internal.1804f3123cd3ae94","k8s.event.reason":"DisruptionBlocked","k8s.event.start_time":"2024-11-06 02:22:17 +0000 UTC","k8s.event.uid":"4d6ddbff-1a4b-4fa7-8eb3-dd8ba0c37753","k8s.namespace.name":""},"resources":{"k8s.node.name":"","k8s.object.api_version":"v1","k8s.object.fieldpath":"","k8s.object.kind":"Node","k8s.object.name":"ip-10-235-51-74.ec2.internal","k8s.object.resource_version":"963917187","k8s.object.uid":"5abd9407-e06d-4e80-b0db-c48444e4f414"}}

this node contains 8 pods, 7 of them are daemonsets, and single deployment pod that contains the karpenter.sh/do-not-disrupt: true annotation, this pod contains next event:

Events:
  Type    Reason     Age                   From       Message
  ----    ------     ----                  ----       -------
  Normal  Nominated  29m (x12 over 3h32m)  karpenter  Pod should schedule on: nodeclaim/default-jv5sb, node/NODE-A
  Normal  Nominated  3m2s (x806 over 27h)  karpenter  Pod should schedule on: nodeclaim/default-c8h6l, node/NODE-B

note that the nodes, the above pod was scheduled for (i.e. NODE-A and NODE-B) contain next events (they both live just less then 4 days):

Events:
  Type    Reason             Age                    From       Message
  ----    ------             ----                   ----       -------
  Normal  DisruptionBlocked  75s (x1511 over 2d4h)  karpenter  Cannot disrupt Node: state node is nominated for a pending pod

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests

3 participants