Skip to content

Introduce Node Lifecycle WG#8396

Merged
k8s-ci-robot merged 1 commit intokubernetes:masterfrom
atiratree:wg-node-lifecycle
Jun 24, 2025
Merged

Introduce Node Lifecycle WG#8396
k8s-ci-robot merged 1 commit intokubernetes:masterfrom
atiratree:wg-node-lifecycle

Conversation

@atiratree
Copy link
Copy Markdown
Member

No description provided.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. area/community-management area/slack-management Issues or PRs related to the Slack Management subproject labels Mar 24, 2025
@k8s-ci-robot k8s-ci-robot requested review from ahg-g and ardaguclu March 24, 2025 12:17
@k8s-ci-robot k8s-ci-robot added committee/steering Denotes an issue or PR intended to be handled by the steering committee. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. sig/cli Categorizes an issue or PR as relevant to SIG CLI. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/contributor-experience Categorizes an issue or PR as relevant to SIG Contributor Experience. do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. labels Mar 24, 2025
@github-project-automation github-project-automation Bot moved this to Needs Triage in SIG Scheduling Mar 24, 2025
@atiratree atiratree changed the title Introduce Node Lifecycle WG WIP: Introduce Node Lifecycle WG Mar 24, 2025
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 24, 2025
@atiratree
Copy link
Copy Markdown
Member Author

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 24, 2025
@rthallisey
Copy link
Copy Markdown
Contributor

Looks like I'm not a member of kubernetes org anymore. I was a few years back, but didn't keep up with contributions recently. You can remove me as a lead and I can reapply after some contributions to this WG.

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/invalid-owners-file Indicates that a PR should not merge because it has an invalid OWNERS file in it. label Mar 24, 2025
@atiratree
Copy link
Copy Markdown
Member Author

We have had impactful conversations with Ryan about this group and its goals. He has experience with cluster maintenance and I look forward to his participation in the WG.

@marquiz
Copy link
Copy Markdown
Contributor

marquiz commented Mar 25, 2025

/cc

@k8s-ci-robot k8s-ci-robot requested a review from marquiz March 25, 2025 17:09
Comment thread wg-node-lifecycle/charter.md
@atiratree atiratree force-pushed the wg-node-lifecycle branch from cf8dbfb to 3aed2af Compare June 5, 2025 18:11
@mrunalp
Copy link
Copy Markdown
Contributor

mrunalp commented Jun 5, 2025

+1

@SergeyKanzhelev
Copy link
Copy Markdown
Member

+1 from SIG Node. SIG Node sponsors this WG with the understanding that the group will concentrate on both - shaping and driving existing KEPs to progress thru stages as well as collecting and evaluating requirements and deciding on priorities for introducing new KEPs.

Copy link
Copy Markdown
Member

@SergeyKanzhelev SergeyKanzhelev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 6, 2025
@jsafrane
Copy link
Copy Markdown
Member

jsafrane commented Jun 6, 2025

lgtm from sig-storage

Copy link
Copy Markdown
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with my sig-apps hat I'll add +1 from sig-apps

See my various comments, also I'm still missing ACK from sig network and sig scheduling

* Humble Chirammal (**[@humblec](https://github.com/humblec)**), VMware
* Lucy Sweet (**[@intUnderflow](https://github.com/intUnderflow)**), Uber
* Krzysztof Wilczyński (**[@kwilczynski](https://github.com/kwilczynski)**), Independent
* Ryan Hallisey (**[@rthallisey](https://github.com/rthallisey)**), NVIDIA
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need that many leads for this WG? Isn't it sufficient to have one per representing company, ensuring they cover all the related sigs?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We either have represantation of different SIGs or different companies, so I am not seeing any duplicate role at this moment. Perhaps we could trim this list later according to these folks' activity and continued interest?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being a lead is a certain responsibility. It goes beyond "is interested in the topic". But I'm okay with letting you figure out among yourself who is really showing up consistently to keep the WG moving and then perhaps do some pruning.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same concern but overall lgtm

/approve

For Steering.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

followup in #8748

Comment thread sigs.yaml Outdated
mailing_list: https://groups.google.com/a/kubernetes.io/g/wg-node-lifecycle
liaison:
github: TBD
name: TBD
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kubernetes/steering-committee we'll need to figure this out prior to merge.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They can provisionally put me.
I'm happy to and based on load + election timing that makes sense as you mentioned elsewhere.
If anyone objects we can update it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for covering the liaison role @BenTheElder! Added.

Comment thread wg-node-lifecycle/README.md Outdated
Comment thread wg-node-lifecycle/charter.md Outdated
Comment thread wg-node-lifecycle/charter.md Outdated
Comment thread wg-node-lifecycle/charter.md Outdated
Comment thread wg-node-lifecycle/charter.md Outdated
Comment thread wg-node-lifecycle/charter.md Outdated
Comment thread wg-node-lifecycle/charter.md Outdated
Comment thread wg-node-lifecycle/charter.md Outdated
@macsko
Copy link
Copy Markdown
Member

macsko commented Jun 9, 2025

+1 from SIG Scheduling

Comment thread wg-node-lifecycle/charter.md
Comment thread wg-node-lifecycle/charter.md
Comment thread wg-node-lifecycle/charter.md Outdated
Comment thread sig-network/README.md

The following [working groups][working-group-definition] are sponsored by sig-network:
* [WG Device Management](/wg-device-management)
* [WG Node Lifecycle](/wg-node-lifecycle)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 from SIG Network

@atiratree atiratree force-pushed the wg-node-lifecycle branch from 3aed2af to 39e3bde Compare June 10, 2025 10:13
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 10, 2025
Comment thread sigs.yaml
Comment thread sigs.yaml Outdated
@BenTheElder
Copy link
Copy Markdown
Member

I think we finally have all SIG +1s now?
cc @kubernetes/steering-committee

This group will have a lot to do :-)
+1, thanks for pulling this together.

Copy link
Copy Markdown
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

/hold
to get sufficient majority from steering

* Humble Chirammal (**[@humblec](https://github.com/humblec)**), VMware
* Lucy Sweet (**[@intUnderflow](https://github.com/intUnderflow)**), Uber
* Krzysztof Wilczyński (**[@kwilczynski](https://github.com/kwilczynski)**), Independent
* Ryan Hallisey (**[@rthallisey](https://github.com/rthallisey)**), NVIDIA
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being a lead is a certain responsibility. It goes beyond "is interested in the topic". But I'm okay with letting you figure out among yourself who is really showing up consistently to keep the WG moving and then perhaps do some pruning.

Comment thread wg-node-lifecycle/charter.md Outdated
## Timelines and Disbanding

The working group will disband once the features and core APIs defined in the following
KEPs/Features have reached a stable state (GA) and ongoing maintenance ownership is established
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no "following KEPs/Features"... so your work is already done? 😛

You probably had this in a different order initially and lost them during some reshuffling. I think this refers to the features under "Prioritization" now?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, updated!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@pohly
Copy link
Copy Markdown
Contributor

pohly commented Jun 24, 2025

/approve

For Steering.

https://github.com/kubernetes/community/pull/8396/files#r2161652316 should better get resolved before merging. Also, needs a rebase...

Co-authored-by: Ryan Hallisey <rhallisey@nvidia.com>
@atiratree
Copy link
Copy Markdown
Member Author

Updated and rebased. Thanks everyone!

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: atiratree, pacoxu, pohly, soltysh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [pacoxu,pohly,soltysh]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@soltysh
Copy link
Copy Markdown
Contributor

soltysh commented Jun 24, 2025

With 4 steering members +1-ing (Paco, Patrick, Ben and myself) this is good to go as is based on the rules.

/hold cancel

Copy link
Copy Markdown

@evrardjp evrardjp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few extra comments for the sake of history/documentation. I just hope someone will read them and have them in mind when we'll produce solutions.

- Consider improving the pod lifecycle of DaemonSets and static pods during a node maintenance.
- Explore the cloud provider use cases and how they can hook into the node lifecycle. So that the
users can use the same APIs or configurations across the board.
- Migrate users of the eviction based kubectl-like drain (kubectl, cluster autoscaler, karpenter,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit sad that kured was removed here, while it was in the initial comments on a previous issue.

I would like to adapt kured to this framework at least.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@evrardjp no worries! This list is not supposed to be exhaustive. We will properly explore the migration topic when the time comes.

FYI, kured is still present in https://github.com/kubernetes/community/blob/master/wg-node-lifecycle/charter.md#relevant-projects

- Explore a unified way of draining the nodes and managing node maintenance by introducing new APIs
and extending the current ones. This includes exploring extension to or interactions with the Node
object.
- Analyze the node lifecycle, the Node API, and possible interactions. We want to explore augmenting
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could have been more specific here. For example, analyse the possibility to set new conditions onto nodes.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The WG charter does not replace a proper KEP; it only indicates the direction in which we would like to proceed. So we will even explore options that are not explicitly listed here.

We expect to provide reference implementations of the new APIs including but not limited to
controllers (kube-controller-manager), API validation, integration with existing core components and
extension points for the ecosystem. This should be accompanied by E2E / Conformance tests.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And cli...

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CLI might be too broad. Nevertheless we talk about the kubectl above and will also further analyze it in our KEPs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/community-management area/slack-management Issues or PRs related to the Slack Management subproject cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. committee/steering Denotes an issue or PR intended to be handled by the steering committee. lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. sig/cli Categorizes an issue or PR as relevant to SIG CLI. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/contributor-experience Categorizes an issue or PR as relevant to SIG Contributor Experience. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.