Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

containerd: introduce a new field to enable NRI #15994

Merged
merged 1 commit into from
Oct 21, 2023

Conversation

fmuyassarov
Copy link
Member

@fmuyassarov fmuyassarov commented Oct 4, 2023

Node Resource Interface (NRI) is a common framework for plugging domain or vendor-specific custom logic into container runtime like containerd. This commit introduces a new field containerd.nri, providing cluster admins the flexibility to opt in (defaulted to disabled) for this feature in containerd. By default, NRI is disabled here in accordance with the containerd's default config file. Since NRI support was added to containerd starting from 1.7.0 release, I've added also check to ensure that at least it is 1.7.0 release of containerd that is in use.
Example:

spec:
  containerd:
    nri: 
      enabled: true
  1. Ref to the containerd NRI configuration: https://github.com/containerd/containerd/blob/main/docs/NRI.md
  2. Initial NRI supported version of containerd v1.7.0

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 4, 2023
@fmuyassarov
Copy link
Member Author

/cc @hakman

@k8s-ci-robot k8s-ci-robot requested a review from hakman October 4, 2023 22:05
@johngmyers
Copy link
Member

Validation of the version should be in pkg/apis/kops/validation/validation.go, specifically validateContainerdConfig()

@johngmyers
Copy link
Member

Is there never going to be anything else we will need to configure around NRI?

How would one get the domain or vendor-specific logic plugged in?

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 9, 2023
@fmuyassarov
Copy link
Member Author

Is there never going to be anything else we will need to configure around NRI?

That's a good point. I discussed with @klihub, who has implemented NRI feature, about what parameters would be good to expose in kOps config. We agreed that at this stage of the NRI, it is good enough to have enabled, socketPath, pluginRegistrationTimeout and pluginRequestTimeout options. In case, if user wants to configure additional parameters then configOverride can be used perhaps. We could see how users adapt to using NRI via kOps and if there is a need that arises to add other parameters of NRI in kOps we can come back and add those too in the future.

How would one get the domain or vendor-specific logic plugged in?

Those logics are referred to as NRI plugins. Currently, there is a project called containers/nri-plugins which serves as a collection of community-maintained NRI plugins. These plugins operate as Kubernetes applications, utilizing resources like DaemonSets and ConfigMaps. Essentially, through kOps, administrators have the capability to enable NRI and subsequently install a plugin on their own.

/cc @kad

@k8s-ci-robot k8s-ci-robot requested a review from kad October 9, 2023 09:46
@kad
Copy link
Member

kad commented Oct 9, 2023

Is there never going to be anything else we will need to configure around NRI?

That's a good point. I discussed with @klihub, who has implemented NRI feature, about what parameters would be good to expose in kOps config. We agreed that at this stage of the NRI, it is good enough to have enabled, socketPath, pluginRegistrationTimeout and pluginRequestTimeout options.

to second what @fmuyassarov already wrote: we don't want at current stage to add to deployments tools, like kops, too much of the parameters of NRI. The key part is to give cluster owners to easily enable feature and configiure the most important bits: enable flag, socket path and timeouts. Anything else if needed can be easily added later into structure if field feedback would really request it.

@klihub
Copy link

klihub commented Oct 9, 2023

Is there never going to be anything else we will need to configure around NRI?

That's a good point. I discussed with @klihub, who has implemented NRI feature, about what parameters would be good to expose in kOps config. We agreed that at this stage of the NRI, it is good enough to have enabled, socketPath, pluginRegistrationTimeout and pluginRequestTimeout options. In case, if user wants to configure additional parameters then configOverride can be used perhaps. We could see how users adapt to using NRI via kOps and if there is a need that arises to add other parameters of NRI in kOps we can come back and add those too in the future.

I also think that at this point it would be enough for us to have the ability to enable NRI and configure/override the timeout parameters.

@fmuyassarov
Copy link
Member Author

/test pull-kops-e2e-k8s-aws-calico

pkg/apis/kops/containerdconfig.go Outdated Show resolved Hide resolved
pkg/apis/kops/containerdconfig.go Outdated Show resolved Hide resolved
pkg/apis/kops/containerdconfig.go Outdated Show resolved Hide resolved
pkg/apis/kops/containerdconfig.go Show resolved Hide resolved
pkg/apis/kops/containerdconfig.go Outdated Show resolved Hide resolved
nodeup/pkg/model/containerd.go Outdated Show resolved Hide resolved
@fmuyassarov
Copy link
Member Author

@johngmyers, Unfortunately, I'm unable to pinpoint the exact reason behind the failings tests. Those CI checks that have failed seems due to running test target of the Makefile. I extended validation unit tests and they pass fine locally. However, on the CI make test seem to fail which might be somewhat unrelated to my changes? Would you be able to spare a moment to have quick look at any of the failing tests? Perhaps it is something obvious for you but not easy to find for me.

@hakman
Copy link
Member

hakman commented Oct 11, 2023

@fmuyassarov I think you need to run make apimachinery crds.

@fmuyassarov
Copy link
Member Author

fmuyassarov commented Oct 11, 2023

@hakman I did that :)

$ make apimachinery crds
go run k8s.io/code-generator/cmd/[email protected] --skip-unsafe=true --v=0 --input-dirs ./pkg/apis/kops/v1alpha2 \
	 --output-base=./ --output-file-base=zz_generated.conversion \
	 --go-header-file "hack/boilerplate/boilerplate.generatego.txt"
grep 'requires manual conversion' /home/fmuyassarov/go/src/kops/pkg/apis/kops/v1alpha2/zz_generated.conversion.go ; [ $? -eq 1 ]
go run k8s.io/code-generator/cmd/[email protected] --skip-unsafe=true --v=0 --input-dirs ./pkg/apis/kops/v1alpha3 \
	 --output-base=./ --output-file-base=zz_generated.conversion \
	 --go-header-file "hack/boilerplate/boilerplate.generatego.txt"
grep 'requires manual conversion' /home/fmuyassarov/go/src/kops/pkg/apis/kops/v1alpha3/zz_generated.conversion.go ; [ $? -eq 1 ]
go run k8s.io/code-generator/cmd/[email protected] --v=0 --input-dirs ./pkg/apis/kops \
	 --output-base=./ --output-file-base=zz_generated.deepcopy \
	 --go-header-file "hack/boilerplate/boilerplate.generatego.txt"
go run k8s.io/code-generator/cmd/[email protected] --v=0 --input-dirs ./pkg/apis/kops/v1alpha2 \
	 --output-base=./ --output-file-base=zz_generated.deepcopy \
	 --go-header-file "hack/boilerplate/boilerplate.generatego.txt"
go run k8s.io/code-generator/cmd/[email protected] --v=0 --input-dirs ./pkg/apis/kops/v1alpha3 \
	 --output-base=./ --output-file-base=zz_generated.deepcopy \
	 --go-header-file "hack/boilerplate/boilerplate.generatego.txt"
go run k8s.io/code-generator/cmd/[email protected] --v=0 --input-dirs ./pkg/apis/kops/v1alpha2 \
	 --output-base=./ --output-file-base=zz_generated.defaults \
	 --go-header-file "hack/boilerplate/boilerplate.generatego.txt"
go run k8s.io/code-generator/cmd/[email protected] --v=0 --input-dirs ./pkg/apis/kops/v1alpha3 \
	 --output-base=./ --output-file-base=zz_generated.defaults \
	 --go-header-file "hack/boilerplate/boilerplate.generatego.txt"
go run k8s.io/code-generator/cmd/[email protected] --v=0 \
	 --input-base=k8s.io/kops/pkg/apis --input-dirs=. --input="kops/,kops/v1alpha2,kops/v1alpha3" \
	 --output-package=k8s.io/kops/pkg/client/clientset_generated/ --output-base=/tmp/tmp.GG2Xy8XBxZ \
	 --go-header-file "hack/boilerplate/boilerplate.generatego.txt"
go run k8s.io/code-generator/cmd/[email protected] --v=0 --clientset-name="clientset" \
	 --input-base=k8s.io/kops/pkg/apis --input-dirs=. --input="kops/,kops/v1alpha2,kops/v1alpha3" \
	 --output-package=k8s.io/kops/pkg/client/clientset_generated/ --output-base=/tmp/tmp.GG2Xy8XBxZ \
	 --go-header-file "hack/boilerplate/boilerplate.generatego.txt"
cp -r /tmp/tmp.GG2Xy8XBxZ/k8s.io/kops/pkg .
rm -rf /tmp/tmp.GG2Xy8XBxZ
hack/update-goimports.sh
cd "/home/fmuyassarov/go/src/kops/hack" && go build -o "/home/fmuyassarov/go/src/kops/_output/bin/controller-gen" sigs.k8s.io/controller-tools/cmd/controller-gen
"/home/fmuyassarov/go/src/kops/_output/bin/controller-gen" crd paths=k8s.io/kops/pkg/apis/kops/v1alpha2 output:dir=k8s/crds/ crd:crdVersions=v1
fmuyassarov@lenova:~/go/src/kops$ git status
On branch add-nri-support
Untracked files:
  (use "git add <file>..." to include in what will be committed)
	kops/

nothing added to commit but untracked files present (use "git add" to track)

@fmuyassarov
Copy link
Member Author

@fmuyassarov I think you need to run make apimachinery crds.

Hmm, it is strange that I don't see deep copy changes here on the PR. At the same time, make apimachinery crds is not generating deep copy changes.

@hakman
Copy link
Member

hakman commented Oct 11, 2023

Is the kops dir in GOPATH? If so, can you move it somewhere else?

@fmuyassarov
Copy link
Member Author

Is the kops dir in GOPATH? If so, can you move it somewhere else?

Yes. It is. I will give it a try.

@fmuyassarov
Copy link
Member Author

@hakman moving out the project from GOPATH did work. Thank you for help.

@fmuyassarov fmuyassarov force-pushed the add-nri-support branch 2 times, most recently from 8e5b9c2 to dcaacf8 Compare October 11, 2023 09:54
@fmuyassarov
Copy link
Member Author

@johngmyers PTAL. Thanks.

@fmuyassarov
Copy link
Member Author

/cc @hakman

@fmuyassarov
Copy link
Member Author

ping @hakman @johngmyers Hi. Would you folks have some time to take a look at the PR?

nodeup/pkg/model/containerd.go Outdated Show resolved Hide resolved
pkg/apis/kops/containerdconfig.go Outdated Show resolved Hide resolved
nodeup/pkg/model/containerd.go Outdated Show resolved Hide resolved
nodeup/pkg/model/containerd.go Outdated Show resolved Hide resolved
@hakman
Copy link
Member

hakman commented Oct 20, 2023

/hold for confirmation that manual tests are passing

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 20, 2023
@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Oct 20, 2023

@fmuyassarov: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kops-kubernetes-e2e-ubuntu-gce-build a410027 link true /test pull-kops-kubernetes-e2e-ubuntu-gce-build

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Node Resource Interface (NRI) is a common framework for plugging
domain or vendor-specific custom logic into container runtime like
containerd. This commit introduces a new congiguration field
`containerd.nri`, providing cluster admins the flexibility to opt
in for this feature in containerd and tune some of its parameters.
By default, NRI is disabled here in accordance with the containerd's
default config file.

Signed-off-by: Feruzjon Muyassarov <[email protected]>
@fmuyassarov
Copy link
Member Author

Tests issues are now resolved.
/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 20, 2023
Copy link
Member

@hakman hakman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @fmuyassarov! :)

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 21, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hakman

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 21, 2023
@k8s-ci-robot k8s-ci-robot merged commit f7bd516 into kubernetes:master Oct 21, 2023
7 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v1.29 milestone Oct 21, 2023
@fmuyassarov fmuyassarov deleted the add-nri-support branch October 21, 2023 07:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/api area/documentation area/nodeup cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants