Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch containerd to enable NRI before running NRI plugin #97

Closed
wants to merge 1 commit into from

Conversation

fmuyassarov
Copy link
Collaborator

This commit adds an init container to enable NRI in containerd configuration file. Users installing plugin via Helm have an option to opt in for using that init container to enable the NRI before running plugin. By default, this option is disabled.

@fmuyassarov
Copy link
Collaborator Author

/cc @kad

@fmuyassarov
Copy link
Collaborator Author

/cc @marquiz

@fmuyassarov
Copy link
Collaborator Author

@kad image is now pinned to ubuntu:22.04.

Comment on lines 29 to 30
nsenter -t 1 -m bash
systemctl restart containerd
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this really work as intended ?

I wonder about that nsenter usage. AFAICT, here it starts bash in the (mount) namespace of init/pid 1. Then once that bash exits, the original patch_nri.sh script continues in its original namespaces and runs systemctl restart containerd. So what does that nsenter accomplish ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this really work as intended ?

Yes

I wonder about that nsenter usage. AFAICT, here it starts bash in the (mount) namespace of init/pid 1. Then once that bash exits, the original patch_nri.sh script continues in its original namespaces and runs systemctl restart containerd. So what does that nsenter accomplish ?

We don't exit the bash of the pid1 until we run systemctl restart containerd. We enter the pid1 in nsenter -t 1 -m bash and we don't exit but instead we just kill the init container as a whole. So, nsenter is entering the mount namespace of the init process (PID 1) and launching a bash within that namespace. For having the access to the host pid, I added hostPID: true.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is that systemctl restart containerd supposed to be run in the namespace(s) nsenter entered ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that nsenter-started bash exits immediately (when running without a controlling tty) and systemctl restart containerd is run in the same namespace as the main script.

Anyway, I wonder what nsenter -t 1 -m would anyway change from the namespaces point of view. Namespaces are inherited across forks. If a process is not actively changing its namespaces, it should be running in the same namespace as its parent process, so transitively as PID 1/its init process.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the nsenter bash looks like nop to me, too

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that relying on nsenter -t 1 -m wasn't sufficient to execute the systemctl... command from within a script, even though it functions well when manually executed in the current terminal session. I have now used nsenter -t 1 -m systemctl restart containerd instead, thanks to the help of @klihub .

Here is the short POC:

POC

@klihub klihub requested a review from marquiz August 8, 2023 15:52
@@ -0,0 +1,36 @@
{{- if .Values.nri.enable }}
apiVersion: v1
kind: ConfigMap
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need this configmap business, the script is not that super complex. I think it would be simpler and safer (no way to affect what get's executed in privileged mode in host pid namespace by just mangling a configmap).

I'd suggest to write the script in the init container spec, smth like

    command:
      - "/bin/bassh"
      - "-c"
    args:
      - |
        echo Patching containerd config
        sed '/\[plugins.\"io.containerd.nri.v1.nri/,/^$/s/disable = false/disable = true/' -i /etc/containerd/config.toml
        systemctl restart containerd

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm personally not fan of adding multi line commands into pod spec especially when the command is long like with sed here. It makes it hard to read the DaemonSet file.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and if you see the script, it is not just 3 lines but we also have some conditionals there.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fail to see why the configmap is any more readable. What about the security concerns?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean you should be able to put the script in the pod spec formatted exactly like in the configmap

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fail to see why the configmap is any more readable. What about the security concerns?

what kind of security concerns do you mean?

Copy link
Collaborator Author

@fmuyassarov fmuyassarov Aug 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean you should be able to put the script in the pod spec formatted exactly like in the configmap

Yes I know. I just didn't like having multi line, long command in the spec of the pod.

I tend to isolate things into their own entity(when possible), instead of having one big YAML.
But anyways, I don't mean to say it is incorrect, it is just my personal preference and I'm totally fine to do it in the spec.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what kind of security concerns do you mean?

The visibility and assurance about what acutally gets run in the privileged host-pid-ns container. This whole business of mounting "some" configmap into the container and running a shell script from there

Comment on lines 29 to 30
nsenter -t 1 -m bash
systemctl restart containerd
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the nsenter bash looks like nop to me, too

This commit adds an init container to enable NRI in containerd
configuration file. Users installing plugin via Helm have an
option to opt in for using that init container to enable the
NRI before running plugin. By default, this option is disabled.

Signed-off-by: Feruzjon Muyassarov <[email protected]>
@@ -0,0 +1,36 @@
{{- if .Values.nri.enable }}
apiVersion: v1
kind: ConfigMap
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fail to see why the configmap is any more readable. What about the security concerns?

echo "NRI has now been enabled..."
else
echo "NRI plugin entry doesn't exist in the continerd config file."
echo "Doing nothing..."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we add the nri plugin config in this case? The containerd config might be a stub, only containing settings that differ from the defaults.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not sure. If [plugins."io.containerd.nri.v1.nri"] is not there, we can of course add the following defaults.
WDYT @klihub ?

    # Enable NRI support in containerd.
    disable = false
    # Allow connections from externally launched NRI plugins.
    disable_connections = false
    # plugin_config_path is the directory to search for plugin-specific configuration.
    plugin_config_path = "/etc/nri/conf.d"
    # plugin_path is the directory to search for plugins to launch on startup.
    plugin_path = "/opt/nri/plugins"
    # plugin_registration_timeout is the timeout for a plugin to register after connection.
    plugin_registration_timeout = "5s"
    # plugin_requst_timeout is the timeout for a plugin to handle an event/request.
    plugin_request_timeout = "2s"
    # socket_path is the path of the NRI socket to create for plugins to connect to.
    socket_path = "/var/run/nri/nri.sock"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OR just put disable = false as that's the only one needed, and not pretend that we necessarily know all the nri plugin config options

Of course this only makes sense if the containerd version that is running supports nri...

data:
patch_nri.sh: |
#!/bin/bash

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should bail out on any error

Suggested change
set -e

cp /etc/containerd/config.toml "$config_file"

# Check if the disable field is set to false under [plugins."io.containerd.nri.v1.nri"]
status=$(awk -F ' = ' '/^\[plugins\."io\.containerd\.nri\.v1\.nri"\]/{flag=1} flag && /disable/{print $2; exit}' "$config_file")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be simpler to just grep if [plugins."io.containerd.nri.v1.nri"] exists, do sed unconditioinally and use e.g. diff to see if anything was changed? WDYT?

Plus, in either of these cases there's still the corner case that the[plugins."io.containerd.nri.v1.nri"] section is present but the the disable setting hasn't been specified explicitly. Do we need to cover this(?) 🧐

Copy link
Collaborator Author

@fmuyassarov fmuyassarov Aug 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be simpler to just grep if [plugins."io.containerd.nri.v1.nri"] exists, do sed unconditioinally and use e.g. diff to see if anything was changed? WDYT?

It can be simpler, but it also means extra/unnecessary restart of the containerd.

EDIT: I missed your point, about checking the diff. So, diff would be used I guess to check if restart is needed or not right? Would not it be better instead of having diff just to check if we actually need to touch the file or not? Actually for me these two options (1. what I have, 2. what you suggest) don't really make a much difference. One is doing diff before touching the file the other is after touching the file.

Plus, in either of these cases there's still the corner case that the[plugins."io.containerd.nri.v1.nri"] section is present but the the disable setting hasn't been specified explicitly. Do we need to cover this(?) monocle_face

Is that a valid config? I mean, does containerd accept a config file without disable field set to either true/false?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, diff would be used I guess to check if restart is needed or not right? Would not it be better instead of having diff just to check if we actually need to touch the file or not? Actually for me these two options (1. what I have, 2. what you suggest) don't really make a much difference. One is doing diff before touching the file the other is after touching the file.

Sure, if the diff is empty don't update the file. I was thinking about (in pseudo'ish bash cod)

sed ... /etc/containerd/config.toml > config.tmp
if ! diff -q /etc/containerd/config.toml config.tmp; then 
  mv config.tmp /etc/containerd/config.toml
  systemctl restart containerd
fi

Plus, in either of these cases there's still the corner case that the[plugins."io.containerd.nri.v1.nri"] section is present but the the disable setting hasn't been specified explicitly. Do we need to cover this(?) monocle_face

Is that a valid config? I mean, does containerd accept a config file without disable field set to either true/false?

Yeah, it is. The config may be an empty file or just one line if at the extrema.

@fmuyassarov
Copy link
Collaborator Author

closing this in favor of #107.

@fmuyassarov fmuyassarov deleted the init-container branch October 23, 2023 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants