Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate the removal of older directories from /var/lib/rancher/rke2/data/* #3902

Open
pdegave opened this issue Feb 14, 2023 · 8 comments
Open

Comments

@pdegave
Copy link

pdegave commented Feb 14, 2023

Environmental Info:
RKE2 Version: All version of rke2

Node(s) CPU architecture, OS, and Version:
NA

Cluster Configuration:
NA

Describe the bug:

A dedicated version specific directory is getting created in /var/lib/rancher/rke2/data/* on every rke2 k8s upgrade and it is consuming a disk space. So, to avoid the disk space issue, we have to delete the older directories manually(except the current version directory).

It would be good to have an automation in place so that the older directories will get removed automatically without manual intervention.

Steps To Reproduce:

  • Provision RKE2 cluster and upgrade the k8s version from rancher and list the directories from the /var/lib/rancher/rke2/data.

eg:

/var/lib/rancher/rke2/data # du -ch --max-depth=1 .
270M ./v1.21.5-rke2r2-92b1ca9704d2
280M ./v1.21.7-rke2r2-68bf06fa2f17
280M ./v1.21.10-rke2r2-11db6b444b5d
306M ./v1.22.9-rke2r2-88ecb1441384
306M ./v1.22.11-rke2r1-ec82446b905f
309M ./v1.23.8-rke2r1-5856a144981f
309M ./v1.23.10-rke2r1-1d0bf7a90c4e
294M ./v1.24.7-rke2r1-bd84af53feb9
293M ./v1.24.9-rke2r2-154c18a3ccf5

Expected behavior:
Except Current version or (Current -2 version) directories should exists. Other directories should get removed automatically.

Actual behavior:
All version directories exists.

Additional context / logs:
N/A

@brandond
Copy link
Member

brandond commented Apr 3, 2023

Still valid, but probably fairly low priority as I don't believe these directories occupy much space.

@derickdiaz
Copy link

I'll take this on! I want to make sure my assumptions are correct

The files /var/lib/rancher/rke2/data are just yaml files which looks like those are used to deploy the default manifests. The folder is built when the rke2-server service starts. Is it safe to assume that directories for the older versions of rke2 are not needed since technically this would be rebuilt in a downgrade/upgrade when the service is started? If so, could we just clear the directory at the end of the install.sh script?

@brandond
Copy link
Member

brandond commented Nov 22, 2023

The files /var/lib/rancher/rke2/data are just yaml files which looks like those are used to deploy the default manifests.

No. The data dir does contain the charts to deployed, but also contains binaries that are needed to bootstrap the kubelet and container runtime, along with a few CLI tools.

root@rke2-server-1:/# ls -la /var/lib/rancher/rke2/data/v1.28.3-dev.2ea41d30-933ac7f7276f/bin/
total 325600
drwxr-xr-x 2 root root      4096 Nov 22 20:25 .
drwxr-xr-x 4 root root      4096 Nov 22 20:25 ..
-rwxr-xr-x 1 root root  60686856 Nov 22 20:25 containerd
-rwxr-xr-x 1 root root   8992552 Nov 22 20:25 containerd-shim
-rwxr-xr-x 1 root root  10653736 Nov 22 20:25 containerd-shim-runc-v1
-rwxr-xr-x 1 root root  14090784 Nov 22 20:25 containerd-shim-runc-v2
-rwxr-xr-x 1 root root  38483024 Nov 22 20:25 crictl
-rwxr-xr-x 1 root root  21035200 Nov 22 20:25 ctr
-rwxr-xr-x 1 root root  54670640 Nov 22 20:25 kubectl
-rwxr-xr-x 1 root root 112940560 Nov 22 20:25 kubelet
-rwxr-xr-x 1 root root  11811456 Nov 22 20:25 runc

The folder is built when the rke2-server service starts.

It is extracted from the rancher/rke2-runtime image whenever rke2 starts, if the directories are missing.

Is it safe to assume that directories for the older versions of rke2 are not needed since technically this would be rebuilt in a downgrade/upgrade when the service is started? If so, could we just clear the directory at the end of the install.sh script?

My biggest concern would be with the containerd shims that are still using the old data dir. Remember that pods continue running even when rke2's containerd is stopped, so any pods that were created by the previous version will still use the shim from the previous version:

root@rke2-server-1:/# ps auxfww
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
...
root        3296  0.0  0.3 726092 15080 pts/0    Sl   20:26   0:00 /var/lib/rancher/rke2/data/v1.28.3-dev.2ea41d30-933ac7f7276f/bin/containerd-shim-runc-v2 -namespace k8s.io -id 3c84ef9bd38e44f5114e8091525fbcd172b96934db650cb557ce72c49659aa22 -address /run/k3s/containerd/containerd.sock -debug
65535       3315  0.0  0.0    972   512 ?        Ss   20:26   0:00  \_ /pause
root        4765  0.0  1.2 769580 51328 ?        Ssl  20:26   0:00  \_ /coredns -conf /etc/coredns/Corefile

root@rke2-server-1:/# xargs -n1 -0 -a /proc/3296/environ echo
PATH=/var/lib/rancher/rke2/agent/containerd/bin:/var/lib/rancher/rke2/data/v1.28.3-dev.2ea41d30-933ac7f7276f/bin:/var/lib/rancher/rke2/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=rke2-server-1
TERM=xterm
KUBECONFIG=/etc/rancher/rke2/rke2.yaml
CONTAINER_RUNTIME_ENDPOINT=unix:///run/k3s/containerd/containerd.sock
HOME=/root
RES_OPTIONS=
_K3S_LOG_REEXEC_=true
NO_PROXY=.svc,.cluster.local,10.42.0.0/16,10.43.0.0/16
NODE_NAME=rke2-server-1
LD_LIBRARY_PATH=/var/lib/rancher/rke2/agent/containerd/lib:
MAX_SHIM_VERSION=2
TTRPC_ADDRESS=/run/k3s/containerd/containerd.sock.ttrpc
GRPC_ADDRESS=/run/k3s/containerd/containerd.sock
NAMESPACE=k8s.io
GOMAXPROCS=4

The container will continue to exist with that path as long as the pod is running. If you clean up the data dir, any commands that use runc will fail:

root@rke2-server-1:/# kubectl exec -it -n kube-system rke2-coredns-rke2-coredns-6b795db654-x5hmv -- /bin/sh
error: Internal error occurred: error executing command in container: failed to exec in container: failed to start exec "7a459afe398165168b5e1a3b1073fcdc7e5b8990629f44e607149af5009420c4": OCI runtime exec failed: exec: "runc": executable file not found in $PATH: <nil>: unknown

In practice, this means that data dirs cannot be cleaned up until any pods that were started by that version of RKE2 have been recreated.

@derickdiaz
Copy link

derickdiaz commented Nov 23, 2023

What if when the rke2-server service start it's able to identify those containers versions like you did with the ps auxfww command and only kill the directories for those version that are not being referenced? When the machine is restarted, the containers are recreated to the current running version of rke2 so this way it's the folder is cleaned up over time.

For example, adding a StartExecPost to the service file that a contains the logic to perform this action only when the rke2 service starts successfully. Also, I'm only look at this from the master nodes perspective. I assume the rke2-agent would also need to be updated.

@brandond
Copy link
Member

brandond commented Nov 28, 2023

The core issue here of pods running with old binaries affects the agent (kubelet+containerd) portion of the code, however, servers also run an agent - so it effectively affects both node types.

@derickdiaz
Copy link

derickdiaz commented Nov 29, 2023

Right so the change I'm proposing would have to be on both the rke2-server and rke2-agent service files. For example, if I wanted to identify which of the version folder are being use I could do this:
image

With this list, I could compare the output to what exists within the data directory and perform the removal.

@brandond
Copy link
Member

brandond commented Nov 29, 2023

I would probably want to handle this in golang as part of the agent startup code, not in the systemd unit - just to avoid embedding too much more cruft in there.

@derickdiaz
Copy link

@brandond I started a draft pull request with showing how I believe it will be implemented if this were to be added within the go code. Let me know if I'm heading in the right direction. #5139

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants