Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Behavioral change in rke util get-state-file #3668

Open
paddy-hack opened this issue Aug 30, 2024 · 1 comment
Open

Behavioral change in rke util get-state-file #3668

paddy-hack opened this issue Aug 30, 2024 · 1 comment

Comments

@paddy-hack
Copy link

RKE version:

  • v1.5.10 (and v1.4.19, I presume)

Docker version: (docker version,docker info preferred)

  • cluster nodes: 19.03.15
  • host where rke is run: 27.1.2

Operating system and kernel: (cat /etc/os-release, uname -r preferred)

  • cluster nodes: RancherOS v1.5.8
  • host where rke is run: Devuan GNU/Linux 6 (excalibur /ceres)

Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO)

  • QEMU

cluster.yml file:

nodes:
    - address: 1.2.3.4
      user: rancher
      role:
        - controlplane
        - etcd
        - worker

Steps to Reproduce:

rke up --ssh-agent-auth
mv cluster.rkestate{,.bak}
mv kube_config_cluster.yml{,.bak}
rke util get-state-file --ssh-agent-auth

Results:

INFO[0000] Retrieving state file from cluster           
INFO[0000] Unable to connect to server using kubeconfig, trying to get state from Control Plane node(s), error: [state] Failed to create Kubernetes Client: stat ./kube_config_cluster.yml: no such file or directory 
INFO[0000] [dialer] Setup tunnel for host [1.2.3.4] 
INFO[0000] Image [rancher/hyperkube:v1.28.10-rancher1] exists on host [1.2.3.4] 
INFO[0000] Starting container [extract-statefile-configmap] on host [1.2.3.4], try #1 
INFO[0001] Successfully started [extract-statefile-configmap] container on host [1.2.3.4] 
INFO[0001] Waiting for [extract-statefile-configmap] container to exit on host [1.2.3.4] 
INFO[0001] Waiting for [extract-statefile-configmap] container to exit on host [1.2.3.4] 
INFO[0001] Container [extract-statefile-configmap] is still running on host [1.2.3.4]: stderr: [], stdout: [] 
INFO[0002] Removing container [extract-statefile-configmap] on host [1.2.3.4], try #1 
INFO[0002] [remove/extract-statefile-configmap] Successfully removed container on host [1.2.3.4] 
INFO[0002] Could not get ConfigMap with cluster state from host [1.2.3.4] 
FATA[0002] [state] Unable to get ConfigMap with cluster state from any Control Plane host 

With v1.5.9 the rke util get-state-file command succeeds.
The test results (per scenario 2) for the issue that adds the rke util commands indicate that the intent is for the command to succeed.

Some further testing shows that with v1.5.10 the command succeeds if kube_config_cluster.yml is present.

This behavioral change broke my CI/CD setup 😭

The change is introduced by the "fix" for CVE-2023-321-91. The release notes mention it but that did not ring a bell for me and I spent the morning figuring out what had happened 😫

Thinking of provisioning my CI/CD job with a copy of kube_config_cluster.yml to make them work again.
Obviously, that file cannot be added to the git repository I use to maintain my clusters.

Submitting this in the hope it helps someone running into the same 🙇

@paddy-hack
Copy link
Author

With v1.5.9 the rke util get-state-file command succeeds.

Clarification: This holds for a v1.5.9 deployed cluster. Using v1.5.9 against a v1.5.10 deployed cluster rke util get-state-file fails. This is because the full-cluster-state ConfigMap that the command looks for is no longer present.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant