Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Backport release-1.26] Update K3s for 2023-07 releases #4444

Closed
rancherbot opened this issue Jul 7, 2023 · 2 comments
Closed

[Backport release-1.26] Update K3s for 2023-07 releases #4444

rancherbot opened this issue Jul 7, 2023 · 2 comments
Assignees

Comments

@rancherbot
Copy link
Collaborator

This is a backport issue for #4419, automatically created via rancherbot by @brandond

Original issue description:

RKE2 tracking issue for:

Both of these issues also affect RKE2 and will require an update of the k3s version to resolve.

@fmoral2
Copy link
Contributor

fmoral2 commented Jul 18, 2023

##Validation of related issue for: k3s-io/k3s#7797

Validated on Version:

-$  rke2 version v1.26.6+dev.b31fbe27 (b31fbe27e5cf0e168b518d662b03c6567e04e51a)

Environment Details

Infrastructure
Cloud EC2 instance

Node(s) CPU architecture, OS, and Version:
Ubuntu 22.04

Cluster Configuration:
1 node

Config yaml agent:

    token: {from server}
    server: {server ip}

Steps to validate the fix

  1. Generate a token ttl 24h on server
  2. Join agent node with token
  3. Delete agent node
  4. Start agent node with token again
  5. Check agent node status

Issue Reproduction:

$ rke2 token create --ttl 24h
{token}

~$ k get nodes

Ready    control-plane,etcd,master   3m35s   v1.26.6+rke2r1
Ready    <none>                      60s     v1.26.6+rke2r1

$ delete node ip-
node "ip-" deleted

~$ k get nodes
NAME STATUS ROLES AGE VERSION
ip- Ready control-plane,etcd,master 4m31s v1.26.6+rke2r1

On agent node:
~$ sudo rm -rf /etc/rancher/node/
~$ sudo systemctl enable rke2-agent --now
~$ sudo systemctl restart rke2-agent

On agent:

~$ journalctl -xeu rke2-agent.service
Jul 17 16:42:10 ip-rke2[13677]: time="2023-07-17T16:42:10Z" level=info msg="Waiting to retrieve agent configuration; server is not ready: https://127.0.0.1:6444/v1-rke2/serving-kubelet.crt: 401 Unauthorized"

On server:
~$ journalctl -xeu rke2-server.service

    Jul 17 16:37:12 ip-rke2[22237]: time="2023-07-17T16:37:12Z" level=info msg="certificate CN=ip-172-31-25-162 signed by CN=rke2-server-ca@1689611624: notBefore=2023-07-17 16:33:44 +0000 UTC notAfter=2024-07-16 16:37:12 +0000 UTC"
    Jul 17 16:37:12 ip-rke2[22237]: time="2023-07-17T16:37:12Z" level=info msg="certificate CN=system:node:ip-172-31-25-162,O=system:nodes signed by CN=rke2-client-ca@1689611624: notBefore=2023-07-17 16:33:44 +0000 UTC notAfter=2024-07-16 16:3>
    Jul 17 16:37:31 ip-rke2[22237]: time="2023-07-17T16:37:31Z" level=info msg="Handling backend connection request [ip-"
    Jul 17 16:41:35 ip-1 rke2[22237]: time="2023-07-17T16:41:35Z" level=info msg="error in remotedialer server [400]: websocket: close 1006 (abnormal closure): unexpected EOF"
    Jul 17 16:41:35 ip-rke2[22237]: time="2023-07-17T16:41:35Z" level=error msg="Sending HTTP 401 response to :43658: unable to verify node identity: nodes \"ip-172-31-25-162\" not found"
    Jul 17 16:41:40 ip- rke2[22237]: time="2023-07-17T16:41:40Z" level=error msg="Sending HTTP 401 response to :43704: unable to verify node identity: nodes \"ip-172-31-25-162\" not found"

Validation Results:

~$ rke2 -v
    rke2 version v1.26.6+dev.b31fbe27 (b31fbe27e5cf0e168b518d662b03c6567e04e51a)


~$ rke2 token create --ttl 24h
 {token}


- Start agent with token provided from server

$ k get nodes
    NAME               STATUS   ROLES                       AGE     VERSION
    ip-    Ready    control-plane,etcd,master   9m16s   v1.26.6+rke2r1
    ip-  Ready    <none>                      45s     v1.26.6+rke2r1


 $ k delete node ip-
        node "ip" deleted


 ~$ k get nodes
        NAME              STATUS   ROLES                       AGE     VERSION
        ip-  Ready    control-plane,etcd,master   9m51s   v1.26.6+rke2r1


On agent:
        $ sudo rm -rf /etc/rancher/node/

Start agent with token provided from server again

~$ k get nodes
        NAME               STATUS   ROLES                       AGE   VERSION
        ip-   Ready    control-plane,etcd,master   14m   v1.26.6+rke2r1
        ip-  Ready    <none>                      44s   v1.26.6+rke2r1



    ```





@fmoral2
Copy link
Contributor

fmoral2 commented Jul 18, 2023

Validation of issue related: k3s-io/k3s#7774

This one for now should be validated passing the server flag on rotate-ca command because we have a issue on rke2 that i will open the ticket

Validated on Version:

-$  rke2 version v1.26.6+dev.b99382e9 (b99382e9c9391d01c44cab3f86a940187326552d)

Environment Details

Infrastructure
Cloud EC2 instance

Node(s) CPU architecture, OS, and Version:
Ubuntu 22.04

Cluster Configuration:
1 node cluster

Config.yaml:

write-kubeconfig-mode: 644
data-dir: /data/rke2

Steps to validate the fix

  1. Install rke2 with command flag pointing to new dir on agent images
  2. Generate new certs running the script https://github.com/k3s-io/k3s/blob/master/contrib/util/generate-custom-ca-certs.sh with the new data-dir and path variables
    PRODUCT="${PRODUCT:-rke2}"
    DATA_DIR="${DATA_DIR:-/data/${PRODUCT}}"
    TEMP_DIR="${DATA_DIR}/server/rotate-ca"
  3. Rotate the certs with the rke2 tool command
  4. Restart rke2 service
  5. Check the certs in the data-dir
  6. Check node and pods status

Validation results

Install:

  sudo curl -sfL https://get.rke2.io | sudo INSTALL_RKE2_COMMIT=b99382e9c9391d01c44cab3f86a940187326552d  INSTALL_RKE2_AGENT_IMAGES_DIR=/data/rke2/agent/images  INSTALL_RKE2_TYPE=server sh -s
    wget https://github.com/k3s-io/k3s/raw/master/contrib/util/rotate-default-ca-certs.sh
    rke2 certificate rotate-ca --path=/data/rke2/server/rotate-ca --data-dir=/data/rke2 --server https://127.0.0.1:9345
    root@ip-172-31-7-8:~# sudo systemctl restart rke2
    root@ip-172-31-7-8:~# ls -l /data/rke2/server/tls/
    s -l /data/rke2/server/tls/
    total 140
    -rw-r--r-- 1 root root 1177 Jul 18 19:21 client-admin.crt
    -rw------- 1 root root  227 Jul 18 19:21 client-admin.key
    -rw-r--r-- 1 root root 1186 Jul 18 19:21 client-auth-proxy.crt
    -rw------- 1 root root  227 Jul 18 19:21 client-auth-proxy.key
    -rw-r--r-- 1 root root  570 Jul 18 19:21 client-ca.crt
    -rw------- 1 root root  227 Jul 18 19:21 client-ca.key
    -rw-r--r-- 1 root root  570 Jul 18 19:22 client-ca.nochain.crt
    -rw-r--r-- 1 root root 1169 Jul 18 19:21 client-controller.crt
    -rw------- 1 root root  227 Jul 18 19:21 client-controller.key
    -rw-r--r-- 1 root root 1181 Jul 18 19:21 client-kube-apiserver.crt
    -rw------- 1 root root  227 Jul 18 19:21 client-kube-apiserver.key
    -rw-r--r-- 1 root root 1149 Jul 18 19:21 client-kube-proxy.crt
    -rw------- 1 root root  227 Jul 18 19:21 client-kube-proxy.key
    -rw------- 1 root root  227 Jul 18 19:21 client-kubelet.key
    -rw-r--r-- 1 root root 1165 Jul 18 19:21 client-rke2-cloud-controller.crt
    -rw------- 1 root root  227 Jul 18 19:21 client-rke2-cloud-controller.key
    -rw-r--r-- 1 root root 1157 Jul 18 19:21 client-rke2-controller.crt
    -rw------- 1 root root  227 Jul 18 19:21 client-rke2-controller.key
    -rw-r--r-- 1 root root 1153 Jul 18 19:21 client-scheduler.crt
    -rw------- 1 root root  227 Jul 18 19:21 client-scheduler.key
    -rw-r--r-- 1 root root 1189 Jul 18 19:21 client-supervisor.crt
    -rw------- 1 root root  227 Jul 18 19:21 client-supervisor.key
    -rw-r--r-- 1 root root 3855 Jul 18 19:23 dynamic-cert.json
    drwxr-xr-x 2 root root 4096 Jul 18 19:21 etcd
    -rw-r--r-- 1 root root  595 Jul 18 19:21 request-header-ca.crt
    -rw------- 1 root root  227 Jul 18 19:21 request-header-ca.key
    -rw-r--r-- 1 root root  570 Jul 18 19:21 server-ca.crt
    -rw------- 1 root root  227 Jul 18 19:21 server-ca.key
    -rw-r--r-- 1 root root  570 Jul 18 19:22 server-ca.nochain.crt
    -rw------- 1 root root 1675 Jul 18 19:22 service.current.key
    -rw------- 1 root root 1675 Jul 18 19:21 service.key
    -rw-r--r-- 1 root root 1380 Jul 18 19:21 serving-kube-apiserver.crt
    -rw------- 1 root root  227 Jul 18 19:21 serving-kube-apiserver.key
    -rw------- 1 root root  227 Jul 18 19:21 serving-kubelet.key
    drwx------ 2 root root 4096 Jul 18 19:21 temporary-certs
    ```
rke2 certificate rotate-ca --help
NAME:
rke2 certificate rotate-ca - Write updated rke2 CA certificates to the datastore

USAGE:
rke2 certificate rotate-ca [command options] [arguments...]

OPTIONS:
--server value, -s value    (cluster) Server to connect to (default: "https://127.0.0.1:6443") [$RKE2_URL]
--data-dir value, -d value  (data) Folder to hold state (default: "/var/lib/rancher/rke2")
--path value                Path to directory containing new CA certificates
--force                     Force certificate replacement, even if consistency checks fail
 
$ kubectl get nodes,pods -A
NAME                   STATUS   ROLES                       AGE   VERSION
node/ip-172-31-18-51   Ready    control-plane,etcd,master   13m   v1.26.6+rke2r1

NAMESPACE     NAME                                                        READY   STATUS      RESTARTS   AGE
kube-system   pod/cloud-controller-manager-ip-172-31-18-51                1/1     Running     0          13m
kube-system   pod/etcd-ip-172-31-18-51                                    1/1     Running     0          13m
kube-system   pod/helm-install-rke2-canal-vq9rh                           0/1     Completed   0          13m
kube-system   pod/helm-install-rke2-coredns-f8qb5                         0/1     Completed   0          13m
kube-system   pod/helm-install-rke2-ingress-nginx-f69n9                   0/1     Completed   0          13m
kube-system   pod/helm-install-rke2-metrics-server-4jsw6                  0/1     Completed   0          13m
kube-system   pod/helm-install-rke2-snapshot-controller-crd-44qvb         0/1     Completed   0          13m
kube-system   pod/helm-install-rke2-snapshot-controller-vn59x             0/1     Completed   1          13m
kube-system   pod/helm-install-rke2-snapshot-validation-webhook-pdp46     0/1     Completed   0          13m
kube-system   pod/kube-apiserver-ip-172-31-18-51                          1/1     Running     0          13m
kube-system   pod/kube-controller-manager-ip-172-31-18-51                 1/1     Running     0          13m
kube-system   pod/kube-proxy-ip-172-31-18-51                              1/1     Running     0          13m
kube-system   pod/kube-scheduler-ip-172-31-18-51                          1/1     Running     0          13m
kube-system   pod/rke2-canal-vzhm4                                        2/2     Running     0          13m
kube-system   pod/rke2-coredns-rke2-coredns-7c98b7488c-6g7qw              1/1     Running     0          13m
kube-system   pod/rke2-coredns-rke2-coredns-autoscaler-65b5bfc754-7fgnk   1/1     Running     0          13m
kube-system   pod/rke2-ingress-nginx-controller-k9fx5                     1/1     Running     0          12m
kube-system   pod/rke2-metrics-server-5bf59cdccb-txbw9                    1/1     Running     0          12m
kube-system   pod/rke2-snapshot-controller-6f7bbb497d-fh9wf               1/1     Running     0          12m
kube-system   pod/rke2-snapshot-validation-webhook-5c499b5cdd-wr62k       1/1     Running     0          12m

@fmoral2 fmoral2 closed this as completed Jul 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants