Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ssh certificates have stopped working in v1.3.11 #2941

Open
stefanfritsch opened this issue May 23, 2022 · 12 comments
Open

ssh certificates have stopped working in v1.3.11 #2941

stefanfritsch opened this issue May 23, 2022 · 12 comments

Comments

@stefanfritsch
Copy link

stefanfritsch commented May 23, 2022

I use ssh certificates to access nodes and this has worked fine for years until at least v1.3.7 but with v1.3.11 (I haven't used the versions in between) it is broken:

Error: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain

The same node works if I add the key to authorized_keys

Steps to Reproduce:

  1. Set up ssh certificate login on a host.
  2. Create a cluster.yaml
  3. Try rke:
    1. With ssh: Works
    2. With v1.3.7: Works
    3. With v1.3.11: Doesn't work
    4. With v1.3.11 and the public key in authorized_keys: Works

Output

ssh

Login works

root@control-0 ~ # ssh stefan.fritsch@shin-11
Last login: Mon May 23 13:17:22 2022 from 159.69.91.228
stefan.fritsch@shin-11:~$ 

v1.3.7

Everything's fine

root@control-0 /decrypted/kubernetes # ./rke_linux-amd64-v1.3.7 etcd snapshot-save --name "etcd-manual-$(date +'%Y-%m-%d')" --config cluster.yml
INFO[0000] Running RKE version: v1.3.7 
INFO[0000] Starting saving snapshot on etcd hosts
INFO[0000] [dialer] Setup tunnel for host [shin-12.example.com]
INFO[0000] [dialer] Setup tunnel for host [shin-10.example.com]
INFO[0000] [dialer] Setup tunnel for host [shin-11.example.com]
INFO[0000] [state] Deploying state file to [/etc/kubernetes/etcd-manual-2022-05-23.rkestate] on host [shin-11.example.com]
INFO[0000] [state] Deploying state file to [/etc/kubernetes/etcd-manual-2022-05-23.rkestate] on host [shin-12.example.com]
INFO[0000] [state] Deploying state file to [/etc/kubernetes/etcd-manual-2022-05-23.rkestate] on host [shin-10.example.com]
INFO[0000] Image [rancher/rke-tools:v0.1.78] exists on host [shin-11.example.com]
INFO[0000] Image [rancher/rke-tools:v0.1.78] exists on host [shin-12.example.com]
INFO[0000] Image [rancher/rke-tools:v0.1.78] exists on host [shin-10.example.com]
INFO[0001] Starting container [cluster-state-deployer] on host [shin-10.example.com], try #1
INFO[0001] Starting container [cluster-state-deployer] on host [shin-12.example.com], try #1
INFO[0001] Starting container [cluster-state-deployer] on host [shin-11.example.com], try #1

v1.3.11

Nothing works

root@control-0 /decrypted/kubernetes # ./rke_linux-amd64-v1.3.11 etcd snapshot-save --name "etcd-manual-$(date +'%Y-%m-%d')" --config cluster.yml
INFO[0000] Running RKE version: v1.3.11
INFO[0000] Starting saving snapshot on etcd hosts
INFO[0000] [dialer] Setup tunnel for host [shin-11.example.com]
INFO[0000] [dialer] Setup tunnel for host [shin-10.example.com]
INFO[0000] [dialer] Setup tunnel for host [shin-12.example.com]
WARN[0000] Failed to set up SSH tunneling for host [shin-11.example.com]: Can't retrieve Docker Info: error during connect: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/info": Unable to access node with address [shin-11.example.com:22] using SSH. Please check if you are able to SSH to the node using the specified SSH Private Key and if you have configured the correct SSH username. Error: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
WARN[0000] Failed to set up SSH tunneling for host [shin-12.example.com]: Can't retrieve Docker Info: error during connect: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/info": Unable to access node with address [shin-12.example.com:22] using SSH. Please check if you are able to SSH to the node using the specified SSH Private Key and if you have configured the correct SSH username. Error: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
WARN[0000] Failed to set up SSH tunneling for host [shin-10.example.com]: Can't retrieve Docker Info: error during connect: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/info": Unable to access node with address [shin-10.example.com:22] using SSH. Please check if you are able to SSH to the node using the specified SSH Private Key and if you have configured the correct SSH username. Error: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
WARN[0000] Removing host [shin-11.example.com] from node lists
WARN[0000] Removing host [shin-12.example.com] from node lists
WARN[0000] Removing host [shin-10.example.com] from node lists

v1.3.11 with the pubkey on one of the hosts

Note how the node with the key in authorized_keys now works

root@control-0 /decrypted/kubernetes # ./rke_linux-amd64-v1.3.11 etcd snapshot-save --name "etcd-manual-$(date +'%Y-%m-%d')" --config cluster.yml
INFO[0000] Running RKE version: v1.3.11                 
INFO[0000] Starting saving snapshot on etcd hosts       
INFO[0000] [dialer] Setup tunnel for host [shin-10.example.com] 
INFO[0000] [dialer] Setup tunnel for host [shin-12.example.com] 
INFO[0000] [dialer] Setup tunnel for host [shin-11.example.com] 
WARN[0000] Failed to set up SSH tunneling for host [shin-10.example.com]: Can't retrieve Docker Info: error during connect: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/info": Unable to access node with address [shin-10.example.com:22] using SSH. Please check if you are able to SSH to the node using the specified SSH Private Key and if you have configured the correct SSH username. Error: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain 
WARN[0000] Failed to set up SSH tunneling for host [shin-12.example.com]: Can't retrieve Docker Info: error during connect: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/info": Unable to access node with address [shin-12.example.com:22] using SSH. Please check if you are able to SSH to the node using the specified SSH Private Key and if you have configured the correct SSH username. Error: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain 
WARN[0000] Removing host [shin-10.example.com] from node lists 
WARN[0000] Removing host [shin-12.example.com] from node lists 

sshd

[...]
May 23 13:44:38 shin-10 sshd[3862349]: Accepted certificate ID "stefan.fritsch at 2022-05-23 11:04:34 user key valid for 10h" (serial 0) signed by RSA CA SHA256:<snip> via /etc/ssh/ssh_trusted_ca.pub
May 23 13:44:38 shin-10 sshd[3862349]: debug3: mm_answer_keyallowed: publickey authentication: RSA-CERT key is allowed
May 23 13:44:38 shin-10 sshd[3862349]: debug3: mm_request_send entering: type 23
May 23 13:44:38 shin-10 sshd[3862349]: debug3: mm_sshkey_verify entering [preauth]
May 23 13:44:38 shin-10 sshd[3862349]: debug3: mm_request_send entering: type 24 [preauth]
May 23 13:44:38 shin-10 sshd[3862349]: debug3: mm_sshkey_verify: waiting for MONITOR_ANS_KEYVERIFY [preauth]
May 23 13:44:38 shin-10 sshd[3862349]: debug3: mm_request_receive_expect entering: type 25 [preauth]
May 23 13:44:38 shin-10 sshd[3862349]: debug3: mm_request_receive entering [preauth]
May 23 13:44:38 shin-10 sshd[3862349]: debug3: mm_request_receive entering
May 23 13:44:38 shin-10 sshd[3862349]: debug3: monitor_read: checking request 24
May 23 13:44:38 shin-10 sshd[3862349]: debug3: mm_answer_keyverify: publickey 0x<snip> signature unverified: incorrect signature
May 23 13:44:38 shin-10 sshd[3862349]: debug1: auth_activate_options: setting new authentication options
May 23 13:44:38 shin-10 sshd[3862349]: debug3: mm_request_send entering: type 25
May 23 13:44:38 shin-10 sshd[3862349]: Failed publickey for stefan.fritsch from <ip> port 59546 ssh2: RSA-CERT SHA256:<snip> ID stefan.fritsch at 2022-05-23 11:04:34 user key valid for 10h (serial 0) CA RSA SHA256:<snip>
May 23 13:44:38 shin-10 sshd[3862349]: debug2: userauth_pubkey: authenticated 0 pkalg [email protected] [preauth]
May 23 13:44:38 shin-10 sshd[3862349]: debug3: user_specific_delay: user specific delay 0.000ms [preauth]
May 23 13:44:38 shin-10 sshd[3862349]: debug3: ensure_minimum_time_since: elapsed 0.951ms, delaying 5.775ms (requested 6.726ms) [preauth]
May 23 13:44:38 shin-10 sshd[3862349]: debug3: userauth_finish: failure partial=0 next methods="publickey" [preauth]
May 23 13:44:38 shin-10 sshd[3862349]: debug3: send packet: type 51 [preauth]
May 23 13:44:38 shin-10 sshd[3862349]: Connection closed by authenticating user stefan.fritsch <ip> port 59546 [preauth]
[...]

System info

RKE version: v1.3.11

Operating system and kernel: (cat /etc/os-release, uname -r preferred)

root@shin-11 /var/log # cat /etc/os-release
NAME="Ubuntu"
VERSION="20.04.4 LTS (Focal Fossa)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 20.04.4 LTS"
VERSION_ID="20.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=focal
UBUNTU_CODENAME=focal
root@shin-11 /var/log # uname -r
6.4.0-100-generic

Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO): bare-metal

cluster.yml file:

nodes:
    - address: shin-10.example.com
      internal_address: 192.168.2.20
      user: stefan.fritsch
      role: [controlplane,worker,etcd]
    - address: shin-11.example.com
      internal_address: 192.168.2.21
      user: stefan.fritsch
      role: [controlplane,worker,etcd]
    - address: shin-12.example.com
      internal_address: 192.168.2.22
      user: stefan.fritsch
      role: [controlplane,worker,etcd]

# Enable use of SSH agent to use SSH private keys with passphrase
# This requires the environment  configured pointing 
# to your SSH agent which has the private key added
ssh_agent_auth: true

SURE-4777

@stefanfritsch stefanfritsch changed the title ssh certificates have stopped working between v1.3.7 and v1.3.11 ssh certificates have stopped working in v1.3.11 May 23, 2022
@stefanlasiewski
Copy link

We ran into this as well, but only with Ubuntu 20.04 nodes not Ubuntu 18.04 nodes.

I'm using RKE v1.3.10.

@stefanfritsch
Copy link
Author

@stefanlasiewski - It's interesting that the server side makes a difference - as ssh from the command line works fine, it's clearly a client (rke) issue but there must be some snafu with the accepted algorithms. I know that openssh for windows (the client not the server) needs PubkeyAcceptedAlgorithms [email protected] in the ~/.ssh/config even if the certificates are rsa-sha2-256.

Could it be related to golang/go/issues/37278?

@Birddude1230
Copy link

We are experiencing a similar issue -- I can confirm that the root cause is a change in the crypto/ssh library -- certificate-based login (with ssh-rsa certs) works fine for versions of crypto/ssh before commit 3147a52a75dd, but is broken after. As best I can tell, the issue is with client_auth.go, in the function pickSignatureAlgorithm. Previously, the library would fail to find a common certificate algo, and would attempt whatever your certificate was as a last-ditch effort. Now, it identifies certificate algos the server should support based on supported key exchange algos, which then include ssh-rsa2-512, ssh-rsa2-256, and ssh-rsa. This sounds like it shouldn't be an issue, since that includes the certificate I want to use, but it does decide on an ssh-rsa2 algo when the cert is ssh-rsa. Why this breaks is not clear, since presumably an ssh-rsa cert can still sign using ssh-rsa2.

So what makes this an RKE issue and not an ssh issue? I suspect, but do not know for certain, that this is a usage issue, mostly because that's the default assumption to make. However, x/crypto is (somewhat unbelievably) still in version 0, so it is deliberately advertising that it is not yet stable. I simply don't have the time to establish confidently where the issue truly lies, especially considering the apparent lack of documentation of x/crypto/ssh.

@stefanlasiewski
Copy link

stefanlasiewski commented Aug 2, 2022

I had luck switching from an RSA key to a ed25519 key (After talking to Rancher support). The upstream Go issue suggests that Go support for RSA keys is broken: golang/go#49952

Also, I notice this issue is discussing certs while my problem is with keys. However, I suspect the underlying cause is the same, and any non-RSA key should work.

@github-actions
Copy link
Contributor

github-actions bot commented Oct 1, 2022

This repository uses an automated workflow to automatically label issues which have not had any activity (commit/comment/label) for 60 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the workflow can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the workflow will automatically close the issue in 14 days. Thank you for your contributions.

@stefanlasiewski
Copy link

@stefanfritsch @Birddude1230 With rke v1.3.14, SSH now works for me. Is it working for you also?

@stefanfritsch
Copy link
Author

@stefanlasiewski Can't confirm for v1.3.15. With an ed25519 private key (ca-key is always rsa) I get:

WARN[0000] Failed to set up SSH tunneling for host [shin-11]: Can't retrieve Docker Info: error during connect: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.24/info": Failed to dial ssh using address [shin-11:22]: ssh: handshake failed: agent: unsupported algorithm "ssh-ed25519" 

with rsa:

Error: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain 

In both cases the login with ssh shin-11 works just fine.

@stefanlasiewski
Copy link

@stefanfritsch You know what, I was wrong. it's not working for me either.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 7, 2023

This repository uses an automated workflow to automatically label issues which have not had any activity (commit/comment/label) for 60 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the workflow can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the workflow will automatically close the issue in 14 days. Thank you for your contributions.

@stefanlasiewski
Copy link

stefanlasiewski commented Jan 9, 2023

This issue is still happening. Posting a message to keep this issue open.

@snasovich
Copy link
Collaborator

@stefanlasiewski @stefanfritsch , could you check if adding the following settings to /etc/ssh/sshd_config resolves the issue for you:

AllowStreamLocalForwarding yes
DisableForwarding no

Context: #2907 (comment)

@stefanlasiewski
Copy link

AllowStreamLocalForwarding yes
DisableForwarding no

This had no effect for me. Note that on Ubuntu 20.04, AllowStreamLocalForwarding yes is already the default according to the manpage. I believe that DisableForwarding no is also the default, but the manpage isn't clear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants