Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PoC: Generate SPIFFE identities in csi-driver #251

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

SgtCoDFish
Copy link
Member

@SgtCoDFish SgtCoDFish commented May 7, 2024

This is intended as a quick demonstration and as a point of discussion.

⚠️ This isn't official, isn't on the roadmap and there's no guarantee that anything here will ever be used in any format.

This is intended as a starting point following discussions between maintainers and in the cert-manager daily standup.

That said: I'm seriously wondering if this sort of approach could replace the entire csi-driver-spiffe project and remove that maintainence burden, with a net-positive in security for everyone.

Differences from csi-driver-spiffe

If tokenRequests were unilaterally enabled for csi-driver we could do the token song-and-dance that csi-driver-spiffe does.

But I don't think that approach is particularly helpful. The csi.storage.k8s.io/serviceAccount.name and csi.storage.k8s.io/pod.namespace used in this PR (which are provided by the k8s API) are sufficient without requiring a token, and they're just as trustworthy because they come from the k8s API. There's no cryptographic validation - but even if we did cryptographic validation, that's just us checking the k8s API. An evil k8s API could presumably defeat cryptographic validation.

There's also a security benefit in that pods don't need permission to create CertificateRequests, which means a compromised pod down the line won't be able to request certs. Approval (approver-policy) helps with rogue pods but doesn't fully solve the issue - better to avoid the risk entirely.

What's Missing From This PR

  1. Tests, obviously
  2. A configurable CA for SPIFFE. csi-driver requires pods to specify their issuer in volume context - maybe we could allow configuration of a global SPIFFE CA so that pods requesting SPIFFE certs don't need to specify an issuer. That would also allow a mapping of issuing CAs to trust domains.
  3. The csi-driver-spiffe carotation logic. I wouldn't reimplement that - I'd instead use trust-manager.
  4. The above mentioned tokenRequest flow. As mentioned above, I wouldn't want to use it personally.

Example

Example of this working:

apiVersion: v1
kind: Secret
metadata:
  name: annotations-87xtt
  namespace: cert-manager
type: Opaque
data:
  tls.crt: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUQ0RENDQXNpZ0F3SUJBZ0lKQUp6VFJPSW5tRGtRTUEwR0NTcUdTSWIzRFFFQkN3VUFNRk14Q3pBSkJnTlYKQkFZVEFsVkxNUXN3Q1FZRFZRUUlFd0pPUVRFVk1CTUdBMVVFQ2hNTVkyVnlkQzF0WVc1aFoyVnlNU0F3SGdZRApWUVFERXhkalpYSjBMVzFoYm1GblpYSWdkR1Z6ZEdsdVp5QkRRVEFlRncweE56QTVNVEF4T0RNek5ETmFGdzB5Ck56QTVNRGd4T0RNek5ETmFNRk14Q3pBSkJnTlZCQVlUQWxWTE1Rc3dDUVlEVlFRSUV3Sk9RVEVWTUJNR0ExVUUKQ2hNTVkyVnlkQzF0WVc1aFoyVnlNU0F3SGdZRFZRUURFeGRqWlhKMExXMWhibUZuWlhJZ2RHVnpkR2x1WnlCRApRVENDQVNJd0RRWUpLb1pJaHZjTkFRRUJCUUFEZ2dFUEFEQ0NBUW9DZ2dFQkFNK1EyQU80aEFSYXYwcXdqazdJCjRtRWg1UjIwMUhTOHM3SHBhTE9YQk52dmg3cUo5eUp6NmpMcVlnNkV2UDBLL2JLNTZDcDJvZTJpZ2Q3R094cFYKM1lQT2MzQ0cwQ0NxSE1wckVjdnhqMnhCS1gwMFJ0Y240b1ZMaERQaEFiMEJWL1I3TkZMZVd4emgrZ2d2UEkxWAptMXFMYVdZcVlaRUo1YkJzWVhEM3RQZFM0R0dJTlJ6OFp2aWg0NmYwWjJ3VmtDR29UcHNiWDhITzc0c2EyRGF5ClVqekFzV0dsTzViWkdpTVNIakRFbmY5eWVrMlRjakV5Vm9vaG9PTGFRZy9uZzIxVDVSV3plWktUbDFjem53dUcKVnI5dFpmSEZxeFE1cWVhSWQrMUlDdHhOdmtFamJUblpsNld5OUN0aG4wZHh3T2VTNVRxTUo3U0ZOWHkxZ3A0agpmL01DQXdFQUFhT0J0akNCc3pBZEJnTlZIUTRFRmdRVUJ0cmp2V2Zia0xBMGlYNnNLVlJoS1VvODY0a3dnWU1HCkExVWRJd1I4TUhxQUZBYmE0NzFuMjVDd05JbCtyQ2xVWVNsS1BPdUpvVmVrVlRCVE1Rc3dDUVlEVlFRR0V3SlYKU3pFTE1Ba0dBMVVFQ0JNQ1RrRXhGVEFUQmdOVkJBb1RER05sY25RdGJXRnVZV2RsY2pFZ01CNEdBMVVFQXhNWApZMlZ5ZEMxdFlXNWhaMlZ5SUhSbGMzUnBibWNnUTBHQ0NRQ2MwMFRpSjVnNUVEQU1CZ05WSFJNRUJUQURBUUgvCk1BMEdDU3FHU0liM0RRRUJDd1VBQTRJQkFRQ1IralhodXA1dENLd2hBZjh4Z3ZwNTg5QmN6UU9qbW90dVpHRUwKRGNpbnQyeTI2M0NoRWRzb0xoeUpmdkZDQVpmVFNtK1VUOTVIbCtaS1Z1b1ZFY0FTN3VkYUZVRnBDL2dJWVZPaQpINC91dkpwczRTcFZDQjcrVC9vcmNUaloyZXdUMjNtUUFRZytCK2l3WDlWQ29mK2ZhZGtZT2cxWEQ5L2VhajZFCjlNY1hJRDNpdUNYZzAyUm1FT3dWTXJUZ2dIUHdIck9HQWlsU2FaYzU4Y0paSG1NWWxUNXJHckpjV1MvQXlYbkgKVk9vZEtDMDA0eWpoN3c5YVNiQ0NiQUwwdERFbmhtNEpyYjhjeHQ3cERXYmRFVlVldWs5TFpSUXRsdVlCbm1KVQprUTdBTGZVZlVoL1JVcENWNHVJNnNFSTNORFgyWXFRYk90c0JEL2hOYUwxRjg1RkEKLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQ==
  tls.key: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFcEFJQkFBS0NBUUVBejVEWUE3aUVCRnEvU3JDT1RzamlZU0hsSGJUVWRMeXpzZWxvczVjRTIrK0h1b24zCkluUHFNdXBpRG9TOC9Rcjlzcm5vS25haDdhS0Izc1k3R2xYZGc4NXpjSWJRSUtvY3ltc1J5L0dQYkVFcGZUUkcKMXlmaWhVdUVNK0VCdlFGWDlIczBVdDViSE9INkNDODhqVmViV290cFppcGhrUW5sc0d4aGNQZTA5MUxnWVlnMQpIUHhtK0tIanAvUm5iQldRSWFoT214dGZ3Yzd2aXhyWU5ySlNQTUN4WWFVN2x0a2FJeEllTU1TZC8zSjZUWk55Ck1USldpaUdnNHRwQ0QrZURiVlBsRmJONWtwT1hWek9mQzRaV3YyMWw4Y1dyRkRtcDVvaDM3VWdLM0UyK1FTTnQKT2RtWHBiTDBLMkdmUjNIQTU1TGxPb3dudElVMWZMV0NuaU4vOHdJREFRQUJBb0lCQVFDWXZHdklLU0cwRnBiRwp2aTZwbUxiRVpPMjBzMWpXNGZpVXhUMlBVV1I0OXNSNHBvY2RhaEIvRU92QTVUb3dOY05EbmZ0U0srT3grcS80Ckh3Umt0NlIrRmcvcVVMbWNIN0Y1M2RuRnFlWXc4YTQyL0ozWU92Zzd2N3J6ZGZJU2c0ZVdWb2JGSit3QnorTnQKM0Z5QllXTG0rTWxCTFpTSDVyR0c1ZW01OS96Sk5IV0loSCtvUVBmQ3hBa1lFdmQ4dFhPVFV6amhxdkVmamFKeQpGWmdoblQ5eHRvNE13RGROQ1BidHpkTmpUTWhpdjBBSGtjWkdHdFJKZmtlaFhYMnFoWE9RMlV6ek85WHJNWm52CjVLZ1lmK2JYS0pzeVMzU1BsNlRUbDd2ZzJnS0JjaVJ2c2RGaE15NUk1R3lJQURyRURKbk5ObVhRUnRpYUZMZmQKay9hcWZQVDVBb0dCQVBxdU1vdVpVYlZTL1FoK3FibHM3RzR6QXV6bmZDaXFkY3RjS21VR1BSUDRzVFRqV2RVcApmakkrVVR0MWU4aG5jbXI0Ulk3T2E5a1VWL2tEd3pTNXNwVVpaK3UwUGN6UzNYS3hPd05PbGVvSDAwZGZjOXZ0CmN4Y3RIZFBkRFRuZFJpOFo0azNtOTMxaklYN2pCL1B5eDhxZU5ZQjNwajBrM1Roa3R3TWJBVkxuQW9HQkFOUDQKYmVJNXpwYnZ0QWRFeEpjdXh4Mm1SREdGMGxJZEtDMGJ2UWFlcU0zTHdxbm1jMEZ6MWRiUDdLWERhK1NkSldQZApyZXMrTkhQWm9FUGVFSnVEVFNuZ1hPTE5FQ1plNEphOWZybjFUZVk4NTh2TUpCd0lreWM4enUrc2dYeGpRVU0rClRXVWxUVWh0WHl5YmtSbnhBRW55NE9UMlRUZ21YSVRKYUtPbVYxVVZBb0dBSGFYU2xvNFlpdEI0MnJOWVVYVGYKZFowVTRIMzBRajcrMVlGZUJqcTVxSTRHTDFJZ1FzUzRoeXExb3NtZlRURm01OTNiSkN1bnQ3SGZRYlUvTmhJcwpXOVA0WlhrWXdndkNZeGt3K0pBbnpOa0dGTy9tSFFHMVZlMWhGTGlWSXQzWHVpUmVqb1lkaVRmYk0wMlltREtECmpLUXZnYlVrOVNCU0JhUnJ2TE5KOGNzQ2dZQVluclpFbkdvK1pjRUhSeGwrWmRTQ3dSa1NsM1NDVFJpcGhKdEQKOVpHdHRZajZxdVdnS0pBaHp5eXhaQzFYOUZpdmJNUVNtcnNFNmJZUHErOUo0TXBKbnVHckJoNW1Gb2NIZXlNSQovbEQ1K1FFRFRzYXk2dHdNcHFkeWR4cmpFN1EwMXp1dUQ5TVdJbjMzZEdvNkZSL3ZkdUpnTmF0cVppcEEwaFB4ClRoUytzUUtCZ1FEaDArY1ZvMW1mWWlDa3AzSVFQQjhRWWlKL2cyL1VCazZwSDhaWkRaK0E1dGQ2TnZlaVdPMXkKd1RFVVdrWDJxeXo5U0x4V0RHT2hkS3F4TnJMQ1VTWVNPVi81L0pRRXRCbTZLNTBBckZ0clk0MEpQL1QvNUt2TQp0U0syYXlGWDF3UTNQdUVtZXdBb2d5LzIwdFdvODBjcjU1NkFYQTYyVXRsMlB6TEszMERiOHc9PQotLS0tLUVORCBSU0EgUFJJVkFURSBLRVktLS0tLQ==

---

apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: ca-clusterissuer-28wcs
spec:
  ca:
    secretName: annotations-87xtt

---

apiVersion: v1
kind: Pod
metadata:
  name: my-csi-app
  namespace: sandbox
  labels:
    app: my-csi-app
spec:
  containers:
    - name: my-frontend
      image: busybox
      volumeMounts:
      - mountPath: "/tls"
        name: tls
      command: [ "sleep", "1000000" ]
  volumes:
    - name: tls
      csi:
        driver: csi.cert-manager.io
        readOnly: true
        volumeAttributes:
              csi.cert-manager.io/issuer-name: ca-clusterissuer-28wcs
              csi.cert-manager.io/issuer-kind: ClusterIssuer
              csi.cert-manager.io/dns-names: ${POD_NAME}.${POD_NAMESPACE}.svc.cluster.local
              csi.cert-manager.io/inject-spiffe: yes please

Example resulting cert:

Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number: 58:ee:e9:64:38:4a:3b:8a:c5:d2:fa:92:06:e0:65:b1
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: C=UK, ST=NA, O=cert-manager, CN=cert-manager testing CA
        Validity
            Not Before: May  7 17:30:01 2024 GMT
            Not After : Aug  5 17:30:01 2024 GMT
        Subject: 
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
            ...
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment
            X509v3 Basic Constraints: critical
                CA:FALSE
            X509v3 Authority Key Identifier: 
                06:DA:E3:BD:67:DB:90:B0:34:89:7E:AC:29:54:61:29:4A:3C:EB:89
            X509v3 Subject Alternative Name: critical
                DNS:my-csi-app.sandbox.svc.cluster.local, URI:spiffe://example.com/ns/sandbox/sa/default
    Signature Algorithm: sha256WithRSAEncryption
    Signature Value: ...

This is inteded as a quick demonstration of a potential replacement for
csi-driver-spiffe. This isn't official, isn't on the roadmap and there's
no guarantee that anything here will ever be used in any format.

This is intended as a point of discussion following recent discussions
between maintainers and in the cert-manager daily standup

Signed-off-by: Ashley Davis <[email protected]>
@cert-manager-prow
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@cert-manager-prow cert-manager-prow bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. labels May 7, 2024
@cert-manager-prow
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from sgtcodfish. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@cert-manager-prow cert-manager-prow bot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label May 7, 2024
@munnerz
Copy link
Member

munnerz commented May 8, 2024

But I don't think that approach is particularly helpful. The csi.storage.k8s.io/serviceAccount.name and csi.storage.k8s.io/pod.namespace used in this PR (which are provided by the k8s API) are sufficient without requiring a token, and they're just as trustworthy because they come from the k8s API. There's no cryptographic validation - but even if we did cryptographic validation, that's just us checking the k8s API. An evil k8s API could presumably defeat cryptographic validation.

Typically the reason you'd pass along an audience scoped token is to present it to an issuer to attest that:

  1. the kube-apiserver (rather than kubelet, which is far less trusted) attests that the given bound Pod exists (including with the specific UID, ie two pods with the same name/namespace are not equal).

  2. the entity making the request had permission to request a token for the given ServiceAccount.

  3. (by extension) the entity making the request has permission to run on the Node encoded into the service account token (with https://github.com/kubernetes/enhancements/tree/master/keps/sig-auth/4193-bound-service-account-token-improvements)

The threat model with scoped access tokens isn't to protect against an evil kubernetes API (as you say, the API already has access to signing keys). It's to protect against a single kubelet, or csi-driver instance, being breached and therefore permitted to request a certificate for any Pod, even those not assigned to it.

@munnerz
Copy link
Member

munnerz commented May 8, 2024

Where I've said "pass along to an issuer", I should note that also includes just utilising it to make the CREATE request due to our mutating webhook defaulting UserInfo fields to those contained in that token.

This requires issuers only need to trust the apiserver to be correct, not kubelet's (within the boundaries of configured maximum token lifetime).

@SgtCoDFish
Copy link
Member Author

Do you know of some sort of kubernetes threat model out there which asserts that kubelets are "far less trusted"? Would be interested to read more in depth on that.

Where I've said "pass along to an issuer", I should note that also includes just utilising it to make the CREATE request due to our mutating webhook defaulting UserInfo fields to those contained in that token.

This is really the crux of it for me. I'm kinda fine with doing the token song and dance [1] in the csi-driver if that sparks joy for whatever reason - it feels like it's probably either mostly theatre or else it's suboptimal API design of csi-drivers generally. At the end of the day it's likely to be defence in depth and that's no bad thing if the cost is low.

I'm really not convinced though that giving every pod permission to create CertificateRequests is a desirable thing though. I think the risk from that is scarily high and I don't think the tools that are available are sufficient to mitigate that risk.

I'm much more comfortable with a threat model saying "you have to trust csi-driver to do the right thing and not be evil or pwned" (and the easier UX that comes with that) than I am with a threat model saying "you must enable approval and get all of your policies exactly right".

[1]: Effectively this but with actually validating the token given that I'd propose to not use the pod's token to request CRs

@cert-manager-prow
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@cert-manager-prow cert-manager-prow bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants