-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Provide deterministic bundle #310
Comments
Cross link to: #303 (comment) This means if a Bundle A looks like:
and bundle B looks like this
it should always produce the same content. Also if we move one CA from ca1 -> ca2, the result should be the same. |
@SgtCoDFish thoughts on potentially get this done? It sounds to me we can just do alphabetic order during producing final bundle which should be good enough. |
I don't currently have the bandwidth to implement this, but I'd be happy to review a PR which does it! My 2c would be to hash the DER-encoded certs and then order them alphanumerically based on the hex-encoded hash |
not sure if anyone would be interested to pick it up but this will be a critical feature. Reason is our component is mounting the trust bundle configmap and the other automation is reconciling the bundle. But if the bundle data keep reordering it will be very expensive and unnecessary. |
But the order is consistent now, and that's good/required. Are you planning to shuffle the sources around @Jiawei0227? I don't say this shouldn't be fixed, but I don't consider it critical. 😸 |
Hi, we have a
We also observe that the order of certificates changes with every reconciliation, leading to many unnecessary updates of the generated configmaps. While the output for |
@erikgb This caused three near misses for us. In two cases it caused etcd to run out of space within a short timespan (A). In two cases (one case had both) it caused etcd to use so much memory that out masters went OOM (B). We use trust-manager to inject three configmaps (full CA certs) into each namespace. This happened in fairly small clusters (<30 namespaces). You can reproduce it by simply restarting trust-manager a few times. It will recreate those configmaps, etcd will grow by 1-2 GB and memory on the master will rise by about 3-6 GB. Personally, I would consider this a critical issue. Currently, we have to massively overprovision our masters. On Azure this prevents you from using the smallest Kubernetes tier at all as it will kill the API. On AWS we had to roughly double our costs. |
I agree this is a serious issue, but probably relatively easy to fix. Any watchers that would like to try a PR to fix this? |
We investigated this a bit more. It turns out that our issue is caused by non-deterministic ordering in label selectors. This is how our bundle looks like:
If we replace our selector with a static list this becomes less of an issue. With the selector the content seems to change every time.
I will have a look if I can reproduce this in a test. |
What caused that issue? |
I see two different issues here:
I suggest that we don't conflate the feature request with the bug. I propose that the current issue (#310) keeps track of @Jiawei0227's feature request. @jan-kantert @sebEg can you create a separate issue for the bug you found? |
I fixed both cases in in #380. Both issues have the same underlying cause. The second case just triggers the issue far more frequently since kubernetes will not order configmaps when loaded via a lebel selector. We could also sort those configmaps. That would fix the second issue (unless you rename configmaps). However, just ordering the certs will fix both issues at once. I can also add a test for the second case but it will be fixed as well. |
The generated trust bundle from trust-manager should provide deterministic bundle in the sense that if the source of the CA order changes, or there are multiple CA sources from multiple secret and they are shuffled in the spec. The generated final trust bundle should always be the same with the same ordering. Mayby by alphabetic order or expiration time order or something.
The text was updated successfully, but these errors were encountered: