Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ETCD-573: add recert cmd #1227

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

tjungblu
Copy link
Contributor

@tjungblu tjungblu commented Mar 14, 2024

you can run it with:

cluster-etcd-operator recert -o asset-out --hips master-1=192.168.2.1,master-2=192.168.2.2,master-3=192.168.2.3

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 14, 2024
@openshift-ci openshift-ci bot requested review from dusk125 and Elbehery March 14, 2024 15:22
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 14, 2024
Copy link
Contributor

openshift-ci bot commented Mar 14, 2024

@tjungblu: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn-etcd-scaling a8c8457 link true /test e2e-aws-ovn-etcd-scaling
ci/prow/e2e-operator-fips a8c8457 link true /test e2e-operator-fips
ci/prow/e2e-gcp-qe-no-capabilities a8c8457 link false /test e2e-gcp-qe-no-capabilities

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@dusk125
Copy link
Contributor

dusk125 commented Mar 15, 2024

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 15, 2024
Copy link
Contributor

openshift-ci bot commented Mar 15, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dusk125, tjungblu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@hasbro17
Copy link
Contributor

/hold

Haven't forgotten about this but still reviewing.

Not blocking this PR but just wanted to think ahead on how we actually want to run this cmd automatically once we detect etcd is down with expired certs. That may affect how we generate them here.

First is the detection of expired certs. I thinking this would be a health check or polling probe that can either query etcd locally to see a x509: certificate has expired or is not yet valid or just inspect the on-disk cert to check the date of expiry.
If this is a sidecar in the operator then we may not have sufficient hostpath permissions to do either of that right? And it can't be in the etcd pod as we need to run this from a single place.

And secondly the distribution step. Since we're generating everything in one place, I'm guessing we have to scp this around to all the other nodes. Not for SNO though.

Lastly since we're only modifying the on-disk cert files, that doesn't change the secrets and bundle configmaps in etcd, that are used by the installer for a new revision.
So we need to figure out how we update the cert secrets and configmaps in etcd otherwise the next revision rollout would reuse the expired signer certs in etcd, as opposed to the new ones generated on disk.

Maybe if we relaxed the constraint and assume that the signers aren't expired when the cluster is offline then we can only regenerate the peer/server and client certs on disk, distribute them, bring the cluster up, and then rotate the node cert secrets and configmaps.

Anyway, not a blocker for this PR but we can discuss and flesh that out a bit as well.

@tjungblu tjungblu changed the title add recert cmd ETCD-573: add recert cmd Apr 8, 2024
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 8, 2024
@openshift-ci-robot
Copy link

openshift-ci-robot commented Apr 8, 2024

@tjungblu: This pull request references ETCD-573 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.

In response to this:

you can run it with:

cluster-etcd-operator recert -o asset-out --hips master-1=192.168.2.1,master-2=192.168.2.2,master-3=192.168.2.3

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 8, 2024
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 8, 2024
@openshift-merge-robot
Copy link
Contributor

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@dusk125
Copy link
Contributor

dusk125 commented Jul 8, 2024

/remove-lifecycle stale

@openshift-ci openshift-ci bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants