Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to troubleshoot a failure during run bundle-upgrade #6831

Open
etsauer opened this issue Sep 6, 2024 · 1 comment
Open

How to troubleshoot a failure during run bundle-upgrade #6831

etsauer opened this issue Sep 6, 2024 · 1 comment
Labels
language/helm Issue is related to a Helm operator project lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@etsauer
Copy link

etsauer commented Sep 6, 2024

Type of question

Best practices
How to implement a specific feature

Question

I'm trying to test the upgrade path for my helm-based operator using the sdk bundle. It's failing because of a missing install plan, but I'm not sure exactly how to get more information about what I'm doing wrong.

What did you do?

# Install current release of the operator
operator-sdk run bundle quay.io/pelorus/pelorus-operator-bundle:v0.0.9 --namespace test-pelorus-operator
# Once installed successfully, attempt the upgrade
operator-sdk run bundle-upgrade quay.io/pelorus/rc-pelorus-operator-bundle:vpr1157-34d9eef --namespace test-pelorus-operator --verbose

What did you expect to see?

I hoped to see the upgrade succeed.

What did you see instead? Under which circumstances?

The install failed after the deleting of the old registry pod:

INFO[0018] Generated a valid Upgraded File-Based Catalog 
INFO[0020] Created registry pod: quay-io-pelorus-rc-pelorus-operator-bundle-vpr1157-34d9eef 
INFO[0020] Updated catalog source pelorus-operator-catalog with address and annotations 
INFO[0021] Deleted previous registry pod with name "quay-io-pelorus-pelorus-operator-bundle-v0-0-9" 
FATA[0120] Failed to run bundle upgrade: install plan is not available for the subscription pelorus-operator-v0-0-9-sub: context deadline exceeded 

I also see the following subscriptions and installplans:

$ oc get subscription -n test-pelorus-operator
NAME                                                            PACKAGE            SOURCE                     CHANNEL
grafana-operator-v4-community-operators-openshift-marketplace   grafana-operator   community-operators        v4
pelorus-operator-v0-0-9-sub                                     pelorus-operator   pelorus-operator-catalog   operator-sdk-run-bundle
prometheus-beta-community-operators-openshift-marketplace       prometheus         community-operators        beta

$ oc get installplan -n test-pelorus-operator
NAME            CSV                       APPROVAL   APPROVED
install-t2fjd   grafana-operator.v4.8.0   Manual     true

NOTE: the prometheus and grafana operators are dependencies of this operator, which is why you see them in this namespace.

Environment

Operator type:

/language helm

Kubernetes cluster type:

OpenShift 4.15

$ operator-sdk version

operator-sdk version: "v1.33.0", commit: "542966812906456a8d67cf7284fc6410b104e118", kubernetes version: "1.27.0", go version: "go1.21.5", GOOS: "linux", GOARCH: "amd64"

$ kubectl version

Additional context

This happens with or without the actual operand resource created, so it seems to be some pretty basic issue, maybe with how the bundle is configured.

@openshift-ci openshift-ci bot added the language/helm Issue is related to a Helm operator project label Sep 6, 2024
@openshift-bot
Copy link

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
language/helm Issue is related to a Helm operator project lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

No branches or pull requests

2 participants