Skip to content
This repository has been archived by the owner on Nov 1, 2022. It is now read-only.

Constantly creating new releases for charts even when no changes #457

Closed
brpaz opened this issue Jun 9, 2020 · 52 comments
Closed

Constantly creating new releases for charts even when no changes #457

brpaz opened this issue Jun 9, 2020 · 52 comments
Labels
blocked needs validation In need of validation before further action bug Something isn't working

Comments

@brpaz
Copy link

brpaz commented Jun 9, 2020

Describe the bug

Hello.

I am starting experimenting with Flux and the Helm Operator on a new Cluster and everything went fine until I deployed cert-manager Helm chart.

Each time the sync runs, the helm operator tries to do an update and create a new release even without any change in the chart.

This is causing some instability in the Network of my cluster. (maybe to do excessive load in the API server resulted from the constant updates.

What is strange is that if I run: kubectl -n cert-manager get helmreleases.helm.fluxcd.io the latest update date is the initial deploy. Still a new secret with the helm release information is being created every time and the pods restarted.

To Reproduce

Just a basic helm release manifest:

---
apiVersion: helm.fluxcd.io/v1
kind: HelmRelease
metadata:
  name: cert-manager
  namespace: cert-manager
spec:
  releaseName: cert-manager
  chart:
    repository: https://charts.jetstack.io
    name: cert-manager
    version: 0.15.1
  values:
    installCRDs: true
    global:
      leaderElection:
        namespace: cert-manager
    ingressShim:
      defaultIssuerName: letsencrypt-prod
      defaultIssuerKind: ClusterIssuer
    prometheus:
      enabled: false

Expected behavior

Cert-manager should only be deployed when the is any change.

Logs

Here is the log output:

´s=2020-06-09T11:27:43.485024459Z caller=helm.go:69 component=helm version=v3 info="checking 31 resources for changes" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:43.511728151Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ServiceAccount \"cert-manager\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:43.52932945Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ServiceAccount \"cert-manager-webhook\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:43.593238042Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for CustomResourceDefinition \"certificaterequests.cert-manager.io\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:43.707520153Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for CustomResourceDefinition \"certificates.cert-manager.io\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:43.787600969Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for CustomResourceDefinition \"challenges.acme.cert-manager.io\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:43.840798706Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for CustomResourceDefinition \"clusterissuers.cert-manager.io\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:43.914200336Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for CustomResourceDefinition \"issuers.cert-manager.io\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:43.976102766Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for CustomResourceDefinition \"orders.acme.cert-manager.io\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.006756511Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRole \"cert-manager-controller-issuers\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.170684941Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRole \"cert-manager-controller-clusterissuers\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.186528407Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRole \"cert-manager-controller-certificates\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.206401365Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRole \"cert-manager-controller-orders\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.226948001Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRole \"cert-manager-controller-challenges\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.242952126Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRole \"cert-manager-controller-ingress-shim\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.270264045Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRole \"cert-manager-view\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.288023181Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRole \"cert-manager-edit\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.323437971Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRoleBinding \"cert-manager-controller-issuers\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.334162638Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRoleBinding \"cert-manager-controller-clusterissuers\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.349806345Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRoleBinding \"cert-manager-controller-certificates\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.364670113Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRoleBinding \"cert-manager-controller-orders\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.42316236Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRoleBinding \"cert-manager-controller-challenges\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.442380377Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ClusterRoleBinding \"cert-manager-controller-ingress-shim\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.457693993Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for Role \"cert-manager:leaderelection\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.471976952Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for Role \"cert-manager-webhook:dynamic-serving\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.509698246Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for Service \"cert-manager-webhook\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.534244597Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for Deployment \"cert-manager\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:44.550004375Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for Deployment \"cert-manager-webhook\"" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:45.300454292Z caller=helm.go:69 component=helm version=v3 info="updating status for upgraded release for cert-manager" targetNamespace=cert-manager release=cert-manager
ts=2020-06-09T11:27:46.000638151Z caller=release.go:309 component=release release=cert-manager targetNamespace=cert-manager resource=cert-manager:helmrelease/cert-manager helmVersion=v3 info="upgrade succeeded" revision=0.15.1 phase=upgrade

Sometimes I also found these errors:

warning="failed to annotate release resources: serviceaccount/cert-manager annotated" phase=annotate

Not exactly sure what this is but it seems to take some time to run.

Additional context

  • Helm Operator version: 1.1.0 (installed with the Helm chart)
  • Kubernetes version: 1.16.8 (Digital Ocean)
@brpaz brpaz added blocked needs validation In need of validation before further action bug Something isn't working labels Jun 9, 2020
@mtneug
Copy link

mtneug commented Jun 14, 2020

It seems version 1.1.0 constantly creates new releases for all HelmRelease definitions. Downgrading to 1.0.2 (i.e. the Helm version of helm-operator) resolved the issue for me.

@lucioveloso
Copy link

I'm facing the similar issue with strimizi kafka operator. With 1.0.2 works well, but with 1.1.0 it keeps sync the chart even though there is no change.

@surfingsimbo
Copy link

I'm seeing the same issue with version 1.1.0 and cert-manager 0.15.0

@stevehipwell
Copy link

stevehipwell commented Jun 18, 2020

We're seeing the same issue with v1.1.0 and cert-manager v0.15.0.

NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
cert-manager cert-manager 244 2020-06-19 08:20:30.927272479 +0000 UTC deployed cert-manager-v0.15.0 v0.15.0

@twendt
Copy link

twendt commented Jun 29, 2020

We had the same issue, but in our case this was caused by too low memory limits for the helm operator. This has caused it to be restarted by kubernetes.
After a new start of the helm operator it has to download all the charts again. This caused the chart.changed value set to true for all charts which then causes the Upgrade action to be called instead of the dry-run action.

I would prefer if the dry-run action would be called here.

@relu
Copy link
Member

relu commented Jul 29, 2020

I'm also seeing this problem with some custom charts. Each reconcile iteration seems to create a new release upgrade even though nothing changes.

EDIT: Figured it out after doing some debugging. It looks like if we use relative chart versions e.g. ~> 2.0 in the release it will break the comparison here since the value of hr.Status.LastAttemptedRevision will not be the resolved version of the chart but rather the raw ~> 2.0 string and it will obviously fail, causing the operator to assume there's always a new version of the chart.

My issue doesn't seem to be related to the cert-manager issue described here, I think it would deserve its own ticket.

EDIT2: Relevant issue #490 (resolved in v1.2.0, my comment can be ignored).

@onedr0p
Copy link

onedr0p commented Aug 13, 2020

I am also having this issue with a lot of charts, info can be found in slack. Can we have a maintainer chime in or at least rename this issue? It can happen on any random chart from what I've seen.

I am not seeing the helm-operator pod being restarted like @twendt has.

https://cloud-native.slack.com/archives/CLAJ40HV3/p1597320334119100

@brpaz brpaz changed the title Constantly creating new releases for cert-manager even when no changes Constantly creating new releases for charts even when no changes Aug 16, 2020
@brpaz
Copy link
Author

brpaz commented Aug 16, 2020

@onedr0p I have renamed the issue to be more generic.

Still, we need an official response. This issue is open for 2 months without any feedback and I think it´s quite critical.
The constant releases killed my cluster.

I want to fully dive into GitOps, but this issue open for so long without any feedback doesnt give much confidence.

@stefanprodan, can some maintainer look at this, please?

@davidholsgrove
Copy link

I too observed this today, up to revision 2291(!) of a Helm-operator (v1.2.0) controlled HelmRelease

@eklee
Copy link

eklee commented Aug 17, 2020

Same here. I have just completed the 'get-started' tutorial and the demo apps are upgraded every 5 minutes:

ts=2020-08-17T02:37:31.935043189Z caller=helm.go:69 component=helm version=v3 info="checking 6 resources for changes" targetNamespace=demo release=redis
ts=2020-08-17T02:37:31.94590984Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for Secret \"redis\"" targetNamespace=demo release=redis
ts=2020-08-17T02:37:31.955831337Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ConfigMap \"redis\"" targetNamespace=demo release=redis
ts=2020-08-17T02:37:31.97589111Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for ConfigMap \"redis-health\"" targetNamespace=demo release=redis
ts=2020-08-17T02:37:31.985966338Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for Service \"redis-headless\"" targetNamespace=demo release=redis
ts=2020-08-17T02:37:31.994606993Z caller=helm.go:69 component=helm version=v3 info="Looks like there are no changes for Service \"redis-master\"" targetNamespace=demo release=redis
ts=2020-08-17T02:37:32.056348166Z caller=helm.go:69 component=helm version=v3 info="updating status for upgraded release for redis" targetNamespace=demo release=redis
ts=2020-08-17T02:37:32.213089037Z caller=release.go:364 component=release release=redis targetNamespace=demo resource=demo:helmrelease/redis helmVersion=v3 info="upgrade succeeded" revision=10.3.1 phase=upgrade

Helm list:

NAME    NAMESPACE       REVISION        UPDATED                                 STATUS          CHART           APP VERSION
mongodb demo            22              2020-08-17 02:40:31.151206363 +0000 UTC deployed        mongodb-7.6.3   4.0.14
redis   demo            22              2020-08-17 02:40:31.296983141 +0000 UTC deployed        redis-10.3.1    5.0.7

@hiddeco
Copy link
Member

hiddeco commented Aug 17, 2020

Sorry for the late response all, last months have been hectic in terms of workload and GitOps Toolkit developments, and I was enjoying time off the last two weeks.

I tried to reproduce the issue with version 1.2.0 of the Helm Operator with both the Redis HelmRelease example @eklee reported and the cert-manager HelmRelease in the issue, and was unable to observe any spurious upgrades.

@davidholsgrove may it be possible that the revision drift up to 2291 was due to the misbehaving 1.1.0 version, and thus fixed by #490? If not, can you all please share the Status object of the misbehaving HelmRelease? This would give me better insights in why it may happen.

@davidholsgrove
Copy link

davidholsgrove commented Aug 17, 2020

@hiddeco I've been using helm-operator v1.2.0 since 10th August, and only noticed the run away upgrades this week. Its occurring in 3 separate k8s clusters (all using fluxcd 1.4.0 / helm-operator 1.2.0, with separate backing git repos of HelmReleases).

I'm using fixed chart versions, so not the same as #469

Ive killed the fluxcd and helm-operator pods in each cluster to stop the helm history being trashed.

Cluster 1

prometheus-operator and helm-operator continually upgrading;

$ helm ls -A
NAME                            NAMESPACE               REVISION        UPDATED                                         STATUS          CHART                                                   APP VERSION
flux                            fluxcd                  29              2020-08-09 22:20:49.084079856 +0000 UTC         deployed        flux-1.4.0                                              1.20.0
helm-operator                   fluxcd                  1032            2020-08-12 04:25:04.433300225 +0000 UTC         deployed        helm-operator-1.2.0                                     1.2.0
prometheus-operator             monitoring              497             2020-08-13 07:54:39.642859 +1000 AEST           deployed        prometheus-operator-9.3.1                               0.38.1

Upgrade occurring every 3 minutes (the last one manually, after helm-operator had been stopped for a day):

$ helm -n monitoring history prometheus-operator
REVISION        UPDATED                         STATUS          CHART                           APP VERSION     DESCRIPTION
488             Wed Aug 12 04:01:19 2020        superseded      prometheus-operator-9.3.1       0.38.1          Upgrade complete
489             Wed Aug 12 04:04:09 2020        superseded      prometheus-operator-9.3.1       0.38.1          Upgrade complete
490             Wed Aug 12 04:07:10 2020        superseded      prometheus-operator-9.3.1       0.38.1          Upgrade complete
491             Wed Aug 12 04:10:11 2020        superseded      prometheus-operator-9.3.1       0.38.1          Upgrade complete
492             Wed Aug 12 04:13:18 2020        superseded      prometheus-operator-9.3.1       0.38.1          Upgrade complete
493             Wed Aug 12 04:16:15 2020        superseded      prometheus-operator-9.3.1       0.38.1          Upgrade complete
494             Wed Aug 12 04:19:14 2020        superseded      prometheus-operator-9.3.1       0.38.1          Upgrade complete
495             Wed Aug 12 04:22:09 2020        superseded      prometheus-operator-9.3.1       0.38.1          Upgrade complete
496             Wed Aug 12 04:25:18 2020        superseded      prometheus-operator-9.3.1       0.38.1          Upgrade complete
497             Thu Aug 13 07:54:39 2020        deployed        prometheus-operator-9.3.1       0.38.1          Upgrade complete
$ k -n monitoring describe hr prometheus-operator
Name:         prometheus-operator
Namespace:    monitoring
Labels:       fluxcd.io/sync-gc-mark=sha256.OfeOgFbdT4R06Z7OMl2uT9Wjy5tc0SdI9v8JqxaoqeQ
Annotations:  fluxcd.io/sync-checksum: 69091ad8e7fe97e3926e9d2256a4a42f1d87d459
              kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"helm.fluxcd.io/v1","kind":"HelmRelease","metadata":{"annotations":{"fluxcd.io/sync-checksum":"69091ad8e7fe97e3926e9d2256a4a...
API Version:  helm.fluxcd.io/v1
Kind:         HelmRelease
Metadata:
  Creation Timestamp:  2020-07-22T07:04:18Z
  Generation:          3
  Resource Version:    28654611
Spec:
  Chart:
    Name:        prometheus-operator
    Repository:  https://kubernetes-charts.storage.googleapis.com
    Version:     9.3.1
  Helm Version:  v3
  Release Name:  prometheus-operator

[--snip--]

Status:
  Conditions:
    Last Transition Time:   2020-07-22T07:04:49Z
    Last Update Time:       2020-08-12T04:25:10Z
    Message:                Chart fetch was successful for Helm release 'prometheus-operator' in 'monitoring'.
    Reason:                 ChartFetched
    Status:                 True
    Type:                   ChartFetched
    Last Transition Time:   2020-08-09T22:21:57Z
    Last Update Time:       2020-08-12T04:25:50Z
    Message:                Release was successful for Helm release 'prometheus-operator' in 'monitoring'.
    Reason:                 Succeeded
    Status:                 True
    Type:                   Released
  Last Attempted Revision:  9.1.1
  Observed Generation:      3
  Phase:                    Succeeded
  Release Name:             prometheus-operator
  Release Status:           deployed
  Revision:                 9.3.1
Events:                     <none>

Cluster 2

gitlab continually upgrading;

$ helm ls -A
NAME                            NAMESPACE               REVISION        UPDATED                                 STATUS          CHART                                   APP VERSION
flux                            fluxcd                  29              2020-07-31 11:11:32.259345348 +0000 UTC deployed        flux-1.4.0                              1.20.0
gitlab                          gitlab                  2291            2020-08-17 01:24:42.888677627 +0000 UTC deployed        gitlab-4.2.4                            13.2.4
helm-operator                   fluxcd                  29              2020-08-10 11:23:56.7837415 +1000 AEST  deployed        helm-operator-1.2.0                     1.2.0
$ helm -n gitlab history gitlab
REVISION        UPDATED                         STATUS          CHART           APP VERSION     DESCRIPTION
2283            Mon Aug 17 01:00:58 2020        superseded      gitlab-4.2.4    13.2.4          Upgrade complete
2284            Mon Aug 17 01:03:42 2020        superseded      gitlab-4.2.4    13.2.4          Upgrade complete
2285            Mon Aug 17 01:06:42 2020        superseded      gitlab-4.2.4    13.2.4          Upgrade complete
2286            Mon Aug 17 01:09:43 2020        superseded      gitlab-4.2.4    13.2.4          Upgrade complete
2287            Mon Aug 17 01:12:33 2020        superseded      gitlab-4.2.4    13.2.4          Upgrade complete
2288            Mon Aug 17 01:15:38 2020        superseded      gitlab-4.2.4    13.2.4          Upgrade complete
2289            Mon Aug 17 01:18:36 2020        superseded      gitlab-4.2.4    13.2.4          Upgrade complete
2290            Mon Aug 17 01:21:41 2020        superseded      gitlab-4.2.4    13.2.4          Upgrade complete
2291            Mon Aug 17 01:24:42 2020        deployed        gitlab-4.2.4    13.2.4          Upgrade complete
2292            Mon Aug 17 01:27:37 2020        pending-upgrade gitlab-4.2.4    13.2.4          Preparing upgrade
$ k -n gitlab describe hr gitlab
Name:         gitlab
Namespace:    gitlab
Labels:       fluxcd.io/sync-gc-mark=sha256.tUcmp_UAo3ET0QtAeI2CG0-lgb2ZYTRDRYWgENMW2xI
Annotations:  fluxcd.io/sync-checksum: 627e765ba04176a6940be115cb3d686eb4b965f3
              kubectl.kubernetes.io/last-applied-configuration:
                {"apiVersion":"helm.fluxcd.io/v1","kind":"HelmRelease","metadata":{"annotations":{"fluxcd.io/sync-checksum":"627e765ba04176a6940be115cb3d6...
API Version:  helm.fluxcd.io/v1
Kind:         HelmRelease
Metadata:
  Creation Timestamp:  2020-07-07T07:57:21Z
  Generation:          15
  Resource Version:    36550979
Spec:
  Chart:
    Name:        gitlab
    Repository:  https://charts.gitlab.io
    Version:     4.2.4
  Helm Version:  v3
  Release Name:  gitlab

[--snip--]

Status:
  Conditions:
    Last Transition Time:   2020-07-31T11:13:15Z
    Last Update Time:       2020-08-17T01:25:22Z
    Message:                Release was successful for Helm release 'gitlab' in 'gitlab'.
    Reason:                 Succeeded
    Status:                 True
    Type:                   Released
    Last Transition Time:   2020-07-10T00:41:46Z
    Last Update Time:       2020-08-17T01:27:23Z
    Message:                Chart fetch was successful for Helm release 'gitlab' in 'gitlab'.
    Reason:                 ChartFetched
    Status:                 True
    Type:                   ChartFetched
  Last Attempted Revision:  4.1.4
  Observed Generation:      15
  Phase:                    ChartFetched
  Release Name:             gitlab
  Release Status:           pending-upgrade
  Revision:                 4.2.4
Events:                     <none>

@gmaiztegi
Copy link

Hi there. I reported #469 in 1.1.0 but we are now running the same issue for all releases in 1.2.0, just like most of people here reported. Rolling back to 1.0.1 :(

@hiddeco
Copy link
Member

hiddeco commented Aug 18, 2020

The problem seems to be that the LastAttemptedRevision is not set to the right version (but stuck on an older version), which is later used to determine if the release needs to be upgraded. This gives me sufficient information to work on a fix, but KubeCon is in the way today.

I will try to have a prerelease ready for you by tomorrow.

@hiddeco
Copy link
Member

hiddeco commented Aug 18, 2020

Still unsuccessful in replicating the issue where the LastAttemptedRevision is not updated, even when I try to replicate it by jumping from 1.0.1 -> 1.1.0 -> 1.2.0 (while performing version upgrades for the HelmRelease in the meantime).

Given you all seem to have installed the helm-operator using Helm itself, can you please provide me with the output of kubectl get crd helmreleases.helm.fluxcd.io -o yaml, as I have a suspicion it may be due to Helm not performing upgrades for CRDs while a field has been added to the status field (since >=1.1.0).

(Another option would be to kubectl apply -f https://raw.githubusercontent.com/fluxcd/helm-operator/v1.2.0/deploy/crds.yaml, and see if the problem goes away).

@davidholsgrove
Copy link

Thanks @hiddeco - looks like it was the version of the CRD wasn't updated and caused the run away helm-operator upgrades.

The HelmOperator chart option createCRD=true - is that the "right" way to go when we have HelmOperator managing the HelmRelease for HelmOperator?

Previously I had a (stale) version of the CRD in the git repo my FluxCD enforces. Would be good if HelmOperator had an initcontainer or other check and refused to start if its CRD was the wrong version maybe?

@hiddeco
Copy link
Member

hiddeco commented Aug 19, 2020

The HelmOperator chart option createCRD=true - is that the "right" way to go when we have HelmOperator managing the HelmRelease for HelmOperator?

The right way is to not install the CRDs using the Helm chart, but apply it manually / synchronize it using Flux (as written out in the install instructions).

Previously I had a (stale) version of the CRD in the git repo my FluxCD enforces. Would be good if HelmOperator had an initcontainer or other check and refused to start if its CRD was the wrong version maybe?

If possible, that would likely be an improvement, but I do not think it will be implemented at this time (or in the near future) as we are working on a next-gen helm-controller that will eventually replace the Helm operator.

@stevehipwell
Copy link

@hiddeco we've been having this issue on v1.1.0 with the correct CRD managed by Flux. Are you saying that this issue affects v1.1.0 and v1.2.0 if the CRD isn't updated?

@onedr0p
Copy link

onedr0p commented Aug 19, 2020

After applying the CRD for v1.2.0, I am happy to report helm operator v1.2.0 is no longer doing unwarranted releases. I think this issue could be closed.

Thanks @hiddeco !

@stefanprodan
Copy link
Member

stefanprodan commented Aug 19, 2020

Just a friendly reminder that Helm is not suitable for managing the lifecycle of Kubernetes CRD controllers. CRDs have to be extracted from charts and applied on the cluster with Flux as plain YAMLs, otherwise the controller version will diverge from its API and that can break production in various ways.

@talmarco
Copy link

@stefanprodan can you elaborate on how CRD should be handled?
I use cert-manager (official helm chart) and the prerequisite is to install their CRD first, so basically the CRD is out of the chart and not part of the HelmRelease. Thanks

@onedr0p
Copy link

onedr0p commented Aug 19, 2020

@talmarco it is described here, however these notes should be for upgrading too, not only install

https://github.com/fluxcd/helm-operator/tree/master/chart/helm-operator#installation

@talmarco
Copy link

@onedr0p I already have helm-operator CRD installed. My question was how to handle other CRDs (like cert-manager) as I understand from @stefanprodan's answer this was the root cause for constantly upgrading the charts.

@onedr0p
Copy link

onedr0p commented Aug 19, 2020

When upgrading cert manager you also need to manually apply the crds or commit them to your repo for flux to apply. Same goes for here.

@hiddeco
Copy link
Member

hiddeco commented Aug 19, 2020

@stevehipwell 1.1.0 has a bug which was fixed in 1.2.0, which requires an update of the CRD.

@stevehipwell
Copy link

Thanks @hiddeco, that was what I thought and what we've seen when testing the v1.2.0 release.

Just a friendly reminder that Helm is not suitable for managing the lifecycle of Kubernetes CRD controllers. CRDs have to be extracted from charts and applied on the cluster with Flux as plain YAMLs, otherwise the controller version will diverge from its API and that can break production in various ways.

@stefanprodan related to the above statement could you confirm that skipCRDs is true by default for the HelmRelease custom resources (the docs don't give the default values)? If not I'd be interested to know why? We've manually turned off all in chart CRD values (looking at you cert-manager) and set skipCRDs to true so that Flux can manage our CRDs; but that was based on our own understanding and not any actual documentation when we made this decision (about the time of the v1.1.0 release of the helm-operator).

@stefanprodan
Copy link
Member

@stevehipwell skipCRDs has no effect on upgrades as Helm ignores the crds dir, it only works at install time but only if the CRDs are not already applied on the cluster. If you keep the CRDs in Git, Flux will apply them before the HelmReleases, so skipCRDs is not relevant.

@stefanprodan
Copy link
Member

Well we assume people follow our install/upgrade instructions and always update both CRDs and controller as they are part of the same thing.

@stevehipwell
Copy link

@stefanprodan I agree that there isn't anything wrong and that the instructions are correct but specifically calling out CRD changes in release notes would be further designing for success.

@hiddeco
Copy link
Member

hiddeco commented Aug 19, 2020

I used to include them in the release notes (see for example the 1.0.0 release notes), but forgot to add them for the following CRD change.

@davidholsgrove
Copy link

weren't the v1.2.0 CRD changes breaking?

That's a good question, I would say that we didn't changed the behaviour of a user-facing CRD field that's not backward compatible. We added things to the status and it doesn't require you to delete the HRs or modify them by hand, it just requires you to update the CRD at the same time with controller.

Given bumping Helm-Operator from v1.1.0 to v1.2.0 without applying the new CRD resulted in a change in behaviour of the controller, I'd suggest this wasn't a backwards compatible upgrade of the controller?

I used to include them in the release notes (see for example the 1.0.0 release notes), but forgot to add them for the following CRD change.

Well we assume people follow our install/upgrade instructions and always update both CRDs and controller as they are part of the same thing.

Thanks - I had the CRD from the v1.0.0 managed by inclusion as yaml in my Flux monitored git repo.
The missing piece here for me was no indication the v1.2.0 Helm-Operator required an update of that CRD in the release notes, and the https://github.com/fluxcd/helm-operator/tree/master/chart/helm-operator#installation could be clearer on the steps for upgrading rather than assumed knowledge?

Thanks again for the assist, looking forward to v2 in the GitOps Toolkit!

@brpaz
Copy link
Author

brpaz commented Aug 23, 2020

@stefanprodan Hmm, I don't remember doing any updates to the helm operator since I installed it in my cluster. I have version 1.1.0 and here is the output of the CRD definition:

CRD Definition
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"apiextensions.k8s.io/v1beta1","kind":"CustomResourceDefinition","metadata":{"labels":{"app.kubernetes.io/managed-by":"pulumi"},"name":"helmreleases.helm.fluxcd.io"},"spec":{"additionalPrinterColumns":[{"JSONPath":".status.releaseName","description":"ReleaseName is the name of the Helm release managed by the HelmRelease, as given by Helm.","name":"Release","type":"string"},{"JSONPath":".status.phase","description":"Phase is the current release phase being performed for the HelmRelease.","name":"Phase","type":"string"},{"JSONPath":".status.releaseStatus","description":"ReleaseStatus is the status of the Helm release managed by the HelmRelease, as given by Helm.","name":"Status","type":"string"},{"JSONPath":".status.conditions[?(@.type==\"Released\")].message","name":"Message","type":"string"},{"JSONPath":".metadata.creationTimestamp","description":"CreationTimestamp is a timestamp representing the server time when this object was created. It is not guaranteed to be set in happens-before order across separate operations. Clients may not set this value. It is represented in RFC3339 form and is in UTC.","name":"Age","type":"date"}],"group":"helm.fluxcd.io","names":{"kind":"HelmRelease","listKind":"HelmReleaseList","plural":"helmreleases","shortNames":["hr","hrs"],"singular":"helmrelease"},"scope":"Namespaced","subresources":{"status":{}},"validation":{"openAPIV3Schema":{"description":"HelmRelease is a type to represent a Helm release.","properties":{"apiVersion":{"description":"APIVersion defines the versioned schema of this representation of an object. Servers should convert recognized schemas to the latest internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources","type":"string"},"kind":{"description":"Kind is a string value representing the REST resource this object represents. Servers may infer this from the endpoint the client submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds","type":"string"},"metadata":{"type":"object"},"spec":{"properties":{"chart":{"properties":{"chartPullSecret":{"description":"ChartPullSecret holds the reference to the authentication secret for accessing the Helm repository using HTTPS basic auth. NOT IMPLEMENTED!","properties":{"name":{"type":"string"}},"required":["name"],"type":"object"},"git":{"description":"Git URL is the URL of the Git repository, e.g. `[email protected]:org/repo`, `http://github.com/org/repo`, or `ssh://[email protected]:2222/org/repo.git`.","type":"string"},"name":{"description":"Name is the name of the Helm chart _without_ an alias, e.g. redis (for `helm upgrade [flags] stable/redis`).","type":"string"},"path":{"description":"Path is the path to the chart relative to the repository root.","type":"string"},"ref":{"description":"Ref is the Git branch (or other reference) to use. Defaults to 'master', or the configured default Git ref.","type":"string"},"repository":{"description":"RepoURL is the URL of the Helm repository, e.g. `https://kubernetes-charts.storage.googleapis.com` or `https://charts.example.com`.","type":"string"},"secretRef":{"description":"SecretRef holds the authentication secret for accessing the Git repository (over HTTPS). The credentials will be added to an HTTPS GitURL before the mirror is started.","properties":{"name":{"type":"string"},"namespace":{"type":"string"}},"required":["name"],"type":"object"},"skipDepUpdate":{"description":"SkipDepUpdate will tell the operator to skip running 'helm dep update' before installing or upgrading the chart, the chart dependencies _must_ be present for this to succeed.","type":"boolean"},"version":{"description":"Version is the targeted Helm chart version, e.g. 7.0.1.","type":"string"}},"type":"object"},"forceUpgrade":{"description":"Force will mark this Helm release to `--force` upgrades. This forces the resource updates through delete/recreate if needed.","type":"boolean"},"helmVersion":{"description":"HelmVersion is the version of Helm to target. If not supplied, the lowest _enabled Helm version_ will be targeted. Valid HelmVersion values are: \"v2\", \"v3\"","enum":["v2","v3"],"type":"string"},"maxHistory":{"description":"MaxHistory is the maximum amount of revisions to keep for the Helm release. If not supplied, it defaults to 10.","type":"integer"},"releaseName":{"description":"ReleaseName is the name of the The Helm release. If not supplied, it will be generated by affixing the namespace to the resource name.","type":"string"},"resetValues":{"description":"ResetValues will mark this Helm release to reset the values to the defaults of the targeted chart before performing an upgrade. Not explicitly setting this to `false` equals to `true` due to the declarative nature of the operator.","type":"boolean"},"rollback":{"description":"The rollback settings for this Helm release.","properties":{"disableHooks":{"description":"DisableHooks will mark this Helm release to prevent hooks from running during the rollback.","type":"boolean"},"enable":{"description":"Enable will mark this Helm release for rollbacks.","type":"boolean"},"force":{"description":"Force will mark this Helm release to `--force` rollbacks. This forces the resource updates through delete/recreate if needed.","type":"boolean"},"maxRetries":{"description":"MaxRetries is the maximum amount of upgrade retries the operator should make before bailing.","format":"int64","type":"integer"},"recreate":{"description":"Recreate will mark this Helm release to `--recreate-pods` for if applicable. This performs pod restarts.","type":"boolean"},"retry":{"description":"Retry will mark this Helm release for upgrade retries after a rollback.","type":"boolean"},"timeout":{"description":"Timeout is the time to wait for any individual Kubernetes operation (like Jobs for hooks) during rollback.","format":"int64","type":"integer"},"wait":{"description":"Wait will mark this Helm release to wait until all Pods, PVCs, Services, and minimum number of Pods of a Deployment, StatefulSet, or ReplicaSet are in a ready state before marking the release as successful.","type":"boolean"}},"type":"object"},"skipCRDs":{"description":"SkipCRDs will mark this Helm release to skip the creation of CRDs during a Helm 3 installation.","type":"boolean"},"targetNamespace":{"description":"TargetNamespace overrides the targeted namespace for the Helm release. The default namespace equals to the namespace of the HelmRelease resource.","type":"string"},"timeout":{"description":"Timeout is the time to wait for any individual Kubernetes operation (like Jobs for hooks) during installation and upgrade operations.","format":"int64","type":"integer"},"valueFileSecrets":{"description":"ValueFileSecrets holds the local name references to secrets. DEPRECATED, use ValuesFrom.secretKeyRef instead.","items":{"properties":{"name":{"type":"string"}},"required":["name"],"type":"object"},"type":"array"},"values":{"description":"Values holds the values for this Helm release.","type":"object"},"valuesFrom":{"items":{"properties":{"chartFileRef":{"description":"The reference to a local chart file with release values.","properties":{"optional":{"description":"Optional will mark this ChartFileSelector as optional. The result of this are that operations are permitted without the source, due to it e.g. being temporarily unavailable.","type":"boolean"},"path":{"description":"Path is the file path to the source relative to the chart root.","type":"string"}},"required":["path"],"type":"object"},"configMapKeyRef":{"description":"The reference to a config map with release values.","properties":{"key":{"type":"string"},"name":{"type":"string"},"namespace":{"type":"string"},"optional":{"type":"boolean"}},"required":["name"],"type":"object"},"externalSourceRef":{"description":"The reference to an external source with release values.","properties":{"optional":{"description":"Optional will mark this ExternalSourceSelector as optional. The result of this are that operations are permitted without the source, due to it e.g. being temporarily unavailable.","type":"boolean"},"url":{"description":"URL is the URL of the external source.","type":"string"}},"required":["url"],"type":"object"},"secretKeyRef":{"description":"The reference to a secret with release values.","properties":{"key":{"type":"string"},"name":{"type":"string"},"namespace":{"type":"string"},"optional":{"type":"boolean"}},"required":["name"],"type":"object"}},"type":"object"},"type":"array"},"wait":{"description":"Wait will mark this Helm release to wait until all Pods, PVCs, Services, and minimum number of Pods of a Deployment, StatefulSet, or ReplicaSet are in a ready state before marking the release as successful.","type":"boolean"}},"required":["chart"],"type":"object"},"status":{"description":"HelmReleaseStatus contains status information about an HelmRelease.","properties":{"conditions":{"description":"Conditions contains observations of the resource's state, e.g., has the chart which it refers to been fetched.","items":{"properties":{"lastTransitionTime":{"description":"LastTransitionTime is the timestamp corresponding to the last status change of this condition.","format":"date-time","type":"string"},"lastUpdateTime":{"description":"LastUpdateTime is the timestamp corresponding to the last status update of this condition.","format":"date-time","type":"string"},"message":{"description":"Message is a human readable description of the details of the last transition, complementing reason.","type":"string"},"reason":{"description":"Reason is a brief machine readable explanation for the condition's last transition.","type":"string"},"status":{"description":"Status of the condition, one of ('True', 'False', 'Unknown').","enum":["True","False","Unknown"],"type":"string"},"type":{"description":"Type of the condition, one of ('ChartFetched', 'Released', 'RolledBack').","enum":["ChartFetched","Released","RolledBack"],"type":"string"}},"required":["status","type"],"type":"object"},"type":"array"},"lastAttemptedRevision":{"description":"LastAttemptedRevision is the revision of the latest chart sync, and may be of a failed release.","type":"string"},"observedGeneration":{"description":"ObservedGeneration is the most recent generation observed by the operator.","format":"int64","type":"integer"},"phase":{"description":"Phase the release is in, one of ('ChartFetched', 'ChartFetchFailed', 'Installing', 'Upgrading', 'Succeeded', 'RollingBack', 'RolledBack', 'RollbackFailed')","enum":["ChartFetched","ChartFetchFailed","Installing","Upgrading","Succeeded","Failed","RollingBack","RolledBack","RollbackFailed"],"type":"string"},"releaseName":{"description":"ReleaseName is the name as either supplied or generated.","type":"string"},"releaseStatus":{"description":"ReleaseStatus is the status as given by Helm for the release managed by this resource.","type":"string"},"revision":{"description":"Revision holds the Git hash or version of the chart currently deployed.","type":"string"},"rollbackCount":{"description":"RollbackCount records the amount of rollback attempts made, it is incremented after a rollback failure and reset after a successful upgrade or revision change.","format":"int64","type":"integer"}},"type":"object"}},"required":["metadata","spec"],"type":"object"}},"version":"v1","versions":[{"name":"v1","served":true,"storage":true}]}}
  creationTimestamp: "2020-06-09T10:42:16Z"
  generation: 1
  labels:
    app.kubernetes.io/managed-by: pulumi
  name: helmreleases.helm.fluxcd.io
  resourceVersion: "2246"
  selfLink: /apis/apiextensions.k8s.io/v1/customresourcedefinitions/helmreleases.helm.fluxcd.io
  uid: 7e35e98a-32d5-49cc-b3dc-014aa0c2ec83
spec:
  conversion:
    strategy: None
  group: helm.fluxcd.io
  names:
    kind: HelmRelease
    listKind: HelmReleaseList
    plural: helmreleases
    shortNames:
    - hr
    - hrs
    singular: helmrelease
  preserveUnknownFields: true
  scope: Namespaced
  versions:
  - additionalPrinterColumns:
    - description: ReleaseName is the name of the Helm release managed by the HelmRelease,
        as given by Helm.
      jsonPath: .status.releaseName
      name: Release
      type: string
    - description: Phase is the current release phase being performed for the HelmRelease.
      jsonPath: .status.phase
      name: Phase
      type: string
    - description: ReleaseStatus is the status of the Helm release managed by the
        HelmRelease, as given by Helm.
      jsonPath: .status.releaseStatus
      name: Status
      type: string
    - jsonPath: .status.conditions[?(@.type=="Released")].message
      name: Message
      type: string
    - description: CreationTimestamp is a timestamp representing the server time when
        this object was created. It is not guaranteed to be set in happens-before
        order across separate operations. Clients may not set this value. It is represented
        in RFC3339 form and is in UTC.
      jsonPath: .metadata.creationTimestamp
      name: Age
      type: date
    name: v1
    schema:
      openAPIV3Schema:
        description: HelmRelease is a type to represent a Helm release.
        properties:
          apiVersion:
            description: 'APIVersion defines the versioned schema of this representation
              of an object. Servers should convert recognized schemas to the latest
              internal value, and may reject unrecognized values. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources'
            type: string
          kind:
            description: 'Kind is a string value representing the REST resource this
              object represents. Servers may infer this from the endpoint the client
              submits requests to. Cannot be updated. In CamelCase. More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds'
            type: string
          metadata:
            type: object
          spec:
            properties:
              chart:
                properties:
                  chartPullSecret:
                    description: ChartPullSecret holds the reference to the authentication
                      secret for accessing the Helm repository using HTTPS basic auth.
                      NOT IMPLEMENTED!
                    properties:
                      name:
                        type: string
                    required:
                    - name
                    type: object
                  git:
                    description: Git URL is the URL of the Git repository, e.g. `[email protected]:org/repo`,
                      `http://github.com/org/repo`, or `ssh://[email protected]:2222/org/repo.git`.
                    type: string
                  name:
                    description: Name is the name of the Helm chart _without_ an alias,
                      e.g. redis (for `helm upgrade [flags] stable/redis`).
                    type: string
                  path:
                    description: Path is the path to the chart relative to the repository
                      root.
                    type: string
                  ref:
                    description: Ref is the Git branch (or other reference) to use.
                      Defaults to 'master', or the configured default Git ref.
                    type: string
                  repository:
                    description: RepoURL is the URL of the Helm repository, e.g. `https://kubernetes-charts.storage.googleapis.com`
                      or `https://charts.example.com`.
                    type: string
                  secretRef:
                    description: SecretRef holds the authentication secret for accessing
                      the Git repository (over HTTPS). The credentials will be added
                      to an HTTPS GitURL before the mirror is started.
                    properties:
                      name:
                        type: string
                      namespace:
                        type: string
                    required:
                    - name
                    type: object
                  skipDepUpdate:
                    description: SkipDepUpdate will tell the operator to skip running
                      'helm dep update' before installing or upgrading the chart,
                      the chart dependencies _must_ be present for this to succeed.
                    type: boolean
                  version:
                    description: Version is the targeted Helm chart version, e.g.
                      7.0.1.
                    type: string
                type: object
              forceUpgrade:
                description: Force will mark this Helm release to `--force` upgrades.
                  This forces the resource updates through delete/recreate if needed.
                type: boolean
              helmVersion:
                description: 'HelmVersion is the version of Helm to target. If not
                  supplied, the lowest _enabled Helm version_ will be targeted. Valid
                  HelmVersion values are: "v2", "v3"'
                enum:
                - v2
                - v3
                type: string
              maxHistory:
                description: MaxHistory is the maximum amount of revisions to keep
                  for the Helm release. If not supplied, it defaults to 10.
                type: integer
              releaseName:
                description: ReleaseName is the name of the The Helm release. If not
                  supplied, it will be generated by affixing the namespace to the
                  resource name.
                type: string
              resetValues:
                description: ResetValues will mark this Helm release to reset the
                  values to the defaults of the targeted chart before performing an
                  upgrade. Not explicitly setting this to `false` equals to `true`
                  due to the declarative nature of the operator.
                type: boolean
              rollback:
                description: The rollback settings for this Helm release.
                properties:
                  disableHooks:
                    description: DisableHooks will mark this Helm release to prevent
                      hooks from running during the rollback.
                    type: boolean
                  enable:
                    description: Enable will mark this Helm release for rollbacks.
                    type: boolean
                  force:
                    description: Force will mark this Helm release to `--force` rollbacks.
                      This forces the resource updates through delete/recreate if
                      needed.
                    type: boolean
                  maxRetries:
                    description: MaxRetries is the maximum amount of upgrade retries
                      the operator should make before bailing.
                    format: int64
                    type: integer
                  recreate:
                    description: Recreate will mark this Helm release to `--recreate-pods`
                      for if applicable. This performs pod restarts.
                    type: boolean
                  retry:
                    description: Retry will mark this Helm release for upgrade retries
                      after a rollback.
                    type: boolean
                  timeout:
                    description: Timeout is the time to wait for any individual Kubernetes
                      operation (like Jobs for hooks) during rollback.
                    format: int64
                    type: integer
                  wait:
                    description: Wait will mark this Helm release to wait until all
                      Pods, PVCs, Services, and minimum number of Pods of a Deployment,
                      StatefulSet, or ReplicaSet are in a ready state before marking
                      the release as successful.
                    type: boolean
                type: object
              skipCRDs:
                description: SkipCRDs will mark this Helm release to skip the creation
                  of CRDs during a Helm 3 installation.
                type: boolean
              targetNamespace:
                description: TargetNamespace overrides the targeted namespace for
                  the Helm release. The default namespace equals to the namespace
                  of the HelmRelease resource.
                type: string
              timeout:
                description: Timeout is the time to wait for any individual Kubernetes
                  operation (like Jobs for hooks) during installation and upgrade
                  operations.
                format: int64
                type: integer
              valueFileSecrets:
                description: ValueFileSecrets holds the local name references to secrets.
                  DEPRECATED, use ValuesFrom.secretKeyRef instead.
                items:
                  properties:
                    name:
                      type: string
                  required:
                  - name
                  type: object
                type: array
              values:
                description: Values holds the values for this Helm release.
                type: object
              valuesFrom:
                items:
                  properties:
                    chartFileRef:
                      description: The reference to a local chart file with release
                        values.
                      properties:
                        optional:
                          description: Optional will mark this ChartFileSelector as
                            optional. The result of this are that operations are permitted
                            without the source, due to it e.g. being temporarily unavailable.
                          type: boolean
                        path:
                          description: Path is the file path to the source relative
                            to the chart root.
                          type: string
                      required:
                      - path
                      type: object
                    configMapKeyRef:
                      description: The reference to a config map with release values.
                      properties:
                        key:
                          type: string
                        name:
                          type: string
                        namespace:
                          type: string
                        optional:
                          type: boolean
                      required:
                      - name
                      type: object
                    externalSourceRef:
                      description: The reference to an external source with release
                        values.
                      properties:
                        optional:
                          description: Optional will mark this ExternalSourceSelector
                            as optional. The result of this are that operations are
                            permitted without the source, due to it e.g. being temporarily
                            unavailable.
                          type: boolean
                        url:
                          description: URL is the URL of the external source.
                          type: string
                      required:
                      - url
                      type: object
                    secretKeyRef:
                      description: The reference to a secret with release values.
                      properties:
                        key:
                          type: string
                        name:
                          type: string
                        namespace:
                          type: string
                        optional:
                          type: boolean
                      required:
                      - name
                      type: object
                  type: object
                type: array
              wait:
                description: Wait will mark this Helm release to wait until all Pods,
                  PVCs, Services, and minimum number of Pods of a Deployment, StatefulSet,
                  or ReplicaSet are in a ready state before marking the release as
                  successful.
                type: boolean
            required:
            - chart
            type: object
          status:
            description: HelmReleaseStatus contains status information about an HelmRelease.
            properties:
              conditions:
                description: Conditions contains observations of the resource's state,
                  e.g., has the chart which it refers to been fetched.
                items:
                  properties:
                    lastTransitionTime:
                      description: LastTransitionTime is the timestamp corresponding
                        to the last status change of this condition.
                      format: date-time
                      type: string
                    lastUpdateTime:
                      description: LastUpdateTime is the timestamp corresponding to
                        the last status update of this condition.
                      format: date-time
                      type: string
                    message:
                      description: Message is a human readable description of the
                        details of the last transition, complementing reason.
                      type: string
                    reason:
                      description: Reason is a brief machine readable explanation
                        for the condition's last transition.
                      type: string
                    status:
                      description: Status of the condition, one of ('True', 'False',
                        'Unknown').
                      enum:
                      - "True"
                      - "False"
                      - Unknown
                      type: string
                    type:
                      description: Type of the condition, one of ('ChartFetched',
                        'Released', 'RolledBack').
                      enum:
                      - ChartFetched
                      - Released
                      - RolledBack
                      type: string
                  required:
                  - status
                  - type
                  type: object
                type: array
              lastAttemptedRevision:
                description: LastAttemptedRevision is the revision of the latest chart
                  sync, and may be of a failed release.
                type: string
              observedGeneration:
                description: ObservedGeneration is the most recent generation observed
                  by the operator.
                format: int64
                type: integer
              phase:
                description: Phase the release is in, one of ('ChartFetched', 'ChartFetchFailed',
                  'Installing', 'Upgrading', 'Succeeded', 'RollingBack', 'RolledBack',
                  'RollbackFailed')
                enum:
                - ChartFetched
                - ChartFetchFailed
                - Installing
                - Upgrading
                - Succeeded
                - Failed
                - RollingBack
                - RolledBack
                - RollbackFailed
                type: string
              releaseName:
                description: ReleaseName is the name as either supplied or generated.
                type: string
              releaseStatus:
                description: ReleaseStatus is the status as given by Helm for the
                  release managed by this resource.
                type: string
              revision:
                description: Revision holds the Git hash or version of the chart currently
                  deployed.
                type: string
              rollbackCount:
                description: RollbackCount records the amount of rollback attempts
                  made, it is incremented after a rollback failure and reset after
                  a successful upgrade or revision change.
                format: int64
                type: integer
            type: object
        required:
        - metadata
        - spec
        type: object
    served: true
    storage: true
    subresources:
      status: {}
status:
  acceptedNames:
    kind: HelmRelease
    listKind: HelmReleaseList
    plural: helmreleases
    shortNames:
    - hr
    - hrs
    singular: helmrelease
  conditions:
  - lastTransitionTime: "2020-06-09T10:42:16Z"
    message: no conflicts found
    reason: NoConflicts
    status: "True"
    type: NamesAccepted
  - lastTransitionTime: "2020-06-09T10:42:16Z"
    message: the initial names have been accepted
    reason: InitialNamesAccepted
    status: "True"
    type: Established
  storedVersions:
  - v1

I could try doing an update 1.2.0 and update the CRD manually like said.

@mnaser
Copy link
Contributor

mnaser commented Sep 16, 2020

I think I've tracked down this bug.

In my case, I had a release being recreated non-stop. I decided to enable the --update-chart-deps=true in the helm-operator to get a view on what's going on exactly. I got a very long diff blob which was hard to parse but I copied it into a file and ran the following to normalize it:

sed -i 's/\\n/\
/g' faildiff
sed -i 's/\\"/"/g' faildiff
sed -i 's/\\t/\t/' faildiff

All of this fun lead me to a clean readable diff where I've identified the following in my case:

  	\t\t"metadata_agent": map[string]interface{}{
  	\t\t\t"DEFAULT": map[string]interface{}{
  	\t\t\t\t... // 2 identical entries
  	\t\t\t\t"nova_metadata_host": string("metadata.openstack.svc.cluster.local"),
  	\t\t\t\t"nova_metadata_ip":   string("metadata.openstack.svc.cluster.local"),
- 	\t\t\t\t"nova_metadata_port": float64(80),
+ 	\t\t\t\t"nova_metadata_port": int(80),
  	\t\t\t},

It looks like during the comparison, there is a mismatch with Helm thinking one part of this is float64 and the other part is int. This confirms my confusion because upon running helm diff against the release, I got no output.

I haven't dug into why this is happening exactly or where, but I'll be doing that. Help is appreciated. :)

@mnaser
Copy link
Contributor

mnaser commented Sep 16, 2020

I think my theory right now is that we're currently building the dry-run using the original values which are parsed from YAML. This probably parses things in the correct type. However, the data we pull out to compare from Helm (stored inside a secret in this case) is going through the JSON unmarshal which as per the documentation:

To unmarshal JSON into an interface value, Unmarshal stores one of these in the interface value:
...
float64, for JSON numbers

I've dug around and it looks like we can workaround this, there seems to be something documented inside google/go-cmp#222 which helps document a similar issue with a resolution using go-cmp. I played around a little bit with the code and it indeed seems to do the trick: https://play.golang.org/p/RwCfG4l-xOK

@scott-grimes
Copy link

having same issue here using version 1.2

helm history blerg
REVISION	UPDATED                 	STATUS         	CHART     	APP VERSION	DESCRIPTION      
235     	Mon Mar 29 17:23:51 2021	superseded     	blerg-0.3.0	0.0.0      	Upgrade complete 
236     	Mon Mar 29 17:26:43 2021	superseded     	blerg-0.3.0	0.0.0      	Upgrade complete 
237     	Mon Mar 29 17:28:39 2021	superseded     	blerg-0.3.0	0.0.0      	Upgrade complete 
238     	Mon Mar 29 17:30:13 2021	superseded     	blerg-0.3.0	0.0.0      	Upgrade complete 
239     	Mon Mar 29 17:33:10 2021	superseded     	blerg-0.3.0	0.0.0      	Upgrade complete 
240     	Mon Mar 29 17:34:43 2021	superseded     	blerg-0.3.0	0.0.0      	Upgrade complete 
241     	Mon Mar 29 17:37:20 2021	superseded     	blerg-0.3.0	0.0.0      	Upgrade complete 
242     	Mon Mar 29 17:40:21 2021	superseded     	blerg-0.3.0	0.0.0      	Upgrade complete 
243     	Mon Mar 29 17:43:36 2021	deployed       	blerg-0.3.0	0.0.0      	Upgrade complete 
244     	Mon Mar 29 17:46:28 2021	pending-upgrade	blerg-0.3.0	0.0.0      	Preparing upgrade
Status:
  Conditions:
    Last Transition Time:   2021-03-29T17:28:39Z
    Last Update Time:       2021-03-29T17:28:39Z
    Message:                Chart fetch was successful for Helm release 'blerg' in 'qa-staging'.
    Reason:                 ChartFetched
    Status:                 True
    Type:                   ChartFetched
    Last Transition Time:   2021-03-29T17:49:47Z
    Last Update Time:       2021-03-29T17:49:47Z
    Message:                Installation or upgrade succeeded for Helm release 'blerg' in 'qa-staging'.
    Reason:                 Deployed
    Status:                 True
    Type:                   Deployed
    Last Transition Time:   2021-03-29T14:21:07Z
    Last Update Time:       2021-03-29T17:49:47Z
    Message:                Release was successful for Helm release 'blerg' in 'qa-staging'.
    Reason:                 Succeeded
    Status:                 True
    Type:                   Released
  Last Attempted Revision:  a98f6980e16f805e19e18f9fb2cceaee405a213b
  Observed Generation:      5
  Phase:                    Succeeded
  Release Name:             blerg
  Release Status:           deployed
  Revision:                 a98f6980e16f805e19e18f9fb2cceaee405a213b
Events:
  Type    Reason         Age                 From           Message
  ----    ------         ----                ----           -------
  Normal  ReleaseSynced  19m (x109 over 2d)  helm-operator  managed release 'blerg' in namespace 'qa-staging' synchronized
  Normal  ReleaseSynced  24s (x6 over 15m)   helm-operator  managed release 'blerg' in namespace 'qa-staging' synchronized

This is particularly problematic for us because have migration hooks that are constantly being run. The chart in question has a job with

Annotations:    helm.sh/hook: pre-install,pre-upgrade
               helm.sh/hook-delete-policy: before-hook-creation

(note that before-hook-creation is the default). every sync causes this job to be run even though neither the chart nor the values have changed.

@hiddeco
Copy link
Member

hiddeco commented Mar 30, 2021

@scott-grimes can you please try out fluxcd/helm-operator-prerelease:master-4dc90d77 with the CRD from master and see if this resolves your issue?

@scott-grimes
Copy link

@hiddeco

same. heres logs from the new helm-operator

{"caller":"release.go:79","component":"release","helmVersion":"v3","info":"starting sync run","release":"blerg","resource":"qa-staging:helmrelease/blerg","targetNamespace":"qa-staging","ts":"2021-03-30T14:24:50.638793386Z"}
{"action":"dry-run-compare","caller":"release.go:289","component":"release","helmVersion":"v3","info":"running dry-run upgrade to compare with release version '2322'","release":"blerg","resource":"qa-staging:helmrelease/blerg","targetNamespace":"qa-staging","ts":"2021-03-30T14:24:51.397948587Z"}
{"caller":"helm.go:69","component":"helm","info":"preparing upgrade for blerg","release":"blerg","targetNamespace":"qa-staging","ts":"2021-03-30T14:24:51.436562837Z","version":"v3"}
{"caller":"helm.go:69","component":"helm","info":"resetting values to the chart's original version","release":"blerg","targetNamespace":"qa-staging","ts":"2021-03-30T14:24:51.459460847Z","version":"v3"}
{"caller":"helm.go:69","component":"helm","info":"performing update for blerg","release":"blerg","targetNamespace":"qa-staging","ts":"2021-03-30T14:24:52.364851293Z","version":"v3"}
{"caller":"helm.go:69","component":"helm","info":"dry run for blerg","release":"blerg","targetNamespace":"qa-staging","ts":"2021-03-30T14:24:52.876975106Z","version":"v3"}
{"caller":"release.go:303","component":"release","helmVersion":"v3","info":"difference detected during release comparison","phase":"dry-run-compare","release":"blerg","resource":"qa-staging:helmrelease/blerg","targetNamespace":"qa-staging","ts":"2021-03-30T14:24:53.24976622Z"}
{"action":"upgrade","caller":"release.go:353","component":"release","helmVersion":"v3","info":"running upgrade","release":"blerg","resource":"qa-staging:helmrelease/blerg","targetNamespace":"qa-staging","ts":"2021-03-30T14:24:53.24990009Z"}
{"caller":"helm.go:69","component":"helm","info":"preparing upgrade for blerg","release":"blerg","targetNamespace":"qa-staging","ts":"2021-03-30T14:24:53.328797156Z","version":"v3"}
{"caller":"helm.go:69","component":"helm","info":"resetting values to the chart's original version","release":"blerg","targetNamespace":"qa-staging","ts":"2021-03-30T14:24:53.346093213Z","version":"v3"}
{"caller":"helm.go:69","component":"helm","info":"performing update for blerg","release":"blerg","targetNamespace":"qa-staging","ts":"2021-03-30T14:24:54.780986838Z","version":"v3"}
{"caller":"helm.go:69","component":"helm","info":"creating upgraded release for blerg","release":"blerg","targetNamespace":"qa-staging","ts":"2021-03-30T14:24:55.295124685Z","version":"v3"}
{"caller":"helm.go:69","component":"helm","info":"checking 5 resources for changes","release":"blerg","targetNamespace":"qa-staging","ts":"2021-03-30T14:24:55.8775033Z","version":"v3"}
{"caller":"helm.go:69","component":"helm","info":"Looks like there are no changes for Service \"blerg\"","release":"blerg","targetNamespace":"qa-staging","ts":"2021-03-30T14:24:55.94115779Z","version":"v3"}
{"caller":"helm.go:69","component":"helm","info":"updating status for upgraded release for blerg","release":"blerg","targetNamespace":"qa-staging","ts":"2021-03-30T14:24:56.44141447Z","version":"v3"}
{"caller":"release.go:364","component":"release","helmVersion":"v3","info":"upgrade succeeded","phase":"upgrade","release":"blerg","resource":"qa-staging:helmrelease/blerg","revision":"b091bc1f922ff30d56c16ba2424882209ca67d95","targetNamespace":"qa-staging","ts":"2021-03-30T14:24:56.634371369Z"}

@scott-grimes
Copy link

some further info here. enabling diff on helm-operator shows the following:

  			"Release": map[string]interface{}{
  				... // 2 identical entries
  				"Name":      string("blerg"),
  				"Namespace": string("mynamespace"),
- 				"Revision":  Inverse(T, float64(27)),
+ 				"Revision":  Inverse(T, float64(28)),
  				"Service":   string("Helm"),
  			},
- 			"Template": map[string]interface{}{
- 				"BasePath": string("blerg/templates"),
- 				"Name":     string("blerg/templates/deployment-karafka.yaml"),
- 			},
+ 			"Template": chartutil.Values{
+ 				"BasePath": string("blerg/templates"),
+ 				"Name":     string("blerg/templates/deployment-karafka.yaml"),
+ 			},

@scott-grimes
Copy link

the "values" in both blocks (chartutil.Values and map[string]interface{}) are identical., whatever comparison function is called is just not casting them correctly

@scott-grimes
Copy link

also

"Capabilities": map[string]interface{}{
"APIVersions": []interface{}{
string("cert-manager.io/v1beta1"),
...
},
"KubeVersion": map[string]interface{}{
"Major":   string("1"),
"Minor":   string("19+"),
"Version": string("v1.19.6-eks-49a6c0"),
},

is changed to

"Capabilities": &chartutil.Capabilities{
  KubeVersion: chartutil.KubeVersion{Version: "v1.19.6-eks-49a6c0", Major: "1", Minor: "19+"},
  APIVersions: chartutil.VersionSet{
    chartutil.VersionSet{
      "cert-manager.io/v1beta1", ...

@scott-grimes
Copy link

issue may be coming from

func configToGenericValues(c chartutil.Values) map[string]interface{} {

ultimately surfacing at

https://github.com/fluxcd/helm-operator/blob/master/pkg/helm/utils.go#L19

does the comparison using when determining if an update is needed use a dry-run to generate the hypothetical new release, then compare the (hypothetical) new release to the existing release? several places which should be map[string]interface{} are instead &chartutil. (Values, Capabilities, etc)

@hiddeco

@hiddeco
Copy link
Member

hiddeco commented Apr 9, 2021

A dry-run is indeed used to determine if an upgrade needs to be performed, this approach has however proven to be brittle since the beginning, and we are using a different approach in the helm-controller to simply not have to deal with uses like the above (or more complex issues like determining what should be taken into account as a divergence that triggers an upgrade).

@stevehipwell
Copy link

/stale

@hiddeco
Copy link
Member

hiddeco commented Oct 22, 2021

This is going to continue to be stale, as the Helm Operator is no longer actively maintained. We advise people to upgrade to the helm-controller.

@stevehipwell
Copy link

Sorry @hiddeco I was closing stale issues I'd opened and accidentally ended up commenting on this one that I didn't even open.

@kingdonb
Copy link
Member

kingdonb commented Dec 9, 2021

Closing. Users are recommended to upgrade to Flux v2 and Helm Controller ASAP.

https://fluxcd.io/docs/migration/helm-operator-migration/

(Edit: I've reopened this issue to avoid new duplicate reports, since there are still recent fresh reports of this issue. But our position remains the same, we cannot devote resources to fixing bugs in Helm Operator, which has been in a code freeze phase of maintenance mode for longer than one year.)

@kingdonb
Copy link
Member

kingdonb commented Nov 1, 2022

We will be archiving the Helm Operator repo very soon, as described by:

Please upgrade to Helm Controller and Flux v2, where issues like this can get attention and get solved by the very active team of maintainers. Closing now, as there will be no more work on Helm Operator in this repo.

For migration support, please consult the migration guide: https://fluxcd.io/flux/migration/flux-v1-migration/ or contact us in the Flux channel on CNCF Slack where we still offer migration assistance, and workshops that are sponsored by the Flux project members and their supportive team at Weaveworks.

@kingdonb kingdonb closed this as not planned Won't fix, can't repro, duplicate, stale Nov 1, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
blocked needs validation In need of validation before further action bug Something isn't working
Projects
None yet
Development

No branches or pull requests