Skip to content
This repository has been archived by the owner on Oct 19, 2024. It is now read-only.

Failed to notify recipient #347

Open
hroyg opened this issue Oct 14, 2021 · 13 comments
Open

Failed to notify recipient #347

hroyg opened this issue Oct 14, 2021 · 13 comments
Assignees
Labels
bug Something isn't working
Milestone

Comments

@hroyg
Copy link

hroyg commented Oct 14, 2021

Summary

ever since we upgraded argocd version to v2.1.3 and with the new version github authentication updated to be a secret instead of in the cm with reference to a secret (the github repo url is defined now in the new version in a secret and not in cm as before) and the authentication configurations definition has changed,

we get error from argocd notification and that in turn makes apicalls and slack messages fail and not being executed/sent.

this does not happen all the time , it seems to randomly happen with some applications.
what i dont understand and probably because i dont understand how argocd-notification works exactly, is why does that started happen after argocd version upgrade, doesnt this function that it executes and fails (<call .repo.GetCommitMetadata .app.status.operationState.syncResult.revision>) is executed by the argocd-notifications ??..

Diagnostics

eks

argocd: 2.1.3
argocd notifications: v1.1.1


time="2021-10-14T11:26:40Z" level=error msg="Failed to notify recipient {jenkins } defined in app argocd/monitoring: template: jenkins-api-calljenkins:1:27: executing \"jenkins-api-calljenkins\" at <call .repo.GetCommitMetadata .app.status.operationState.syncResult.revision>: error calling call: rpc error: code = Internal desc = Failed to fetch 8179a397e623f56c7f36b4a5781ad233af2bbe5b: `git fetch origin --tags --force` failed exit status 128: fatal: could not read Username for 'https://github.com': No such device or address" app=argocd/monitoring


time="2021-10-14T11:26:40Z" level=error msg="Failed to notify recipient {slack cloud-cd-stage} defined in app argocd/monitoring: template: custom-synced-and-healty:5:26: executing \"custom-synced-and-healty\" at <call .repo.GetCommitMetadata .app.status.operationState.syncResult.revision>: error calling call: rpc error: code = Internal desc = Failed to fetch 8179a397e623f56c7f36b4a5781ad233af2bbe5b: `git fetch origin --tags --force` failed exit status 128: fatal: could not read Username for 'https://github.com': No such device or address" app=argocd/monitoring

any input here to resolve the issue will be much appropriated .

Thanks

Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

@hroyg hroyg added the bug Something isn't working label Oct 14, 2021
@ryota-sakamoto
Copy link
Member

@hroyg
I think we need to break down the issue.

  1. if you downgrade the version of ArgoCD to old, is the problem fixed?
  2. the problem is ArgoCD version problem or argocd-notifications problem or something

@hroyg
Copy link
Author

hroyg commented Oct 25, 2021

@ryota-sakamoto

  1. Downgrading argocd version back to v2.0.3 (and when downgrading i also changed the way i pass the github authentication back to be in configmap resolved the issue .
    we changed back the authentication mechanism to be in CM as follow:

repositories: |
- passwordSecret:
key: password
name: repo-248020157
type: git
url: https://github.com/Firm/k8s-cloud-resources.git
usernameSecret:
key: username
name: repo-248020157

** also resolved the issue staying in argocd v2.1.3 (not downgrading back to previous version) and just changing the config to be as above (the old way, as we used it before the upgrade to v2.1.3), so the problem i guess relates to the new way argocd pass the github PAT password, or maybe how argocd notifications uses them (not to familiar with the flow argocd notifications connects with github through argocd server/repo-server)

  1. The problem started after upgrading argocd version, but the error appears in argocd notifications controller.

** we haven't encounter any functionality issues for argocd with the new way of authenticating to github (the new way is to pass the repo name and authentication, e.g PAT, as secrets).

@Thumbiceq
Copy link

Same issue, not working with ArgoCD v2.1.1, downgrading to v2.0.5 helped.

(call .repo.GetAppDetails).Helm.GetParameterValueByName is randomly failing with Failed to fetch 023d82c5c49bdc9aa05ac32801d2800e900ff7c0: 'git fetch origin --tags --force'

As well when two notification services are defined and same function is used in both service templates ((call .repo.GetAppDetails).Helm.GetParameterValueByName), it fails only with the first one. The second is sent normally without any errors.

slack notification service template:

...
          {
            "title": "Upstream Repository",
            "value": "{{ (call .repo.GetAppDetails).Helm.GetParameterValueByName "app-vue.labels.upstreamRepository" }}",
            "short": true
          },
...

Webhook notification service template:

...
path: /api/v4/projects/{{(call .repo.GetAppDetails).Helm.GetParameterValueByName "app-vue.labels.upstreamRepository"}}/statuses/{{(call .repo.GetAppDetails).Helm.GetParameterValueByName "app-vue.labels.commitSHA"}}?state=success
...

@ryota-sakamoto ryota-sakamoto self-assigned this Nov 8, 2021
@ryota-sakamoto ryota-sakamoto added this to the v1.2 milestone Nov 8, 2021
@ryota-sakamoto
Copy link
Member

I reproduced this issue, then I'm investigating it.

@mbolek
Copy link

mbolek commented Nov 19, 2021

think this the same issue: #356
I'm also affected by this but... not always, for whatever reason, on some occasions I get the Failed to fetch on other the notifications work and get data from the commit

@ryota-sakamoto ryota-sakamoto modified the milestones: v1.2, v1.3 Nov 24, 2021
@agaudreault
Copy link
Member

agaudreault commented Nov 30, 2021

The reason this broke without updating argocd-notifications is because it calls the argocd-repo-server service to get the information and it is actually that services peerforming the call and failing.

metadata, err := svc.repoServerClient.GetRevisionMetadata(ctx, &apiclient.RepoServerRevisionMetadataRequest{

There is also a cache mechanism in argo-repo-server so that might explain why sometimes the notification goes through.

@agaudreault
Copy link
Member

agaudreault commented Nov 30, 2021

For reference, we have the same problem and our ArgoCD instance is configured on our private repositories with credential template and a GitHub app according to https://argo-cd.readthedocs.io/en/stable/user-guide/private-repositories/#github-app-credential. (Using 2.1.3)

argocd repocreds list
URL PATTERN              USERNAME  SSH_CREDS  TLS_CREDS
https://github.com/org   -         false      false

@tsunamishaun
Copy link

I was hoping the new release v1.2.1/ #370 that fixed my similar issue #356 would have also fixed this (which I am now seeing clearly after upgrading).

My specific errors are around de-referencing the Application object attributes in the trigger (when and oncePer clause). I have no way to gracefully handle these as I do in the templates (setting default values in case they don't exist).

time="2021-12-15T22:01:28Z" level=error msg="failed to execute oncePer condition: cannot fetch images from <nil> (1:20)\n | app.status.summary.images\n | ...................^"
time="2021-12-15T22:01:28Z" level=error msg="failed to execute when condition: cannot fetch phase from <nil> (1:27)\n | app.status.operationState.phase in ['Error', 'Failed']\n | ..........................^"
time="2021-12-15T22:01:28Z" level=error msg="failed to execute oncePer condition: cannot fetch syncResult from <nil> (1:27)\n | app.status.operationState.syncResult.revision\n | ..........................^"

Any help appreciated.

@agaudreault
Copy link
Member

agaudreault commented Jan 27, 2022

After upgrading to v1.2.1 with argoCD v2.2.3, I am able to use .repo.GetCommitMetadata.

@ichasco-heytrade
Copy link

Hi, I am using in the slack templates:

...
{
              "title": "Author",
              "value": "{{(call .repo.GetCommitMetadata .app.status.sync.revision).Author}}",
              "short": true
            }
...

It works but it gives me the next error:

argocd-notifications-controller-769fb8f4fd-ptgql argocd-notifications-controller time="2022-03-08T09:30:24Z" level=error msg="Failed to notify recipient {slack releases-dev} defined in resource argocd/service1: template: app-deployed:23:17: executing \"app-deployed\" at <call .repo.GetCommitMetadata .app.status.sync.revision>: error calling call: rpc error: code = Unavailable desc = connection error: desc = \"transport: authentication handshake failed: tls: first record does not look like a TLS handshake\"" resource=argocd/service1

I think that is related with argocd-repo. But I don't want to disable here TLS because I have to disable it also in server and application-controller. Is there any other way to show the author of the commit?

I am using these versions:

argocd: 2.2.5
argo-notifications: 1.2.1

Thanks

@muhammad-asn
Copy link

Is there any update on this? It seems the issue is still open and nowhere to the solution?

@mubarak-j
Copy link

if this issue hasn’t been resolved in latest argo-cd v2.4.0, then i think this issue should be resubmitted upstream because notification code was merged with argo-cd repo. I doubt issues here are monitored or being worked on anymore.

@sinkr
Copy link

sinkr commented Aug 26, 2022

I'm unsure how to follow @mubarak-j 's advice above, so I'll throw on here that this is still occurring in v2.4.11+3d9e9f2.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

10 participants