Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sandbox notifications - not sending to slack #4045

Closed
2 tasks done
dyu-bot opened this issue Sep 18, 2023 · 7 comments
Closed
2 tasks done

Sandbox notifications - not sending to slack #4045

dyu-bot opened this issue Sep 18, 2023 · 7 comments
Labels
bug Something isn't working untriaged This issues has not yet been looked at by the Maintainers

Comments

@dyu-bot
Copy link
Contributor

dyu-bot commented Sep 18, 2023

Describe the bug

I am testing out the notifications sandbox configuration with this config:

I set up a sendgrid account with a verified sender email address, and use that api key in the configs.

  workflow_notifications:
    enabled: true
    config:
      notifications:
        type: sandbox
        region: foo (<- I had to set this, otherwise helm was complaining) 
        emailer:
          emailServerConfig:
            serviceName: sendgrid
            apiKeyEnvVar: $SENDGRID_API_KEY
          subject: "Notice: Execution \"{{ workflow.name }}\" has {{ phase }} in \"{{ domain }}\"."
          sender:  "[email protected]"
          body: >
             Execution \"{{ workflow.name }} [{{ name }}]\" has {{ phase }} in \"{{ domain }}\". View details at
             <a href=\https://xxx.com/console/projects/{{ project }}/domains/{{ domain }}/executions/{{ name }}>
             https://xxx.com/console/projects/{{ project }}/domains/{{ domain }}/executions/{{ name }}</a>. {{ error }}

I'm using flyte-core helm chart v1.6.1 but overriding the flyteadmin image to v1.1.119 to be able to use the sandbox processor and publisher

  flyteadmin:
    image:
      repository: "ghcr.io/flyteorg/flyteadmin"
      tag: "v1.1.119"

Flyte is deployed on GKE 1.26

I'm running this sample workflow in the docs to notify upon success:

@task
def double_int_and_print(a: int) -> str:
    return str(a * 2)


@workflow
def int_doubler_wf(a: int) -> str:
    doubled = double_int_and_print(a=a)
    return doubled

wacky_int_doubler_lp = LaunchPlan.get_or_create(
    name="wacky_int_doubler",
    workflow=int_doubler_wf,
    default_inputs={"a": 4},
    notifications=[
        Slack(
            phases=[
                WorkflowExecutionPhase.SUCCEEDED,
                WorkflowExecutionPhase.ABORTED,
                WorkflowExecutionPhase.TIMED_OUT,
                WorkflowExecutionPhase.FAILED,
            ],
            recipients_email=["xxx.slack.com"],
        ),
    ],
)

In the logs, I do see the publisher publishing the message, but there is an error running the processor:

{"json":{"src":"sandbox_processor.go:21"},"level":"warning","msg":"Starting SandBox notifications processor","ts":"2023-09-18T09:02:23Z"}
{"json":{"src":"sandbox_processor.go:46"},"level":"debug","msg":"no message to process","ts":"2023-09-18T09:02:23Z"}
{"json":{"src":"sandbox_processor.go:23"},"level":"error","msg":"error with running processor err: [\u003cnil\u003e] ","ts":"2023-09-18T09:02:23Z"}
{"json":{"exec_id":"xxx","src":"sandbox_publisher.go:15"},"level":"debug","msg":"Publishing the following message [recipients_email:\"xxx.com\" sender_email:\"xxx\" subject_line:\"Notice: Execution \\\"xxx\\\" has succeeded in \\\"development\\\".\" body:\"Execution \\\\\\\"xxx.smoke.int_doubler_wf [xxx]\\\\\\\" has succeeded in \\\\\\\"development\\\\\\\". View details at \u003ca href=\\\\https://xxx.com/console/projects/xxx/domains/development/executions/xxx\u003e https://xxx.com/console/projects/xxx/domains/development/executions/xxx\u003c/a\u003e. \\n\" ]","ts":"2023-09-18T09:02:26Z"}

Also, upon checking the sendgrid account, it did not attempt to send any email, so it looks like it isn't reaching sendgrid (probably due to the processor not working).

Expected behavior

The message to reach slack.

Additional context to reproduce

No response

Screenshots

No response

Are you sure this issue hasn't been raised already?

  • Yes

Have you read the Code of Conduct?

  • Yes
@dyu-bot dyu-bot added bug Something isn't working untriaged This issues has not yet been looked at by the Maintainers labels Sep 18, 2023
@Future-Outlier
Copy link
Member

Could you please verify if the email functionality is working correctly?

I haven't tested the scenario where it's "not sending to Slack."
However, I'm willing to look into it more closely if needed.

@dyu-bot
Copy link
Contributor Author

dyu-bot commented Sep 18, 2023

hey @Future-Outlier thanks for the quick reply. I called the sendgrid api with that api key and was able to get a message delivered to slack.

I guess the biggest clue might be the error with running processor err: [\u003cnil\u003e]
Any ideas what could be causing the processor to fail to run?

Let me know if there's any other info I can provide. Appreciate it.

@Future-Outlier
Copy link
Member

Hello, @dyu-bot

To help troubleshoot, please follow these steps:

  1. Access the flyte-sandbox pod using kubectl. The pod name to look for would be similar to flyte-sandbox-7d699df5fc-nbxc8.

    kubectl exec -it flyte-sandbox-7d699df5fc-nbxc8 -- /bin/bash
  2. Once you're inside the pod, check the value of the SENDGRID_API_KEY environment variable using the following command:

    printenv SENDGRID_API_KEY

Please let me know the results or if you encounter any issues during the process.

image

@Future-Outlier
Copy link
Member

Can you help me try this code?

from datetime import timedelta
from flytekit import Email, FixedRate, LaunchPlan, PagerDuty, Slack, WorkflowExecutionPhase, task, workflow


@task
def double_int_and_print(a: int) -> str:
    return str(a * 2)


@workflow
def int_doubler_wf(a: int) -> str:
    doubled = double_int_and_print(a=a)
    return doubled

int_doubler_wf_lp = LaunchPlan.get_or_create(
    name="int_doubler_wf_lp_succeed_just_email",
    workflow=int_doubler_wf,
    default_inputs={"a": 4},
    notifications=[
        Email(
            phases=[WorkflowExecutionPhase.SUCCEEDED,],
            recipients_email=["[email protected]"],
        ),
    ],
)

use the command

pyflyte register launch-plan.py

and activate in the flyte console the "int_doubler_wf_lp_succeed_just_email" name.
Thanks a lot!

@dyu-bot
Copy link
Contributor Author

dyu-bot commented Sep 19, 2023

Hey @Future-Outlier sorry for the confusion, I should have placed this key fact earlier in the issue description but I'm running on GKE actually, not sandbox. Realizing now, that it might not be compatible? (I'm not running any flyte-sandbox pod but a full flyte deployment with flyte-core)

So I'm just going to try the GCP notifications implementation next. Initially I thought GCP wasn't supported since the docs say only AWS is, so tried sandbox on GCP. But the docs are out of date, and GCP is actually supported, so will try that next. Thanks anyways! This can probably be closed out.

@Future-Outlier
Copy link
Member

No problem, if there's any question, I will help you!

@Future-Outlier
Copy link
Member

@dyu-bot If possible, please help me close the issue, thanks a lot!

@dyu-bot dyu-bot closed this as completed Sep 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working untriaged This issues has not yet been looked at by the Maintainers
Projects
None yet
Development

No branches or pull requests

2 participants