Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add Airflow provider integration and documentation #4090

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

tuantran0910
Copy link

@tuantran0910 tuantran0910 commented Mar 18, 2025

Closes #4089

πŸ“‘ Description

This PR is to add a feature for integration between Apache Airflow and Keep.

βœ… Checks

  • My pull request adheres to the code style of this project
  • My code requires changes to the documentation
  • I have updated the documentation as required
  • All the tests have passed

Copy link

vercel bot commented Mar 18, 2025

@tuantran0910 is attempting to deploy a commit to the KeepHQ Team on Vercel.

A member of the Team first needs to authorize it.

@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Mar 18, 2025
@CLAassistant
Copy link

CLAassistant commented Mar 18, 2025

CLA assistant check
All committers have signed the CLA.

@dosubot dosubot bot added Documentation Improvements or additions to documentation Feature A new feature Provider Providers related issues labels Mar 18, 2025
@tuantran0910 tuantran0910 marked this pull request as draft March 18, 2025 07:05
@talboren
Copy link
Member

❀️ will review soon

@shahargl
Copy link
Member

πŸ‘‘

@tuantran0910
Copy link
Author

Hi everyone,

I'm facing the following issues:

  1. I don't see the Airflow provider listed on the Provider page in the UI.

  2. I cannot send requests to the /alerts/event/airflow endpoint using the following command:

    curl --location 'http://0.0.0.0:8080/alerts/event/airflow' \
      --header 'Content-Type: application/json' \
      --header 'Accept: application/json' \
      --header 'X-API-KEY: 0a73d419-89dc-4d0f-9b7d-376577a76823' \
      --data '{
      "name": "Airflow Task Failure: keep_task",
      "status": "firing",
      "service": "pipeline",
      "message": "Task keep-task failed in DAG keep_dag at 2025-03-18",
      "description": "Bash command failed. The command returned a non-zero exit code",
      "severity": "critical"
    }'

    Instead, I have to use the generic /alerts/event endpoint:

    curl --location 'http://0.0.0.0:8080/alerts/event' \
      --header 'Content-Type: application/json' \
      --header 'Accept: application/json' \
      --header 'X-API-KEY: 0a73d419-89dc-4d0f-9b7d-376577a76823' \
      --data '{
      "name": "Airflow Task Failure: keep_task",
      "status": "firing",
      "service": "pipeline",
      "source": ["airflow"],
      "message": "Task keep-task failed in DAG keep_dag at 2025-03-18",
      "description": "Bash command failed. The command returned a non-zero exit code",
      "severity": "critical"
    }'

Is this expected behavior, or am I missing something?

Thanks in advance!

@tuantran0910 tuantran0910 marked this pull request as ready for review March 18, 2025 07:26
@talboren
Copy link
Member

Hi everyone,

I'm facing the following issues:

  1. I don't see the Airflow provider listed on the Provider page in the UI.

  2. I cannot send requests to the /alerts/event/airflow endpoint using the following command:

    curl --location 'http://0.0.0.0:8080/alerts/event/airflow' \
      --header 'Content-Type: application/json' \
      --header 'Accept: application/json' \
      --header 'X-API-KEY: 0a73d419-89dc-4d0f-9b7d-376577a76823' \
      --data '{
      "name": "Airflow Task Failure: keep_task",
      "status": "firing",
      "service": "pipeline",
      "message": "Task keep-task failed in DAG keep_dag at 2025-03-18",
      "description": "Bash command failed. The command returned a non-zero exit code",
      "severity": "critical"
    }'

    Instead, I have to use the generic /alerts/event endpoint:

    curl --location 'http://0.0.0.0:8080/alerts/event' \
      --header 'Content-Type: application/json' \
      --header 'Accept: application/json' \
      --header 'X-API-KEY: 0a73d419-89dc-4d0f-9b7d-376577a76823' \
      --data '{
      "name": "Airflow Task Failure: keep_task",
      "status": "firing",
      "service": "pipeline",
      "source": ["airflow"],
      "message": "Task keep-task failed in DAG keep_dag at 2025-03-18",
      "description": "Bash command failed. The command returned a non-zero exit code",
      "severity": "critical"
    }'

Is this expected behavior, or am I missing something?

Thanks in advance!

Hey @tuantran0910, could be that there is some problem with the provider and it's not loading for some reason.
Are you seeing any "Cannot import provider Airflow" log in the backend logs?

@tuantran0910
Copy link
Author

tuantran0910 commented Mar 18, 2025

Hi @talboren, I have figured out the reason that cannot making requests to the /alerts/event/airflow endpoint is that the environment must be set and cannot be None. So I will change the code so that the default environment will be undefined if it is not set like Keep.

The only concern right now is that there is no Airflow provider shown in the Providers page:

image

There is not error log Cannot import provider Airflow in the backend logs.

@talboren
Copy link
Member

Hi @talboren, I have figured out the reason that cannot making requests to the /alerts/event/airflow endpoint is that the environment must be set and cannot be None. So I will change the code so that the default environment will be undefined if it is not set like Keep.

The only concern right now is that there is no Airflow provider shown in the Providers page:

image There is not error log `Cannot import provider Airflow` in the backend logs.

That's it, you found the reason :)

Regarding not being able to see the provider in the list, it's because the provider doesn't really have any reason to be "installed" and we do not present it, because it has no configuration.

I can actually see that the backend sends it to the frontend:

CleanShot 2025-03-18 at 10 46 30

But we do not present it because the provider has no configuration

CleanShot 2025-03-18 at 10 47 51

^ this is an example from Pagerduty

@tuantran0910
Copy link
Author

tuantran0910 commented Mar 18, 2025

Hi @talboren, I have figured out the reason that cannot making requests to the /alerts/event/airflow endpoint is that the environment must be set and cannot be None. So I will change the code so that the default environment will be undefined if it is not set like Keep.
The only concern right now is that there is no Airflow provider shown in the Providers page:
image
There is not error log Cannot import provider Airflow in the backend logs.

That's it, you found the reason :)

Regarding not being able to see the provider in the list, it's because the provider doesn't really have any reason to be "installed" and we do not present it, because it has no configuration.

I can actually see that the backend sends it to the frontend:

CleanShot 2025-03-18 at 10 46 30

But we do not present it because the provider has no configuration

CleanShot 2025-03-18 at 10 47 51

^ this is an example from Pagerduty

Thanks @talboren, I understand it. So because Airflow does not need to configure or install anything from the Keep side, so yeah, there is no reason to display it in the Providers page.

@shahargl
Copy link
Member

Hi @talboren, I have figured out the reason that cannot making requests to the /alerts/event/airflow endpoint is that the environment must be set and cannot be None. So I will change the code so that the default environment will be undefined if it is not set like Keep.
The only concern right now is that there is no Airflow provider shown in the Providers page:
image
There is not error log Cannot import provider Airflow in the backend logs.

That's it, you found the reason :)
Regarding not being able to see the provider in the list, it's because the provider doesn't really have any reason to be "installed" and we do not present it, because it has no configuration.
I can actually see that the backend sends it to the frontend:
CleanShot 2025-03-18 at 10 46 30
But we do not present it because the provider has no configuration
CleanShot 2025-03-18 at 10 47 51
^ this is an example from Pagerduty

Thanks @talboren, I understand it. So because Airflow does not need to configure or install anything from the Keep side, so yeah, there is no reason to display it in the Providers page.

Actually I think its cool to have it so people will know we have integration with them.

There are couple of other providers that do not need any configuration but have provider tiles.

For example, you can see checkmk provider - it only have webhook_markdown to explain how to install the webhook:

CleanShot 2025-03-18 at 11 20 45@2x

@tuantran0910
Copy link
Author

I agree with @shahargl, we can put the instruction of how to setup the callback for Airflow that connect with Keep in here :D Maybe by adding three class variable webhook_description, webhook_template and webhook_markdown like the CheckmkProvider provider ?

@talboren
Copy link
Member

I agree with @shahargl, we can put the instruction of how to setup the callback for Airflow that connect with Keep in here :D Maybe by adding three class variable webhook_description, webhook_template and webhook_markdown like the CheckmkProvider provider ?

That will be more than awesome

@tuantran0910
Copy link
Author

So one more thing to concern here is that I have setup the fingerprint field fingerprint for the AirflowProvider:

class AirflowProvider(BaseProvider):
    """Enrich alerts with data sent from Airflow."""

    PROVIDER_DISPLAY_NAME = "Airflow"
    PROVIDER_CATEGORY = ["Orchestration"]
    FINGERPRINT_FIELDS = ["fingerprint"]

So for example, if I make the request with the same fingerprint but different in other fields (for example the description contains the timestamp). Will those alerts actually get deduplicated ?

image

But I see that I got this instead:

image

@tuantran0910
Copy link
Author

tuantran0910 commented Mar 18, 2025

I agree with @shahargl, we can put the instruction of how to setup the callback for Airflow that connect with Keep in here :D Maybe by adding three class variable webhook_description, webhook_template and webhook_markdown like the CheckmkProvider provider ?

That will be more than awesome

Yeah, this can help the users to reduce the gap on exploring provider. I have configure this for the Airflow Provider :D

image

Checkout this commit 7165b7c.

@shahargl
Copy link
Member

@tuantran0910 regarding the fingerprint - yea if the description contains timestamp than deduplication won't deduplicate it.

but it can be configured later on "deduplication" rules, so let's start simple and continue from there?

@tuantran0910
Copy link
Author

@tuantran0910 regarding the fingerprint - yea if the description contains timestamp than deduplication won't deduplicate it.

but it can be configured later on "deduplication" rules, so let's start simple and continue from there?

Totally agree @shahargl !

@talboren
Copy link
Member

@tuantran0910 can you add a screenshot from provider screen to show how it looks like when you open Airflow? πŸ™πŸΌ

@tuantran0910
Copy link
Author

tuantran0910 commented Mar 18, 2025

Hi @talboren, here is the screenshot from the provider screen:

image

Actually, I cannot click the block under the Linked Providers section, I have do search the Airflow on the search bar:

image

Click on it under the Available Providers, then I got the information of the Airflow Provider:

image

From my opinion, I think it would be ideal to let the user click on the block under Linked Providers :D Or maybe this is the default behavior in Keep.

@shahargl
Copy link
Member

yea, currently "linkedin providers" aren't clickable (because they do not have any information/state). how would u change it?

and apart from it - its ready for PR?

@tuantran0910
Copy link
Author

yea, currently "linkedin providers" aren't clickable (because they do not have any information/state). how would u change it?

and apart from it - its ready for PR?

Yes, it's ready for PR.

Copy link
Member

@shahargl shahargl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Mar 19, 2025
Copy link

vercel bot commented Mar 19, 2025

The latest updates on your projects. Learn more about Vercel for Git β†—οΈŽ

Name Status Preview Comments Updated (UTC)
keep βœ… Ready (Inspect) Visit Preview πŸ’¬ Add feedback Mar 19, 2025 0:15am

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Documentation Improvements or additions to documentation Feature A new feature hacktoberfest-accepted lgtm This PR has been approved by a maintainer Provider Providers related issues size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[πŸ”Œ Provider]: Airflow Provider
4 participants