Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(eap): write script to send scrubbed data into a gcs bucket #6698

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

davidtsuk
Copy link
Contributor

No description provided.

@phacops phacops requested a review from mdtro December 20, 2024 20:31
file_name = f"scrubbed_spans_data_{current_time}.csv.gz"

query = f"""
INSERT INTO FUNCTION gcs('https://storage.googleapis.com/{self.gcp_bucket_name}/{file_name}',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dump that into a directory with the date of the dump.

You can probably make that one variable to be called later in the log.


def execute(self, logger: JobLogger) -> None:
cluster = get_cluster(StorageSetKey.EVENTS_ANALYTICS_PLATFORM)
connection = cluster.get_query_connection(ClickhouseClientSettings.QUERY)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do one query that does an INSERT INTO <gcs_bucket> (SELECT ... FROM eap_spans_2_local) instead.

And make sure it's running asynchronous so this script can stop with success.

Copy link
Contributor Author

@davidtsuk davidtsuk Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do one query that does an INSERT INTO <gcs_bucket> (SELECT ... FROM eap_spans_2_local) instead.

How is this different from what I already have?

And make sure it's running asynchronous so this script can stop with success.

I don't think snuba supports running async queries, but I think we can add support for it with https://github.com/mymarilyn/aioch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants