Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for s3 references to batch data sync requests #10

Merged
merged 1 commit into from
Jun 27, 2024

Conversation

rpmcginty
Copy link
Collaborator

@rpmcginty rpmcginty commented Jun 26, 2024

What's in this Change?

  • adding support for storing batch data sync requests to s3 and fetching pointers to such objects in a request.

Why is this needed? there is a rather small max payload allowed in step functions (256 KB). This is problematic for our lambda and batch lambda functions that are plugged into the system. This change allows us to write part of the larger parts of the lambda responses and requests to s3 and load it.

The way it works is that if you specify intermediate_s3_path in the prepare batch data sync request, the function will write the batch data sync requests to s3 objects under that path specified.

Testing

  • running e2e tests in merscope analysis pipeline

@rpmcginty rpmcginty force-pushed the feature/update-batch-data-sync branch from ea65c7c to 49bcafb Compare June 26, 2024 19:46
@rpmcginty rpmcginty requested a review from aamster June 26, 2024 19:52
for _ in request.requests:
if isinstance(request.requests, S3URI):
self.logger.info(f"Request is stored at {request.requests}... fetching content.")
_ = download_to_json(request.requests)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a huge fan of using _ and __ instead of variable name, but I can still understand what the code is doing. _ is generally used when we don't use that variable at all, not when you can't think of a name for it

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah you are right, I can change it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will clean up in a follow up change

@rpmcginty rpmcginty merged commit cb32318 into main Jun 27, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants