add support for s3 references to batch data sync requests #10

rpmcginty · 2024-06-26T19:36:22Z

What's in this Change?

adding support for storing batch data sync requests to s3 and fetching pointers to such objects in a request.

Why is this needed? there is a rather small max payload allowed in step functions (256 KB). This is problematic for our lambda and batch lambda functions that are plugged into the system. This change allows us to write part of the larger parts of the lambda responses and requests to s3 and load it.

The way it works is that if you specify intermediate_s3_path in the prepare batch data sync request, the function will write the batch data sync requests to s3 objects under that path specified.

Testing

running e2e tests in merscope analysis pipeline

aamster · 2024-06-27T17:28:44Z

src/aibs_informatics_aws_lambda/handlers/data_sync/operations.py

-        for _ in request.requests:
+        if isinstance(request.requests, S3URI):
+            self.logger.info(f"Request is stored at {request.requests}... fetching content.")
+            _ = download_to_json(request.requests)


not a huge fan of using _ and __ instead of variable name, but I can still understand what the code is doing. _ is generally used when we don't use that variable at all, not when you can't think of a name for it

yeah you are right, I can change it.

I will clean up in a follow up change

add support for s3 references to batch data sync requests

49bcafb

rpmcginty force-pushed the feature/update-batch-data-sync branch from ea65c7c to 49bcafb Compare June 26, 2024 19:46

rpmcginty requested a review from aamster June 26, 2024 19:52

aamster approved these changes Jun 27, 2024

View reviewed changes

rpmcginty merged commit cb32318 into main Jun 27, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add support for s3 references to batch data sync requests #10

add support for s3 references to batch data sync requests #10

rpmcginty commented Jun 26, 2024 •

edited

Loading

aamster Jun 27, 2024

rpmcginty Jun 27, 2024

rpmcginty Jun 27, 2024

add support for s3 references to batch data sync requests #10

add support for s3 references to batch data sync requests #10

Conversation

rpmcginty commented Jun 26, 2024 • edited Loading

What's in this Change?

Testing

aamster Jun 27, 2024

Choose a reason for hiding this comment

rpmcginty Jun 27, 2024

Choose a reason for hiding this comment

rpmcginty Jun 27, 2024

Choose a reason for hiding this comment

rpmcginty commented Jun 26, 2024 •

edited

Loading