Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add streaming fileobj support to CRTTransferManager #277

Open
wants to merge 7 commits into
base: develop
Choose a base branch
from

Conversation

graebm
Copy link
Contributor

@graebm graebm commented Oct 18, 2023

Improvements to CRTTransferManager:

  • Add streaming fileobj support
    • Previously, it only worked with file paths.
  • Add support for ChecksumAlgorithm
  • Use ALLOWED lists to limit what can be passed in extra_args
    • The default TransferManager uses these lists to prevent unexpected weirdness when splitting uploads and downloads into parts, but the CRTTransferManager hadn't hooked up these filters

This relies on awscrt 0.19.4+
We'll need to update botocore's dependencies, before this can go in.

Comment on lines +163 to +165
ALLOWED_DOWNLOAD_ARGS = TransferManager.ALLOWED_DOWNLOAD_ARGS
ALLOWED_UPLOAD_ARGS = TransferManager.ALLOWED_UPLOAD_ARGS
ALLOWED_DELETE_ARGS = TransferManager.ALLOWED_DELETE_ARGS
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I make a common base class for TransferManager and CRTTransferManager, where I can put stuff like these lists, and def _validate_all_known_args() and def _validate_if_bucket_supported()?

Or move these to be standalone lists/functions that both classes can use?

Or just copy/paste the functions, and reference the lists, like I'm doing here?

call_args.extra_args["Body"] = call_args.fileobj

# We want the CRT S3Client to calculate checksums, not botocore.
# Default to CRC32.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is different than the default TransferManager. With this change CRT is defaulting to CRC32 instead of Content-MD5. The CRT team got grumpy about how Content-MD5 currently works in our code and pushed me to do it this way, but we can still make Content-MD5 work if we need to.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant