-
Notifications
You must be signed in to change notification settings - Fork 854
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
S3TransferManager directoryDownload downloads incomplete files without failing #5631
Comments
@debora-ito Can you help or point out to someone who can help please? We've been using this in production and we'd like to keep it that way but need to find-out why it's actually happening. |
@bugrabenturk Found this line in the logs:
Do you think it might be the case, the object was modified? I'm looking into why the client didn't surface an error. |
@debora-ito No, object did not get modified during download, thats strange to see that kind of message :/ |
Some questions:
|
1 - I'm not able to reproduce this, it's very much random i'm afraid. 2- Normally in the directory there are more than 10 files but we filter the prefix and download only filtered ones, and on this case it's for .mov file, so let's assume that directory has 10 files but we do download only one of them at those failed ones and those downloaded files are around 60+ GB mostly, but its around 30 to 90GB. Issues are happening mostly with the big ones, but i don't think the size is important because i've seen some of the downloads just finished after %2 of the file downloaded, which is another strange part so i can't say it's happening to only large files, but it seemed like that. 3- they're really big more than 10gb so, i have had to change it to debug instead of trace after some failed downloads i am afraid. And i've kept few files, from below you can see how often that we've had those incomplete downloaded files, strange part is 20731570 that asset file seems like failed at the same day three times. Seems like somehow other download operations, head object operations are being logged as well in the log files, they might confuse you, i will provide cleaner logs later on.
renditions-download-logging-20512540-1729102789185.log |
Thank you for the files, I'll take a look at them.
When you say the downloads are finished, what exactly indicates they are finished? |
@debora-ito After few incomplete downloads to be able to understand more i have also added the transfer listener
after that, things got more interesting since transfer was getting Transfer complete! before it reaches to 100%, some did even before the first tick, so yeah thats how. |
@bugrabenturk can you show how this looks like? We have a known issue with the transfer listener where there's a long delay between 100% and printing "Transfer complete", but this is the opposite, I think. |
@debora-ito I've found this from old ones, hope its usefull. |
Quick update: I'm still investigating this, the current hypothesis is CRT client is experiencing a specific error that is not being surfaced and not reaching the transfer listener. I'm working on a repro case, trying to figure out what's that original error. @bugrabenturk I'm wondering if this issue happens with the standard s3 async client too, have you test it out of curiosity? |
@debora-ito No, i did not try with s3 async client, i might have next week. |
So we identified the issue, the way I reproduced is by deleting the object after the downloadDirectory has started. The transfer stops with no error, and no failure is registered in We are working on a fix, to add it to the In the meantime you can use the default S3AsyncClient as a workaround: S3AsyncClient s3AsyncClient =
S3AsyncClient.builder()
.multipartEnabled(true)
.build(); |
Describe the bug
Sometimes S3TransferManager does not actually download the complete file and it does not marks that as failed transfer instead it marks as complete transfer, it does not even throws any exception.
Regression Issue
Expected Behavior
Should download the complete file, if it does not downloads the complete file should trigger as failed download or throw the corresponded exception.
Current Behavior
Does not downloads the complete file and does not triggers as failed download instead marks as download completed without throwing any exception.
First I thought that it's happening because of the assuming request's token expiration, since its set to 3600 secs (1 hour), i thought that when the download starts then the token expires and somehow it does fails the transfer but still marks as completed.
So based on that idea, i've created the session right before every transfer action, so it still did not help and problem still occurred.
However, when I have checked the logs i couldn't be able to see much but this is the last part thats been downloaded, rest of the parts are not downloaded at all.
[DEBUG] [2024-09-26T15:39:52Z] [00007fbe09e24b38] [S3MetaRequest] - id=0x7fbe07ff2300: 132 out of 998 parts have completed.
I'm attaching the log file, since it was very big i'm adding the part that its stopped downloading at the part of 132.
svhf.log
Reproduction Steps
We're using STSClient to gather temporary AWS Session credentials with assuming to the corresponding role and then doing the download operation
`
`
Possible Solution
No response
Additional Information/Context
No response
AWS Java SDK version used
2.28.6
AWS CRT version used
0.31.1
JDK version used
21
Operating System and version
Alpine Linux v3.20
The text was updated successfully, but these errors were encountered: