Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws s3 sync writes the incorrect modified date to a file thus never in "sync" #8395

Closed
hossimo opened this issue Dec 7, 2023 · 6 comments
Closed
Assignees
Labels
bug This issue is a bug. p2 This is a standard priority issue s3

Comments

@hossimo
Copy link

hossimo commented Dec 7, 2023

Describe the bug

After Synchronizing a prefix in a bucket to a local external drive I found that some files always synchronize with the modified date 1 second in the future of the date in the bucket.

For example, if I run the command:
aws s3 sync s3://<BUCKET>/YUL/<NAME>/PRESHOW/ . --debug --dryrun> dry-run.txt 2>&1 (obfuscation by me)

I get the following in the logs:

2023-12-07 08:23:35,786 - MainThread - botocore.parsers - DEBUG - Response body:
...
<Key>YUL/<SHOW>/PRESHOW/106.YUL-<SHOW>-PRESHOW-G1-W3-S106-V1.mov</Key><LastModified>2023-12-06T20:13:05.000Z</LastModified>
...
2023-12-07 08:23:36,703 - MainThread - awscli.customizations.s3.syncstrategy.base - DEBUG - syncing: ivg-site-files/YUL/<SHOW>/PRESHOW/106.YUL-<SHOW>-PRESHOW-G1-W3-S106-V1.mov -> E:\\106.YUL-<SHOW>-PRESHOW-G1-W3-S106-V1.mov, file does not exist at destination

There are 10 files in this prefix and they all show as to be synced since they do not exist at the destination. In this example let's look at a single file. It's showing the file has a Last Modified Date of 2023-12-06T20:13:05.000Z

Now I don't touch the files but run the same command but this time without --dryrun:
aws s3 sync s3://<BUCKET>/YUL/<NAME>/PRESHOW/ . --debug > 1st-run.txt 2>&1

2023-12-07 08:23:50,413 - MainThread - botocore.parsers - DEBUG - Response body:
...
<Key>YUL/<SHOW>/PRESHOW/106.YUL-<SHOW>-PRESHOW-G1-W3-S106-V1.mov</Key><LastModified>2023-12-06T20:13:05.000Z</LastModified>
...
2023-12-07 08:23:50,460 - MainThread - awscli.customizations.s3.syncstrategy.base - DEBUG - syncing: ivg-site-files/YUL/<SHOW>/PRESHOW/106.YUL-<SHOW>-PRESHOW-G1-W3-S106-V1.mov -> E:\\106.YUL-<SHOW>-PRESHOW-G1-W3-S106-V1.mov, file does not exist at destination

again it syncs as expected however after the file is synchronized if I look at the file properties in Windows Explorer the date is different.

image
The Modified date shows the same time + 1 Second, but a different TimeZone.

Now Let's run the same command again but dump to a different text file:

aws s3 sync s3://<BUCKET>/YUL/<NAME>/PRESHOW/ . --debug > 2nd-run.txt 2>&1

2023-12-07 08:39:13,566 - MainThread - awscli.customizations.s3.syncstrategy.base - DEBUG - syncing: ivg-site-files/YUL/<SHOW>/PRESHOW/106.YUL-<SHOW>-PRESHOW-G1-W3-S106-V1.mov -> E:\\106.YUL-<SHOW>-PRESHOW-G1-W3-S106-V1.mov, size: 7814447252 -> 7814447252, modified time: 2023-12-06 15:13:05-05:00 -> 2023-12-06 15:13:06-05:00

This time the sync command shows both Times in my Timezone and notice the local file is still 1 second in the future. Of the 10 files in the prefix 4 of them had the same issue and got transferred again, each of them is 1 second off.

Expected Behavior

The Modified date on the local drive should be the same as the modified date on AWS.

Current Behavior

Sometimes the local date is modified 1 second after the AWS Stored date

Reproduction Steps

I'm not sure how the Time and date are stored but it almost feels like a rounding issue, like files that are stored at 2023-12-06T20:13:05.500Z get rounded up to 2023-12-06T20:13:06.000Z.

  • Sync files to an external drive formatted as exFAT (Have not tried other partition types)
  • Once finished Sync the same files again and see if some get copied due to differences in modified date.

Possible Solution

I know I could try running with --size-only but that kind of misses the point of using Sync to begin with. I do want to synchronize but I'd rather not have it download files that have not changed, and it probably should store the same modified date on all files.

Additional Information/Context

After more testing, it seems this only happens on an external drive formatted as exFAT. If I do the same process on an internal (NTFS) drive it works as expected. I have not tried on other formats.

CLI version used

2.13.25 and 2.14.6

Environment details (OS name and version, etc.)

Windows 11

@hossimo hossimo added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Dec 7, 2023
@klausetgeton
Copy link

Hey, I would 👍🏻 this one. The LastModified is lacking MS precision

@maiquelcraash
Copy link

I'm facing similar issues with the LastModified value as it seems to not consider the milliseconds for the modified date.

@RyanFitzSimmonsAK RyanFitzSimmonsAK self-assigned this Feb 29, 2024
@RyanFitzSimmonsAK RyanFitzSimmonsAK added s3 investigating This issue is being investigated and/or work is in progress to resolve the issue. p2 This is a standard priority issue and removed needs-triage This issue or PR still needs to be triaged. labels Feb 29, 2024
@RyanFitzSimmonsAK
Copy link
Contributor

Hi @hossimo, thanks for reaching out and for your patience. @klausetgeton and @maiquelcraash, can you verify the theory that this only happens with external exFAT drives, or were you able to reproduce it on other types of drives?

@RyanFitzSimmonsAK
Copy link
Contributor

Hey, it looks like this bug is a result of the two second write time resolution on exFAT and FAT32 file systems.

For example, on NT FAT, create time has a resolution of 10 milliseconds, write time has a resolution of 2 seconds, and access time has a resolution of 1 day (really, the access date).

This explains why only some of your files were experiencing this behavior; all of the odd integer timestamps were being turned into even integers.

Unfortunately, there isn't much we can do about this. For workarounds, you already mentioned --size-only as a way to bypass this. Please let me know if you have any other questions.

@RyanFitzSimmonsAK RyanFitzSimmonsAK added response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. and removed investigating This issue is being investigated and/or work is in progress to resolve the issue. labels Mar 1, 2024
@hossimo
Copy link
Author

hossimo commented Mar 1, 2024

got it, and fair enough. I honestly didn't know FAT was 2-second write resolution on the FATs. I guess the only thing that could be done it write a tag that has the actual file time and reference but that's understandably out of scope and with its own issues.

I guess the answer is not to sync from an external drive unless it has a higher write resolution.

Thanks for the investigation.

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Mar 1, 2024
Copy link

github-actions bot commented Mar 1, 2024

This issue is now closed. Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. p2 This is a standard priority issue s3
Projects
None yet
Development

No branches or pull requests

4 participants