Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crates_io_tarball: Disallow paths differing only by case #8788

Merged
merged 1 commit into from
Jun 4, 2024

Conversation

Turbo87
Copy link
Member

@Turbo87 Turbo87 commented Jun 4, 2024

Resolves #8410

This implementation uses .to_ascii_lowercase() on the Path to compare the paths. In other words: this may or may not be sufficient depending on the filesystem and the usage of non-ASCII characters in the paths. It seems to cover the majority of the existing cases though, so it's at least a step in the right direction.

In the future we could even consider rejecting non-ASCII paths, in which case this implementation would probably be sufficient.

/cc @kornelski

@Turbo87 Turbo87 added C-bug 🐞 Category: unintended, undesired behavior A-backend ⚙️ labels Jun 4, 2024
@Turbo87 Turbo87 requested a review from a team June 4, 2024 11:50
@Turbo87 Turbo87 force-pushed the duplicate-paths branch from c8361d8 to 19dee4e Compare June 4, 2024 12:01
Copy link

codecov bot commented Jun 4, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 88.65%. Comparing base (3dc1848) to head (19dee4e).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8788      +/-   ##
==========================================
- Coverage   88.66%   88.65%   -0.02%     
==========================================
  Files         276      276              
  Lines       27605    27613       +8     
==========================================
+ Hits        24476    24480       +4     
- Misses       3129     3133       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@LawnGnome LawnGnome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

I think we'll eventually have to grapple with the terrible questions of "so, what, if any, encoding do we expect paths in crate files to be in", and "is uniqueness actually defined by the NF[CD] canonicalisations of path names after case folding, if the answer to the previous question involves Unicode in some way", but let's not let perfect be the enemy of good here.

@Turbo87 Turbo87 merged commit e476d90 into rust-lang:main Jun 4, 2024
9 checks passed
@Turbo87 Turbo87 deleted the duplicate-paths branch June 4, 2024 17:50
@zimond
Copy link

zimond commented Jun 6, 2024

after this, I cannot publish my crate with the following error:

the remote server responded with an error (status 400 Bad Request): uploaded tarball contains more than one file with the same path xxx/README.md

I have to delete the readme file to upload a new version, which is weird. The bundled artifact .crate file contains only one readme file and I don't know where this error comes from.

@Turbo87
Copy link
Member Author

Turbo87 commented Jun 6, 2024

for what crate is this? can you cargo package and send us the file to take a look?

@zimond
Copy link

zimond commented Jun 6, 2024

@eth3lbert
Copy link
Contributor

The doc.rs source page indicates that there are two README.md files.
https://docs.rs/crate/fontkit/0.5.2/source/

@zimond
Copy link

zimond commented Jun 6, 2024

Now I'm really confused.

@zimond
Copy link

zimond commented Jun 6, 2024

Could it be that cargo is generating a README.md for me? link

@zimond
Copy link

zimond commented Jun 6, 2024

Ok so with readme = "Readme.md" in Cargo.toml, I successfully uploaded a new version 0.6.0-beta.6 to crates.io.

@Turbo87
Copy link
Member Author

Turbo87 commented Jun 6, 2024

hmm I'm not aware of cargo having such behavior. @rust-lang/cargo ?

@epage
Copy link

epage commented Jun 6, 2024

This is likely rust-lang/cargo#13722 which comes from our logic to ensure a README or a LICENSE from another location in the repo is pulled into the .crate file

https://github.com/rust-lang/cargo/blob/10b7d384352f6d9a39f609de44476f815fc792a2/src/cargo/ops/cargo_package.rs#L330-L338

@epage
Copy link

epage commented Jun 6, 2024

btw this seems like this would have been a good change to coordinate with the Cargo team on as this is interrelated to Cargo's behavior and choices Cargo makes. If nothing else, we should probably be consistent on our case sensitivity logic.

@ehuss
Copy link
Contributor

ehuss commented Jun 6, 2024

Filed rust-lang/cargo#14020 about the readme specifically.

@ehuss
Copy link
Contributor

ehuss commented Jun 6, 2024

@Turbo87 I would suggest reverting this until things are fixed, as I suspect this will impact a large number of users.

I would also suggest doing a scan of existing crates to see how common it is. This may trigger for all sorts of reasons which we don't know about.

@LawnGnome
Copy link
Contributor

I've just deployed crates.io with this reverted.

Let's figure out what the right behaviour should be here and then get cargo and crates.io on the same page.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-backend ⚙️ C-bug 🐞 Category: unintended, undesired behavior
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Crates with paths differing only by case are allowed
7 participants