[red-knot] Eagerly normalize `VendoredPathBuf`s #11989

AlexWaygood · 2024-06-23T15:06:35Z

Summary

Currently VendoredPaths are only normalized when you actually try to use them to lookup a path in the vendored zip archive. This has a couple of disadvantages:

The path is only "dynamically" checked when you actually try to use it. If the path is invalid, we should probably validate it and normalize it at the point where it's constructed, so that it's impossible to create a VendoredPath(Buf) to begin with.
Every time you query whether a path exists (or query some other metadata about the path) in the zip archive, the normalization has to allocate a new String. But this allocation is hidden from the user of the VendoredFileSystem APIs, and feels somewhat wasteful if you're making several queries with the same path.

This PR changes the design of the VendoredFileSystem, VendoredPath and VendoredPathBuf so that normalization is done eagerly at the point when VendoredPath and VendoredPathBuf are created, rather than lazily when you try to use them to query paths in the zip archive. The main disadvantage of this is that it becomes a lot more annoying to construct a VendoredPath from an &str. Previously you could do VendoredPath::new("foo.pyi"); now you must do &VendoredPathBuf::try_from("foo.pyi").unwrap().

I'm not wedded to this PR as it does feel like it makes the API somewhat more awkward to use. But I said I'd look into this as a followup for #11863 (see e.g. #11863 (comment)). So this is the followup!

Test Plan

cargo test -p ruff_db

github-actions · 2024-06-23T15:26:37Z

`ruff-ecosystem` results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

MichaReiser · 2024-06-23T15:53:43Z

I think my preferred solution here would be to improve the normalization logic to avoid allocating if the path's already normalized. But I'm not opposed to do the normalization eagerly.

If we so the normalization eagerly, than I don't think the Path variant still makes sense, we should just use &PathBuf

What's unclear to me is how the normalization works when joining paths because we would then have to normalize both paths before we can join them (which includes an allocation)

crates/ruff_db/src/vendored/path.rs

MichaReiser · 2024-06-23T16:47:07Z

Reading through this more, I'm leaning towards keeping it as is because it makes the API more cumbersome to use without very convincing benefits.

The path is only "dynamically" checked when you actually try to use it. If the path is invalid, we should probably validate it and normalize it at the point where it's constructed, s

It's unclear to me if we actually do want to do this. I don't think there's anything wrong with VendoredPath::new("root/sub").join("../other") where ../other would no longer be a valid VendoredPath (because it starts with a ../)

Every time you query whether a path exists (or query some other metadata about the path) in the zip archive, the normalization has to allocate a new String.

I agree that this is not ideal but I think we can instead change the normalization to return a Cow and only allocate if the path isn't normalized.

I also expect that this won't be a very hot path because module resolution is cached. So we only resolve every module once.

crates/ruff_db/src/vendored/path.rs

AlexWaygood · 2024-06-23T18:45:34Z

Reading through this more, I'm leaning towards keeping it as is because it makes the API more cumbersome to use without very convincing benefits.

Yeah, I am also unsure. But I think part of the issue is that the current implementation on main isn't very principled: it panics if it encounters an unnormalized path, when it should probably return an error instead. #11991 is an alternative to this PR that tries to do the normalization in a more principled way, and it also makes the APIs more cumbersome to use.

It's unclear to me if we actually do want to do this. I don't think there's anything wrong with VendoredPath::new("root/sub").join("../other") where ../other would no longer be a valid VendoredPath (because it starts with a ../)

Hmm, that's an interesting point. We could possibly have something like a join_str() method instead, that allows you to join a fragment to this path even if the fragment itself is not a valid path. But I agree that that's not ideal...

codspeed-hq · 2024-06-23T18:54:37Z

CodSpeed Performance Report

Merging #11989 will improve performances by 4.92%

_{Comparing alex/eagerly-normalize-vendored-paths (4bd1fe8) with main (068b75c)}

Summary

⚡ 1 improvements
✅ 29 untouched benchmarks

Benchmarks breakdown

	Benchmark	`main`	`alex/eagerly-normalize-vendored-paths`	Change
⚡	`linter/default-rules[pydantic/types.py]`	1.9 ms	1.8 ms	+4.92%

AlexWaygood · 2024-06-28T12:46:03Z

Leaving this for the time being

[red-knot] Eagerly normalize VendoredPathBufs

eeed154

AlexWaygood added the red-knot Multi-file analysis & type inference label Jun 23, 2024

AlexWaygood requested a review from MichaReiser June 23, 2024 15:06

MichaReiser reviewed Jun 23, 2024

View reviewed changes

crates/ruff_db/src/vendored/path.rs Outdated Show resolved Hide resolved

Address review

a2fecdf

MichaReiser reviewed Jun 23, 2024

View reviewed changes

crates/ruff_db/src/vendored/path.rs Show resolved Hide resolved

AlexWaygood mentioned this pull request Jun 23, 2024

Improve normalization of VendoredPaths #11991

Closed

fix bug

0775bc7

AlexWaygood force-pushed the alex/eagerly-normalize-vendored-paths branch from 38a84d2 to 0775bc7 Compare June 23, 2024 18:40

fix docs

1d6bb53

AlexWaygood added 2 commits June 23, 2024 21:55

Panic if we encounter prefixes

9d608f4

improve

4bd1fe8

AlexWaygood force-pushed the alex/eagerly-normalize-vendored-paths branch from 8d377c3 to 4bd1fe8 Compare June 23, 2024 21:04

AlexWaygood closed this Jun 28, 2024

AlexWaygood deleted the alex/eagerly-normalize-vendored-paths branch June 28, 2024 12:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[red-knot] Eagerly normalize `VendoredPathBuf`s #11989

[red-knot] Eagerly normalize `VendoredPathBuf`s #11989

AlexWaygood commented Jun 23, 2024

github-actions bot commented Jun 23, 2024 •

edited

Loading

MichaReiser commented Jun 23, 2024

MichaReiser commented Jun 23, 2024

AlexWaygood commented Jun 23, 2024

codspeed-hq bot commented Jun 23, 2024 •

edited

Loading

AlexWaygood commented Jun 28, 2024

[red-knot] Eagerly normalize VendoredPathBufs #11989

[red-knot] Eagerly normalize VendoredPathBufs #11989

Conversation

AlexWaygood commented Jun 23, 2024

Summary

Test Plan

github-actions bot commented Jun 23, 2024 • edited Loading

ruff-ecosystem results

Linter (stable)

Linter (preview)

MichaReiser commented Jun 23, 2024

MichaReiser commented Jun 23, 2024

AlexWaygood commented Jun 23, 2024

codspeed-hq bot commented Jun 23, 2024 • edited Loading

CodSpeed Performance Report

Merging #11989 will improve performances by 4.92%

Summary

Benchmarks breakdown

AlexWaygood commented Jun 28, 2024

[red-knot] Eagerly normalize `VendoredPathBuf`s #11989

[red-knot] Eagerly normalize `VendoredPathBuf`s #11989

github-actions bot commented Jun 23, 2024 •

edited

Loading

`ruff-ecosystem` results

codspeed-hq bot commented Jun 23, 2024 •

edited

Loading