-
Notifications
You must be signed in to change notification settings - Fork 722
support pnp resolver #1876
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
support pnp resolver #1876
Conversation
logger.Log(fmt.Sprintf("ATA:: Installed typings %v", packageNames)) | ||
var installedTypingFiles []string | ||
resolver := module.NewResolver(ti.host, &core.CompilerOptions{ModuleResolution: core.ModuleResolutionKindNodeNext}, "", "") | ||
resolver := module.NewResolver(ti.host, &core.CompilerOptions{ModuleResolution: core.ModuleResolutionKindNodeNext}, "", "", nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
todo: add pnp support later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless you plan to do it in this PR, we this should have a comment in the code.
} | ||
|
||
// need fixtures to be yarn install and make global cache | ||
func TestGlobalCache(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These test cases were taken from pnp-rs, but since they are relatively complex to run and maintain.
I think it would be fine to remove them if they’re not considered necessary.
Please let me know your thoughts!
@microsoft-github-policy-service agree |
internal/compiler/host.go
Outdated
pnpResolutionConfig := TryGetPnpResolutionConfig(currentDirectory) | ||
|
||
if pnpResolutionConfig != nil { | ||
fs = pnpvfs.From(fs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second thought, this way I wouldn’t be able to take advantage of the cached VFS, so I’m thinking of moving its location instead.
You said some of this comes from pnp-rs. Is it a pure port? What is the license of that code? Do you own it? (You signed the CLA, but that may not be enough if it's not yours.) |
@jakebailey As far as I understand, this means I need to include the original author’s license and copyright notice in the comments. Would this cause any issues? If everything is fine, I’ll go ahead and add the appropriate BSD 2-Clause License notice. |
We could pull it, but would need to explicitly declare that dependence in some extra metadata to generate NOTICE.txt (something we have avoided entirely by ensuring all code was written by us from spec or tests, but it's not really different than other deps). But you definitely need to declare that there are files derived from code under a different license. |
@jakebailey |
No, the files containing the code need to have headers with said license. For example, these tests:
|
Thank you. I’ve added the license notice as suggested. @jakebailey |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A "few" comments; my overall worry is that the code style is not so much similar to our code, but maybe that's okay.
(My reviewing it is not really saying yes or no to the PR, we haven't discussed it)
internal/module/pnp/lib.go
Outdated
return "", false | ||
} | ||
|
||
var rePNP = regexp.MustCompile(`(?s)(const[\ \r\n]+RAW_RUNTIME_STATE[\ \r\n]*=[\ \r\n]*|hydrateRuntimeState\(JSON\.parse\()'`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This scares me a bit; because of the regex, but also because this reads JS files? Does the spec really require such a thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! esbuild uses the JS AST for parsing. Since I was primarily focused on porting pnp-rs, I initially brought the code over as-is, but as you mentioned in another comment, there are definitely some risky areas. I believe the TypeScript Go codebase also includes a JS parser, so I’ll try using that for parsing instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the official spec, in a Node.js environment, pnp.cjs patches Node’s module resolution, so it’s executed first to override the default behavior. In Go, however, this part needs to be implemented manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ultimately, parsing pnp.cjs is necessary to read the metadata, so this part seems unavoidable.
If we parse it using an AST like esbuild does, it might be a bit more reliable — would that be okay?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes more sense to me. Though in general I really thought that PnP had been specified to just emit a plain JSON file...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a9dbc68
I’ve changed the implementation from using regex to parsing with an AST.
The original pnp-rs implementation (the official one by the Yarn team) also uses regex, so I believe the regex-based approach is stable.
However, if switching to the AST-based approach is considered better, I’m happy to proceed with it — the code works fine as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my project, where pnp.cjs is around 200,000 lines, the regex implementation performed slightly better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/yarnpkg/pnp-rs was happy with a regex, then I don't mind behind happy with a regex. I don't know if this is even in the hot path. My main concern was just whether or not it was "stable".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand. pnp-rs uses regex while esbuild uses AST, and both are stable implementations.
As you mentioned, this isn’t a hot path and there shouldn’t be any significant difference in performance, so I think it’s fine.
I’ll keep the AST-based implementation as it is in the commit.
internal/module/pnp/manifest.go
Outdated
} | ||
|
||
func (r *RegexDef) compile() (*regexp.Regexp, error) { | ||
return regexp.Compile(r.Source) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will fail because regex
is re2 syntax, not ECMAScript. You'd have to use regexp2
in ECMAScript mode.
That being said, is this really ever used? Does the PnP spec really require regex parsing? (How do esbuild and so on do this without pulling on totally different regex implementations?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback!
The original ignorePatternData
looks like this:
"ignorePatternData": "(^(?:\\\\.yarn\\\\/sdks(?:\\\\/(?!\\\\.{1,2}(?:\\\\/|$))(?:(?:(?!(?:^|\\\\/)\\\\.{1,2}(?:\\\\/|$)).)*?)|$))$)",
Esbuild also compiles and uses the regex in the same way. Since this is part of the Yarn PnP spec, it seems unavoidable to use a regular expression here.
For now, I’ve optimized this commit to avoid the inefficiency of compiling the regex on every use. f0ee55b
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If esbuild gets away with it, then that's fine I suppose.
Out of curiosity, is this string always the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the specification, I understand that this value is dynamic.
yarn berry
@jakebailey Thank you for the thoughtful review! I know feedback like this takes a lot of time, and I truly appreciate it. Reading your comments made me realize how unfamiliar I still am with the TypeScript Go codebase—I’m sorry I didn’t reference the existing code more. Consistent code style is especially important in open source, so I’m happy to adopt all of your suggestions. I’ll start by addressing your comments. |
@jakebailey |
"os" | ||
"path/filepath" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The compiler should not depend on these; any accesses it makes must go through a provided FS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(#1911 will enforce this in the future)
internal/module/pnp/manifest.go
Outdated
IsAlias bool | ||
} | ||
|
||
func (p *PackageDependency) UnmarshalJSON(data []byte) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not strictly required, but all of these UnmarshalJSON functions would perform better and probably be simpler to write using the v2 versions of this, as in UnmarshalJSONFrom(dec *jsontext.Decoder) error
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's not exactly what I meant per se; you want to read from the input in a more streaming manner. I would look at other instances of the method, since by calling Value you are just doing what the old code did and double decoding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
e697dce
I did some work on it, but the code has gotten quite complicated, so I'm not sure if it's heading in the right direction. There might be some minor memory or performance gains, but I haven’t benchmarked it yet.
|
||
func ResolveConfig(moduleName string, containingFile string, host ResolutionHost) *ResolvedModule { | ||
resolver := NewResolver(host, &core.CompilerOptions{ModuleResolution: core.ModuleResolutionKindNodeNext}, "", "") | ||
resolver := NewResolver(host, &core.CompilerOptions{ModuleResolution: core.ModuleResolutionKindNodeNext}, "", "", nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an interesting one; nil
here means that one can't load tsconfigs through PnP, which is probably fine, but I am curious how the old PnP patch did this (if at all).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually wanted to get feedback on this part, but I forgot to mention it in the description — thanks for pointing it out!
I think adding PnP resolution would be the right approach, but for now, I set it to nil because I wasn’t sure how best to pass the pnpResolutionConfig from the host to NewResolver given the current structure.
(I’ll also mention this here since the concern is similar to the zipvfs layer you pointed out.)
I considered two possible approaches:
-
Inject both pnpResolutionConfig and zipvfs externally.
-
Handle them inside NewCompilerHost (which is how it’s currently implemented in this PR).
The first approach would make the PnP-related code spread across quite a few parts of the system, which I’d prefer to avoid — I’d like to keep PnP logic as isolated as possible.
The second approach avoids scattering PnP logic across tsgo, but it introduces some implicit behavior in NewCompilerHost and prevents taking advantage of cachedvfs.
From a maintainer’s perspective, do you have any thoughts on which direction would be preferable?
// trim trailing slash makes a bug in packageJsonInfoCache.Set | ||
// like @emotion/react/ -> @emotion/react after packageJsonInfoCache.Set | ||
// check why it's happening in packageJsonInfoCache and need to fix it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, because the cache I think assumes normalized paths with trailing trimmed. Definitely need to ensure paths are minimal like that. I think.
logger.Log(fmt.Sprintf("ATA:: Installed typings %v", packageNames)) | ||
var installedTypingFiles []string | ||
resolver := module.NewResolver(ti.host, &core.CompilerOptions{ModuleResolution: core.ModuleResolutionKindNodeNext}, "", "") | ||
resolver := module.NewResolver(ti.host, &core.CompilerOptions{ModuleResolution: core.ModuleResolutionKindNodeNext}, "", "", nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless you plan to do it in this PR, we this should have a comment in the code.
if pnpResolutionConfig != nil { | ||
fs = zipvfs.From(fs) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do wonder if this is the right layer for this. Surely the language server also needs to know this information? For example, we need to read random files to do source mapping in the LS layer, which would have to dig into zips.
But, we don't want cached info to be there indefinitely either.
I also don't know where this should go in relation to the caching layer...
Once the direction for the NewCompilerHost design is decided, I think I can add everything. However, I consider the ata file related to LSP behavior, and I haven’t really tested LSP properly yet, nor do I fully understand it, so I’m a bit concerned about that part. I’m fine either working on it immediately or continuing in the next PR. I can proceed according to your guidance. |
This PR adds support for PnP resolution.
#1875 in the previous discussion, I have internalized pnp-go into the repository.
Please note that the PR has become quite large due to the inclusion of the PnP source and test code. I apologize for the length and appreciate your understanding.
what is pnp
Yarn Plug’n’Play (PnP) is a dependency resolution system that removes the need for a traditional node_modules folder.
describe at #460
how to support
add pnp resolution config in host
add pnp branch in resolver.go