Skip to content

Conversation

gun-yu
Copy link

@gun-yu gun-yu commented Oct 11, 2025

This PR adds support for PnP resolution.

#1875 in the previous discussion, I have internalized pnp-go into the repository.

Please note that the PR has become quite large due to the inclusion of the PnP source and test code. I apologize for the length and appreciate your understanding.

what is pnp

Yarn Plug’n’Play (PnP) is a dependency resolution system that removes the need for a traditional node_modules folder.
describe at #460

how to support

add pnp resolution config in host

type host struct {
	orchestrator *Orchestrator
	host         compiler.CompilerHost
	// Caches that last only for build cycle and then cleared out
	extendedConfigCache tsc.ExtendedConfigCache
	sourceFiles         parseCache[ast.SourceFileParseOptions, *ast.SourceFile]
	configTimes         collections.SyncMap[tspath.Path, time.Duration]

	// caches that stay as long as they are needed
	resolvedReferences parseCache[tspath.Path, *tsoptions.ParsedCommandLine]
	mTimes             *collections.SyncMap[tspath.Path, time.Time]
	resolvedReferences  parseCache[tspath.Path, *tsoptions.ParsedCommandLine]
	mTimes              *collections.SyncMap[tspath.Path, time.Time]
	pnpResolutionConfig *pnp.ResolutionConfig
}

add pnp branch in resolver.go

// resolveTypeReferenceDirective
// skip typeRoots because PnP knows exactly where each @types is located.
	if r.resolver.pnpResolutionConfig != nil {
		if resolvedFromNearestNodeModulesDirectory := r.loadModuleFromNearestNodeModulesDirectory(true /*typesScopeOnly*/); !resolvedFromNearestNodeModulesDirectory.shouldContinueSearching() {
			return r.createResolvedTypeReferenceDirective(resolvedFromNearestNodeModulesDirectory, true /*primary*/)
		}
	} else {
		if len(typeRoots) > 0 {
			if r.tracer != nil {
				r.tracer.write(diagnostics.Resolving_with_primary_search_path_0.Format(strings.Join(typeRoots, ", ")))
			}
// loadModuleFromNearestNodeModulesDirectory
if r.resolver.pnpResolutionConfig != nil {
			if result := r.loadModuleFromPNP(priorityExtensions, typesScopeOnly); !result.shouldContinueSearching() {
				return result
			}
		} else {
			if result := r.loadModuleFromNearestNodeModulesDirectoryWorker(priorityExtensions, mode, typesScopeOnly); !result.shouldContinueSearching() {
				return result
			}
		}

logger.Log(fmt.Sprintf("ATA:: Installed typings %v", packageNames))
var installedTypingFiles []string
resolver := module.NewResolver(ti.host, &core.CompilerOptions{ModuleResolution: core.ModuleResolutionKindNodeNext}, "", "")
resolver := module.NewResolver(ti.host, &core.CompilerOptions{ModuleResolution: core.ModuleResolutionKindNodeNext}, "", "", nil)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

todo: add pnp support later

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless you plan to do it in this PR, we this should have a comment in the code.

@gun-yu gun-yu changed the title Feat/add pnp resolver support pnp resolver Oct 11, 2025
}

// need fixtures to be yarn install and make global cache
func TestGlobalCache(t *testing.T) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These test cases were taken from pnp-rs, but since they are relatively complex to run and maintain.
I think it would be fine to remove them if they’re not considered necessary.
Please let me know your thoughts!

@gun-yu
Copy link
Author

gun-yu commented Oct 12, 2025

@microsoft-github-policy-service agree

pnpResolutionConfig := TryGetPnpResolutionConfig(currentDirectory)

if pnpResolutionConfig != nil {
fs = pnpvfs.From(fs)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, this way I wouldn’t be able to take advantage of the cached VFS, so I’m thinking of moving its location instead.

@jakebailey
Copy link
Member

You said some of this comes from pnp-rs. Is it a pure port? What is the license of that code? Do you own it? (You signed the CLA, but that may not be enough if it's not yours.)

@gun-yu
Copy link
Author

gun-yu commented Oct 13, 2025

@jakebailey
The original source code, pnp-rs, is licensed under the BSD 2-Clause License. As the ported code is derived from that source, it should also fall under the BSD 2-Clause License — I missed that part, thank you for pointing it out.

As far as I understand, this means I need to include the original author’s license and copyright notice in the comments.

Would this cause any issues? If everything is fine, I’ll go ahead and add the appropriate BSD 2-Clause License notice.

@jakebailey
Copy link
Member

We could pull it, but would need to explicitly declare that dependence in some extra metadata to generate NOTICE.txt (something we have avoided entirely by ensuring all code was written by us from spec or tests, but it's not really different than other deps).

But you definitely need to declare that there are files derived from code under a different license.

@gun-yu
Copy link
Author

gun-yu commented Oct 13, 2025

@jakebailey
Sorry for the late reply. I just want to confirm my understanding—would it be sufficient to directly add a notice for the BSD 2-Clause License of pnp-rs in the NOTICE.txt file?

@jakebailey
Copy link
Member

No, the files containing the code need to have headers with said license. For example, these tests:

// Copyright 2018 Ulf Adams

@gun-yu
Copy link
Author

gun-yu commented Oct 14, 2025

Thank you. I’ve added the license notice as suggested. @jakebailey
b368c1c

Copy link
Member

@jakebailey jakebailey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A "few" comments; my overall worry is that the code style is not so much similar to our code, but maybe that's okay.

(My reviewing it is not really saying yes or no to the PR, we haven't discussed it)

return "", false
}

var rePNP = regexp.MustCompile(`(?s)(const[\ \r\n]+RAW_RUNTIME_STATE[\ \r\n]*=[\ \r\n]*|hydrateRuntimeState\(JSON\.parse\()'`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This scares me a bit; because of the regex, but also because this reads JS files? Does the spec really require such a thing?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! esbuild uses the JS AST for parsing. Since I was primarily focused on porting pnp-rs, I initially brought the code over as-is, but as you mentioned in another comment, there are definitely some risky areas. I believe the TypeScript Go codebase also includes a JS parser, so I’ll try using that for parsing instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the official spec, in a Node.js environment, pnp.cjs patches Node’s module resolution, so it’s executed first to override the default behavior. In Go, however, this part needs to be implemented manually.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ultimately, parsing pnp.cjs is necessary to read the metadata, so this part seems unavoidable.
If we parse it using an AST like esbuild does, it might be a bit more reliable — would that be okay?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes more sense to me. Though in general I really thought that PnP had been specified to just emit a plain JSON file...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a9dbc68
I’ve changed the implementation from using regex to parsing with an AST.
The original pnp-rs implementation (the official one by the Yarn team) also uses regex, so I believe the regex-based approach is stable.
However, if switching to the AST-based approach is considered better, I’m happy to proceed with it — the code works fine as is.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my project, where pnp.cjs is around 200,000 lines, the regex implementation performed slightly better.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/yarnpkg/pnp-rs was happy with a regex, then I don't mind behind happy with a regex. I don't know if this is even in the hot path. My main concern was just whether or not it was "stable".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand. pnp-rs uses regex while esbuild uses AST, and both are stable implementations.
As you mentioned, this isn’t a hot path and there shouldn’t be any significant difference in performance, so I think it’s fine.
I’ll keep the AST-based implementation as it is in the commit.

}

func (r *RegexDef) compile() (*regexp.Regexp, error) {
return regexp.Compile(r.Source)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will fail because regex is re2 syntax, not ECMAScript. You'd have to use regexp2 in ECMAScript mode.

That being said, is this really ever used? Does the PnP spec really require regex parsing? (How do esbuild and so on do this without pulling on totally different regex implementations?)

Copy link
Author

@gun-yu gun-yu Oct 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the feedback!
The original ignorePatternData looks like this:

"ignorePatternData": "(^(?:\\\\.yarn\\\\/sdks(?:\\\\/(?!\\\\.{1,2}(?:\\\\/|$))(?:(?:(?!(?:^|\\\\/)\\\\.{1,2}(?:\\\\/|$)).)*?)|$))$)",

Esbuild also compiles and uses the regex in the same way. Since this is part of the Yarn PnP spec, it seems unavoidable to use a regular expression here.

For now, I’ve optimized this commit to avoid the inefficiency of compiling the regex on every use. f0ee55b

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If esbuild gets away with it, then that's fine I suppose.

Out of curiosity, is this string always the same?

Copy link
Author

@gun-yu gun-yu Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the specification, I understand that this value is dynamic.
yarn berry

@gun-yu
Copy link
Author

gun-yu commented Oct 16, 2025

@jakebailey Thank you for the thoughtful review! I know feedback like this takes a lot of time, and I truly appreciate it. Reading your comments made me realize how unfamiliar I still am with the TypeScript Go codebase—I’m sorry I didn’t reference the existing code more. Consistent code style is especially important in open source, so I’m happy to adopt all of your suggestions. I’ll start by addressing your comments.

@gun-yu
Copy link
Author

gun-yu commented Oct 18, 2025

@jakebailey
I believe I’ve incorporated all the review feedback.
Please let me know if there’s anything I might have overlooked.

@gun-yu gun-yu requested a review from jakebailey October 18, 2025 18:36
Comment on lines +5 to +6
"os"
"path/filepath"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compiler should not depend on these; any accesses it makes must go through a provided FS.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(#1911 will enforce this in the future)

IsAlias bool
}

func (p *PackageDependency) UnmarshalJSON(data []byte) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not strictly required, but all of these UnmarshalJSON functions would perform better and probably be simpler to write using the v2 versions of this, as in UnmarshalJSONFrom(dec *jsontext.Decoder) error).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not exactly what I meant per se; you want to read from the input in a more streaming manner. I would look at other instances of the method, since by calling Value you are just doing what the old code did and double decoding.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e697dce
I did some work on it, but the code has gotten quite complicated, so I'm not sure if it's heading in the right direction. There might be some minor memory or performance gains, but I haven’t benchmarked it yet.


func ResolveConfig(moduleName string, containingFile string, host ResolutionHost) *ResolvedModule {
resolver := NewResolver(host, &core.CompilerOptions{ModuleResolution: core.ModuleResolutionKindNodeNext}, "", "")
resolver := NewResolver(host, &core.CompilerOptions{ModuleResolution: core.ModuleResolutionKindNodeNext}, "", "", nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting one; nil here means that one can't load tsconfigs through PnP, which is probably fine, but I am curious how the old PnP patch did this (if at all).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually wanted to get feedback on this part, but I forgot to mention it in the description — thanks for pointing it out!

I think adding PnP resolution would be the right approach, but for now, I set it to nil because I wasn’t sure how best to pass the pnpResolutionConfig from the host to NewResolver given the current structure.

(I’ll also mention this here since the concern is similar to the zipvfs layer you pointed out.)

I considered two possible approaches:

  1. Inject both pnpResolutionConfig and zipvfs externally.

  2. Handle them inside NewCompilerHost (which is how it’s currently implemented in this PR).

The first approach would make the PnP-related code spread across quite a few parts of the system, which I’d prefer to avoid — I’d like to keep PnP logic as isolated as possible.

The second approach avoids scattering PnP logic across tsgo, but it introduces some implicit behavior in NewCompilerHost and prevents taking advantage of cachedvfs.

From a maintainer’s perspective, do you have any thoughts on which direction would be preferable?

Comment on lines +16 to +18
// trim trailing slash makes a bug in packageJsonInfoCache.Set
// like @emotion/react/ -> @emotion/react after packageJsonInfoCache.Set
// check why it's happening in packageJsonInfoCache and need to fix it
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because the cache I think assumes normalized paths with trailing trimmed. Definitely need to ensure paths are minimal like that. I think.

logger.Log(fmt.Sprintf("ATA:: Installed typings %v", packageNames))
var installedTypingFiles []string
resolver := module.NewResolver(ti.host, &core.CompilerOptions{ModuleResolution: core.ModuleResolutionKindNodeNext}, "", "")
resolver := module.NewResolver(ti.host, &core.CompilerOptions{ModuleResolution: core.ModuleResolutionKindNodeNext}, "", "", nil)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless you plan to do it in this PR, we this should have a comment in the code.

Comment on lines +59 to +61
if pnpResolutionConfig != nil {
fs = zipvfs.From(fs)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do wonder if this is the right layer for this. Surely the language server also needs to know this information? For example, we need to read random files to do source mapping in the LS layer, which would have to dig into zips.

But, we don't want cached info to be there indefinitely either.

I also don't know where this should go in relation to the caching layer...

@gun-yu
Copy link
Author

gun-yu commented Oct 21, 2025

Unless you plan to do it in this PR, we this should have a comment in the code.

Once the direction for the NewCompilerHost design is decided, I think I can add everything.

However, I consider the ata file related to LSP behavior, and I haven’t really tested LSP properly yet, nor do I fully understand it, so I’m a bit concerned about that part.

I’m fine either working on it immediately or continuing in the next PR. I can proceed according to your guidance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants