Replies: 19 comments
-
@nickpape-msft FYI |
Beta Was this translation helpful? Give feedback.
-
The current format of I think what you need is a custom git merge driver for |
Beta Was this translation helpful? Give feedback.
-
I didn't understand this comment. Aside from a minor bit of avoidable networking nondeterminism, the node_modules folder is basically a function of whatever's in your |
Beta Was this translation helpful? Give feedback.
-
BTW on this page I saw a note:
This seems like it would complicate the setup for people who want to enlist in our monorepo. |
Beta Was this translation helpful? Give feedback.
-
as it is right now, one version spec can be resolved differently for different packages in node_modules. /foo/1.0.0:
dependencies:
qar: 1.0.0
...
/bar/1.0.0:
dependencies:
qar: 1.1.0
a postinstall script can configure the merge driver |
Beta Was this translation helpful? Give feedback.
-
This problem is turning into a significant source of pain for us. On average our shrinkwrap.yaml file now changes a minimum of 3 times per day. This doesn't sound like a lot, but if those 3 people are trying to merge their PRs around the same time, it can play out like this:
Lately this cycle seems to be recurring several times per week.
Sure, but as long as that decision is made deterministically, it doesn't need to be stored. In other words, as long as Person 1 and Person 2 are making unrelated changes, in principle they aren't actually "conflicting". (Whereas if Person 1 tried to upgrade a library, but Person 2 tried to downgrade it, then that's a legitimate merge conflict, and no one would blame the tooling.) The "qar" merge conflict arises only because shrinkwrap.yaml is storing the detailed output of the PNPM algorithm, instead of storing a minimal set of inputs. If I understand right, the reason is that the PNPM version selection algorithm is potentially nondeterministic about certain choices that it makes. (Even if this nondeterminism provides a 10% speed gain for installs, we'd be willing to forgo it if that's what's needed to make Person 2's life happier.) CC @lahuey |
Beta Was this translation helpful? Give feedback.
-
I published a "beta" version of the merge driver: @pnpm/merge-driver. In order to install it, install the driver globally: Then run this command in the repo which gets the conflicts on merge:
P.S. The installation steps will be easier but I wanted to give it sooner as it seems to be a big problem for you. The auto-merger will only fail if the same fields were modified in package.json. So it should solve all the issues |
Beta Was this translation helpful? Give feedback.
-
Awesome, thanks! We'll give it a try and provide some feedback. |
Beta Was this translation helpful? Give feedback.
-
I read up on Git merge drivers this weekend. In order to use this, we would need to introduce setup scripts that enable the merge driver for each person's enlistments. We would also need to configure the lab machines to install and enable the merge driver before they perform the merge (e.g. for a PR build). This work would have some benefits (e.g. it would also give us a way to use Git hooks, which maybe would be useful), but it's a nontrivial cost. Currently VSTS does the merge at the start of the build definition, so we would need to setup some command that installs and configures the merge driver beforehand. Since it's independent of Git, this setup would need to be compatible with any branch for any point in the Git history. I'm also wondering about the implementation of the PNPM driver. If I understand right, if the Anyway, I'll see what's involved to give this a try. If the algorithm doesn't work reliably, we will find out very quickly. Our current shrinkwrap.yaml is getting a lot of activity throughout the day. We now have a steady influx of new projects coming into our monorepo every week. This has caused us to start thinking a lot more deeply about scalability. |
Beta Was this translation helpful? Give feedback.
-
BTW a third possible approach occurred to me: Suppose that we treated shrinkwrap.yaml as a cache file that gets updated periodically by a bot. The The obvious problem with this approach is nondeterminism. However, suppose that PNPM had a hook that could say "when querying the NPM registry, ignore any package versions that were published after timestamp X". Rush would determine X by looking at the Git commit times (e.g. the maximum commit timestamp for each of the input package.json files). Since NPM registries don't allow a version to be republished, and people almost never unpublish a version, the behavior would be pretty deterministic. (It also assumes that A downside of this approach is that we would lose the integrity checks. An upside is that our Git history wouldn't be flooded with constant changes to a huge YAML file. |
Beta Was this translation helpful? Give feedback.
-
We can't figure out an easy way to enable a merge driver in our lab. Generally people seem to have a negative reaction to this design, because they don't like the idea of tampering with the behavior a foundational and already complex system such as Git. @zkochan what would be involved for the other approach? Today is there a way to tell "pnpm install" to "ignore any version from the registry that was published after timestamp X?" If not, would it be a nontrivial work item? I believe that would give Rush the building block it needs to move shrinkwrap changes to be bot-managed operations on the master branch, rather than something people have to deal with on their PR branches. |
Beta Was this translation helpful? Give feedback.
-
@rarkins I noticed that @renovate-bot can solve the conflicts in And I don't think that a custom git merge driver is used. Could you please share how renovate resolves conflicts in the lockfiles? EDIT: I guess I understand, the bot realizes that there is a conflict, it removes the shrinkwrap, creates a new one and resubmits the shrinkwrap. |
Beta Was this translation helpful? Give feedback.
-
I think that would be possible to do with npm-hosted dependencies. |
Beta Was this translation helpful? Give feedback.
-
There are several conflicting requirements that this idea has to satisfy:
Thinking about this more, a naive timestamp-based would not fully solve requirement 1. Counterexample:
This is counterintuitive because the final merge upgrades us to A 1.3.2 which was not tested in either branch. It could cause a build break in master. The idea could be refined by remembering multiple timestamps, but it gets a little complicated... But since the PR builds always use a hot merge with the latest master, perhaps these breaks would be very rare in practice. If it's our best option, it might be a tolerable way to avoid merge conflicts. The implementation seems fairly straightforward (since Rush is already managing a temp/shrinkwrap.yaml). |
Beta Was this translation helpful? Give feedback.
-
@zkochan yeah, you guessed it - no magic on Renovate's side |
Beta Was this translation helpful? Give feedback.
-
That would definitely cause merge conflicts for everyone's PR branches. |
Beta Was this translation helpful? Give feedback.
-
#1395 has been raised about merge conflicts specifically. |
Beta Was this translation helpful? Give feedback.
-
I did some updates to the formatting of the lockfile to reduce merge conflicts: #3195 Though we still cannot guarantee that the lockfile merges don't cause issues, I created #3202 to address that. I don't know if there is a silver bullet here. There are pros and cons to each approach. I don't think I will make a breaking change to the lockfile format in pnpm v6 but we can discuss alternative formats as proposed in this issue. |
Beta Was this translation helpful? Give feedback.
-
Lately we've had an increasing amount of churn in our shrinkwrap.yaml file, which is causing annoying merge conflicts for people. For example, if a person's PR build takes 45 minutes to finish, during that time an automated bot may commit package.json updates that cause a merge conflict when the build finally completes. Our repo’s .gitattributes prevents automerging of shrinkwrap.yaml (i.e. classifies it as a "binary"), because its complicated structure (a serialized directed-acyclic-graph) could be corrupted by a merge.
We realized that today’s shrinkwrap is solving several unrelated problems:
1. Determinism: Ensure different people on different days will install the same versions, for a given branch
2. Sanity: Ensure that versions are mapped to reasonable results (e.g. avoid side-by-side duplication, counterintuitive downgrades, etc)
3. Integrity: Use a hash to detect cases where an NPM version was republished, replacing its contents with something inconsistent
4. Caching: Store certain additional metadata to avoid blocking “pnpm install” behind an extra REST call
5. Reporting: Provide a complete report of all versions being installed, including the full topology of the directed-acyclic-graph
Hypothetically, if we only care about eliminating merge conflicts, we actually only need #1 and #2.
For an extreme thought experiment, consider this package.json:
We need shrinkwrap to deterministically choose a version for jju and @types/node, but all other versions (including the indirect dependency on jsonfile) are fully determined by the package.json file itself. In theory, PNPM could calculate a unique consistent shrinkwrap.yaml for the above package.json, using only this additional information:
The key idea is that this file format is only consulted when SemVer ranges are encountered.
What about integrity checks (#3)? To support that, we would need to expand the file format to include every referenced dependency again, something like this:
This is going to churn more, but it's still completely safe for Git auto-merging, because additions/removals aren't reshuffling a serialized directed-acyclic-graph structure. One possible optimization would be to make these extra records optional; a human can avoid including them in their PR, if they know that the bot will come along an hour later and add them.
What about caching (#4)? The
"dependencies"
block could be reintroduced, if that avoids some REST calls.What about reporting (#5)? This seems counterproductive to me. If we want a nice report of our tree structure, it would be better to generate a proper HTML report and attach it e.g. as a VSTS build artifact. We don't like to churn our Git history with machine-generated reports, and YAML isn't even a particularly readable format.
Thoughts?
Beta Was this translation helpful? Give feedback.
All reactions