-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tools: add a backport queue cron action #56143
base: main
Are you sure you want to change the base?
Conversation
Review requested:
|
/cc @nodejs/backporters @nodejs/releasers |
matrix: | ||
branch: | ||
- { release-line: v23.x, base-branch: main } | ||
- { release-line: v22.x, base-branch: v23.x } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should be carefule with this because v23.x
may contain newly released commits that should not be landed during two weeks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's a problem, we'll get notified before the commit is ready whether it lands cleanly on v22.x, it doesn't mean we'll have to backport it right away
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to my comment above, I think having the workflow be manually ran makes more sense to start with, in that case these values should be parameterized instead.
make lint-js NODE="$NODE" && \ | ||
make test-doc -j4 NODE="$NODE" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The more I think about this problem space, the more I'm convinced that the right way to go about it is to complete the backport queue ignoring tests and automate a git bisect
workflow that detects when things broke only if we actually have a broken build in the staging branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running doc tests is quite cheap, I don't see a reason not to
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure thing, that could easily be incorporated into nodejs/node-core-utils#875 too 👍
I'm just making the case that running the full test suite is the most valuable thing and that is very expensive and because of that I've been thinking more and more that bisecting later might actually be the way to go, even for automation.
echo "Branch doesn't exist yet" | ||
fi | ||
env: | ||
BACKPORT_BRANCH: ${{ matrix.branch.release-line }}-staging-auto-backport |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have to keep in mind that the purpose of the v<MAJOR>.x-staging
branches is precisely to serve as a working area for the releaser, in the sense that the releaser is free to break builds, force-push, and modify it however they want when preparing a release.
With that in mind, adding a second <MAJOR>-staging-auto-backport
seems like an unnecessary duplication of effort that I'm not sure is adding value here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I addressed that point in the PR description, we cannot use the commits backported by the automation, so it makes more sense to use a different branch IMO. It could also be on a separate repo if we don't want to clutter nodejs/node with that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, let me copy/paste each point so that we can discuss each one in more detail:
It's only able to work with commits that cleanly apply to the staging branch, which is trivial for a human to do.
It's not trivial and very time consuming, particularly because these "trivial to cherry-pick" commits are mixed up with many conflicting commits that may or not be simple for the releaser to handle it themselves. This automation should be precisely about separating this two types of commits, whatever lands cleanly can go in the staging branch and whatever needs backport can be handled later - later, as in the next 10mins of work for that releaser or throughout the week, it gives more control to the releaser on how they want to handle it.
No matter how small of a conflict, as soon as a commit doesn't backport it is skipped, and having out of order backports makes branches diverge more and more, making future backport harder.
Agreed, that's why the report generated by nodejs/node-core-utils#875 is in the correct order so that the releaser can do a best-effort work to try and land the easy-to-manage conflicted commits in the right order. Any non-trivial conflict will still require the backport from (likely) the original PR author as usual.
Releasers sign their backport commits. Because commits that land on staging branch receive much fewer attention than the one on main, we don't want anyone to be tempted to introduce subtle backdoor when backporting, having the signature puts the releaser reputation at stake. I don't think we could have the same system with a bot while making sure no one can impersonate the bot.
I think you're talking about the manual backport process when you mention "we don't want anyone to be tempted to introduce subtle backdoor when backporting" - that, has its own process, which consists in opening a new PR that still needs to be reviewed manually (and approved at least by the releaser) and which we usually run tests again, in order to land in the staging branch.
I'm failing to see how the bot impersonation plays out in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm failing to see how the bot impersonation plays out in that case.
If the bot is pushing commits on the staging branch, it would be very easy for a malicious actor to impersonate the bot and push a commit that contains e.g. a backdoor. It's not a problem on main
which has a lot of eyes on it, but on staging branches, it would be very hard to spot, and possibly tricky to track down. Currently, the way we have things where only humans are landing stuff on the staging branches puts a good incentive not to be malicious actor, and to be careful not to leak our private keys, as the blame would go to whoever signed the malicious commit.
that, has its own process
I think it would be silly to assume a malicious actor would follow process 🙃
on: | ||
schedule: | ||
# Run twice a week at 00:15 AM UTC on Sunday and Thursday. | ||
- cron: 15 0 * * 0,4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would start with a manually run workflow to give it a try. Also if it keep it a manual step we can easily get around the two weeks problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have to run the workflow manually, then it has no value, I'd rather use the CLI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see a big value in not having to be at my workstation in order to kickstart a workflow, having the worfklow available in GitHub actions means I can just kickstart it anytime from my phone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having it scheduled means you don't even need to pull up your phone though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure 👍 it makes total sense now that I understand the goal is not to land these commits on staging!
I'd still start with manual runs just to try it out but that's open to you
matrix: | ||
branch: | ||
- { release-line: v23.x, base-branch: main } | ||
- { release-line: v22.x, base-branch: v23.x } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related to my comment above, I think having the workflow be manually ran makes more sense to start with, in that case these values should be parameterized instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice if we could wait to land this after the Release WG meeting later this week so that we can collect feedback & raise awareness about these upcoming automations for all releasers.
This should help the Release team to keep up with backports, by getting notified sooner when there's a backport conflict.
I've tried compiling and running tests as part of the workflow but it's not reliable and would force us to limit the number of commits to avoid timeouts.
I'm using a different branch, I don't think we want the automation to push to the staging branches because:
main
, we don't want anyone to be tempted to introduce subtle backdoor when backporting, having the signature puts the releaser reputation at stake. I don't think we could have the same system with a bot while making sure no one can impersonate the bot.