Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix RuleRunner to work without BUILD_ROOT file in pytest sandbox #21112

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

cognifloyd
Copy link
Member

@cognifloyd cognifloyd commented Jun 27, 2024

After #20887 (pants 2.22.0.dev2), RuleRunner-based tests do not work in external repos:
https://pantsbuild.slack.com/archives/C0D7TNJHL/p1719344475216619

This test run shows the ValueErrors that show up in RuleRunner-based tests:
https://github.com/StackStorm/st2/actions/runs/9667956659/job/26670955784

The tests were passing in the pants repo because rule_runner.py had an explicit dependency on //BUILD_ROOT:files. So, the BUILD_ROOT file was added to the pytest sandbox for every RuleRunner-based test. Removing the rule_runner.py dependency on //BUILD_ROOT:files causes the RuleRunner-based tests to fail in the pants repo as well.

This issue was exposed because the rust-based NativeOptionParser does not provide an API for overriding the BuildRoot, unlike the python layer which uses a singleton BuildRoot object and allows RuleRunner to replace the build_root, bypassing the sentinel file-lookup logic. The NativeOptionParser requires either the PANTS_BUILDROOT_OVERRIDE env var or a sentinel file (one of BUILD_ROOT, BUILDROOT, or pants.toml) in the current working directory (or a parent of the current working directory) to calculate the build_root.

This PR:

  • Removes dependencies on //BUILD_ROOT:files for some tests that do not actually require it.
  • Removes the explicit dependency on //BUILD_ROOT:files from testutil/rule_runner.py.
  • Makes RuleRunner change the current working directory to the temporary build_root directory while bootstrapping options (ie for the codepaths that initialize the rust-based NativeOptionParser).
  • Makes RuleRunner add the sentinel file in the temporary build_root directory that RuleRunner creates so NativeOptionParser can find it in the current working directory.
  • Adds a RuleRunner.pushd() api method to allow tests to change the current directory to RuleRunner.build_root.
  • Adds dependencies on //BUILD_ROOT:files for tests that relied on the transitive dep via testutil/rule_runner.py. These tests either did not use RuleRunner (eg they only used mock_console), or relied on working in the sandbox with the actual pants code under test (meaning the pytest sandbox IS the build_root for that test).

Repos that use pants.testutil might not have a BUILD_ROOT
file, and most likely would not make any RuleRunner-based tests
depend on that file (adding it to the pytest sandbox). Instead,
the RuleRunner needs to manage adding a sentinel file.

This was not an issue before NativeOptionParser was wired up
in the python layer as part of 2.22. Now, however, NativeOptionParser
needs to be able to detect the build root using either an env var
(PANTS_BUILDROOT_OVERRIDE) or a sentinel file in the cwd.
This makes RuleRunner add that sentinel file.
@cognifloyd cognifloyd added this to the 2.22.x milestone Jun 27, 2024
@cognifloyd cognifloyd added needs-cherrypick category:bugfix Bug fixes for released features external-plugins Issues related to plugins outside this repository labels Jun 27, 2024
@cognifloyd cognifloyd self-assigned this Jun 27, 2024
@@ -24,7 +24,6 @@ python_sources(
"pants_integration_test.py": {
"dependencies": ["//BUILD_ROOT:files", "src/python/pants/__main__.py"]
Copy link
Member Author

@cognifloyd cognifloyd Jun 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also tried removing the //BUILD_ROOT:files dependency from pants_integration_test.py. But that didn't work, and it turned out to be a more invasive change.

Fixing this will require removing or changing tmpdir creation functions in this file that call get_buildroot() repeatedly. We need to change that assumption so the test author can pass in the build root instead of having these functions calculate it from the current working directory, which happens to be the pytest sandbox.

I suspect this issue existed before #20887 (ie it is not a new bug in 2.22.x), but pants_integration_test-based tests are far less common than RuleRunner based tests. Since this is (probably) not a new bug, we can deal with changing the assumptions in this code at some later date.


# Change cwd and add sentinel file (BUILDROOT) so NativeOptionParser can find build_root.
with pushd(self.build_root):
Path("BUILDROOT").touch()
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opted to use BUILDROOT instead of BUILD_ROOT so that anyone who needs to inspect a sandbox has a better idea where the file comes from (or can more easily grep for where it comes from).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't fully understand this comment. I'm fine with either one, but did you mean, using BUILDROOT in lieu of BUILD_ROOT so that it lines up with the other usages?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a BUILD_ROOT file at the root of the pants repo. Some tests depend on that (//BUILD_ROOT:files), so it gets copied to the root of the pytest sandbox.

Now, RuleRunner also adds a sentinel file to its own sandboxes. Using the variant with _ gives a subtle indicator of why the file is in the sandbox / what created the file; It gives a subtle indicator that the file is not a copy of the file that //:BUILD_ROOT:files pulls in.

Copy link
Member

@kaos kaos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm :)

Running SpecParser() at import time caused issues with tests that
no longer have the BUILD_ROOT file in the sandbox (now that rule_runner
does not depend on it).
This is where the test relied on the transitive dependency on //BUILD_ROOT
via rule_runner. Since that dep had to be removed to avoid external plugin
issues, we re-add it to tests that cannot easily be adjusted to work without it.
@cognifloyd
Copy link
Member Author

I'm slowly working through all of the test failures to fix them by either using rule_runner.pushd() or adding the explicit dep on //BUILD_ROOT:files just to those tests. I run a few tests locally, but let CI run all of them to see what I missed.

name="tests",
sources=["*_test.py", "!*_integration_test.py"],
overrides={
"publish_test.py": {"timeout": 70},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hit a timeout locally where it took 60.5 seconds, but killed it at 60.3 seconds.

Comment on lines +450 to +453
@contextmanager
def pushd(self):
with pushd(self.build_root):
yield
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A new convenience API method for use in tests (and in the rule_runner itself).

@@ -24,7 +24,6 @@ python_sources(
"pants_integration_test.py": {
"dependencies": ["//BUILD_ROOT:files", "src/python/pants/__main__.py"]
},
"rule_runner.py": {"dependencies": ["//BUILD_ROOT:files"]},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the fix that makes the errors in external repos visible in the pants repo.

@cognifloyd
Copy link
Member Author

Tests are passing now.

@benjyw I would like your feedback on this before I merge it and cherry-pick it to 2.22.x.

@cognifloyd cognifloyd requested a review from kaos June 28, 2024 20:47
docs/notes/2.23.x.md Outdated Show resolved Hide resolved
@sureshjoshi
Copy link
Member

High level, I don't have a good understanding of the originating problem but the changes themselves seem mostly mechanical. I see green checkmarks, so I'm fine with that.

The pushd API doesn't give me the fuzzies, but I don't really know why. I get that it works around the underlying problem, but I guess it makes me a bit nervous that we might write some tests that'll fail, and then we'll have to know/deduce to add that context manager. I guess my lack of understanding of the root cause means that I can't differentiate between a bug fix vs a bandaid. So, take my thoughts with the appropriate quantity of salt.

I'm kinda curious for @benjyw's thoughts on this, as I think he wrote the new native options parser - so I just want to ensure we're not working around something that should be handled at the Rust layer?

@cognifloyd
Copy link
Member Author

cognifloyd commented Jun 30, 2024

High level, I don't have a good understanding of the originating problem but the changes themselves seem mostly mechanical. I see green checkmarks, so I'm fine with that.

👍 The linked slack conversation has a bit more context. Working on this proved confusing for me too.

The pushd API doesn't give me the fuzzies, but I don't really know why. I get that it works around the underlying problem, but I guess it makes me a bit nervous that we might write some tests that'll fail, and then we'll have to know/deduce to add that context manager. I guess my lack of understanding of the root cause means that I can't differentiate between a bug fix vs a bandaid. So, take my thoughts with the appropriate quantity of salt.

I don't much care for it either. I tried (unsuccessfully) to encapsulate all of the wonky cwd management logic within the RuleRunner. I had to do it manually in some tests however, especially tests that create an options bootstrapper. At first I did with pushd(rule_runner.build_root): which felt rather verbose/repetitive, so I ended up encapsulating that in the rule_runner.pushd() method. It's still ugly, and a bit hand-wavy. But it's more explicit hand-waving than all of the tests that inadvertently depended on the BUILD_ROOT file magically being at the root of the pytest sandbox once they imported anything from pants.testutils.rule_runner.

I'm kinda curious for @benjyw's thoughts on this, as I think he wrote the new native options parser - so I just want to ensure we're not working around something that should be handled at the Rust layer?

I wonder about doing something in the rust layer too. It seems like we need a way for tests to tell the rust layer what the build_root should be, without requiring the sentinel file. The python layer has that feature, so adding it to the rust layer sounds reasonable.

But I'm not confident enough with rust to pull that off. I started in that direction, confused myself in the process, and then asked @benjyw which solution he preferred (rust api to override build_root detection or sentinel file in the sandbox) -- he said to add the sentinel file to the sandbox.

Adding the sentinel file ended up being somewhat convoluted as well. So, I'm not sold that this is better than something in the rust layer. But, it at least gets tests to fail like they do in external repos, and then patches up the fallout in the pants repo. So, it's probably good enough for a 2.22.x backport, which I think needs to happen before 2.22.0 final is released.

@kaos
Copy link
Member

kaos commented Jun 30, 2024

I agree with @sureshjoshi about this being mostly of the bandaid kind. The better approach I think would be to not rely on CWD as much as we do now (historically, it was assumed that CWD was always the same as the build root) -- this being the case have added some hoops to jump through for tests that exercises pants from a test sandbox, as then the CWD is not the build root. Those hoops are now starting to fall apart with the move of options into rust land, so this fix mends that by maintaining the assumption about the relationship of CWD with build root, rather than getting rid of the assumption (which I think would be the preferred solution long term, but is likely significantly more work to get done, so I'm 👍🏽 with this as a stop gap in the mean time).

My $.02 :)

Copy link
Member

@sureshjoshi sureshjoshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved based on the latest context/comments from @cognifloyd and @kaos

Might be something to review later, but the fact that in-repo plugin tests are broken is pretty important too

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category:bugfix Bug fixes for released features external-plugins Issues related to plugins outside this repository needs-cherrypick
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants