Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pip_parse does not respect requirement markers #1105

Closed
arrdem opened this issue Mar 1, 2023 · 11 comments · Fixed by #2059 or #2135
Closed

pip_parse does not respect requirement markers #1105

arrdem opened this issue Mar 1, 2023 · 11 comments · Fixed by #2059 or #2135
Assignees

Comments

@arrdem
Copy link
Contributor

arrdem commented Mar 1, 2023

🐞 bug report

Affected Rule

The issue is caused by the rule: pip_parse (all versions)

Is this a regression?

It appears to be. Previously, rules_python did evaluate platform markers see nikhaldi@323e84e

In 0.16, python/pip_install/extract_wheels/parse_requirements_to_bzl.py will gladly parse a requirements file which uses markers, but does nothing with them. Nor does the new Starlark parser introduced in ba69aec.

It appears this functionality was simply never implemented in the new parser, and was deleted in 37e7e68 when the legacy pip_import implementation was removed.

Description

rules_python makes no attempt to evaluate requirement marker expressions, which leads to false build failures due to attempting to install packages marked as not viable because pip wheel succeeds with a no-op and an error message when presented with a requirement which will be ignored due to markers.

🔬 Minimal Reproduction

When handed a requirements.txt file which uses platform markers to select a different version of an artifact depending on markers everything appears okay. Consider

matplotlib==3.4.3 ; sys_platform != "darwin" or platform_machine != "arm64"
matplotlib==3.4.3.post1+abnormal ; sys_platform == "darwin" and platform_machine == "arm64"

The correct behavior for this requirements file should be to choose between these two versions. Note that the boolean selector condition is disjoint. However the actual behavior is to unconditionally attempt to install the first requirement clause listed in the file. This breaks both bazel fetch //... and any rules referencing the requirement("matplotlib").

The behavior should be to evaluate all the marker expressions and select matching versions/args.

It may be valid for no versions of a package to be selected. Consider tensorflow-macos==2.7.0 ; sys_platform == "darwin". Arguably this should be achieved with platform-specific lockfiles and requirements_darwin, but it shouldn't be an error.

If multiple requirements are selected, that should likely be an error.

🔥 Exception or Error

ERROR: $REPO/BUILD:1:11: $TARGET depends on @pypi_py38_darwin_arm64_matplotlib//:pkg in repository @pypi_py38_darwin_arm64_matplotlib which failed to fetch. no such package '@pypi_py38_darwin_arm64_matplotlib//': whl_library pypi_py38_darwin_arm64_matplotlib failed:
Ignoring matplotlib: markers 'sys_platform != "darwin" or platform_machine != "arm64"' don't match your environment
 (Traceback (most recent call last):
  File "/Users/arrdem/.pyenv/versions/3.8.16/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/Users/arrdem/.pyenv/versions/3.8.16/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/private/var/tmp/_bazel_arrdem/64fd9f51655edad1a031ed416b537465/external/rules_python/python/pip_install/extract_wheels/extract_single_wheel.py", line 105, in <module>
    main()
  File "/private/var/tmp/_bazel_arrdem/64fd9f51655edad1a031ed416b537465/external/rules_python/python/pip_install/extract_wheels/extract_single_wheel.py", line 92, in main
    whl = next(iter(glob.glob("*.whl")))
StopIteration
)
@arrdem
Copy link
Contributor Author

arrdem commented Mar 3, 2023

While implementing support for marker evaluation back on 0.18.1 or before looks quite doable using the underlying pip machinery as before, I don't see a good way to add support for it into the new starlark parser as doing so would require teaching starlark how to both parse and evaluate the marker grammar or putting a Python interpreter back in the parsing flow which would defeat the apparent purpose of the starlark parser.

@aignas
Copy link
Collaborator

aignas commented Mar 13, 2023

@arrdem, does your usecase work better if you have the os specific requirements files as implemented in #1123 and you keep the platform specific includes in the requirements.in as opposed to requirements.txt file?

@mvgijssel
Copy link

mvgijssel commented Mar 14, 2023

Just ran into this issue as well trying to install different versions of torch for different architectures (aarch64 vs x86_64)

A requirements lock file that looks like

torch @ https://download.pytorch.org/whl/cpu/torch-1.13.1%2Bcpu-cp39-cp39-linux_x86_64.whl ; platform_machine == "x86_64" and python_full_version == "3.9.10" \
    --hash=sha256:71636a5c21927236f4974d2355fb3f66a0b707c28219b0135ff65ed0f0e61287
torch @ https://download.pytorch.org/whl/torch-1.13.1-cp39-cp39-manylinux2014_aarch64.whl ; platform_machine == "aarch64" and python_full_version == "3.9.10" \
    --hash=sha256:2c3581a3fd81eb1f0f22997cddffea569fea53bafa372b2c0471db373b26aafc

works locally on a M1 Macbook pro with an arm architecture, but fails in the GitLab CI which is using x86_64 with the error

00:01:12 ERROR: /builds/oVEuX7FM/0/predict/BUILD.bazel:201:11: //predict:lib depends on @predict_torch//:pkg in repository @predict_torch which failed to fetch. no such package '@predict_torch//': whl_library predict_torch failed: Ignoring torch: markers 'platform_machine == "aarch64" and python_full_version == "3.9.10"' don't match your environment

Using the latest rules_python

http_archive(
    name = "rules_python",
    sha256 = "ffc7b877c95413c82bfd5482c017edcf759a6250d8b24e82f41f3c8b8d9e287e",
    strip_prefix = "rules_python-0.19.0",
    url = "https://github.com/bazelbuild/rules_python/releases/download/0.19.0/rules_python-0.19.0.tar.gz",
)

@arrdem
Copy link
Contributor Author

arrdem commented Mar 14, 2023

does your usecase work better if you have the os specific requirements files as implemented in #1123 and you keep the platform specific includes in the requirements.in as opposed to requirements.txt file?

@aignas No, we have architecture specific backport builds which need to be selected in support of Apple ARM hardware and Amazon Graviton. In working through options for how to fit this into the existing rules_python infrastructure the #1123 per-OS requirements selectors helps, but it's not immediately obvious to me that teaching the pip_parse macro to do os+arch based selection would be a good pattern. It seems better to use markers for that, rather than adding full select() style support for arbitrary (kernel, arch) tuples which is the real need here.

@arrdem
Copy link
Contributor Author

arrdem commented Mar 15, 2023

Slept on this a bit - I could see an argument that we should deploy consistent builds of Python artifacts across architectures so we don't have version differences. That'd get us close to being able to just leverage the current requirements_linux= etc. features.

The kicker is that we do have architecture-specific requirements within a given OS. @mvgijssel's example is legitimate, tensorflow for example also has architecture-dependent requirement within a given host platform/kernel. So I don't think we can get away from something markers-equivalent.

@arrdem
Copy link
Contributor Author

arrdem commented May 23, 2023

Hacked around this by writing a python_config repository rule which can use the repository_ctx with access to the host arch and OS and starlark conditionals to generate a config.bzl template file with appropriate contents, simplifying all the multiple-python/pip setup malarky down to

load("//tools/rules:python_config.bzl", "python_config")

python_config(
    name = "pycfg",
    version = python_version,
)

load("@pycfg//:config.bzl", "environment", "extra_pip_args", "requirement_clusters", "requirements_lock")

pip_parse(
    name = "pypi",
    environment = environment,
    extra_pip_args = extra_pip_args,
    requirement_clusters = requirement_clusters,
    requirements_lock = requirements_lock,
)

load("@pypi//:requirements.bzl", pypi_install_deps = "install_deps")

pypi_install_deps()

@OliverFM
Copy link
Contributor

Any progress on this issue @arrdem – I couldn't see if you had merged this in, since it looks like, if it is merged, the name has changed.

If not could you share an example of this kind of rule?
In particular it would be great to change these outputs dependent on whether or not CUDA is present.

Copy link

This issue has been automatically marked as stale because it has not had any activity for 180 days. It will be closed if no further activity occurs in 30 days.
Collaborators can add an assignee to keep this open indefinitely. Thanks for your contributions to rules_python!

@github-actions github-actions bot added the Can Close? Will close in 30 days if there is no new activity label Mar 24, 2024
Copy link

This issue was automatically closed because it went 30 days without a reply since it was labeled "Can Close?"

@keith
Copy link
Member

keith commented Jun 17, 2024

I hit this case too, looks like I could duplicate the requirements files in order to pass different files to requirements_by_platform, but that's a lot of files to keep up to date

@aignas aignas reopened this Jul 12, 2024
@aignas
Copy link
Collaborator

aignas commented Jul 12, 2024

Note, that there are more comments/discussion on this in #1868.

@github-actions github-actions bot removed the Can Close? Will close in 30 days if there is no new activity label Jul 12, 2024
aignas added a commit to aignas/rules_python that referenced this issue Jul 16, 2024
aignas added a commit to aignas/rules_python that referenced this issue Jul 16, 2024
…around

This is extra preparation needed for bazelbuild#2059.

Summary:
- Create `pypi_repo_utils` for more ergonomic handling of Python in repo context.
- Split the resolution of requirements files to platforms into a separate function
  to make the testing easier. This also allows more validation that was realized
  that there is a need for in the WIP feature PR.
- Make the code more robust about the assumption of the target platform label.

Work towards bazelbuild#260, bazelbuild#1105, bazelbuild#1868.
aignas added a commit to aignas/rules_python that referenced this issue Jul 16, 2024
…around

This is extra preparation needed for bazelbuild#2059.

Summary:
- Create `pypi_repo_utils` for more ergonomic handling of Python in repo context.
- Split the resolution of requirements files to platforms into a separate function
  to make the testing easier. This also allows more validation that was realized
  that there is a need for in the WIP feature PR.
- Make the code more robust about the assumption of the target platform label.

Work towards bazelbuild#260, bazelbuild#1105, bazelbuild#1868.
github-merge-queue bot pushed a commit that referenced this issue Jul 17, 2024
…around (#2069)

This is extra preparation needed for #2059.

Summary:
- Create `pypi_repo_utils` for more ergonomic handling of Python in repo
context.
- Split the resolution of requirements files to platforms into a
separate function
to make the testing easier. This also allows more validation that was
realized
  that there is a need for in the WIP feature PR.
- Make the code more robust about the assumption of the target platform
label.

Work towards #260, #1105, #1868.
github-merge-queue bot pushed a commit that referenced this issue Jul 17, 2024
…2068)

This is just a small PR to reduce the scope of #2059.

This just moves some code from one python file to a separate one.

Work towards #260, #1105, #1868.
aignas added a commit to aignas/rules_python that referenced this issue Jul 18, 2024
This also changes the local_runtime_repo to explicitly check for
supported platforms instead of relying on a `None` value returned by the
helper method. This makes the behaviour exactly the same to the
behaviour before this PR and we can potentially drop the need for the
validation in the future if our local_runtime detection is more robust.

This also makes the platform detectino in `pypi_repo_utils` not depend
on `uname` and only use the `repository_ctx`. Apparently the
`module_ctx.watch` throws an error if one attempts to watch files on the
system (this is left for `repository_rule` it seems and one can only do
`module_ctx.watch` on files within the current workspace. This was
surprising, but could have been worked around by just unifying code.

This splits out things from bazelbuild#2059 and makes the code more succinct.

Work towards bazelbuild#260, bazelbuild#1105, bazelbuild#1868.
github-merge-queue bot pushed a commit that referenced this issue Jul 19, 2024
This also changes the local_runtime_repo to explicitly check for
supported platforms instead of relying on a `None` value returned by the
helper method. This makes the behaviour exactly the same to the
behaviour before this PR and we can potentially drop the need for the
validation in the future if our local_runtime detection is more robust.

This also makes the platform detectino in `pypi_repo_utils` not depend
on `uname` and only use the `repository_ctx`. Apparently the
`module_ctx.watch` throws an error if one attempts to watch files on the
system (this is left for `repository_rule` it seems and one can only do
`module_ctx.watch` on files within the current workspace. This was
surprising, but could have been worked around by just unifying code.

This splits out things from #2059 and makes the code more succinct.

Work towards #260, #1105, #1868.
@aignas aignas self-assigned this Jul 23, 2024
github-merge-queue bot pushed a commit that referenced this issue Aug 15, 2024
Before this change the `all_requirements` and related constants will
have
packages that need to be installed only on specific platforms and will
mean
that users relying on those constants (e.g. `gazelle`) will need to do
extra
work to exclude platform-specific packages. The package managers that
that
support outputting such files now include `uv` and `pdm`. This might be
also
useful in cases where we attempt to handle non-requirements lock files.

Note, that the best way to handle this would be to move all of the
requirements
parsing code to Python, but that may cause regressions as it is a much
bigger
change. This is only changing the code so that we are doing extra
processing
for the requirement lines that have env markers. The lines that have no
markers
will not see any change in the code execution paths and the python
interpreter
will not be downloaded.

We also use the `*_ctx.watch` API where available to correctly
re-evaluate the
markers if the `packaging` Python sources for this change.

Extra changes that are included in this PR:
- Extend the `repo_utils` to have a method for `arch` getting from the
`ctx`.
- Change the `local_runtime_repo` to perform the validation not relying
on the
  implementation detail of the `get_platforms_os_name`.
- Add `$(UV)` make variable for the `uv:current_toolchain` so that we
can
  generate the requirements for `sphinx` using `uv`.
- Swap the requirement generation using `genrule` and `uv` for `sphinx`
and co
so that we can test the `requirement` marker code. Note, the
`requirement`
  markers are not working well with the `requirement_cycles`.

Fixes #1105.
Fixes #1868.
Work towards #260, #1975.
Related #1663.

---------

Co-authored-by: Richard Levasseur <[email protected]>
aignas added a commit to aignas/rules_python that referenced this issue Aug 20, 2024
…thon

Before this PR the lockfile would become platform dependent when the
`requirements` file would have env markers. This was not caught because
we do not have MODULE.bazel.lock checked into the `rules_python`
repository because the CI is running against many versions and the lock
file is different, therefor we would not be able to run with
`bazel build --lockfile_mode=error`.

With this change we use the label to `BUILD.bazel` which is living next
to the `python` symlink and since the `BUILD.bazel` is the same on all
platforms, the lockfile will remain the same.

Work towards bazelbuild#1105, bazelbuild#1868.
aignas added a commit to aignas/rules_python that referenced this issue Aug 21, 2024
…thon

Before this PR the lockfile would become platform dependent when the
`requirements` file would have env markers. This was not caught because
we do not have MODULE.bazel.lock checked into the `rules_python`
repository because the CI is running against many versions and the lock
file is different, therefor we would not be able to run with
`bazel build --lockfile_mode=error`.

With this change we use the label to `BUILD.bazel` which is living next
to the `python` symlink and since the `BUILD.bazel` is the same on all
platforms, the lockfile will remain the same.

Work towards bazelbuild#1105, bazelbuild#1868.
aignas added a commit to aignas/rules_python that referenced this issue Aug 21, 2024
…thon

Before this PR the lockfile would become platform dependent when the
`requirements` file would have env markers. This was not caught because
we do not have MODULE.bazel.lock checked into the `rules_python`
repository because the CI is running against many versions and the lock
file is different, therefor we would not be able to run with
`bazel build --lockfile_mode=error`.

With this change we use the label to `BUILD.bazel` which is living next
to the `python` symlink and since the `BUILD.bazel` is the same on all
platforms, the lockfile will remain the same.

Work towards bazelbuild#1105, bazelbuild#1868.
aignas added a commit to aignas/rules_python that referenced this issue Aug 21, 2024
…thon

Before this PR the lockfile would become platform dependent when the
`requirements` file would have env markers. This was not caught because
we do not have MODULE.bazel.lock checked into the `rules_python`
repository because the CI is running against many versions and the lock
file is different, therefor we would not be able to run with
`bazel build --lockfile_mode=error`.

With this change we use the label to `BUILD.bazel` which is living next
to the `python` symlink and since the `BUILD.bazel` is the same on all
platforms, the lockfile will remain the same.

Work towards bazelbuild#1105, bazelbuild#1868.
aignas added a commit to aignas/rules_python that referenced this issue Aug 21, 2024
…thon

Before this PR the lockfile would become platform dependent when the
`requirements` file would have env markers. This was not caught because
we do not have MODULE.bazel.lock checked into the `rules_python`
repository because the CI is running against many versions and the lock
file is different, therefor we would not be able to run with
`bazel build --lockfile_mode=error`.

With this change we use the label to `BUILD.bazel` which is living next
to the `python` symlink and since the `BUILD.bazel` is the same on all
platforms, the lockfile will remain the same.

Work towards bazelbuild#1105, bazelbuild#1868.
aignas added a commit to aignas/rules_python that referenced this issue Aug 21, 2024
…thon

Before this PR the lockfile would become platform dependent when the
`requirements` file would have env markers. This was not caught because
we do not have MODULE.bazel.lock checked into the `rules_python`
repository because the CI is running against many versions and the lock
file is different, therefor we would not be able to run with
`bazel build --lockfile_mode=error`.

With this change we use the label to `BUILD.bazel` which is living next
to the `python` symlink and since the `BUILD.bazel` is the same on all
platforms, the lockfile will remain the same.

Work towards bazelbuild#1105, bazelbuild#1868.
aignas added a commit to aignas/rules_python that referenced this issue Aug 21, 2024
…thon

Before this PR the lockfile would become platform dependent when the
`requirements` file would have env markers. This was not caught because
we do not have MODULE.bazel.lock checked into the `rules_python`
repository because the CI is running against many versions and the lock
file is different, therefor we would not be able to run with
`bazel build --lockfile_mode=error`.

With this change we use the label to `BUILD.bazel` which is living next
to the `python` symlink and since the `BUILD.bazel` is the same on all
platforms, the lockfile will remain the same.

Work towards bazelbuild#1105, bazelbuild#1868.
aignas added a commit to aignas/rules_python that referenced this issue Aug 22, 2024
…thon

Before this PR the lockfile would become platform dependent when the
`requirements` file would have env markers. This was not caught because
we do not have MODULE.bazel.lock checked into the `rules_python`
repository because the CI is running against many versions and the lock
file is different, therefor we would not be able to run with
`bazel build --lockfile_mode=error`.

With this change we use the label to `BUILD.bazel` which is living next
to the `python` symlink and since the `BUILD.bazel` is the same on all
platforms, the lockfile will remain the same.

Work towards bazelbuild#1105, bazelbuild#1868.
github-merge-queue bot pushed a commit that referenced this issue Aug 22, 2024
…thon (#2135)

Before this PR the lockfile would become platform dependent when the
`requirements` file would have env markers. This was not caught because
we do not have MODULE.bazel.lock checked into the `rules_python`
repository because the CI is running against many versions and the lock
file is different, therefore we would not be able to run with
`bazel build --lockfile_mode=error`.

With this change we use the label to `BUILD.bazel` which is living next
to the `python` symlink and since the `BUILD.bazel` is the same on all
platforms, the lockfile will remain the same.

Summary
* refactor(uv): create a reusable macro for using uv for locking reqs.
* test(bzlmod): enable testing the MODULE.bazel.lock breakage across
platforms.
* test(bzlmod): use a universal requirements file for 3.9.
This breaks the CI, because the python interpreter file hash is added to
the lock file.
* fix(bzlmod): keep the lockfile platform independent when resolving
python


Fixes #1105 and #1868 for real this time.
Implements an additional helper for #1975.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants