Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Dependency-specific index urls #1560

Open
johnpyp opened this issue Jun 9, 2024 · 4 comments
Open

[Feature Request] Dependency-specific index urls #1560

johnpyp opened this issue Jun 9, 2024 · 4 comments

Comments

@johnpyp
Copy link

johnpyp commented Jun 9, 2024

Some packages like pytorch recommend installing their packages through custom index urls, e.g from that page:

# To Install:
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/rocm6.0

Though we could use url prioritization (with uv to make it consistent) for this, it would be better in this case to support dependency-scoped index urls, to avoid leaking the extra index url check to every other dependency as well, which introduces an unexpected supply chain scope to all dependencies as well.

@Skypekey
Copy link

Did you mean this?
https://hatch.pypa.io/dev/config/dependency/#direct-references
I think this is useful

@johnpyp
Copy link
Author

johnpyp commented Jun 13, 2024

I don't think so, as that's specifying an exact artifact to fetch from rather than the registry to resolve the given dependency from.

@Skypekey
Copy link

Oh, I understood. You need to specify a pypi source for certain modules. forgive my misunderstanding

@polarathene
Copy link

polarathene commented Jun 23, 2024

Just sharing my findings here if helpful.


PDM

PDM has a kinda nice way to approach this (falters if you want a project to support multiple PyTorch sources though):

[project]
name = "example"

dependencies = [
    "torch", # Implicitly resolves to `2.3.1+cu121` via configured PyTorch source below
    "torchvision",
    "torchaudio",
]
requires-python = ">=3.10"

[tool.pdm.resolution]
respect-source-order = true

[tool.pdm]
distribution = false

[[tool.pdm.source]]
name = "pytorch"
url = "https://download.pytorch.org/whl/cu121"
include_packages = ["torch", "torchvision", "torchaudio", "nvidia-*"]

The nvidia-* at the end there is to ensure that the torch deps resolve to the implicit nvidia-* packages from the torch index. Otherwise they'd come from PyPi, even though presently some of those were resolving to CUDA 12.5 instead of the intended and compatible CUDA 12.1 that these packages were intended to use.

That may be relevant context for you to keep in mind with your request to scope deps, as you may otherwise encounter that same caveat.


They also have optional dependency groups:

dependencies = [
    "torchvision",
    "torchaudio",
]

[project.optional-dependencies]
torch_cpu = ["torch==2.3.1+cpu"]
torch_cuda = ["torch==2.3.1+cu121"]

[[tool.pdm.source]]
name = "pytorch-cuda-12.1"
url = "https://download.pytorch.org/whl/cu121"
include_packages = ["torch_cuda", "torchvision", "torchaudio", "nvidia-*"]

[[tool.pdm.source]]
name = "pytorch-cpu"
url = "https://download.pytorch.org/whl/cpu"
include_packages = ["torch_cpu", "torchvision", "torchaudio"]

You'd then run a command to specify the optional dep as a group like pdm install --group torch_cpu, however this won't work as expected due to the overlapping CUDA package source. You'd need to migrate the packages from dependencies to the optional-dependencies table with each group providing the explicit local identifier, which enforces the version pin like with torch (you cannot use >=).

If you don't specify the torchvision + torchaudio packages in each of the sources include_packages, then they'd resolve to the fallback PyPi default index package for resolution.

  • EDIT: You could alternatively specify them in each group (target_cpu / target_cuda). This is advised for this type of package variance. However this doesn't avoid the multiple sources with overlapping include_packages, unless you explicitly use local identifiers for these deps they will still match/resolve to the packages at the undesired indexes.
  • Likewise, there is a known upstream PyTorch issue with +cpu not being compatible/assigned for the ARM64 / aarch64 platform, the pytorch-cpu source index is valid, but the local identifier +cpu must be omitted in that case.

Each would need to maintain separate lock files with PDM too. You could workaround the pyproject.toml issues mentioned by using separate pyproject.toml files, for PDM at least it doesn't seem like there is interest to improve on the flexibility. There is a third-party plugin that provides an alternative way to configure torch deps (generates separate lock files).

The include_packages setting will bias the package to that source AFAIK, but other packages will attempt to resolve through indexes including these (unless explicitly excluding them). Priority seems to be predictable with respect-source-order = true by the order the source is declared in with PyPi as the default unless you have include_packages declared.


With Rye / Hatch

These are my observations so far at least 😅

  • PDM is looking into adopting uv (which should notably help an issue I've observed with it's cache performance)
  • Hatch delegates to pip / uv, thus no lock file support until that tooling adopts such (there's something similar AFAIK, but lacks those capabilities that PDM and Rye offer).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants