Skip to content

Conversation

jameslamb
Copy link
Member

Contributes to rapidsai/build-planning#208 (breaking some changes off of #7128 to help with review and debugging there)

  • switches to using dask-cuda[cu12] extra for wheels (added in Build and test with CUDA 13.0.0 dask-cuda#1536)
  • bumps pins on some dependencies to match the rest of RAPIDS
    • cuda-python: >=12.9.2 (CUDA 12)
    • cupy: >=13.6.0
    • numba: >=0.60.0
  • adds explicit runtime dependency on numba-cuda
    • cuml uses this unconditionally but does not declare runtime dependency on it today

Contributes to https://github.com/rapidsai/build-infra/issues/293

  • replaces dependency on pynvml package with nvidia-ml-py package (see that issue for details)

Notes for Reviewers

These dependency pin changes should be low-risk

All of these pins and requirements are already coming through cuml's dependencies, e.g. cudf carries most of them via rapidsai/cudf#19806

So they shouldn't change much about the test environments in CI.

@jameslamb jameslamb added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Sep 2, 2025
Copy link

copy-pr-bot bot commented Sep 2, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@jameslamb
Copy link
Member Author

Most CI here is passing, just one job is failing:

FAILED test_naive_bayes.py::test_gaussian_parameters[1e-05-balanced] - ValueError: Number of priors must match number of classes.
FAILED test_naive_bayes.py::test_gaussian_parameters[1e-05-unbalanced] - ValueError: Number of priors must match number of classes.
...
FAILED test_naive_bayes.py::test_categorical_partial_fit[True-int32-float32] - assert 0.2518 <= (0.104 + 0.0001)
FAILED test_naive_bayes.py::test_categorical_partial_fit[True-int32-float64] - assert 0.2518 <= (0.104 + 0.0001)
...
FAILED test_naive_bayes.py::test_categorical_parameters[False-False-0.1-balanced] - ValueError: Number of classes must match number of priors
FAILED test_naive_bayes.py::test_categorical_parameters[False-False-0.1-unbalanced] - ValueError: Number of classes must match number of priors
...
= 40 failed, 14342 passed, 6174 skipped, 1210 xfailed, 39 xpassed, 1101 warnings in 1676.92s (0:27:56) =

(conda-python-tests-singlegpu build link)

Those test failures are only happening on 1 test configuration for that job:

  • 12.2.2, 3.10, amd64, rockylinux8, l4, earliest-driver, oldest-deps
  • 12.9.1, 3.13, amd64, ubuntu24.04, h100, latest-driver, latest-deps
  • 12.0.1, 3.12, arm64, ubuntu22.04, a100, latest-driver, latest-deps

It looks like the same tests are failing for that configuration on an unrelated PR, so I suspect this isn't related to this PR's changes.

@jameslamb jameslamb marked this pull request as ready for review September 2, 2025 18:08
@jameslamb jameslamb requested review from a team as code owners September 2, 2025 18:08
@jameslamb jameslamb changed the title WIP: use 'nvidia-ml-py' instead of 'pynvml', declare 'numba-cuda' dependency pins use 'nvidia-ml-py' instead of 'pynvml', declare 'numba-cuda' dependency pins Sep 2, 2025
@jameslamb
Copy link
Member Author

Tracking issue for the test failures: #7152

@jameslamb
Copy link
Member Author

Thanks for restarting CI here @csadorf !

If that passes, I'll merge this and get the CUDA 13 PR (#7128) ready for review.

@jameslamb
Copy link
Member Author

🎉 that worked! Thanks @csadorf and @betatim !

@jameslamb
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit c213eb7 into rapidsai:branch-25.10 Sep 3, 2025
76 checks passed
@jameslamb jameslamb deleted the drop-pynvml branch September 3, 2025 19:06
rapids-bot bot pushed a commit that referenced this pull request Sep 4, 2025
Contributes to rapidsai/build-planning#208

* uses CUDA 13.0.0 to build and test
* adds CUDA 13 devcontainers
* moves some dependency pins (most others were done in #7164)
  - `cuda-python`: `>=13.0.1` (CUDA 13)

Contributes to rapidsai/build-planning#68

* updates to CUDA 13 dependencies in fallback entries in `dependencies.yaml` matrices (i.e., the ones that get written to `pyproject.toml` in source control)

## Notes for Reviewers

This switches GitHub Actions workflows to the `cuda13.0` branch from here: rapidsai/shared-workflows#413

A future round of PRs will revert that back to `branch-25.10`, once all of RAPIDS supports CUDA 13.

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Gil Forsyth (https://github.com/gforsyth)
  - Simon Adorf (https://github.com/csadorf)
  - Jim Crist-Harif (https://github.com/jcrist)

URL: #7128
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
conda conda issue Cython / Python Cython or Python issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants