-
Notifications
You must be signed in to change notification settings - Fork 6
feat: speedup task runtime with cuML #190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
For what it's worth, this did not work for me. It was erroring out on the |
Thanks for reporting! In the README.md I did specify to install the nvidia-cuda-toolkit, but not in the PR description so I apologize for that |
Oops, I see it now. I somehow missed it 😓 |
Running the labeling task using cached embeddings for one model (UCE 4l) on one tissues and with
without the acceleration (so without these changes):
so no real speedup |
I haven't tried the cuml accelerator yet, so I don't know specifics. But in general, needs to be a large amount of data to offset the time to move data on and off of GPU. My guess is that UCE-4l embeddings aren't enough. UCE-33l embeddings might show acceleration with GPU. There is some dependency on GPU and on specific ML algorithm too. @valenzuelaomar Maybe the documentation should be updated ot indicate that GPU acceleration can vary based on amout of data, algorithm, and type of GPU? |
@mlgill you're spot on about there needing to be a large amount of data to see the gpu-acceleration benefits. I think updating documentation to reference that is a good idea |
Resolves #238
This pull request introduces GPU acceleration support for tasks using cuML, along with related updates to the codebase and documentation. The changes include adding installation instructions, updating dependencies, and enabling GPU acceleration if available.
GPU Acceleration Support:
README.md
explaining how to install and enable GPU acceleration using cuML for improved performance. Includes installation steps for both source and PyPI installations.gpu
inpyproject.toml
that includescuml-cu12==25.4.*
._enable_gpu_acceleration()
insrc/czbenchmarks/__init__.py
to initialize GPU acceleration with cuML if available, logging the status of GPU supportBenchmarks (regular sklearn vs gpu-accelerated sklearn)
image is from: https://developer.nvidia.com/blog/nvidia-cuml-brings-zero-code-change-acceleration-to-scikit-learn/#benchmarks
How this works
Tests
Ran with regular installation
Ran with regular installation + GPU acceleration
Known Limitations
cuML automatically accelerates compatible components on NVIDIA GPUs and falls back to CPU execution for unsupported operations.
https://docs.rapids.ai/api/cuml/stable/zero-code-change-limitations/