-
Notifications
You must be signed in to change notification settings - Fork 57
Add GPU testing workflows and add gpu tox configuration #635
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add GPU testing workflows and add gpu tox configuration #635
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First round of comments.
…s into a single job
…dundant info from GPU tests
Update the concurrency group name for the GPU Test Workflow by adding an extension since the calling workflow, which is in the same concurrency group, will block this one, which leads to a deadlock, and the pipeline will skip this call.
Co-authored-by: Martin Fitzner <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds GPU testing capabilities to the project by introducing a dedicated Tox environment for GPU tests and workflow configurations. The changes include manual workflow triggers for GPU testing, system information outputs for benchmarking, and modernization of the benchmark installation process.
- Adds a new
gputest
Tox environment for GPU-specific testing - Introduces manual GPU testing workflows that require core tests to pass first
- Modernizes benchmark installation by switching from pip to uv package manager
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
File | Description |
---|---|
tox.ini | Adds gputest environment configuration with GPU availability check |
.github/workflows/regular.yml | Adds GPU testing job triggered by manual workflow dispatch |
.github/workflows/gpu_tests.yml | New dedicated workflow for GPU tests with AWS Lambda runner provisioning |
.github/workflows/ci.yml | Adds GPU testing job to CI workflow with manual trigger |
.github/workflows/benchmark.yml | Adds system information output and migrates to uv package manager |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
extras = test | ||
passenv = | ||
CI | ||
BAYBE_NUMPY_USE_SINGLE_PRECISION |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
whats the intention of these flags?
if it is to enforce single precision I think they should rather be part of setenv
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I misunderstood that. I thought that would configure which ENV variables will be parsed when set on the computer where Tox is executed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no thats right, so this was the intention?
Ie one could run GPU tests in single and double precision? Or do GPU tests need to be run in single precision anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought that might become important because if a remember correctly, @AVHopp mentioned that there was an issue regrading the precision when he started investigation GPUs. Maybe he can comment on that before I confuse something here. But it is probably better to add those flag when GPU support is tackled, so we can also remove it. Thanks for pointing it out :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no please dont change it yet, I think this line is reasonable if GPUs can indeed be run with double and single precision, if not possible generally then we should probably change it
This PR adds a dedicated Tox environment for GPU-related tests and one manual workflow as well as additional steps to regular and CI for installing and starting those tests. At the moment, they are only executed if triggered manually and if core tests have passed. Support for other triggers is WIP. Additionally, I've changed the Benchmark installation to UV and added terminal outputs of standard resource commands for CPU, RAM and so on.