Add GPU testing workflows and add gpu tox configuration #635

fabianliebig · 2025-09-08T04:51:42Z

This PR adds a dedicated Tox environment for GPU-related tests and one manual workflow as well as additional steps to regular and CI for installing and starting those tests. At the moment, they are only executed if triggered manually and if core tests have passed. Support for other triggers is WIP. Additionally, I've changed the Benchmark installation to UV and added terminal outputs of standard resource commands for CPU, RAM and so on.

AVHopp

First round of comments.

.github/workflows/benchmark.yml

tox.ini

.github/workflows/regular.yml

…s into a single job

…dundant info from GPU tests

.github/workflows/gpu_tests.yml

.github/workflows/benchmark.yml

.github/workflows/ci.yml

.github/workflows/regular.yml

Update the concurrency group name for the GPU Test Workflow by adding an extension since the calling workflow, which is in the same concurrency group, will block this one, which leads to a deadlock, and the pipeline will skip this call.

.github/workflows/gpu_tests.yml

.github/workflows/regular.yml

.github/workflows/gpu_tests.yml

Co-authored-by: Martin Fitzner <[email protected]>

Copilot

Pull Request Overview

This PR adds GPU testing capabilities to the project by introducing a dedicated Tox environment for GPU tests and workflow configurations. The changes include manual workflow triggers for GPU testing, system information outputs for benchmarking, and modernization of the benchmark installation process.

Adds a new gputest Tox environment for GPU-specific testing
Introduces manual GPU testing workflows that require core tests to pass first
Modernizes benchmark installation by switching from pip to uv package manager

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
tox.ini	Adds `gputest` environment configuration with GPU availability check
.github/workflows/regular.yml	Adds GPU testing job triggered by manual workflow dispatch
.github/workflows/gpu_tests.yml	New dedicated workflow for GPU tests with AWS Lambda runner provisioning
.github/workflows/ci.yml	Adds GPU testing job to CI workflow with manual trigger
.github/workflows/benchmark.yml	Adds system information output and migrates to uv package manager

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

.github/workflows/ci.yml

…PU Tests

Scienfitz · 2025-09-25T11:35:06Z

tox.ini

+extras = test
+passenv =
+    CI
+    BAYBE_NUMPY_USE_SINGLE_PRECISION


whats the intention of these flags?
if it is to enforce single precision I think they should rather be part of setenv

Maybe I misunderstood that. I thought that would configure which ENV variables will be parsed when set on the computer where Tox is executed.

no thats right, so this was the intention?
Ie one could run GPU tests in single and double precision? Or do GPU tests need to be run in single precision anyway?

I thought that might become important because if a remember correctly, @AVHopp mentioned that there was an issue regrading the precision when he started investigation GPUs. Maybe he can comment on that before I confuse something here. But it is probably better to add those flag when GPU support is tackled, so we can also remove it. Thanks for pointing it out :)

no please dont change it yet, I think this line is reasonable if GPUs can indeed be run with double and single precision, if not possible generally then we should probably change it

.github/workflows/ci.yml

Add GPU testing workflows and add gpu tox configuration

5674e58

fabianliebig self-assigned this Sep 8, 2025

fabianliebig marked this pull request as ready for review September 8, 2025 05:47

fabianliebig requested review from Scienfitz, AdrianSosic and AVHopp as code owners September 8, 2025 05:47

AVHopp reviewed Sep 8, 2025

View reviewed changes

.github/workflows/benchmark.yml Show resolved Hide resolved

.github/workflows/benchmark.yml Show resolved Hide resolved

.github/workflows/benchmark.yml Show resolved Hide resolved

tox.ini Outdated Show resolved Hide resolved

.github/workflows/regular.yml Show resolved Hide resolved

fabianliebig added 5 commits September 8, 2025 09:13

Refactor GPU testing workflow: consolidate GPU machine setup and test…

34e2c38

…s into a single job

Update GPU test command to assert CUDA availability

932b0dd

Enhance system information output in benchmark workflow and remove re…

0256cc1

…dundant info from GPU tests

Enhance GPU testing workflows: add matrix strategy GPU test

beaad09

Add default to GPU test workflow

eeabcaf

Scienfitz reviewed Sep 9, 2025

View reviewed changes

fabianliebig added 4 commits September 9, 2025 11:32

Rename GPU test job to align with the others

74c437e

Remove coverage threshold environment variables

bbce4e1

Update gputest job dependencies on typecheck

a23e2ef

Update concurrency group name for GPU

b489b65

Update the concurrency group name for the GPU Test Workflow by adding an extension since the calling workflow, which is in the same concurrency group, will block this one, which leads to a deadlock, and the pipeline will skip this call.

fabianliebig closed this Sep 10, 2025

fabianliebig deleted the fix/change-benchmark-env-creation-and-add-gpu-tox branch September 10, 2025 09:24

fabianliebig restored the fix/change-benchmark-env-creation-and-add-gpu-tox branch September 10, 2025 09:24

fabianliebig reopened this Sep 10, 2025

Scienfitz requested changes Sep 11, 2025

View reviewed changes

fabianliebig and others added 2 commits September 11, 2025 12:52

Include semantic into concurrency to avoid skipped matrix parts

075fb9b

Fix: Add semantic input to runner job name for clarity

2ac9861

Co-authored-by: Martin Fitzner <[email protected]>

Copilot AI review requested due to automatic review settings September 11, 2025 10:56

Copilot AI reviewed Sep 11, 2025

View reviewed changes

.github/workflows/ci.yml Outdated Show resolved Hide resolved

fabianliebig added 3 commits September 11, 2025 13:10

Refactor: Remove workflow_dispatch inputs for semantic and tox from G…

224919e

…PU Tests

Refactor: Update GPU test jobs to use new webhook interface

c15224f

Remove seperate manual gpu workflow

6e656c7

Scienfitz reviewed Sep 25, 2025

View reviewed changes

Add GPU testing workflows and add gpu tox configuration #635

Are you sure you want to change the base?

Add GPU testing workflows and add gpu tox configuration #635

Conversation

fabianliebig commented Sep 8, 2025

Uh oh!

AVHopp left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Scienfitz Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

fabianliebig Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

Scienfitz Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

fabianliebig Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

Scienfitz Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!