Skip to content
Open
Show file tree
Hide file tree
Changes from 10 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 23 additions & 5 deletions .github/workflows/benchmark.yml
Original file line number Diff line number Diff line change
Expand Up @@ -111,14 +111,32 @@ jobs:
BAYBE_BENCHMARKING_PERSISTENCE_PATH: ${{ secrets.TEST_RESULT_S3_BUCKET }}
BAYBE_PARALLEL_SIMULATION_RUNS: false
steps:
- name: System Information
run: |
echo -e "\033[1;34m===== SYSTEM INFORMATION =====\033[0m"
uname -a
echo -e "\n\n\n\033[1;34m===== CPU INFORMATION =====\033[0m"
lscpu
echo -e "\n\n\n\033[1;34m===== BLOCK DEVICES =====\033[0m"
lsblk
echo -e "\n\n\n\033[1;34m===== MEMORY INFORMATION =====\033[0m"
free -h
echo -e "\n\n\n\033[1;34m===== DISK USAGE =====\033[0m"
df -h
if [ -x "$(command -v nvidia-smi)" ]; then
echo -e "\n\n\n\033[1;34m===== GPU INFORMATION =====\033[0m"
nvidia-smi
fi
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: actions/setup-python@v5
id: setup-python
- name: "Set up Python"
uses: actions/setup-python@v5
with:
python-version: "3.10"
- name: Install uv
uses: astral-sh/setup-uv@v6
- name: Install the project
run: uv sync --locked --extra benchmarking
- name: Benchmark
run: |
pip install '.[benchmarking]'
python -W ignore -m benchmarks --benchmark-list "${{ matrix.benchmark_list }}"
run: uv run -m benchmarks --benchmark-list "${{ matrix.benchmark_list }}"
15 changes: 15 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -175,3 +175,18 @@ jobs:
tr -d '%' |
sed '$d' |
awk '{if ( $1<${{ env.COVERAGE_INDIVIDUAL_THRESH }} ) exit 1 }'

gputest:
needs: typecheck
if: ${{ github.event_name == 'workflow_dispatch' }}
uses: ./.github/workflows/gpu_tests.yml
secrets: inherit
permissions:
contents: read
id-token: write
strategy:
matrix:
py-version: [ {semantic: '3.10', tox: 'py310'} ]
with:
semantic: ${{ matrix.py-version.semantic }}
tox: ${{ matrix.py-version.tox }}
82 changes: 82 additions & 0 deletions .github/workflows/gpu_tests.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
name: GPU Tests
on:
workflow_dispatch:
inputs:
semantic:
required: true
type: string
default: "3.10"
tox:
required: true
type: string
default: "py310"
workflow_call:
secrets:
APP_PRIVATE_KEY:
required: true
AWS_ROLE_TO_ASSUME:
required: true
inputs:
semantic:
required: true
type: string
default: "3.10"
tox:
required: true
type: string
default: "py310"


concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-GPUTest
cancel-in-progress: true

jobs:
add-gpu-machine:
runs-on: ubuntu-latest
permissions:
contents: read
id-token: write
steps:
- name: Generate a token
id: generate-token
uses: actions/create-github-app-token@v1
with:
app-id: ${{ vars.APP_ID }}
private-key: ${{ secrets.APP_PRIVATE_KEY }}
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_ROLE_TO_ASSUME }}
role-session-name: Github_Add_Runner
aws-region: eu-central-1
- name: Login to Amazon ECR
id: login-ecr
uses: aws-actions/amazon-ecr-login@v2
- name: Execute Lambda function
run: |
aws lambda invoke --function-name jit_runner_register_and_create_runner_container --cli-binary-format raw-in-base64-out --payload '{"github_api_secret": "${{ steps.generate-token.outputs.token }}", "count_container": 1, "container_compute": "XS-GPU", "repository": "${{ github.repository }}" }' response.json
if ! grep -q '"statusCode": 200' response.json; then
echo "Lambda function failed. statusCode is not 200."
exit 1
fi
gputest:
needs: [add-gpu-machine]
name: GPU Tests ${{ inputs.semantic }}
runs-on: gpu
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
id: setup-python
with:
python-version: ${{ inputs.semantic }}
- uses: actions/cache@v4
with:
path: .tox/gputest-${{ inputs.tox }}
key: gputest-${{ inputs.tox }}-${{ hashFiles('pyproject.toml') }}-${{ hashFiles('tox.ini') }}
- name: Run GPU tests
run: |
pip install tox-uv
tox -e gputest-${{ inputs.tox }}
18 changes: 18 additions & 0 deletions .github/workflows/regular.yml
Original file line number Diff line number Diff line change
Expand Up @@ -161,3 +161,21 @@ jobs:
tr -d '%' |
sed '$d' |
awk '{if ( $1<${{ env.COVERAGE_INDIVIDUAL_THRESH }} ) exit 1 }'

gputest:
needs: typecheck
if: ${{ github.event_name == 'workflow_dispatch' }}
uses: ./.github/workflows/gpu_tests.yml
secrets: inherit
permissions:
contents: read
id-token: write
strategy:
matrix:
py-version: [ {semantic: '3.10', tox: 'py310'},
{semantic: '3.11', tox: 'py311'},
{semantic: '3.12', tox: 'py312'},
{semantic: '3.13', tox: 'py313'} ]
with:
semantic: ${{ matrix.py-version.semantic }}
tox: ${{ matrix.py-version.tox }}
16 changes: 15 additions & 1 deletion tox.ini
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tox]
min_version = 4.9
env_list = {fulltest,coretest,lint,mypy,audit}-py{310,311,312,313}
env_list = {fulltest,gputest,coretest,lint,mypy,audit}-py{310,311,312,313}
isolated_build = True

[testenv:fulltest,fulltest-py{310,311,312,313}]
Expand All @@ -18,6 +18,20 @@ commands =
python --version
pytest -p no:warnings --cov=baybe --durations=5 {posargs}

[testenv:gputest,gputest-py{310,311,312,313}]
description = Runs GPU tests
extras = test
passenv =
CI
BAYBE_NUMPY_USE_SINGLE_PRECISION
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whats the intention of these flags?
if it is to enforce single precision I think they should rather be part of setenv

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I misunderstood that. I thought that would configure which ENV variables will be parsed when set on the computer where Tox is executed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no thats right, so this was the intention?
Ie one could run GPU tests in single and double precision? Or do GPU tests need to be run in single precision anyway?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that might become important because if a remember correctly, @AVHopp mentioned that there was an issue regrading the precision when he started investigation GPUs. Maybe he can comment on that before I confuse something here. But it is probably better to add those flag when GPU support is tackled, so we can also remove it. Thanks for pointing it out :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no please dont change it yet, I think this line is reasonable if GPUs can indeed be run with double and single precision, if not possible generally then we should probably change it

BAYBE_TORCH_USE_SINGLE_PRECISION
BAYBE_PARALLEL_SIMULATION_RUNS
setenv =
BAYBE_TEST_ENV = GPUTEST
commands =
python --version
python -c "import torch; assert torch.cuda.is_available()"

[testenv:coretest,coretest-py{310,311,312,313}]
description = Run PyTest with core functionality
extras = test
Expand Down
Loading