Skip to content

Conversation

@halungge
Copy link
Contributor

@halungge halungge commented Mar 7, 2025

[CI] Adds a minimal pipeline to run the mpi_tests on CSCS-CI.
The pipeline uses its own base image with a minimal MPI installation, and installs icon4py with UV in a test container.

Restrictions of the pipeline setup:

  • tests run only with CPU backends
  • the container based openMPI installation is used, there is no access of the cluster provided MPI infrastructure (same approach as PMAP)

Additional changes

  • Some issues in the parallel test are fixed. Other tests are marked xfail.

Follow-up tasks are defined in https://hackmd.io/O4Fymu1dTxqTZSC8rdiVVw.

Magdalena Luz added 30 commits February 11, 2025 16:46
- remove ipeidx_dsl field from MetricStateNonHydro
- read renamed savepoints
- split VelocityAdvectionInit and VelocityAdvectionExit savepoints
remove unnecessary parameters in test_time_step_flags
remove x- from DiffusionExitSavepoint field names
temporarily ignore saturation adjustment tests - need Weisman-Klemp data
@msimberg
Copy link
Contributor

cscs-ci run distributed

@msimberg msimberg force-pushed the parallel_tests_on_ci branch from b7c536b to cee91df Compare January 21, 2026 12:54
@msimberg
Copy link
Contributor

cscs-ci run distributed

1 similar comment
@msimberg
Copy link
Contributor

cscs-ci run distributed

@msimberg
Copy link
Contributor

cscs-ci run distributed

@msimberg
Copy link
Contributor

cscs-ci run distributed

@msimberg
Copy link
Contributor

cscs-ci run distributed

@msimberg
Copy link
Contributor

#1003 fixes the global reduction tests. I'd like to have that PR merged before this one. I've temporarily added the changes to this PR for testing, but the PRs should be merged separately.

@msimberg
Copy link
Contributor

cscs-ci run distributed

@msimberg
Copy link
Contributor

msimberg commented Jan 22, 2026

This now has #989 to fix the dace connectivity issue. #1003 will be merged, but is already on this branch to fix global reductions tests.

Open question: should distributed tests be run as part of default or separately? If separate, should it be required? My preference would be to run it with default, but possibly make the jobs dependent on non-distributed tests passing. On the other hand, there are not so many distributed tests and not all backends are tested at the moment, so running them immediately does not very significantly increase the number of concurrent test jobs. Preferences?

Note, this aims to get the CI jobs running as soon as possible on main. Because of this many tests are xfail. Fixing the tests is part of https://hackmd.io/O4Fymu1dTxqTZSC8rdiVVw and wil be done as follow-up PRs.

@msimberg msimberg marked this pull request as ready for review January 22, 2026 16:02
@github-actions
Copy link

Mandatory Tests

Please make sure you run these tests via comment before you merge!

  • cscs-ci run default

Optional Tests

To run benchmarks you can use:

  • cscs-ci run benchmark-bencher

To run tests and benchmarks with the DaCe backend you can use:

  • cscs-ci run dace

To run test levels ignored by the default test suite (mostly simple datatest for static fields computations) you can use:

  • cscs-ci run extra

For more detailed information please look at CI in the EXCLAIM universe.

@msimberg
Copy link
Contributor

cscs-ci run distributed

@msimberg
Copy link
Contributor

cscs-ci run default

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants