Skip to content

Conversation

@mike-hobson
Copy link
Contributor

@mike-hobson mike-hobson commented Jan 29, 2026

PR Summary

Sci/Tech Reviewer: None required
Code Reviewer: @stevemullerworth

I noted three deficiencies with the partition unit tests:

  1. There are no tests of uneven partitions. The single panel partitioner unit tests use an 8x8 full domain mesh. If this is partitioned over 3 ranks, it should result in partitions of size 3x8, 3x8 and 2x8 (i.e. uneven). Add such tests to the biperiodic, planar and x- plus y- trench mesh partitioner tests.
  2. There is no test of the parallel cubedsphere partitioner. This was omitted because it is slow to run from the command line. Whilst that is true (it is slower than all the other infrastructure unit tests combined), this is the partitioner used for almost all global model runs - so I have added it in (and we'll take the hit on runtime).
  3. The current test suite runs the unit tests on one core - even though the infrastructure unit tests are parallel. The infrastructure unit tests take two and a half minutes to complete. Giving them the correct number of cores to run on drops this to 8 seconds. There is no fine-grained control of resources for the unit tests, so after discussions with SSD, we decided it was fine for all "technical tests" (integration- and unit-tests) to run on multiple cores. The Met Office appears to have its own overrides for this setting, but I have also changed the system default (as I can see no reason for anyone wanting to run a parallel test on a single core)

Code Quality Checklist

  • I have performed a self-review of my own code
  • My code follows the project's style guidelines
  • Comments have been included that aid understanding and enhance the readability of the code
  • My changes generate no new warnings
  • All automated checks in the CI pipeline have completed successfully

Testing

  • I have tested this change locally, using the LFRic Core rose-stem suite
  • If required (e.g. API changes) I have also run the LFRic Apps test suite using this branch
  • If any tests fail (rose-stem or CI) the reason is understood and acceptable (e.g. kgo changes)
  • I have added tests to cover new functionality as appropriate (e.g. system tests, unit tests, etc.)
  • Any new tests have been assigned an appropriate amount of compute resource and have been allocated to an appropriate testing group (i.e. the developer tests are for jobs which use a small amount of compute resource and complete in a matter of minutes)

Test Suite Results - lfric_core - partition_test/run1

Suite Information

Item Value
Suite Name partition_test/run1
Suite User mike.hobson
Workflow Start 2026-01-29T06:08:00
Groups Run all
Dependency Reference Main Like
lfric_core mike-hobson/lfric_core@add_uneven_partition_test False
SimSys_Scripts MetOffice/[email protected] True

Task Information

✅ succeeded tasks - 372

Security Considerations

  • I have reviewed my changes for potential security issues
  • Sensitive data is properly handled (if applicable)
  • Authentication and authorisation are properly implemented (if applicable)

Performance Impact

  • Performance of the code has been considered and, if applicable, suitable performance measurements have been conducted

AI Assistance and Attribution

  • Some of the content of this change has been produced with the assistance of Generative AI tool name (e.g., Met Office Github Copilot Enterprise, Github Copilot Personal, ChatGPT GPT-4, etc) and I have followed the Simulation Systems AI policy (including attribution labels)

Documentation

  • Where appropriate I have updated documentation related to this change and confirmed that it builds correctly

PSyclone Approval

  • If you have edited any PSyclone-related code (e.g. PSyKAl-lite, Kernel interface, optimisation scripts, LFRic data structure code) then please contact the TCD Team

Code Review

  • All dependencies have been resolved
  • Related Issues have been properly linked and addressed
  • CLA compliance has been confirmed
  • Code quality standards have been met
  • Tests are adequate and have passed
  • Documentation is complete and accurate
  • Security considerations have been addressed
  • Performance impact is acceptable

Copy link
Collaborator

@james-bruten-mo james-bruten-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rose-stem all good

Copy link
Collaborator

@stevemullerworth stevemullerworth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing the branch infrastructure unit tests on the command line makes sense, and code looks OK. On merging on main, I get a failure with run_lfric_xios_integration_tests_azspice_gnu_64bit: the job.out looks fine, but the job.err reports an "unexpected error"

Copy link
Collaborator

@MatthewHambley MatthewHambley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumed build system code owner's review. This looks fine, bumping the number of processes for the mpiexec is not a problem. I've made a drive-by comment on something else for the actual reviewers to consider.

num_cells_ghost = partition%get_num_cells_ghost()
@assertEqual( 0, num_cells_ghost )

case (3)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate that you're following the existing pattern, but if each of these tests are so different they may, in fact, be three different tests with different number of MPI processes. Rather than one run three times.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants