Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test with Julienne #169

Open
wants to merge 31 commits into
base: main
Choose a base branch
from
Open

Test with Julienne #169

wants to merge 31 commits into from

Conversation

rouson
Copy link
Collaborator

@rouson rouson commented Dec 28, 2024

This PR replaces PR #133 and accomplishes the following:

  1. Switch from testing with Veggies to testing with Julienne.
  2. Make progress on issue Unit tests for prif_stop and prif_error_stop make fragile non-portable assumptions #137 by moving the termination tests to a separate fpm project in the new test-termination/ subdirectory as suggested in comment r1761996357.
  3. Define a new prif_test_t type that extends Julienne's test_t type and override test_t's report type-bound procedure, replacing parallel Fortran features with PRIF calls.

The primary motivation for the switch to Julienne is complexity reduction: the Julienne software stack contains 3850 lines of code versus 30513 in Veggies as determined by executing following command on the current main branch of both repositories:

wc -l `find src example test build/dependencies -iname '*.f90'

As a consequence, the test suite runs much faster even after accounting for some possible reduction in test coverage because this test suite was originally developed in 3-4 months ago so any tests added to the Veggies-based test suite since then are not represented here. Determining whether and to what extent the test coverage is actually diminished is on the To Do list below.

To Do

  • Edit the CI scripts to run the termination tests in the new test-termination subdirectory.
  • Review and edit the test/ and test-termination/ subdirectories to match the test coverage of the previous Veggies-based tests.

@rouson rouson mentioned this pull request Dec 28, 2024
16 tasks
@rouson rouson force-pushed the test-with-julienne branch from 189c949 to 886233a Compare December 28, 2024 19:46
Test coverage:
* prif_init
* prif_allocate, prif_deallocate, prif_allocate_coarray, prif_deallocate_coarray
* prif_co_broadcast, prif_co_max, prif_co_min, prif_co_sum, prif_co_reduce
* prif_image_index, prif_num_images, prif_this_image_no_coarray
* prif_put, prif_get
* prif_sync_all
* prif_team_type, prif_form_team, prif_change_team, prif_end_team
This commit replaces veggies and iso_varying_string in the fpm
manifest with julienne and assert.
This commit fixes a problem that caused seg faults when running
subsets of tests. Previously, if the subset didn't include the
prif_init test, then prif_init was never called.  This commit calls
prif_init in test/main.F90 before running the tests.  Consequently,
the test for normal execution of prif_init now happens via an
assertion rather than in a unit test.
As suggested in the following comment, this commit defines a separate
project for testing program termination:
#133 (comment)
This commit removes code that was temporarily inserted when
diagnosing a flang bug and uncomments the correct code.
@rouson rouson force-pushed the test-with-julienne branch from 886233a to 435d148 Compare December 29, 2024 04:51
This commit defines a new prif_test_t type that extends Julienne's
test_t type in ordero to override test_t's "report" type-bound
procedure in order to replace parallel Fortran featuers with PRIF
calls.
@rouson rouson marked this pull request as draft December 29, 2024 07:49
@rouson rouson marked this pull request as ready for review December 29, 2024 07:53
This eliminates a do loop in prif_test_t's "report" tybe-bound
procedure.
Copy link
Member

@everythingfunctional everythingfunctional left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't reviewed the changes the tests yet, but I have a few higher level comments so far.

  1. You need to add an fpm.toml file in the test-termination folder. At present the instructions in the test-termination/README.md do not work. The fpm.toml I created is listed below.
  2. It is unclear from the output whether the tests pass or fail. I would recommend putting together some sort of script that checks the output of each individual test rather than have users try to inspect the outputs and determine pass/fail manually. This should make getting the CI to work easier as well.
  3. I would have made the switch to Julienne a separate PR, as the stop tests don't really depend on the test framework and vice-versa.
  4. I think on balance the switch to Julienne might be the right move, but I think it's worth pointing out some of the important differences:
    1. You get less detailed output from running the test suite. i.e. it doesn't report how many or what values are being compared. This could make debugging failing tests (or erroneously passing tests) more difficult.
    2. The framework is simpler, and it is one you control.
  5. I think you could adapt veggies in a similar way as you did with julienne, so if you're curious to try it, I could point you at the few things you'd want/need to overload.

example fpm.toml file

name = "test-termination"
version = "0.4.1"
license = "BSD"
author = ["Damian Rouson", "Brad Richardson", "Katherine Rasmussen", "Dan Bonachea"]
maintainer = "[email protected]"
copyright = "2021-2024 UC Regents"

[dependencies]
caffeine = {path = ".."}

[build]
link = ["gasnet-smp-seq", "gcc", "m"]

rouson added 2 commits January 2, 2025 12:38
The new text at the bottom of the test-termination/README.md
specifies the environment variable definitions used in running
the termination tests.
@rouson
Copy link
Collaborator Author

rouson commented Jan 2, 2025

@everythingfunctional thanks for the quick review!

  1. You need to add an fpm.toml file in the test-termination folder. At present the instructions in the test-termination/README.md do not work. The fpm.toml I created is listed below.

Good catch. I just pushed my fpm.toml, which I mistakenly didn't add before. This required git add -f because fpm.toml is in the .gitignore due to the top-level manifest being generated by install.sh.

  1. It is unclear from the output whether the tests pass or fail. I would recommend putting together some sort of script that checks the output of each individual test rather than have users try to inspect the outputs and determine pass/fail manually. This should make getting the CI to work easier as well.

That's the plan. I'm hoping @ktras can work on this.

  1. I would have made the switch to Julienne a separate PR, as the stop tests don't really depend on the test framework and vice-versa.

I agree. If we decide against the switch to Julienne, I'll create a separate PR to replace the old program termination tests with the new ones.

  1. I think on balance the switch to Julienne might be the right move, but I think it's worth pointing out some of the important differences:

    1. You get less detailed output from running the test suite. i.e. it doesn't report how many or what values are being compared. This could make debugging failing tests (or erroneously passing tests) more difficult.

This morning's release of Julienne 1.6.0 adds this information in diagnostic output. This issue has been bugging me for a while, but it took some focused time over the break to think of how to add such diagnostic output in backwards-compatible way. Of course, there aren't many projects using Julienne yet (two-ish) but it would be nice to not have to rewrite multiple projects' test suites all at once. Fortunately I finally figured out a backwards-compatible path and that's in Julienne 1.6.0.

  1. The framework is simpler, and it is one you control.
  1. I think you could adapt veggies in a similar way as you did with julienne, so if you're curious to try it, I could point you at the few things you'd want/need to overload.

Very cool. It would probably be useful info for learning purposes either way so pointers are welcome.

Copy link
Member

@bonachea bonachea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made some initial observations based on a quick skim of the high-level changes.

I have not yet dived into review of the test code.

veggies = {git = "https://gitlab.com/everythingfunctional/veggies", tag = "v1.1.3"}
iso_varying_string = {git = "https://gitlab.com/everythingfunctional/iso_varying_string.git", tag = "v3.0.4"}
julienne = {git = "https://github.com/berkeleylab/julienne.git", tag = "1.5.3"}
assert = {git = "https://github.com/berkeleylab/assert.git", tag = "2.0.0"}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are adding a project dependence on berkeleylab/assert (which I DO support), then we should also excise the current caffeine_assert_s module which was added as a temporary workaround for a lack of a real assertion library.

However this change is also orthogonal to the Julienne transition and feels like it belongs in a separate PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As agreed in our meeting, I'll be working on this upgrade as a separate PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not yet convinced that the proposed change fully resolves all six of the problems outlined in issue #137 that this PR claims to solve. However it might be a step in the right direction?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bonachea if we agree that the new approach is a step in the right direction, then it might make sense to merge the termination-tests as is before more thoroughly addressing #137. How does that sound?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we shouldn't hold up this PR waiting for a complete solution to #137. I've changed the PR description to stop claiming a complete fix to that issue, so that GitHub hopefully won't automatically close it.

I'm not yet prepared to approve a merge of this PR as-is, let's plan to discuss on Wednesday.

rouson added 3 commits January 2, 2025 14:30
This commit appends `_c_bool` to the `quiet` argument of
`prif_co_reduce` and removes the `pure` attribue from the
encompassing procedure.
@rouson
Copy link
Collaborator Author

rouson commented Jan 6, 2025

As of commit fcf2fa8, this PR now adds the capability for diagnostic output in the event of a test failure for all tests except those in test/prif_allocate_test_m.F90 . Adding diagnostic output to the latter test will likely require a relatively minor update to Julienne. That means 48 of 52 tests can produce diagnostic output. I think that's a sufficiently large fraction that the PR could be merged before adding the diagnostic output in the four missing cases.

Copy link
Member

@bonachea bonachea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made some additional observations.

Right now my biggest concern with this PR is that it updates an (old?) snapshot of of the test suite, which is continuing to evolve in several other in-flight PRs. I don't want this PR to become a blocker for merging PRs that add Caffeine features (with corresponding test code that naturally will semantically conflict with this PR).

Given a choice between reaching PRIF 0.5 compliance (a client-facing feature goal) and replacing an internal test harness, I'd currently lean towards prioritizing the former. Does the proposed switch from Veggies to Julienne provide any direct benefit to library clients, or is this purely a maintainability upgrade that mostly benefits Caffeine developers?

Let's plan to discuss this further in our meeting this week.


impure elemental subroutine co_all(boolean)
logical(c_bool), intent(inout) :: boolean
call prif_co_reduce(boolean, c_funloc(both))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is using the PRIF 0.4 version of prif_co_reduce, and will break when we upgrade the implementation to PRIF 0.5, which significantly complicates the code needed to invoke prif_co_reduce (see PR #158).

Moreover there's no semantic reason to use a custom reduction function here. Please convert the boolean to an integer and use prif_co_sum or prif_co_max instead, to avoid the mess mentioned above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes in this file massively conflict with the complete rewrite in-progress in PR #158 to accommodate the PRIF 0.5 changes to collectives.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes in this file will conflict with the changes in-progress in PR #158 to accommodate the PRIF 0.5 changes to collectives.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes in this file will conflict with the changes in-progress in PR #158 to accommodate the PRIF 0.5 changes to collectives.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes in this file will conflict with the changes in-progress in PR #158 to accommodate the PRIF 0.5 changes to collectives.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes in this file massively conflict with the changes in-progress in PR #171, where this test was greatly expanded to complete coverage for the PRIF 0.5 implementation of teams.

@bonachea
Copy link
Member

The simple file renames for test/* proposed as part of this PR have now been merged in standalone PR #174, as previously discussed and agreed.

This PR will eventually need a rebase to resolve that conflict and also semantic conflicts with PR #171 and #175 that have also been merged. However as discussed work on this PR should probably be deferred until after PR #158 is completed and the test suite reaches a point of stability with PRIF 0.5, to avoid another redundant round of conflict resolution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants