Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable GPU exection of summarize_timestep via OpenACC #1294

Open
wants to merge 5 commits into
base: develop
Choose a base branch
from

Conversation

gdicker1
Copy link
Collaborator

This PR modifies the code and adds OpenACC directives to allow the loops in the summarize_timestep routine can execute on GPUs.

Timing information for the temporary OpenACC data transfers in this routine is captured in the log file by a new timer: 'summarize_timestep [ACC_data_xfer]'.

This PR includes a re-write of the loops associated with config_print_detailed_minmax_vel to ensure consistent locations are reported for the minimum and maximum values for the wind speeds.

…mestep

These 'integer, pointer' variables are used as loop bounds and can cause
issues with OpenACC. Since OpenACC will directly copy the pointers and
may not preserve the association (may not also copy the value pointed
to), odd behavior can result.
This prevents having to add OpenACC routine information to
mpas_log_write and routine(s) it calls.
Only for loops executed when 'config_print_detailed_minmax_vel = .true.'

When running with OpenACC, it is difficult to get values to "come with"
during a reduction. This means it's difficult to get both the minimum
value of an array and the position (the array indices) it occurs at in
one pass.  Using 2 loops like this ensures the correct positions are
found.

Using a "linear index" to record the position of the minimum/maximum
values helps in cases where the min/max occurs at multiple points in an
array. The detailed prints will always use the first occurence of the
min/max in the array.
Adds timing information for the OpenACC data transfers in the
'summarize_timestep [ACC_data_xfer]' timer. This timer will be removed
as data regions in the MPAS-A dycore are fused.
Adds timing information for the OpenACC data transfers in the temporary
'summarize_timestep [ACC_data_xfer]' timer.
@mgduda mgduda requested review from mgduda and abishekg7 March 21, 2025 02:09
@mgduda mgduda added Atmosphere OpenACC Work related to OpenACC acceleration of code labels Mar 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Atmosphere OpenACC Work related to OpenACC acceleration of code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants