-
Notifications
You must be signed in to change notification settings - Fork 350
Enable GPU exection of atm_bdy_adjust_scalars_work via OpenACC #1266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
gdicker1
wants to merge
4
commits into
MPAS-Dev:develop
Choose a base branch
from
gdicker1:atmosphere/acc_atm_bdy_adjust_scalars
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Enable GPU exection of atm_bdy_adjust_scalars_work via OpenACC #1266
gdicker1
wants to merge
4
commits into
MPAS-Dev:develop
from
gdicker1:atmosphere/acc_atm_bdy_adjust_scalars
+37
−3
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Small whitespace changes. Also change implicit loop to an explicit loop to better parallelize. Implicit loops can be ported with 'acc kernels', but we prefer more proscribed 'acc parallel ...' constructs.
jim-p-w
approved these changes
Jan 23, 2025
abishekg7
reviewed
Feb 1, 2025
abishekg7
reviewed
Feb 1, 2025
Other than my comments, this PR seems to be bit identical with the previous version of develop. |
mgduda
approved these changes
Apr 24, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
@gdicker1 I'll merge this PR after you've had a chance to clean up the commit history.
This commit adds an initial port of this routine using OpenACC. More changes are needed for performance and data management.
Ensures the fields which don't change while the model is running are present on the device from model startup to model shutdown.
Ensure that the other, non-invariant fields are available for this routine. Variables that are overwritten during this routine are only created while others are copied in. Any variables overwritten by this routine are copied out at the end. Timing for these transfers are reported in the output log file in the new timer: 'atm_bdy_adjust_scalars [ACC_data_xfer]'. Also add default(present) to parallel directives to ensure data movement is correct and prevent any implicit data movements from the compiler.
9d09979
to
dd38bd0
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR makes small code modifications and adds OpenACC directives so the atm_bdy_adjust_scalars_work routine can execute on GPU(s).
Timing information for the OpenACC data transfers in this routine is captured in the log file by a new timer: atm_bdy_adjust_scalars [ACC_data_xfer].
Invariant fields used in this routine are also copied to the device within mpas_atm_dynamics_init and are deleted in mpas_atm_dynamics_finalize.