Skip to content

Enable GPU exection of atm_bdy_adjust_scalars_work via OpenACC #1266

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: develop
Choose a base branch
from

Conversation

gdicker1
Copy link
Collaborator

This PR makes small code modifications and adds OpenACC directives so the atm_bdy_adjust_scalars_work routine can execute on GPU(s).

Timing information for the OpenACC data transfers in this routine is captured in the log file by a new timer: atm_bdy_adjust_scalars [ACC_data_xfer].

Invariant fields used in this routine are also copied to the device within mpas_atm_dynamics_init and are deleted in mpas_atm_dynamics_finalize.

Small whitespace changes. Also change implicit loop to an explicit loop
to better parallelize. Implicit loops can be ported with 'acc kernels',
but we prefer more proscribed 'acc parallel ...' constructs.
@mgduda mgduda added Atmosphere OpenACC Work related to OpenACC acceleration of code labels Jan 17, 2025
@abishekg7
Copy link
Collaborator

Other than my comments, this PR seems to be bit identical with the previous version of develop.

Copy link
Contributor

@mgduda mgduda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@gdicker1 I'll merge this PR after you've had a chance to clean up the commit history.

This commit adds an initial port of this routine using OpenACC. More
changes are needed for performance and data management.
Ensures the fields which don't change while the model is running are
present on the device from model startup to model shutdown.
Ensure that the other, non-invariant fields are available for this
routine. Variables that are overwritten during this routine are only
created while others are copied in. Any variables overwritten by this
routine are copied out at the end.

Timing for these transfers are reported in the output log file in the
new timer: 'atm_bdy_adjust_scalars [ACC_data_xfer]'.

Also add default(present) to parallel directives to ensure data movement
is correct and prevent any implicit data movements from the compiler.
@gdicker1 gdicker1 force-pushed the atmosphere/acc_atm_bdy_adjust_scalars branch from 9d09979 to dd38bd0 Compare April 26, 2025 01:00
@gdicker1
Copy link
Collaborator Author

@mgduda this should be ready now!

Force-push from 9d09979 to dd38bd0 to rebase fixup commits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Atmosphere OpenACC Work related to OpenACC acceleration of code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants