-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add clone functionality to a field #90
base: main
Are you sure you want to change the base?
Conversation
Some clarifications as Philippe told something was not very clear. We take into account the case where the field to be clone is optional not because we won't give the second argument to the subroutine. But because we might have a chain of subroutines being called with an optional argument being not present from the top and propagated down to the clone subroutine. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @dareg for this contribution. I just have a few minor comments that I think should be addressed before we merge this.
Just out of curiosity, so that I can understand the use case a bit better, is there a benefit to putting the FIELD on device beyond using it as a struct for DEVPTR
?
@@ -262,6 +262,23 @@ field's data must be present on the host. It will not work if the data are on | |||
the device or if the field has not been allocated yet (when using the DELAY | |||
option). | |||
|
|||
## Cloning fields with FIELD\_CLONE\_ON_ | |||
|
|||
The subroutines FIELD_CLONE_ON_HOST amd FIELD_CLONE_ON_DEVICE let a field be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: 'amd' should be 'and'
ZR => GET_${TARGET}$_DATA_RDONLY (YR) | ||
ZL => GET_${TARGET}$_DATA_RDWR (YL) | ||
|
||
$:offload_macros.kernels(present=['ZL','ZR']) if TARGET=="DEVICE" else None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately the kernels construct doesn't map to openmp, so we have two choices here.
If performance is not that important, then we can simply do:
$:offload_macros.serial(present=['ZL','ZR'])
ZL(${ft.shape}$) = ZR(${ft.shape}$)
$:offload_macros.end_serial()
If performance is important, then you can generate a nested loop like I am doing here:
field_api/src/core/field_RANKSUFF_module.fypp
Line 577 in b02e712
#:for i in range(ft.rank, 0, -1) |
@@ -100,7 +100,8 @@ list(APPEND prec_srcs | |||
${CMAKE_CURRENT_BINARY_DIR}/src/core/field_module.F90 | |||
${CMAKE_CURRENT_BINARY_DIR}/src/util/field_array_module.F90 | |||
${CMAKE_CURRENT_BINARY_DIR}/src/util/field_array_util_module.F90 | |||
${CMAKE_CURRENT_BINARY_DIR}/src/util/field_util_module.F90) | |||
${CMAKE_CURRENT_BINARY_DIR}/src/util/field_util_module.F90 | |||
${CMAKE_CURRENT_BINARY_DIR}/src/util/field_clone_module.F90) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure this is a precision dependent source? I don't think it is, and we only need to compile it once and can remove it from this list.
@@ -38,4 +38,5 @@ field_api_add_object_library( | |||
set_source_files_properties( ${CMAKE_CURRENT_BINARY_DIR}/field_array_module.F90 | |||
${CMAKE_CURRENT_BINARY_DIR}/field_array_util_module.F90 | |||
${CMAKE_CURRENT_BINARY_DIR}/field_util_module.F90 | |||
${CMAKE_CURRENT_BINARY_DIR}/field_clone_module.F90 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, we only need to disable compilation for precision dependent sources.
This PR adds a way to clone a field into a second field, which will always be a field owner.
We are already using this functionality in IAL, but only on CPU.
We are now proposing to upstream an improved version (for CPU and GPU) of this functionality into field_api.
There would be two new subroutines in the API: FIELD_CLONE_ON_HOST and FIELD_CLONE_ON_DEVICE.
Documentation is updated and several tests are added.