-
Notifications
You must be signed in to change notification settings - Fork 859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fcoll/vulcan accelerator support #12678
fcoll/vulcan accelerator support #12678
Conversation
74a3029
to
8b24867
Compare
0beeef3
to
f8bc3fd
Compare
If the user user input buffers are GPU device memory, use also GPU device memory for the aggregation step. This will allow the data transfer to occur between GPU buffers, and hence take advantage of the much higher GPU-GPU interconnects (e.g. XGMI, NVLINK, etc.). The downside of this approach is that we cannot call directly into the fbtl ipwritev routine, but have to go through the common_ompio_file_iwrite_pregen routine, which performs the necessary segmenting and staging through the host memory. Signed-off-by: Edgar Gabriel <[email protected]>
add support for using accelerator buffers in the aggregation step of the read_all operation. This is in common/ompio instead of the fcoll component, since all fcoll components (except individual) use at the momment the default implementation, which has been moved to common/ompio a while back to avoid code duplication. Signed-off-by: Edgar Gabriel <[email protected]>
performance measurements indicate that in most cases using a CPU host buffer for data aggregation will lead to better performance than using a GPU buffer. So turn the feature off by default. Signed-off-by: Edgar Gabriel <[email protected]>
f8bc3fd
to
d30471c
Compare
Looks fine to me. I guess my only question is why the mca_common_ompio_file_iread_pregen / mca_common_ompio_file_iwrite_pregen routines are necessary, if the code was previously calling the preadv function. |
Ah, this is for the ipreadv function. So, why not use it for CPU memory buffers also? |
@qkoziol thank you for your review! Let me try to clarify your question, and also use this as an opportunity for documenting some of the changes. The pipeline protocol is used for individual I/O in cases where we need to use an additional staging buffer for the operation, e.g. GPU buffers or if we need to perform data conversion for a different data representation. Regular file_read/write operations don't need the additional staging step. When doing data aggregation into GPU buffers in collective I/O, we therefore cannot simply call the fbtl/ipreadv or fbtl/ipwritev function (as we do for host buffers), but we want to invoke the pipeline protocol. However, in contrary to the individual I/O operations, some of the operations are not necessary. Specifically, we can use the pre-calculated offsets from the collective I/O operation (and hence don't need to repeat the file-view operations anymore), and we don't need to update the file pointer position (that is also done in the collective I/O operation). Hence, these are the two iread_pregen/iwrite_pregen functions. Lastly, each collective component has its own write_all operation, but they all use the same algorithm for the read_all, which has because of that been moved from the components into the common/ompio directory. This is something that might have to change in the near future, but our focus in the past was always on the write_all operations and we neglected read_all a bit. |
No description provided.