-
Notifications
You must be signed in to change notification settings - Fork 324
mdspan to DLPack
#7027
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
mdspan to DLPack
#7027
Conversation
This comment has been minimized.
This comment has been minimized.
Co-authored-by: David Bayer <[email protected]>
This comment has been minimized.
This comment has been minimized.
|
I was also wondering whether we don't want to introduce a special namespace (something like |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
🥳 CI Workflow Results🟩 Finished in 1d 01h: Pass: 100%/84 | Total: 1d 16h | Max: 2h 14m | Hits: 92%/198924See results here. |
| inline constexpr bool __has_vector_type_v = !::cuda::std::is_same_v<__vector_type_t<_Tp, _Size>, void>; | ||
|
|
||
| template <class _Tp> | ||
| inline constexpr bool __is_vector_type_v = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move these to a separate header <cuda/__type_traits/is_vector_type.h>?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will address the issue in another PR.
| inline constexpr bool __is_extended_fp_vector_type_v<::__nv_bfloat162> = true; | ||
| template <> | ||
| inline constexpr bool __is_extended_fp_vector_type_v<::__half2> = true; | ||
| template <> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should also add these overloads to __vector_type_t, so these traits stay consistent. Maybe we can move this to a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, better in another PR. We would need to update the include paths of other files
| _CCCL_HOST_API __dlpack_tensor(__dlpack_tensor&& __other) noexcept | ||
| : __shape{::cuda::std::move(__other.__shape)} | ||
| , __strides{::cuda::std::move(__other.__strides)} | ||
| , __tensor{__other.__tensor} | ||
| { | ||
| __other.__tensor = ::DLTensor{}; | ||
| __update_tensor(); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need move constructor + assignment operator? What are they good for? They just invalidate the old object for no reason. I would keep just the copy variants :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the reason is to give more freedom/flexibility, but yes, it could be error-prone. The copy operator is not heavy, so we can keep only it.
| ::DLDeviceType __device_type, | ||
| int __device_id) | ||
| { | ||
| static_assert(::cuda::std::is_pointer_v<typename _Accessor::data_handle_type>, "data_handle_type must be a pointer"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe in the future we can relax this to ContiguousIterator if anyone needs it
| template <::cuda::std::size_t _Rank> | ||
| struct __dlpack_tensor | ||
| { | ||
| ::cuda::std::array<::cuda::std::int64_t, _Rank> __shape[_Rank]{}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cuda::std::array is need for Rank == 0
| return __tensor1; | ||
| } | ||
|
|
||
| ::DLTensor get() const&& = delete; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
without explicit = delete the compiler doesn't raise an error
Description
The PR implements conversion utilities that take a (host/device/managed)
mdspanand produce a DLTensor view of the same underlying memory.The main issue is how to handle the memory associated with the
shapeandstridearray of DLPack.The implementation provides a small wrapper that owns the shape and strides arrays and stores a DLTensor whose shape/strides pointers refer into those arrays.
Todo: