Skip to content

Conversation

@fbusato
Copy link
Contributor

@fbusato fbusato commented Dec 19, 2025

Description

The PR implements conversion utilities that take a (host/device/managed) mdspan and produce a DLTensor view of the same underlying memory.

The main issue is how to handle the memory associated with the shape and stride array of DLPack.
The implementation provides a small wrapper that owns the shape and strides arrays and stores a DLTensor whose shape/strides pointers refer into those arrays.

Todo:

  • documentation

@fbusato fbusato self-assigned this Dec 19, 2025
@fbusato fbusato requested a review from a team as a code owner December 19, 2025 00:55
@fbusato fbusato requested a review from wmaxey December 19, 2025 00:55
@fbusato fbusato added the 3.2.0 label Dec 19, 2025
@fbusato fbusato added this to CCCL Dec 19, 2025
@github-project-automation github-project-automation bot moved this to Todo in CCCL Dec 19, 2025
@cccl-authenticator-app cccl-authenticator-app bot moved this from Todo to In Review in CCCL Dec 19, 2025
@github-actions

This comment has been minimized.

@github-project-automation github-project-automation bot moved this from In Review to In Progress in CCCL Dec 19, 2025
@fbusato fbusato requested a review from a team as a code owner December 19, 2025 18:21
@fbusato fbusato requested a review from gonidelis December 19, 2025 18:21
@github-actions

This comment has been minimized.

@davebayer
Copy link
Contributor

I was also wondering whether we don't want to introduce a special namespace (something like cuda::interop) for interoperability with other APIs apart from CTK

@github-actions

This comment has been minimized.

@fbusato fbusato moved this from In Progress to In Review in CCCL Dec 20, 2025
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@alliepiper alliepiper removed the 3.2.0 label Jan 15, 2026
@github-actions

This comment has been minimized.

@github-actions
Copy link
Contributor

🥳 CI Workflow Results

🟩 Finished in 1d 01h: Pass: 100%/84 | Total: 1d 16h | Max: 2h 14m | Hits: 92%/198924

See results here.

inline constexpr bool __has_vector_type_v = !::cuda::std::is_same_v<__vector_type_t<_Tp, _Size>, void>;

template <class _Tp>
inline constexpr bool __is_vector_type_v = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move these to a separate header <cuda/__type_traits/is_vector_type.h>?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will address the issue in another PR.

Comment on lines +499 to +502
inline constexpr bool __is_extended_fp_vector_type_v<::__nv_bfloat162> = true;
template <>
inline constexpr bool __is_extended_fp_vector_type_v<::__half2> = true;
template <>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should also add these overloads to __vector_type_t, so these traits stay consistent. Maybe we can move this to a separate PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, better in another PR. We would need to update the include paths of other files

@fbusato fbusato requested a review from davebayer January 20, 2026 19:42
Comment on lines 186 to 193
_CCCL_HOST_API __dlpack_tensor(__dlpack_tensor&& __other) noexcept
: __shape{::cuda::std::move(__other.__shape)}
, __strides{::cuda::std::move(__other.__strides)}
, __tensor{__other.__tensor}
{
__other.__tensor = ::DLTensor{};
__update_tensor();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need move constructor + assignment operator? What are they good for? They just invalidate the old object for no reason. I would keep just the copy variants :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the reason is to give more freedom/flexibility, but yes, it could be error-prone. The copy operator is not heavy, so we can keep only it.

::DLDeviceType __device_type,
int __device_id)
{
static_assert(::cuda::std::is_pointer_v<typename _Accessor::data_handle_type>, "data_handle_type must be a pointer");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe in the future we can relax this to ContiguousIterator if anyone needs it

template <::cuda::std::size_t _Rank>
struct __dlpack_tensor
{
::cuda::std::array<::cuda::std::int64_t, _Rank> __shape[_Rank]{};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cuda::std::array is need for Rank == 0

return __tensor1;
}

::DLTensor get() const&& = delete;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without explicit = delete the compiler doesn't raise an error

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.

3 participants