Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
750ca5a
first version
fbusato Dec 18, 2025
f040c10
add unit test
fbusato Dec 19, 2025
464ccc2
documentation
fbusato Dec 19, 2025
6f32ae9
Update libcudacxx/include/cuda/__mdspan/mdspan_to_dlpack.h
fbusato Dec 19, 2025
3457d3a
Merge branch 'mdspan-to-dlpack' of github.com:fbusato/cccl into mdspa…
fbusato Dec 19, 2025
ee05eda
add many types
fbusato Dec 19, 2025
4d2e0da
remove operator->
fbusato Dec 19, 2025
f290320
formatting
fbusato Dec 19, 2025
7a22848
fix MSVC warning
fbusato Dec 20, 2025
f78db30
improve documentation
fbusato Dec 20, 2025
1467ab2
fix MSVC warning
fbusato Dec 20, 2025
136ab59
refactor vector type traits by removing conditional compilation for v…
fbusato Jan 5, 2026
501f48c
reenable vector types for CTK 13
fbusato Jan 5, 2026
bd6094c
Merge branch 'main' into mdspan-to-dlpack
fbusato Jan 5, 2026
604257d
fix msvc warning
fbusato Jan 5, 2026
eb2635a
fix index_operator.pass
fbusato Jan 6, 2026
8e813f1
Merge branch 'main' into mdspan-to-dlpack
fbusato Jan 6, 2026
b6a52cd
use internal type
fbusato Jan 7, 2026
0f8d8b7
Merge branch 'main' into mdspan-to-dlpack
fbusato Jan 7, 2026
87b6777
address comments from #7047
fbusato Jan 7, 2026
d6b5dcb
use CCCL_DISABLE_DLPACK
fbusato Jan 7, 2026
ced0dcd
Merge branch 'main' into mdspan-to-dlpack
fbusato Jan 14, 2026
ddb3a13
use _CCCL_HAS_DLPACK
fbusato Jan 14, 2026
c27bcbd
add negative test
fbusato Jan 14, 2026
b2ef919
handle deprecated warnings
fbusato Jan 14, 2026
c3e1a58
Merge branch 'main' into mdspan-to-dlpack
fbusato Jan 15, 2026
27c7329
add lifetimebound
fbusato Jan 15, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions docs/libcudacxx/extended_api/mdspan.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ Mdspan
mdspan/host_device_accessor
mdspan/restrict_accessor
mdspan/shared_memory_accessor
mdspan/mdspan_to_dlpack

.. list-table::
:widths: 25 45 30 30
Expand All @@ -34,3 +35,8 @@ Mdspan
- ``mdspan`` and accessor for CUDA shared memory
- CCCL 3.2.0
- CUDA 13.2

* - :ref:`mdspan to dlpack <libcudacxx-extended-api-mdspan-mdspan-to-dlpack>`
- Convert a ``mdspan`` to a ``DLTensor``
- CCCL 3.2.0
- CUDA 13.2
140 changes: 140 additions & 0 deletions docs/libcudacxx/extended_api/mdspan/mdspan_to_dlpack.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
.. _libcudacxx-extended-api-mdspan-mdspan-to-dlpack:

``mdspan`` to DLPack
====================

This functionality provides a conversion from ``cuda::host_mdspan``, ``cuda::device_mdspan``, and ``cuda::managed_mdspan`` to `DLPack <https://dmlc.github.io/dlpack/latest/>`__ ``DLTensor`` view.

Defined in the ``<cuda/mdspan>`` header.

Conversion functions
--------------------

.. code:: cuda

namespace cuda {

template <typename T, typename Extents, typename Layout, typename Accessor>
[[nodiscard]] __dlpack_tensor<Extents::rank()>
to_dlpack_tensor(const cuda::host_mdspan<T, Extents, Layout, Accessor>& mdspan);

template <typename T, typename Extents, typename Layout, typename Accessor>
[[nodiscard]] __dlpack_tensor<Extents::rank()>
to_dlpack_tensor(const cuda::device_mdspan<T, Extents, Layout, Accessor>& mdspan);

template <typename T, typename Extents, typename Layout, typename Accessor>
[[nodiscard]] __dlpack_tensor<Extents::rank()>
to_dlpack_tensor(const cuda::managed_mdspan<T, Extents, Layout, Accessor>& mdspan);

} // namespace cuda

Types
-----

``__dlpack_tensor`` is an internal class that stores a ``DLTensor`` and owns the backing storage for its ``shape`` and ``strides`` pointers. The class does not use any heap allocation.

.. code:: cuda

namespace cuda {

template <size_t Rank>
class __dlpack_tensor {
public:
__dlpack_tensor() noexcept;
__dlpack_tensor(const __dlpack_tensor&) noexcept;
__dlpack_tensor(__dlpack_tensor&&) noexcept;
__dlpack_tensor& operator=(const __dlpack_tensor&) noexcept;
__dlpack_tensor& operator=(__dlpack_tensor&&) noexcept;
~__dlpack_tensor() noexcept = default;

DLTensor& get() & noexcept [[lifetimebound]];
const DLTensor& get() & const noexcept [[lifetimebound]];

DLTensor& get() && noexcept = delete;
const DLTensor& get() && const noexcept = delete;
};

} // namespace cuda

``cuda::__dlpack_tensor`` stores a ``DLTensor`` and owns the backing storage for its ``shape`` and ``strides`` pointers. The class does not use any heap allocation.

.. note:: **Lifetime**

The ``DLTensor`` associated with ``cuda::__dlpack_tensor`` must not outlive the wrapper. If the wrapper is destroyed or moved, the returned ``DLTensor::shape`` and ``DLTensor::strides`` pointers will dangle.

.. note:: **Const-correctness**

``DLTensor::data`` points at ``mdspan.data_handle()`` (or is ``nullptr`` if ``mdspan.size() == 0``). If ``T`` is ``const``, the pointer is ``const_cast``'d because ``DLTensor::data`` is unqualified.

Semantics
---------

The conversion produces a non-owning DLPack view of the ``mdspan`` data and metadata:

- ``DLTensor::ndim`` is ``mdspan.rank()``.
- For rank > 0, ``DLTensor::shape[i]`` is ``mdspan.extent(i)``.
- For rank > 0, ``DLTensor::strides[i]`` is ``mdspan.stride(i)``.
- ``DLTensor::byte_offset`` is always ``0``.
- ``DLTensor::device`` is:

- ``{kDLCPU, 0}`` for ``cuda::host_mdspan``
- ``{kDLCUDA, /*device_id*/}`` for ``cuda::device_mdspan``
- ``{kDLCUDAManaged, 0}`` for ``cuda::managed_mdspan``

Element types are mapped to ``DLDataType`` according to the DLPack conventions, including:

- ``bool``.
- Signed and unsigned integers.
- IEEE-754 Floating-point and extended precision floating-point, including ``__half``, ``__nv_bfloat16``, ``__float128``, FP8, FP6, FP4 when available.
- Complex: ``cuda::std::complex<__half>``, ``cuda::std::complex<float>``, and ``cuda::std::complex<double>``.
- `CUDA built-in vector types <https://docs.nvidia.com/cuda/cuda-programming-guide/05-appendices/cpp-language-extensions.html#built-in-types>`__, such as ``int2``, ``float4``, etc.
- Vector types for extended floating-point, such as ``__half2``, ``__nv_fp8x4_e4m3``, etc.

Constraints
-----------

- The accessor ``data_handle_type`` must be a pointer type.

Runtime errors
--------------

- If any ``extent(i)`` or ``stride(i)`` cannot be represented in ``int64_t``, the conversion raises an ``std::invalid_argument`` exception.

Availability notes
------------------

- This API is available only when DLPack header is present, namely ``<dlpack/dlpack.h>`` is found in the include path.
- This API can be disabled by defining ``CCCL_DISABLE_DLPACK`` before including any library headers. In this case, ``<dlpack/dlpack.h>`` will not be included.

References
----------

- `DLPack C API <https://dmlc.github.io/dlpack/latest/c_api.html>`__ documentation.

Example
-------

.. code:: cuda

#include <dlpack/dlpack.h>
#include <cuda/mdspan>
#include <cuda/std/cassert>
#include <cuda/std/cstdint>

int main() {
using extents_t = cuda::std::extents<size_t, 2, 3>;

int data[6] = {0, 1, 2, 3, 4, 5};
cuda::host_mdspan<int, extents_t> md{data, extents_t{}};

auto dl = cuda::to_dlpack_tensor(md);
const auto& dltensor = dl.get();
// auto dltensor = dl.get(); is incorrect; it returns a reference to a temporary object that will be destroyed at the end of the statement.

// `dl` owns the shape/stride storage; `dltensor.data` is a non-owning pointer to `data`.
assert(dltensor.device.device_type == kDLCPU);
assert(dltensor.ndim == 2);
assert(dltensor.shape[0] == 2 && dltensor.shape[1] == 3);
assert(dltensor.strides[0] == 3 && dltensor.strides[1] == 1);
assert(dltensor.data == data);
}
Loading
Loading