`DLPack` to `mdspan` #7047

fbusato · 2025-12-23T01:43:27Z

Description

The PR implements conversion utilities that take a DLTensor view and produce a (host/device/managed) mdspan of the same underlying memory.

The opposite conversion is implemented in mdspan to DLPack #7027. #7027 is also a prerequisite of this PR.

Todo:

documentation

Co-authored-by: David Bayer <[email protected]>

…n-to-dlpack

…a-cuda Linker to link LTO (NVIDIA#7011) Co-authored-by: Ashwin Srinath <[email protected]>

This allows us to use it independently

…VIDIA#7026)

…NVIDIA#7024) Co-authored-by: pciolkosz <[email protected]>

* Rework hierarchy levels * add missing launches to native cluster level queries * remove dependency on runtime storage --------- Co-authored-by: pciolkosz <[email protected]>

…A#7019)

) * Fix synchronous resource adapter property passing * Hide pinned pool on older CUDA versions * Workaround MSVC bug * Missing maybe_unused

* Remove _view from the shared memory getter * Forgot about cudax

* Ignore CUDA free errors in thrust memory resource * Add a comment

@davebayer

* Don't set current device in CUDA 13 and handle extended lambda * Add extended lambda test * Compiler workarounds * Waive extended lambda test on NVRTC * Apply suggestion from @davebayer --------- Co-authored-by: David Bayer <[email protected]>

…regardless of exception support (NVIDIA#7028) Co-authored-by: David Bayer <[email protected]>

…DIA#7012)

…ector types below version 13.0

oleksandr-pavlyk · 2026-01-06T14:23:24Z

I feel very strongly that if std::layout_stride can not represent strides that can be encountered in DLPack, i.e., non-positive strides, then CCCL must implement a layout object that can.

@mhoemmen Are there reasons this can not be done?

mhoemmen · 2026-01-06T17:49:23Z

@oleksandr-pavlyk wrote:

I feel very strongly that if std::layout_stride can not represent strides that can be encountered in DLPack, i.e., non-positive strides, then CCCL must implement a layout object that can.

std::layout_stride::mapping CANNOT represent nonpositive strides.

The design intent of std::layout_stride::mapping is to represent any layout mapping resulting from one or more applications of submdspan to a layout_left or layout_right mapping. It's NOT a general strided layout mapping. It does NOT support broadcasting or negative strides.

This was always part of the design. It's something mdspan's layouts inherited from Kokkos::View. It's also part of the reason why the layout mapping requirements define a strided layout mapping (see is_strided()) separately from layout_stride. Relaxing this would break submdspan.

Therefore, I agree that if you want a layout mapping that can represent everything that DLPack can represent, then you'll need a custom layout mapping.

You'll have this issue and worse with NumPy's ndarray format, because that uses byte strides instead of element strides. Supporting that natively would require a custom accessor as well. PR kokkos/mdspan#249 shows an example.

leofang · 2026-01-07T05:04:18Z

I agree with @oleksandr-pavlyk:

I feel very strongly that if std::layout_stride can not represent strides that can be encountered in DLPack, i.e., non-positive strides, then CCCL must implement a layout object that can.

@mhoemmen If strides must be positive, the whole conversion DLPack <-> mdspan is useless to the Python land (which this and the other PR are targeting, IIUC), it's even dangerous to use, because errors due to nonpositive strides would only happen at run-time, not at compile-time. Not that I need these two PRs merged asap, but knowing this I wouldn't even bother considering it for CuPy or cuda-core.

What is the solution, then? For example

Therefore, I agree that if you want a layout mapping that can represent everything that DLPack can represent, then you'll need a custom layout mapping.

What does a custom layout mapping entail, exactly? Maybe there is something obvious that I am missing here, due to my lack of full understanding in mdspan 😛 Also,

Relaxing this would break submdspan.

Is submdspan a must-have for DLPack / Python users? I get that it is a thing that C++ users need (due to the standardization), but I can't see how we'd use it in Python. Assuming we are allowed to not worry about submdspan, is there any other reason that requires positive strides?

docs/libcudacxx/extended_api/mdspan/dlpack_to_mdspan.rst

docs/libcudacxx/extended_api/mdspan/mdspan_to_dlpack.rst

libcudacxx/include/cuda/__mdspan/dlpack_to_mdspan.h

libcudacxx/include/cuda/__mdspan/mdspan_to_dlpack.h

mhoemmen · 2026-01-08T23:14:33Z

@leofang wrote:

What does a custom layout mapping entail, exactly?

A "custom layout mapping" is a user-defined type that meets the layout mapping requirements.

This file in the reference mdspan implementation's tests has examples.

It should be possible for us to write a custom layout mapping that supports arbitrary DLPack layouts. It would need to store an offset as well as strides, so that negative strides would still result in a nonnegative mapping result.

Is submdspan a must-have for DLPack / Python users? I get that it is a thing that C++ users need (due to the standardization), but I can't see how we'd use it in Python. Assuming we are allowed to not worry about submdspan, is there any other reason that requires positive strides?

layout_stride needs nonzero strides if all the extents are nonzero, because otherwise the mapping would not be unique. The C++ Standard specifies that layout_stride::mapping is always unique. This would have nothing to do with submdspan.
layout_stride needs nonnegative strides because otherwise evaluating the mapping could give a negative result. That would violate the layout mapping requirements. This would have nothing to do with submdspan.

The following paper explains these issues: https://isocpp.org/files/papers/P3959R0.html .

mhoemmen · 2026-01-09T17:46:31Z

Here is a rough draft of a custom layout mapping that would support DLPack's layout (including zero and negative strides): https://godbolt.org/z/WEYazcsxT . I've commented out submdspan_mapping so the type doesn't yet support submdspan, but it's still a legal layout mapping. Getting it to support submdspan wouldn't be too hard.

namespace std {
  // work around issue in single header reference implementation
  using ::std::experimental::dims; 

  namespace impl {
    // [mdspan.layout.stride.expo] 2 defines OFFSET(m).
    // It's only ever applied to strided layout mappings.
    template<class Mapping>
      requires(Mapping::is_always_strided())
    constexpr auto offset(const Mapping& m) {
      if constexpr (typename Mapping::extents_type::rank() == 0) {
        return m();
      }
      else {
        using index_type = typename Mapping::index_type;
        constexpr auto rank = typename Mapping::extents_type::rank();     
        bool any_zero = false;
        for (::std::size_t r = 0; r < rank; ++r) {
          any_zero = any_zero || (m.extents().extent(r) == 0);
        }
        if (any_zero) {
          return index_type(0);
        }
        else {
          constexpr auto zeros =
            []< ::std::size_t... Rs>(std::index_sequence<Rs...>) {
              return std::tuple{((void) Rs, index_type(0))...};
            } (std::make_index_sequence<rank>());
          return std::apply(m, zeros);
        }
      }
    }
  }

  class layout_stride_relaxed {
  public:
    template<class Extents>
    class mapping;
  };

  template<class Extents>
  class layout_stride_relaxed::mapping {
  public:
    using extents_type = Extents;
    using index_type = extents_type::index_type;
    using size_type = extents_type::size_type;
    using rank_type = extents_type::rank_type;
    using layout_type = layout_stride_relaxed;

  private:
    static constexpr rank_type rank_ = extents_type::rank();

  public:
    constexpr mapping() noexcept {
      const layout_right::mapping<extents_type> map{};
      for (std::size_t d = 0; d < rank_; ++d) {
        strides_[d] = map.stride(d);
      }
    }

    constexpr mapping(const mapping&) noexcept = default;

    template<class OtherIndexType>
    requires(
      ::std::is_convertible_v<const OtherIndexType&, ::std::intptr_t> &&
      ::std::is_nothrow_constructible_v<::std::intptr_t, const OtherIndexType&>
    )
    constexpr mapping(
        const extents_type& e,
        ::std::span<OtherIndexType, rank_> s,
        ::std::size_t offset = 0) noexcept
      : extents_(e), offset_(offset)
    {
      for (::std::size_t d = 0; d < rank_; ++d) {
        strides_[d] = s[d];
      }
    }

    template<class OtherIndexType>
    requires(
      is_convertible_v<const OtherIndexType&, ::std::intptr_t> &&
      is_nothrow_constructible_v<::std::intptr_t, const OtherIndexType&>
    )
    constexpr mapping(
        const extents_type& e,
        const ::std::array<OtherIndexType, rank_>& s,
        ::std::size_t offset = 0) noexcept
      : extents_(e), offset_(offset)
    {
      for (::std::size_t d = 0; d < rank_; ++d) {
        strides_[d] = s[d];
      }
    }

    // m IS a layout_stride_relaxed::mapping
    template<class OtherMapping>
    requires(
      detail::layout_mapping_alike<OtherMapping> &&
      ::std::is_constructible_v<
        extents_type,
        typename OtherMapping::extents_type
      > &&
      ::std::is_same_v<
        layout_type,
        typename OtherMapping::layout_type
      >
    )
    constexpr explicit(
      ! (
        ::std::is_convertible_v<
          typename OtherMapping::extents_type, extents_type
        >
      )
    )
    mapping(const OtherMapping& m) noexcept
      : extents_(m.extents()), offset_(m.offset_)
    {
      for (std::size_t d = 0; d < rank_; ++d) {
        strides_[d] = m.stride(d);
      }
    }

    // m is NOT a layout_stride_relaxed::mapping
    template<class StridedLayoutMapping>
    requires(
      detail::layout_mapping_alike<StridedLayoutMapping> &&
      ::std::is_constructible_v<
        extents_type,
        typename StridedLayoutMapping::extents_type> &&
      StridedLayoutMapping::is_always_unique() &&
      StridedLayoutMapping::is_always_strided()
    )
    constexpr explicit(
      ! (
        ::std::is_convertible_v<
          typename StridedLayoutMapping::extents_type, extents_type
        > && (
          detail::is_mapping_of<layout_left, StridedLayoutMapping> ||
          detail::is_mapping_of<layout_right, StridedLayoutMapping> ||
          experimental::detail::is_layout_left_padded_mapping<
            StridedLayoutMapping>::value ||
          experimental::detail::is_layout_right_padded_mapping<
            StridedLayoutMapping>::value ||
          detail::is_mapping_of<layout_stride, StridedLayoutMapping>
        )
      )
    )
    mapping(const StridedLayoutMapping& m) noexcept
      : extents_(m.extents())
    {
      for (std::size_t d = 0; d < rank_; ++d) {
        strides_[d] = m.stride(d);
      }
    }
 
    constexpr mapping& operator=(const mapping&) noexcept = default;

    // [mdspan.layout.stride.obs], observers
    constexpr const extents_type& extents() const noexcept {
      return extents_;
    }
    constexpr ::std::array<index_type, rank_> strides() const noexcept {
      return strides_;
    }
    constexpr ::std::intptr_t offset() const noexcept {
      return offset_;
    }

    constexpr index_type required_span_size() const noexcept {
      // The dot product of indices and strides is linear.
      // Thus, over all valid indices, the max value of the
      // dot product is achieved at the extrema: either the
      // min index (0) if the stride is negative, or the max
      // index (extent(r) - 1) if the stride is nonnegative.
      std::array<index_type, rank_> max_indices{};
      for (std::size_t r = 0; r < rank_; ++r) {
        const index_type ext = extents_.extent(r);
        const index_type ext_minus_1 =
          ext == 0 ? index_type(0) : ext - index_type(1);
        max_indices[r] = strides_[r] < 0 ? index_type(0) : ext_minus_1;
      }
      index_type dot = 0;
      for (std::size_t r = 0; r < rank_; ++r) {
        dot += max_indices[r] * strides_[r];
      }
      return offset() + dot;
    }

    template<class... Indices>
    requires(
      sizeof...(Indices) == rank_ &&
      (::std::is_convertible_v<Indices, index_type> && ...) &&
      (::std::is_nothrow_constructible_v<index_type, Indices> && ...)
    )
    constexpr index_type operator()(Indices... inds) const noexcept {
      return offset() +
        [&, this]<::std::size_t... Rs>(::std::index_sequence<Rs...>) {
          return ((inds...[Rs] * strides_[Rs]) + ... + index_type(0));
        } (::std::make_index_sequence<rank_>());
    }

    static constexpr bool is_always_unique() noexcept { return false; }
    static constexpr bool is_always_exhaustive() noexcept { return false; }
    // It's technically NOT always strided, because of the offset
    // (to accommodate negative strides)
    static constexpr bool is_always_strided() noexcept { return false; }

    constexpr bool is_unique() noexcept {
      // The Standard doesn't require that this be exact.
      // Possibility of negative strides with an offset
      // makes that harder to figure out.
      return false;  
    }
    constexpr bool is_exhaustive() const noexcept {
      // The Standard doesn't require that this be exact.
      // Possibility of negative strides with an offset
      // makes that harder to figure out.
      return false;  
    }
    constexpr bool is_strided() noexcept {
      return offset_ == 0;
    }

    constexpr index_type stride(rank_type i) const noexcept {
      return strides_[i];
    }

    // y is also a layout_stride_relaxed::mapping
    template<class OtherMapping>
    requires(
      detail::layout_mapping_alike<OtherMapping> &&
      rank_ == OtherMapping::extents_type::rank() &&
      ::std::is_same_v<layout_type, typename OtherMapping::layout_type>
    )
    friend constexpr bool
    operator==(const mapping& x, const OtherMapping& y) noexcept {
      return x.extents() == y.extents() &&
      x.offset_ == y.offset_ &&
      [&]<::std::size_t...Rs> (::std::index_sequence<Rs...>) {
        return ((x.stride(Rs) == y.stride(Rs)) && ...);
      } (::std::make_index_sequence<rank_>());
    }

    // y is NOT a layout_stride_relaxed::mapping but is strided.
    template<class OtherMapping>
    requires(
      detail::layout_mapping_alike<OtherMapping> &&
      rank_ == OtherMapping::extents_type::rank() &&
      OtherMapping::is_always_strided()
    )
    friend constexpr bool
    operator==(const mapping& x, const OtherMapping& y) noexcept {
      return x.extents() == y.extents() &&
      impl::offset(y) == x.offset_ &&
      [&]<::std::size_t...Rs> (::std::index_sequence<Rs...>) {
        return ((x.stride(Rs) == y.stride(Rs)) && ...);
      } (::std::make_index_sequence<rank_>());
    }

  private:
    extents_type extents_{};
    std::intptr_t offset_ = 0;
    array<std::intptr_t, rank_> strides_{};

#if 0
    // [mdspan.sub.map], submdspan mapping specialization
    template<class... SliceSpecifiers>
      constexpr auto submdspan-mapping-impl(SliceSpecifiers...) const
        -> /* see-below */;

    template<class... SliceSpecifiers>
      friend constexpr auto submdspan_mapping(
        const mapping& src, SliceSpecifiers... slices) {
          return src.submdspan-mapping-impl(slices...);
      }
#endif // 0
  };
} // namespace std

int main() {
  std::dims<3> exts(3, 5, 11);  
  std::array<std::intptr_t, 3> strides{0, 1, 5}; // broadcasting
  std::layout_stride_relaxed::mapping<std::dims<3>> map(exts, strides);

  assert(map(0, 1, 1) == map(1, 1, 1));

  return 0;
}

github-actions · 2026-01-14T20:08:54Z

😬 CI Workflow Results

🟥 Finished in 1h 36m: Pass: 88%/84 | Total: 1d 08h | Max: 1h 35m | Hits: 79%/155595

See results here.

fbusato and others added 30 commits December 18, 2025 12:16

first version

750ca5a

add unit test

f040c10

documentation

464ccc2

Update libcudacxx/include/cuda/__mdspan/mdspan_to_dlpack.h

6f32ae9

Co-authored-by: David Bayer <[email protected]>

Merge branch 'mdspan-to-dlpack' of github.com:fbusato/cccl into mdspa…

3457d3a

…n-to-dlpack

add many types

ee05eda

remove operator->

4d2e0da

formatting

f290320

fix MSVC warning

7a22848

improve documentation

f78db30

fix MSVC warning

1467ab2

first version

d844f65

complete the implementation

3843556

add unit test

977909f

cuda.coop: Use cuda.core.experimental.Linker instead of internal numb…

b0e1fbc

…a-cuda Linker to link LTO (NVIDIA#7011) Co-authored-by: Ashwin Srinath <[email protected]>

Make c2h vector comparisons constexpr (NVIDIA#7009)

50da3d4

improves comments on decoupled lookback example (NVIDIA#7015)

f8a4d06

Extract reduce_op_sync into a free function (NVIDIA#7004)

e9f0a13

This allows us to use it independently

Remove experimental namespace from cuda.core import (NVIDIA#7022)

362d316

reexpress completion signature transform alias to make clangd happy (N…

28d22c9

…VIDIA#7026)

Qualify call to __launch_impl in launch.h to avoid ambiguity errors (…

1e28e8c

…NVIDIA#7024) Co-authored-by: pciolkosz <[email protected]>

Rework hierarchy levels (NVIDIA#6957)

f21a158

* Rework hierarchy levels * add missing launches to native cluster level queries * remove dependency on runtime storage --------- Co-authored-by: pciolkosz <[email protected]>

Use vectorized tuning for triad benchmark for dtypes of size 2 (NVIDI…

1ef85d4

…A#7019)

[libcu++] Fix synchronous resource adapter property passing (NVIDIA#6976

00a1b95

) * Fix synchronous resource adapter property passing * Hide pinned pool on older CUDA versions * Workaround MSVC bug * Missing maybe_unused

[libcu++] Remove _view from the shared memory getter name (NVIDIA#6997)

adc23f5

* Remove _view from the shared memory getter * Forgot about cudax

[thrust] Ignore CUDA free errors in thrust memory resource (NVIDIA#7002)

33aa542

* Ignore CUDA free errors in thrust memory resource * Add a comment

the <stdexcept> header must be included when using _CCCL_THROW, …

6402bc6

…regardless of exception support (NVIDIA#7028) Co-authored-by: David Bayer <[email protected]>

Error out when nvrtcc cannot parse cuda_thread_count (NVIDIA#7035)

5546b87

Allow all public headers to be included with host compilers only (NVI…

58aba1d

…DIA#7012)

fix compiler warnings

e96ebea

This comment has been minimized.

Sign in to view

fbusato and others added 8 commits January 5, 2026 12:02

refactor vector type traits by removing conditional compilation for v…

136ab59

…ector types below version 13.0

reenable vector types for CTK 13

501f48c

Merge branch 'main' into mdspan-to-dlpack

bd6094c

fix msvc warning

604257d

Merge branch 'mdspan-to-dlpack' into dlpack-to-mdspan

f7c5eb4

documentation and copyright

14cf251

fix index_operator.pass

eb2635a

fix formatting

ea7e4e4

This comment has been minimized.

Sign in to view

fbusato added 2 commits January 6, 2026 11:17

Merge branch 'main' into mdspan-to-dlpack

8e813f1

Merge branch 'main' into dlpack-to-mdspan

c20e897

This comment has been minimized.

Sign in to view

davebayer reviewed Jan 7, 2026

View reviewed changes

fbusato and others added 4 commits January 7, 2026 09:33

use internal type

b6a52cd

Merge branch 'main' into mdspan-to-dlpack

0f8d8b7

Merge branch 'mdspan-to-dlpack' into dlpack-to-mdspan

1c7f5d4

address comments

9bbf73b

fbusato added a commit to fbusato/cccl that referenced this pull request Jan 7, 2026

address comments from NVIDIA#7047

87b6777

This comment has been minimized.

Sign in to view

fbusato and others added 2 commits January 14, 2026 10:23

Merge branch 'main' into dlpack-to-mdspan

83b6eb2

use _CCCL_HAS_DLPACK

ba38968

alliepiper removed the 3.2.0 label Jan 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`DLPack` to `mdspan` #7047

`DLPack` to `mdspan` #7047

fbusato commented Dec 23, 2025 •

edited

Loading

Uh oh!

This comment has been minimized.

This comment has been minimized.

oleksandr-pavlyk commented Jan 6, 2026

Uh oh!

mhoemmen commented Jan 6, 2026

Uh oh!

This comment has been minimized.

leofang commented Jan 7, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment has been minimized.

mhoemmen commented Jan 8, 2026

Uh oh!

mhoemmen commented Jan 9, 2026

Uh oh!

github-actions bot commented Jan 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

15 participants

DLPack to mdspan #7047

Are you sure you want to change the base?

DLPack to mdspan #7047

Conversation

fbusato commented Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

This comment has been minimized.

This comment has been minimized.

oleksandr-pavlyk commented Jan 6, 2026

Uh oh!

mhoemmen commented Jan 6, 2026

Uh oh!

This comment has been minimized.

leofang commented Jan 7, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

This comment has been minimized.

mhoemmen commented Jan 8, 2026

Uh oh!

mhoemmen commented Jan 9, 2026

Uh oh!

github-actions bot commented Jan 14, 2026

😬 CI Workflow Results

🟥 Finished in 1h 36m: Pass: 88%/84 | Total: 1d 08h | Max: 1h 35m | Hits: 79%/155595

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

15 participants

`DLPack` to `mdspan` #7047

`DLPack` to `mdspan` #7047

fbusato commented Dec 23, 2025 •

edited

Loading