Skip to content

Conversation

@fbusato
Copy link
Contributor

@fbusato fbusato commented Dec 23, 2025

Description

The PR implements conversion utilities that take a DLTensor view and produce a (host/device/managed) mdspan of the same underlying memory.

The opposite conversion is implemented in mdspan to DLPack #7027. #7027 is also a prerequisite of this PR.

Todo:

  • documentation

fbusato and others added 30 commits December 18, 2025 12:16
This allows us to use it independently
* Rework hierarchy levels

* add missing launches to native cluster level queries

* remove dependency on runtime storage

---------

Co-authored-by: pciolkosz <[email protected]>
)

* Fix synchronous resource adapter property passing

* Hide pinned pool on older CUDA versions

* Workaround MSVC bug

* Missing maybe_unused
* Remove _view from the shared memory getter

* Forgot about cudax
* Ignore CUDA free errors in thrust memory resource

* Add a comment
* Don't set current device in CUDA 13 and handle extended lambda

* Add extended lambda test

* Compiler workarounds

* Waive extended lambda test on NVRTC

* Apply suggestion from @davebayer

---------

Co-authored-by: David Bayer <[email protected]>
…regardless of exception support (NVIDIA#7028)

Co-authored-by: David Bayer <[email protected]>
@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@oleksandr-pavlyk
Copy link
Contributor

I feel very strongly that if std::layout_stride can not represent strides that can be encountered in DLPack, i.e., non-positive strides, then CCCL must implement a layout object that can.

@mhoemmen Are there reasons this can not be done?

@mhoemmen
Copy link
Contributor

mhoemmen commented Jan 6, 2026

@oleksandr-pavlyk wrote:

I feel very strongly that if std::layout_stride can not represent strides that can be encountered in DLPack, i.e., non-positive strides, then CCCL must implement a layout object that can.

std::layout_stride::mapping CANNOT represent nonpositive strides.

The design intent of std::layout_stride::mapping is to represent any layout mapping resulting from one or more applications of submdspan to a layout_left or layout_right mapping. It's NOT a general strided layout mapping. It does NOT support broadcasting or negative strides.

This was always part of the design. It's something mdspan's layouts inherited from Kokkos::View. It's also part of the reason why the layout mapping requirements define a strided layout mapping (see is_strided()) separately from layout_stride. Relaxing this would break submdspan.

Therefore, I agree that if you want a layout mapping that can represent everything that DLPack can represent, then you'll need a custom layout mapping.

You'll have this issue and worse with NumPy's ndarray format, because that uses byte strides instead of element strides. Supporting that natively would require a custom accessor as well. PR kokkos/mdspan#249 shows an example.

@github-actions

This comment has been minimized.

@leofang
Copy link
Member

leofang commented Jan 7, 2026

I agree with @oleksandr-pavlyk:

I feel very strongly that if std::layout_stride can not represent strides that can be encountered in DLPack, i.e., non-positive strides, then CCCL must implement a layout object that can.

@mhoemmen If strides must be positive, the whole conversion DLPack <-> mdspan is useless to the Python land (which this and the other PR are targeting, IIUC), it's even dangerous to use, because errors due to nonpositive strides would only happen at run-time, not at compile-time. Not that I need these two PRs merged asap, but knowing this I wouldn't even bother considering it for CuPy or cuda-core.

What is the solution, then? For example

Therefore, I agree that if you want a layout mapping that can represent everything that DLPack can represent, then you'll need a custom layout mapping.

What does a custom layout mapping entail, exactly? Maybe there is something obvious that I am missing here, due to my lack of full understanding in mdspan 😛 Also,

Relaxing this would break submdspan.

Is submdspan a must-have for DLPack / Python users? I get that it is a thing that C++ users need (due to the standardization), but I can't see how we'd use it in Python. Assuming we are allowed to not worry about submdspan, is there any other reason that requires positive strides?

fbusato added a commit to fbusato/cccl that referenced this pull request Jan 7, 2026
@github-actions

This comment has been minimized.

@mhoemmen
Copy link
Contributor

mhoemmen commented Jan 8, 2026

@leofang wrote:

What does a custom layout mapping entail, exactly?

A "custom layout mapping" is a user-defined type that meets the layout mapping requirements.

This file in the reference mdspan implementation's tests has examples.

It should be possible for us to write a custom layout mapping that supports arbitrary DLPack layouts. It would need to store an offset as well as strides, so that negative strides would still result in a nonnegative mapping result.

Is submdspan a must-have for DLPack / Python users? I get that it is a thing that C++ users need (due to the standardization), but I can't see how we'd use it in Python. Assuming we are allowed to not worry about submdspan, is there any other reason that requires positive strides?

  • layout_stride needs nonzero strides if all the extents are nonzero, because otherwise the mapping would not be unique. The C++ Standard specifies that layout_stride::mapping is always unique. This would have nothing to do with submdspan.

  • layout_stride needs nonnegative strides because otherwise evaluating the mapping could give a negative result. That would violate the layout mapping requirements. This would have nothing to do with submdspan.

The following paper explains these issues: https://isocpp.org/files/papers/P3959R0.html .

@mhoemmen
Copy link
Contributor

mhoemmen commented Jan 9, 2026

Here is a rough draft of a custom layout mapping that would support DLPack's layout (including zero and negative strides): https://godbolt.org/z/WEYazcsxT . I've commented out submdspan_mapping so the type doesn't yet support submdspan, but it's still a legal layout mapping. Getting it to support submdspan wouldn't be too hard.

namespace std {
  // work around issue in single header reference implementation
  using ::std::experimental::dims; 

  namespace impl {
    // [mdspan.layout.stride.expo] 2 defines OFFSET(m).
    // It's only ever applied to strided layout mappings.
    template<class Mapping>
      requires(Mapping::is_always_strided())
    constexpr auto offset(const Mapping& m) {
      if constexpr (typename Mapping::extents_type::rank() == 0) {
        return m();
      }
      else {
        using index_type = typename Mapping::index_type;
        constexpr auto rank = typename Mapping::extents_type::rank();     
        bool any_zero = false;
        for (::std::size_t r = 0; r < rank; ++r) {
          any_zero = any_zero || (m.extents().extent(r) == 0);
        }
        if (any_zero) {
          return index_type(0);
        }
        else {
          constexpr auto zeros =
            []< ::std::size_t... Rs>(std::index_sequence<Rs...>) {
              return std::tuple{((void) Rs, index_type(0))...};
            } (std::make_index_sequence<rank>());
          return std::apply(m, zeros);
        }
      }
    }
  }

  class layout_stride_relaxed {
  public:
    template<class Extents>
    class mapping;
  };

  template<class Extents>
  class layout_stride_relaxed::mapping {
  public:
    using extents_type = Extents;
    using index_type = extents_type::index_type;
    using size_type = extents_type::size_type;
    using rank_type = extents_type::rank_type;
    using layout_type = layout_stride_relaxed;

  private:
    static constexpr rank_type rank_ = extents_type::rank();

  public:
    constexpr mapping() noexcept {
      const layout_right::mapping<extents_type> map{};
      for (std::size_t d = 0; d < rank_; ++d) {
        strides_[d] = map.stride(d);
      }
    }

    constexpr mapping(const mapping&) noexcept = default;

    template<class OtherIndexType>
    requires(
      ::std::is_convertible_v<const OtherIndexType&, ::std::intptr_t> &&
      ::std::is_nothrow_constructible_v<::std::intptr_t, const OtherIndexType&>
    )
    constexpr mapping(
        const extents_type& e,
        ::std::span<OtherIndexType, rank_> s,
        ::std::size_t offset = 0) noexcept
      : extents_(e), offset_(offset)
    {
      for (::std::size_t d = 0; d < rank_; ++d) {
        strides_[d] = s[d];
      }
    }

    template<class OtherIndexType>
    requires(
      is_convertible_v<const OtherIndexType&, ::std::intptr_t> &&
      is_nothrow_constructible_v<::std::intptr_t, const OtherIndexType&>
    )
    constexpr mapping(
        const extents_type& e,
        const ::std::array<OtherIndexType, rank_>& s,
        ::std::size_t offset = 0) noexcept
      : extents_(e), offset_(offset)
    {
      for (::std::size_t d = 0; d < rank_; ++d) {
        strides_[d] = s[d];
      }
    }

    // m IS a layout_stride_relaxed::mapping
    template<class OtherMapping>
    requires(
      detail::layout_mapping_alike<OtherMapping> &&
      ::std::is_constructible_v<
        extents_type,
        typename OtherMapping::extents_type
      > &&
      ::std::is_same_v<
        layout_type,
        typename OtherMapping::layout_type
      >
    )
    constexpr explicit(
      ! (
        ::std::is_convertible_v<
          typename OtherMapping::extents_type, extents_type
        >
      )
    )
    mapping(const OtherMapping& m) noexcept
      : extents_(m.extents()), offset_(m.offset_)
    {
      for (std::size_t d = 0; d < rank_; ++d) {
        strides_[d] = m.stride(d);
      }
    }

    // m is NOT a layout_stride_relaxed::mapping
    template<class StridedLayoutMapping>
    requires(
      detail::layout_mapping_alike<StridedLayoutMapping> &&
      ::std::is_constructible_v<
        extents_type,
        typename StridedLayoutMapping::extents_type> &&
      StridedLayoutMapping::is_always_unique() &&
      StridedLayoutMapping::is_always_strided()
    )
    constexpr explicit(
      ! (
        ::std::is_convertible_v<
          typename StridedLayoutMapping::extents_type, extents_type
        > && (
          detail::is_mapping_of<layout_left, StridedLayoutMapping> ||
          detail::is_mapping_of<layout_right, StridedLayoutMapping> ||
          experimental::detail::is_layout_left_padded_mapping<
            StridedLayoutMapping>::value ||
          experimental::detail::is_layout_right_padded_mapping<
            StridedLayoutMapping>::value ||
          detail::is_mapping_of<layout_stride, StridedLayoutMapping>
        )
      )
    )
    mapping(const StridedLayoutMapping& m) noexcept
      : extents_(m.extents())
    {
      for (std::size_t d = 0; d < rank_; ++d) {
        strides_[d] = m.stride(d);
      }
    }
 
    constexpr mapping& operator=(const mapping&) noexcept = default;

    // [mdspan.layout.stride.obs], observers
    constexpr const extents_type& extents() const noexcept {
      return extents_;
    }
    constexpr ::std::array<index_type, rank_> strides() const noexcept {
      return strides_;
    }
    constexpr ::std::intptr_t offset() const noexcept {
      return offset_;
    }

    constexpr index_type required_span_size() const noexcept {
      // The dot product of indices and strides is linear.
      // Thus, over all valid indices, the max value of the
      // dot product is achieved at the extrema: either the
      // min index (0) if the stride is negative, or the max
      // index (extent(r) - 1) if the stride is nonnegative.
      std::array<index_type, rank_> max_indices{};
      for (std::size_t r = 0; r < rank_; ++r) {
        const index_type ext = extents_.extent(r);
        const index_type ext_minus_1 =
          ext == 0 ? index_type(0) : ext - index_type(1);
        max_indices[r] = strides_[r] < 0 ? index_type(0) : ext_minus_1;
      }
      index_type dot = 0;
      for (std::size_t r = 0; r < rank_; ++r) {
        dot += max_indices[r] * strides_[r];
      }
      return offset() + dot;
    }

    template<class... Indices>
    requires(
      sizeof...(Indices) == rank_ &&
      (::std::is_convertible_v<Indices, index_type> && ...) &&
      (::std::is_nothrow_constructible_v<index_type, Indices> && ...)
    )
    constexpr index_type operator()(Indices... inds) const noexcept {
      return offset() +
        [&, this]<::std::size_t... Rs>(::std::index_sequence<Rs...>) {
          return ((inds...[Rs] * strides_[Rs]) + ... + index_type(0));
        } (::std::make_index_sequence<rank_>());
    }

    static constexpr bool is_always_unique() noexcept { return false; }
    static constexpr bool is_always_exhaustive() noexcept { return false; }
    // It's technically NOT always strided, because of the offset
    // (to accommodate negative strides)
    static constexpr bool is_always_strided() noexcept { return false; }

    constexpr bool is_unique() noexcept {
      // The Standard doesn't require that this be exact.
      // Possibility of negative strides with an offset
      // makes that harder to figure out.
      return false;  
    }
    constexpr bool is_exhaustive() const noexcept {
      // The Standard doesn't require that this be exact.
      // Possibility of negative strides with an offset
      // makes that harder to figure out.
      return false;  
    }
    constexpr bool is_strided() noexcept {
      return offset_ == 0;
    }

    constexpr index_type stride(rank_type i) const noexcept {
      return strides_[i];
    }

    // y is also a layout_stride_relaxed::mapping
    template<class OtherMapping>
    requires(
      detail::layout_mapping_alike<OtherMapping> &&
      rank_ == OtherMapping::extents_type::rank() &&
      ::std::is_same_v<layout_type, typename OtherMapping::layout_type>
    )
    friend constexpr bool
    operator==(const mapping& x, const OtherMapping& y) noexcept {
      return x.extents() == y.extents() &&
      x.offset_ == y.offset_ &&
      [&]<::std::size_t...Rs> (::std::index_sequence<Rs...>) {
        return ((x.stride(Rs) == y.stride(Rs)) && ...);
      } (::std::make_index_sequence<rank_>());
    }

    // y is NOT a layout_stride_relaxed::mapping but is strided.
    template<class OtherMapping>
    requires(
      detail::layout_mapping_alike<OtherMapping> &&
      rank_ == OtherMapping::extents_type::rank() &&
      OtherMapping::is_always_strided()
    )
    friend constexpr bool
    operator==(const mapping& x, const OtherMapping& y) noexcept {
      return x.extents() == y.extents() &&
      impl::offset(y) == x.offset_ &&
      [&]<::std::size_t...Rs> (::std::index_sequence<Rs...>) {
        return ((x.stride(Rs) == y.stride(Rs)) && ...);
      } (::std::make_index_sequence<rank_>());
    }

  private:
    extents_type extents_{};
    std::intptr_t offset_ = 0;
    array<std::intptr_t, rank_> strides_{};

#if 0
    // [mdspan.sub.map], submdspan mapping specialization
    template<class... SliceSpecifiers>
      constexpr auto submdspan-mapping-impl(SliceSpecifiers...) const
        -> /* see-below */;

    template<class... SliceSpecifiers>
      friend constexpr auto submdspan_mapping(
        const mapping& src, SliceSpecifiers... slices) {
          return src.submdspan-mapping-impl(slices...);
      }
#endif // 0
  };
} // namespace std

int main() {
  std::dims<3> exts(3, 5, 11);  
  std::array<std::intptr_t, 3> strides{0, 1, 5}; // broadcasting
  std::layout_stride_relaxed::mapping<std::dims<3>> map(exts, strides);

  assert(map(0, 1, 1) == map(1, 1, 1));

  return 0;
}

@github-actions
Copy link
Contributor

😬 CI Workflow Results

🟥 Finished in 1h 36m: Pass: 88%/84 | Total: 1d 08h | Max: 1h 35m | Hits: 79%/155595

See results here.

@alliepiper alliepiper removed the 3.2.0 label Jan 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: In Review

Development

Successfully merging this pull request may close these issues.