-
Notifications
You must be signed in to change notification settings - Fork 320
DLPack to mdspan
#7047
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
DLPack to mdspan
#7047
Conversation
Co-authored-by: David Bayer <[email protected]>
…a-cuda Linker to link LTO (NVIDIA#7011) Co-authored-by: Ashwin Srinath <[email protected]>
This allows us to use it independently
…NVIDIA#7024) Co-authored-by: pciolkosz <[email protected]>
* Rework hierarchy levels * add missing launches to native cluster level queries * remove dependency on runtime storage --------- Co-authored-by: pciolkosz <[email protected]>
* Remove _view from the shared memory getter * Forgot about cudax
* Ignore CUDA free errors in thrust memory resource * Add a comment
* Don't set current device in CUDA 13 and handle extended lambda * Add extended lambda test * Compiler workarounds * Waive extended lambda test on NVRTC * Apply suggestion from @davebayer --------- Co-authored-by: David Bayer <[email protected]>
…regardless of exception support (NVIDIA#7028) Co-authored-by: David Bayer <[email protected]>
This comment has been minimized.
This comment has been minimized.
…ector types below version 13.0
This comment has been minimized.
This comment has been minimized.
|
I feel very strongly that if @mhoemmen Are there reasons this can not be done? |
|
@oleksandr-pavlyk wrote:
The design intent of This was always part of the design. It's something Therefore, I agree that if you want a layout mapping that can represent everything that DLPack can represent, then you'll need a custom layout mapping. You'll have this issue and worse with NumPy's |
This comment has been minimized.
This comment has been minimized.
|
I agree with @oleksandr-pavlyk:
@mhoemmen If strides must be positive, the whole conversion DLPack <-> mdspan is useless to the Python land (which this and the other PR are targeting, IIUC), it's even dangerous to use, because errors due to nonpositive strides would only happen at run-time, not at compile-time. Not that I need these two PRs merged asap, but knowing this I wouldn't even bother considering it for CuPy or cuda-core. What is the solution, then? For example
What does a custom layout mapping entail, exactly? Maybe there is something obvious that I am missing here, due to my lack of full understanding in mdspan 😛 Also,
Is |
This comment has been minimized.
This comment has been minimized.
|
@leofang wrote:
A "custom layout mapping" is a user-defined type that meets the layout mapping requirements. This file in the reference mdspan implementation's tests has examples. It should be possible for us to write a custom layout mapping that supports arbitrary DLPack layouts. It would need to store an offset as well as strides, so that negative strides would still result in a nonnegative mapping result.
The following paper explains these issues: https://isocpp.org/files/papers/P3959R0.html . |
|
Here is a rough draft of a custom layout mapping that would support DLPack's layout (including zero and negative strides): https://godbolt.org/z/WEYazcsxT . I've commented out namespace std {
// work around issue in single header reference implementation
using ::std::experimental::dims;
namespace impl {
// [mdspan.layout.stride.expo] 2 defines OFFSET(m).
// It's only ever applied to strided layout mappings.
template<class Mapping>
requires(Mapping::is_always_strided())
constexpr auto offset(const Mapping& m) {
if constexpr (typename Mapping::extents_type::rank() == 0) {
return m();
}
else {
using index_type = typename Mapping::index_type;
constexpr auto rank = typename Mapping::extents_type::rank();
bool any_zero = false;
for (::std::size_t r = 0; r < rank; ++r) {
any_zero = any_zero || (m.extents().extent(r) == 0);
}
if (any_zero) {
return index_type(0);
}
else {
constexpr auto zeros =
[]< ::std::size_t... Rs>(std::index_sequence<Rs...>) {
return std::tuple{((void) Rs, index_type(0))...};
} (std::make_index_sequence<rank>());
return std::apply(m, zeros);
}
}
}
}
class layout_stride_relaxed {
public:
template<class Extents>
class mapping;
};
template<class Extents>
class layout_stride_relaxed::mapping {
public:
using extents_type = Extents;
using index_type = extents_type::index_type;
using size_type = extents_type::size_type;
using rank_type = extents_type::rank_type;
using layout_type = layout_stride_relaxed;
private:
static constexpr rank_type rank_ = extents_type::rank();
public:
constexpr mapping() noexcept {
const layout_right::mapping<extents_type> map{};
for (std::size_t d = 0; d < rank_; ++d) {
strides_[d] = map.stride(d);
}
}
constexpr mapping(const mapping&) noexcept = default;
template<class OtherIndexType>
requires(
::std::is_convertible_v<const OtherIndexType&, ::std::intptr_t> &&
::std::is_nothrow_constructible_v<::std::intptr_t, const OtherIndexType&>
)
constexpr mapping(
const extents_type& e,
::std::span<OtherIndexType, rank_> s,
::std::size_t offset = 0) noexcept
: extents_(e), offset_(offset)
{
for (::std::size_t d = 0; d < rank_; ++d) {
strides_[d] = s[d];
}
}
template<class OtherIndexType>
requires(
is_convertible_v<const OtherIndexType&, ::std::intptr_t> &&
is_nothrow_constructible_v<::std::intptr_t, const OtherIndexType&>
)
constexpr mapping(
const extents_type& e,
const ::std::array<OtherIndexType, rank_>& s,
::std::size_t offset = 0) noexcept
: extents_(e), offset_(offset)
{
for (::std::size_t d = 0; d < rank_; ++d) {
strides_[d] = s[d];
}
}
// m IS a layout_stride_relaxed::mapping
template<class OtherMapping>
requires(
detail::layout_mapping_alike<OtherMapping> &&
::std::is_constructible_v<
extents_type,
typename OtherMapping::extents_type
> &&
::std::is_same_v<
layout_type,
typename OtherMapping::layout_type
>
)
constexpr explicit(
! (
::std::is_convertible_v<
typename OtherMapping::extents_type, extents_type
>
)
)
mapping(const OtherMapping& m) noexcept
: extents_(m.extents()), offset_(m.offset_)
{
for (std::size_t d = 0; d < rank_; ++d) {
strides_[d] = m.stride(d);
}
}
// m is NOT a layout_stride_relaxed::mapping
template<class StridedLayoutMapping>
requires(
detail::layout_mapping_alike<StridedLayoutMapping> &&
::std::is_constructible_v<
extents_type,
typename StridedLayoutMapping::extents_type> &&
StridedLayoutMapping::is_always_unique() &&
StridedLayoutMapping::is_always_strided()
)
constexpr explicit(
! (
::std::is_convertible_v<
typename StridedLayoutMapping::extents_type, extents_type
> && (
detail::is_mapping_of<layout_left, StridedLayoutMapping> ||
detail::is_mapping_of<layout_right, StridedLayoutMapping> ||
experimental::detail::is_layout_left_padded_mapping<
StridedLayoutMapping>::value ||
experimental::detail::is_layout_right_padded_mapping<
StridedLayoutMapping>::value ||
detail::is_mapping_of<layout_stride, StridedLayoutMapping>
)
)
)
mapping(const StridedLayoutMapping& m) noexcept
: extents_(m.extents())
{
for (std::size_t d = 0; d < rank_; ++d) {
strides_[d] = m.stride(d);
}
}
constexpr mapping& operator=(const mapping&) noexcept = default;
// [mdspan.layout.stride.obs], observers
constexpr const extents_type& extents() const noexcept {
return extents_;
}
constexpr ::std::array<index_type, rank_> strides() const noexcept {
return strides_;
}
constexpr ::std::intptr_t offset() const noexcept {
return offset_;
}
constexpr index_type required_span_size() const noexcept {
// The dot product of indices and strides is linear.
// Thus, over all valid indices, the max value of the
// dot product is achieved at the extrema: either the
// min index (0) if the stride is negative, or the max
// index (extent(r) - 1) if the stride is nonnegative.
std::array<index_type, rank_> max_indices{};
for (std::size_t r = 0; r < rank_; ++r) {
const index_type ext = extents_.extent(r);
const index_type ext_minus_1 =
ext == 0 ? index_type(0) : ext - index_type(1);
max_indices[r] = strides_[r] < 0 ? index_type(0) : ext_minus_1;
}
index_type dot = 0;
for (std::size_t r = 0; r < rank_; ++r) {
dot += max_indices[r] * strides_[r];
}
return offset() + dot;
}
template<class... Indices>
requires(
sizeof...(Indices) == rank_ &&
(::std::is_convertible_v<Indices, index_type> && ...) &&
(::std::is_nothrow_constructible_v<index_type, Indices> && ...)
)
constexpr index_type operator()(Indices... inds) const noexcept {
return offset() +
[&, this]<::std::size_t... Rs>(::std::index_sequence<Rs...>) {
return ((inds...[Rs] * strides_[Rs]) + ... + index_type(0));
} (::std::make_index_sequence<rank_>());
}
static constexpr bool is_always_unique() noexcept { return false; }
static constexpr bool is_always_exhaustive() noexcept { return false; }
// It's technically NOT always strided, because of the offset
// (to accommodate negative strides)
static constexpr bool is_always_strided() noexcept { return false; }
constexpr bool is_unique() noexcept {
// The Standard doesn't require that this be exact.
// Possibility of negative strides with an offset
// makes that harder to figure out.
return false;
}
constexpr bool is_exhaustive() const noexcept {
// The Standard doesn't require that this be exact.
// Possibility of negative strides with an offset
// makes that harder to figure out.
return false;
}
constexpr bool is_strided() noexcept {
return offset_ == 0;
}
constexpr index_type stride(rank_type i) const noexcept {
return strides_[i];
}
// y is also a layout_stride_relaxed::mapping
template<class OtherMapping>
requires(
detail::layout_mapping_alike<OtherMapping> &&
rank_ == OtherMapping::extents_type::rank() &&
::std::is_same_v<layout_type, typename OtherMapping::layout_type>
)
friend constexpr bool
operator==(const mapping& x, const OtherMapping& y) noexcept {
return x.extents() == y.extents() &&
x.offset_ == y.offset_ &&
[&]<::std::size_t...Rs> (::std::index_sequence<Rs...>) {
return ((x.stride(Rs) == y.stride(Rs)) && ...);
} (::std::make_index_sequence<rank_>());
}
// y is NOT a layout_stride_relaxed::mapping but is strided.
template<class OtherMapping>
requires(
detail::layout_mapping_alike<OtherMapping> &&
rank_ == OtherMapping::extents_type::rank() &&
OtherMapping::is_always_strided()
)
friend constexpr bool
operator==(const mapping& x, const OtherMapping& y) noexcept {
return x.extents() == y.extents() &&
impl::offset(y) == x.offset_ &&
[&]<::std::size_t...Rs> (::std::index_sequence<Rs...>) {
return ((x.stride(Rs) == y.stride(Rs)) && ...);
} (::std::make_index_sequence<rank_>());
}
private:
extents_type extents_{};
std::intptr_t offset_ = 0;
array<std::intptr_t, rank_> strides_{};
#if 0
// [mdspan.sub.map], submdspan mapping specialization
template<class... SliceSpecifiers>
constexpr auto submdspan-mapping-impl(SliceSpecifiers...) const
-> /* see-below */;
template<class... SliceSpecifiers>
friend constexpr auto submdspan_mapping(
const mapping& src, SliceSpecifiers... slices) {
return src.submdspan-mapping-impl(slices...);
}
#endif // 0
};
} // namespace std
int main() {
std::dims<3> exts(3, 5, 11);
std::array<std::intptr_t, 3> strides{0, 1, 5}; // broadcasting
std::layout_stride_relaxed::mapping<std::dims<3>> map(exts, strides);
assert(map(0, 1, 1) == map(1, 1, 1));
return 0;
} |
😬 CI Workflow Results🟥 Finished in 1h 36m: Pass: 88%/84 | Total: 1d 08h | Max: 1h 35m | Hits: 79%/155595See results here. |
Description
The PR implements conversion utilities that take a DLTensor view and produce a (host/device/managed) mdspan of the same underlying memory.
The opposite conversion is implemented in mdspan to DLPack #7027. #7027 is also a prerequisite of this PR.
Todo: