[BUG]: Latest CCCL breaks CuPy #2109

leofang · 2024-07-30T16:18:20Z

Is this a duplicate?

I confirmed there appear to be no duplicate issues for this bug and that I agree to the Code of Conduct

Type of Bug

Compile-time Error

Component

Thrust

Describe the bug

CCCL diff (between yesterday and today): cupy/cccl@2246a87...d4f928e

Yesterday it built fine. Today we see this (from https://ci.preferred.jp/cupy.linux.cuda-build/166014/):

[00:03:14.784005 STDOUT 1352]](https://ci.preferred.jp/cupy.linux.cuda-build/166014/#L5136)	  cupy/cuda/cupy_thrust.cu(156): error: inherited member is not allowed	
[00:03:14.784282 STDOUT 1352]](https://ci.preferred.jp/cupy.linux.cuda-build/166014/#L5137)	    bool thrust::less<complex<float>>::operator() (	
[00:03:14.784834 STDOUT 1352]](https://ci.preferred.jp/cupy.linux.cuda-build/166014/#L5138)	                                       ^	
[00:03:14.785276 STDOUT 1352]](https://ci.preferred.jp/cupy.linux.cuda-build/166014/#L5139)		
[00:03:14.785932 STDOUT 1352]](https://ci.preferred.jp/cupy.linux.cuda-build/166014/#L5140)	  cupy/cuda/cupy_thrust.cu(156): error: declaration is incompatible with "bool cuda::std::__4::less<_Tp>::operator()(const _Tp &, const _Tp &) const [with _Tp=thrust::THRUST_200600_700_NS::complex<float>]" (declared at line 350 of /tmp/tmp.8dn9ZcMlIG/cupy/_core/include/cupy/_cccl/libcudacxx/cuda/std/__functional/operations.h)	
[00:03:14.786311 STDOUT 1352]](https://ci.preferred.jp/cupy.linux.cuda-build/166014/#L5143)	    bool thrust::less<complex<float>>::operator() (	
[00:03:14.786743 STDOUT 1352]](https://ci.preferred.jp/cupy.linux.cuda-build/166014/#L5144)	                                       ^	
[00:03:14.787154 STDOUT 1352]](https://ci.preferred.jp/cupy.linux.cuda-build/166014/#L5145)		
[00:03:14.787625 STDOUT 1352]](https://ci.preferred.jp/cupy.linux.cuda-build/166014/#L5146)	  cupy/cuda/cupy_thrust.cu(154): error: "constexpr" is not valid here	
[00:03:14.788046 STDOUT 1352]](https://ci.preferred.jp/cupy.linux.cuda-build/166014/#L5147)	    constexpr	
[00:03:14.788484 STDOUT 1352]](https://ci.preferred.jp/cupy.linux.cuda-build/166014/#L5148)	    ^	
[00:03:14.788905 STDOUT 1352]](https://ci.preferred.jp/cupy.linux.cuda-build/166014/#L5149)

How to Reproduce

Build cupy/cupy#8412 from source

Expected behavior

No compile-time errors.

Reproduction link

No response

Operating System

No response

nvidia-smi output

No response

NVCC version

No response

The text was updated successfully, but these errors were encountered:

leofang · 2024-07-30T16:31:03Z

Build cupy/cupy#8412 from source

To unblock myself I've force-pushed back to the snapshot from yesterday, but starting from this commit cupy/cupy@08e6a3c it should be reproducible. By tracing the code change following the compiler error, I believe the culprit is #1872, in particular #1872 (review) (cc: @bernhardmgruber for vis).

bernhardmgruber · 2024-07-30T17:36:03Z

@leofang thank you for reporting this and I am sorry for the breakage!

The specializations of Thrust's API that cupy performs looks quite extraordinary to me (I actually had to educate myself on the syntax) and I would like to suggest a better way.

In your current implementation, you specialize a member of thrust::less directly:

bool thrust::less<float>::operator() (
    const float& lhs, const float& rhs) const {
    return _real_less<float>(lhs, rhs);
}

which relies on thrust::less having an actual operator(). We changed this in #1872, since we now pull in the operator from ::cuda::std::less. If you specialize the entire functor instead, your code should work before and after the changes of #1872:

template <>
struct thrust::less<float> {
    bool operator() (float lhs, float rhs) const {
        return _real_less<float>(lhs, rhs);
    }
};

This is arguably safer and will also be forward compatible when we eventually alias thust::less to cuda::std::less entirely (we cannot do this yet, because we need to give some notice with the deprecation warnings).

As a general remark: I think redefining the meaning of thrust::less<float> is quite dangerous, since some algorithms depend on the exact meaning of this function object to provide more efficient implementations. E.g., thrust::sort detects whether the comparator is thrust::less<T> to branch into radix sort, and then takes the mathematical properties of "less than" as given, independently of how thrust::less<T> is defined.

leofang · 2024-07-30T18:11:03Z

Hi Bernhard, thanks for your reply.

which relies on thrust::less having an actual operator(). We changed this in #1872
...
(we cannot do this yet, because we need to give some notice with the deprecation warnings).

To me this change of behavior requires a deprecation cycle, though I do have a hard time to see where compile-time warnings can be injected for this particular case.

As a general remark: I think redefining the meaning of thrust::less<float> is quite dangerous,
...
E.g., thrust::sort detects whether the comparator is thrust::less<T> to branch into radix sort, and then takes the mathematical properties of "less than" as given, independently of how thrust::less<T> is defined.

Is this expectation documented anywhere? Would it be a concern if the comparator fulfills this "less than" property?

To give some context, CuPy had to define a custom comparator in order to follow NumPy behavior, where NaNs are sorted to the end of an array (see code comments here).

An extra note: we cannot keep breaking critical users like CuPy just casually (#1494). I've been following @jrhemstad's advertisement and encouraging the CuPy team to stay at CCCL head, but my push would be questionable if they keep hitting maintenance issues.

jrhemstad · 2024-07-30T21:19:10Z

To me this change of behavior requires a deprecation cycle

It does not. The CuPy code is incorrect in how it provides a custom comparison operator. The proper way to provide a custom comparison operator is to pass one as an argument, not reach into internal bits of the library and modify them at will.

This is covered under our compatibility guidelines: https://github.com/NVIDIA/cccl?tab=readme-ov-file#compatibility-guidelines

Do not add any declarations to the thrust::, cub::, nv::, or cuda:: namespaces unless an exception is noted for a specific symbol, e.g., specializing a type trait.

I'll grant that this could be improved to say "Do not add any declarations or template specializations...", but the point remains the same.

All that said, we don't like breaking people (even when it wasn't our fault), so we will be helping to contribute a fix to CuPy.

I look forward to when you have the time to help us set up #1494!

bernhardmgruber · 2024-07-31T14:36:13Z

PR for CuPi: cupy/cupy#8446

leofang · 2024-07-31T17:30:00Z

I had a somewhat lengthy discussion with Jake offline. Below is a summary of what I asked regarding specializing thrust::less vs thrust::less::operator noted above (and the offline thread), for posterity:

At one point we were even encouraged to specialize a CUB internal template (in the context of 64-bit support), so you must forgive CuPy developers to have the impression that everything template is customizable at will.

Perhaps let me ask this way: Why is specializing thrust::less ok (as suggested above and done in cupy/cupy#8446) but specializing thrust::less::operator is not? As per the (being updated) rule number 1, CCCL discourages users from specializing all things under thrust:: anyway. But as a user, all we see in the public API of thrust::less is that there’s one and only one member function we need to define, so it’s unclear where the line is between the two approaches from user perspective.

jrhemstad · 2024-07-31T20:59:32Z

everything template is customizable at will

The standard practice in C++ is that you should not specialize a template that does not belong to you unless you've been explicitly granted permission to do so. Everyone is at a different place with their experience and expertise in C++, so it's not unreasonable that someone may not be familiar with this yet.

Being given permission to specialize one template in one library does not grant permission to specialize another template in a different library 😅 .

Why is specializing thrust::less ok (as suggested above and done in cupy/cupy#8446) but specializing thrust::less::operator is not?

I explained in cupy/cupy#8446 (comment) that I don't think either specialization is correct. I think @bernhardmgruber was trying to be courteous and do the minimal possible set of changes to unblock cupy, but ultimately, the specialization is still ill-formed.

The good news is that it's very easy to fix by just using a custom comparator type instead of using thrust::less directly.

leofang added the bug Something isn't working right. label Jul 30, 2024

bernhardmgruber self-assigned this Jul 31, 2024

bernhardmgruber mentioned this issue Jul 31, 2024

Clarify compatibility wrt. template specializations #2138

Merged

bernhardmgruber added the thrust For all items related to Thrust. label Jul 31, 2024

leofang mentioned this issue Jul 31, 2024

Disable jitify for cub & Bump CCCL cupy/cupy#8412

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG]: Latest CCCL breaks CuPy #2109

[BUG]: Latest CCCL breaks CuPy #2109

leofang commented Jul 30, 2024

leofang commented Jul 30, 2024 •

edited

Loading

bernhardmgruber commented Jul 30, 2024

leofang commented Jul 30, 2024

jrhemstad commented Jul 30, 2024

bernhardmgruber commented Jul 31, 2024

leofang commented Jul 31, 2024

jrhemstad commented Jul 31, 2024

[BUG]: Latest CCCL breaks CuPy #2109

[BUG]: Latest CCCL breaks CuPy #2109

Comments

leofang commented Jul 30, 2024

Is this a duplicate?

Type of Bug

Component

Describe the bug

How to Reproduce

Expected behavior

Reproduction link

Operating System

nvidia-smi output

NVCC version

leofang commented Jul 30, 2024 • edited Loading

bernhardmgruber commented Jul 30, 2024

leofang commented Jul 30, 2024

jrhemstad commented Jul 30, 2024

bernhardmgruber commented Jul 31, 2024

leofang commented Jul 31, 2024

jrhemstad commented Jul 31, 2024

leofang commented Jul 30, 2024 •

edited

Loading