You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am not convinced about only supporting this on sm_90+ and on shmem barriers. For a future memcpy_async_tx, we'll just do a fallback and never issue tx-based instructions for those cases, so if arrive_tx is just an arrive there (essentially discarding the tx count), that should be fine, no?
I'd like our users to be able to just write the same code everywhere, have it use hardware features where available, and do a fallback everywhere else where possible (so only trap for cases where we can't tell what the correct thing to do is, like in the barrier in cluster shmem case).
Is this a duplicate?
Area
libcu++
Is your feature request related to a problem? Please describe.
Reported by @griwes and @miscco in this thread. Quoting @griwes:
cc @ahendriksen
Describe the solution you'd like
See above.
Describe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: