Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: reduce lacks vectorized loads for transform iterators #435

Closed
1 task done
gonzalobg opened this issue Sep 12, 2023 · 0 comments · Fixed by #1091
Closed
1 task done

[BUG]: reduce lacks vectorized loads for transform iterators #435

gonzalobg opened this issue Sep 12, 2023 · 0 comments · Fixed by #1091
Assignees
Labels
bug Something isn't working right.

Comments

@gonzalobg
Copy link
Collaborator

Is this a duplicate?

Type of Bug

Performance

Component

CUB

Describe the bug

Supporting vectorized loads for transform iterators is required to improve the performance of reduce by 10x for reductions under < 1 GB.

There is an internal implementation available.

How to Reproduce

.

Expected behavior

.

Reproduction link

No response

Operating System

.

nvidia-smi output

.

NVCC version

.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working right.
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

2 participants