-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suboptimal boolean reduction vectorization #128665
Comments
@llvm/issue-subscribers-backend-arm Author: Domagoj Šarić (psiha)
Both Clang and GCC struggle, ARM and x86, but GCC does a better job overall (yet still suboptimal/worse than the handwritten version). If you make the accumulation variable (occurrences) 32 bit at least that helps them use horizontal adds/addv (and it again helps GCC more than Clang).
|
@llvm/issue-subscribers-backend-x86 Author: Domagoj Šarić (psiha)
Both Clang and GCC struggle, ARM and x86, but GCC does a better job overall (yet still suboptimal/worse than the handwritten version). If you make the accumulation variable (occurrences) 32 bit at least that helps them use horizontal adds/addv (and it again helps GCC more than Clang).
|
@llvm/issue-subscribers-backend-aarch64 Author: Domagoj Šarić (psiha)
Both Clang and GCC struggle, ARM and x86, but GCC does a better job overall (yet still suboptimal/worse than the handwritten version). If you make the accumulation variable (occurrences) 32 bit at least that helps them use horizontal adds/addv (and it again helps GCC more than Clang).
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Both Clang and GCC struggle, ARM and x86, but GCC does a better job overall (yet still suboptimal/worse than the handwritten version). If you make the accumulation variable (occurrences) 32 bit at least that helps them use horizontal adds/addv (and it again helps GCC more than Clang).
https://godbolt.org/z/E13T3MKv7
minbench.zip
The text was updated successfully, but these errors were encountered: