Replies: 1 comment 1 reply
-
I think it is platform-dependent, the current x86_64 code in particular performing worse than, say, arm64. But if I knew why exactly and how to fix it, we wouldn't be having this discussion |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi @martin-frbg
I noticed that the gemv function in OpenBLAS performs equally or even better in a single thread compared to multiple threads. Are there specific factors like memory access patterns, workload distribution, or threading overhead affecting this behavior?
Beta Was this translation helpful? Give feedback.
All reactions