You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I'm working for my project on rust-decimal implementation with highly optimized operations, using std::simd, byte shifts, magic number usage, branch prediction optimized and branchless where reasonable code, with many fast path implementations for common real world cases, and with various features like Decimal GEMM implementation with optional avx2 and avx512 instructions, vector and matrix operations, maybe i will decide to make a GPU version later, i don't know for now.
Question is would you be interested in merging the code 2-4 times faster in general and up to 50x faster in some cases?
If you would, then I will try to maintain compatibility with your implementation, if you are not interested, I will just build it for myself.