Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[PERF] Specialize pow(x,2) as x*x. llama-7B (#434)
Right now `pow` with const exp argument is implemented simply. We convert const to const tensor and run elementwise `pow` of 2 tensors. It is simply but not always efficient. llama2 (RMSNorm part) has `x*x` that implemented as `tensor.pow(2)`. Convert `pow(x,2)` to `x*x`. Improvement on llama2-7B is around **0.237%**
- Loading branch information