-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
int8 performance #44
Comments
please see |
Test commands: run.shexport LD_LIBRARY_PATH=$PWD
./sherpa-ncnn \
./sherpa-ncnn-conv-emformer-transducer-2022-12-06/tokens.txt \
./sherpa-ncnn-conv-emformer-transducer-2022-12-06/encoder_jit_trace-pnnx.ncnn.param \
./sherpa-ncnn-conv-emformer-transducer-2022-12-06/encoder_jit_trace-pnnx.ncnn.bin \
./sherpa-ncnn-conv-emformer-transducer-2022-12-06/decoder_jit_trace-pnnx.ncnn.param \
./sherpa-ncnn-conv-emformer-transducer-2022-12-06/decoder_jit_trace-pnnx.ncnn.bin \
./sherpa-ncnn-conv-emformer-transducer-2022-12-06/joiner_jit_trace-pnnx.ncnn.param \
./sherpa-ncnn-conv-emformer-transducer-2022-12-06/joiner_jit_trace-pnnx.ncnn.bin \
$@
run-8bit.sh
You can download the models from |
Here are the results on xiaomi 11 ultra The following table lists the processing time for decoding a wave, which is 5.1 seconds long.
Tests screenshot GPU (fp16)GPU (int8)CPU (fp16)CPU (int8) |
Here are the benchmark results on Xiaomi 9 Time (in seconds) for decoding a 5.1-second wave file:
|
I just tested the performance of int8 quantization on some Android phone. The phone has 8 CPUs and we use 8 threads for testing.
The following table lists the processing time for decoding a wave, which is 5.1 seconds long.
Click to see the screenshot of `About phone`
Click to see the output of `cat /proc/cpuinfo`
The text was updated successfully, but these errors were encountered: