-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What about int8 weighs? #608
Comments
@yuekaizhang Could you have a look? Thanks! |
@AntonThai2022 Would you mind pasting the error logs here? |
I apologize for the created topic - I just mixed up the folder with intermediate weights and engine weights. I sat down and took the desired folder and renamed it - everything worked. It’s strange that the regular version takes 8GB, and 8 bit 7GB. Although the acceleration was almost doubled. |
hello!
I build int8 weights:
INFERENCE_PRECISION=float16
WEIGHT_ONLY_PRECISION=int8
MAX_BEAM_WIDTH=4
MAX_BATCH_SIZE=8
checkpoint_dir=whisper_large_v3_weights_${WEIGHT_ONLY_PRECISION}
output_dir=whisper_large_v3_${WEIGHT_ONLY_PRECISION}
Convert the large-v3 model weights into TensorRT-LLM format.
python3 convert_checkpoint.py
--use_weight_only
--weight_only_precision $WEIGHT_ONLY_PRECISION
--output_dir $checkpoint_dir
So I got whisper_large_v3_weights_int8 and put it in sherpa/triton/whisper/model_repo_whisper_trtllm/whisper/1. But it does not work.
I tried to change name on whisper_large_v3, but it did not help)))
Is it real to launch int8 whisper in your repo and docker image?
The text was updated successfully, but these errors were encountered: