-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unable to launch Triton server on finetuned whisper model #568
Comments
@yuekaizhang Could you have a look? |
@StephennFernandes Seems you build engines and run engines in different envs. Would you mind building and runnning in the same docker container e.g. soar97/triton-whisper:24.01.complete? |
@yuekaizhang i got it working thanks a ton for your assistance. |
@StephennFernandes Since whisper could only process audios smaller than 30s, you need to implement a VAD segmenter like this project https://github.com/shashikg/WhisperS2T/tree/main. Welcome to contribute :D |
@yuekaizhang thanks for the heads up. already on it. |
@yuekaizhang hey not like this error done anything bad for the deployment. as far as i have seen my triton deployment works fine.
|
Hi there,
I have been finetuning whisper models using huggingface. Further to convert the model to TensorRT_LLM format, i use a HF script that converts the models from its HF format to the original OpenAI format.
i then follow your instructions and convert the OAI model to TensorRT_LLM format. which happens successfully.
however when i follow the further steps on launching the Triton inference server using the
launch_server.sh
script.i get the following error:
the following is the stack trace of the entire log post launching the bash script.
The text was updated successfully, but these errors were encountered: