-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Excessive Input Values in streaming zipformer encoder After Conversion to onnx #679
Comments
@yuekaizhang could you have a look at this issue?
I think it is easy to use a script to update the config.pbtxt when exporting the model to onnx to include the inputs for model states. |
@renadnasser1 |
Hi @yuekaizhang and @csukuangfj , The model I've built is a streaming zipformer model. The goal is to deploy it using triton. Following are the methods I've used to export it into the .onnx format:
The outcomes of these trials are:
|
You may try build_librispeech_pruned_transducer_stateless7_streaming.sh first, since it is a similar model comparing with streaming zipformer. In #681, it could work.
Sorry, I have no slot support recently; I would be very grateful if someone could contribute build_librispeech_zipformer_streaming.sh. |
Hello @csukuangfj,
First, thank you for all your hard work on icefall and sherpa—they've been incredible resources!
We encountered an issue after converting a trained checkpoint for a streaming Zipformer-based ASR model to ONNX format using the conversion script: export-onnx-streaming.py. The conversion script successfully generated a 3 onnx files (encoder, decoder and joiner). however, the encoder generated with 99 input_values, including (x, x_lens).
During deployment to Triton, we faced the following challenge:
We needed to write the config.pbtxt file. To streamline this, we referred to the scripts available in sherpa/triton/scripts for building configs. Unfortunately, there doesn't appear to be a script specifically for a streaming Zipformer-based model.
To proceed, we used the sherpa/triton/model_repo_streaming_zipformer directory as a reference for all components (feature_extractor, encoder, decoder, joiner, scorer). However, when running Triton, the model configuration expects 2 input_values, while the ONNX model provides 99 input_values.
Could you clarify the following:
Your insights would be immensely helpful, and I'd be happy to provide additional details if needed.
Thanks in advance for your support!
The text was updated successfully, but these errors were encountered: