A command-line client for the Triton ASR service.
To get started with this project, you can clone the repository and install the required packages using pip
.
https://github.com/yuekaizhang/Triton-ASR-Client.git
cd Triton-ASR-Client
pip install -r requirements.txt
client.py [-h] [--server-addr SERVER_ADDR] [--server-port SERVER_PORT]
[--manifest-dir MANIFEST_DIR] [--audio-path AUDIO_PATH]
[--model-name {transducer,attention_rescoring,streaming_wenet,infer_pipeline}]
[--num-tasks NUM_TASKS] [--log-interval LOG_INTERVAL]
[--compute-cer] [--streaming] [--simulate-streaming]
[--chunk_size CHUNK_SIZE] [--context CONTEXT]
[--encoder_right_context ENCODER_RIGHT_CONTEXT]
[--subsampling SUBSAMPLING] [--stats_file STATS_FILE]
-h, --help
: show this help message and exit--server-addr SERVER_ADDR
: Address of the server (default: localhost)--server-port SERVER_PORT
: gRPC port of the triton server, default is 8001 (default: 8001)--manifest-dir MANIFEST_DIR
: Path to the manifest dir which includes wav.scp trans.txt files. (default: ./datasets/aishell1_test)--audio-path AUDIO_PATH
: Path to a single audio file. It can't be specified at the same time with --manifest-dir (default: None)--model-name {whisper,transducer,attention_rescoring,streaming_wenet,infer_pipeline}
: Triton model_repo module name to request: whisper with TensorRT-LLM, transducer for k2, attention_rescoring for wenet offline, streaming_wenet for wenet streaming, infer_pipeline for paraformer large offline (default: transducer)--num-tasks NUM_TASKS
: Number of concurrent tasks for sending (default: 50)--log-interval LOG_INTERVAL
: Controls how frequently we print the log. (default: 5)--compute-cer
: True to compute CER, e.g., for Chinese. False to compute WER, e.g., for English words. (default: False)--streaming
: True for streaming ASR. (default: False)--simulate-streaming
: True for strictly simulate streaming ASR. Threads will sleep to simulate the real speaking scene. (default: False)--chunk_size CHUNK_SIZE
: Parameter for streaming ASR, chunk size default is 16 (default: 16)--context CONTEXT
: Subsampling context for wenet (default: -1)--encoder_right_context ENCODER_RIGHT_CONTEXT
: Encoder right context for k2 streaming (default: 2)--subsampling SUBSAMPLING
: Subsampling rate (default: 4)--stats_file STATS_FILE
: Output of stats analysis in human readable format (default: ./stats_summary.txt)
Model Repo | Description | Source | HuggingFace Link |
---|---|---|---|
Whisper | Offline ASR TensorRT-LLM | Openai | |
Conformer Onnx | Offline ASR Onnx FP16 | Wenet | yuekai/model_repo_conformer_aishell_wenet |
Conformer Tensorrt | Streaming ASR Tensorrt FP16 | Wenet | |
Conformer FasterTransformer | Offline ASR FasterTransformer FP16 | Wenet | |
Conformer CUDA-TLG decoder | Offline ASR with CUDA Decoders | Wenet | speechai/model_repo_conformer_aishell_wenet_tlg |
Offline Conformer Onnx | Offline ASR Onnx FP16 | k2 | wd929/k2_conformer_offline_onnx_model_repo |
Offline Conformer TensorRT | Offline ASR TensorRT FP16 | k2 | wd929/k2_conformer_offline_trt_model_repo |
Streaming Conformer Onnx | Streaming ASR Onnx FP16 | k2 | |
Zipformer Onnx | Offline ASR Onnx FP16 with Blank Skip | k2 | |
Paraformer Onnx | Offline ASR FP32 | FunASR |