This project shows how to build / deploy fast and independent ASR system based on Kaldi and Tensorflow.
ASR demo
can be found here.
- gcc=7.5.0
- g++=7.5.0
- bazel=2.0.0
- Kaldi dependencies: git clone Kaldi
to your local repository and install kaldi dependencies.
kaldi/tools/extras/check_dependencies.sh
will help. - Check if mkl is successfully installed and check if mkl path is
/opt/intel/mkl
. If not, change mkl path in WORKSPACE. - Change kaldi path in WORKSPACE.
You can use dynamic library for asr system depolyment.
- build:
bazel build asr:libasr.so.0.0.0
, dynamic library path isbazel-bin/asr/libasr.so.0.0.0
. - Include
asr/asr.h
and link tolibasr.so.0.0.0
in your code. - Example see server.cpp.
You can find server example in server.
The ASR model is trained using Kaldi Speech Recognition Toolkit.
Below is the structure of model directory need for OnlineAsr
in asr/asr.h.
├── final.mdl
├── final.pb
├── HCLG.fst
├── ivector_extractor
│ ├── final.dubm
│ ├── final.ie
│ ├── final.mat
│ ├── global_cmvn.stats
│ ├── mfcc.conf
│ ├── online_cmvn.conf
│ └── splice.conf
└── words.txt
All files can be found in kaldi directory after training and decoding,
except final.pb
, the tensorflow format model file.
This can be converted via kaldi-onnx-tf.
It's more common to use and can utilize multi-thread for faster inference. Also it's convenient for GPU inference.
A pre-trained model is provided: model.
It's a TDNN-F model trained using kaldi's multi_cn project, with ~1200h Mandarin open source data. Some CER below.
test_set | cer |
---|---|
aishell | 5.65 |
aidatatang | 4.43 |
magicdata | 3.57 |
thchs | 12.85 |