Uses almost no Swahili resources. Audio FSTs are trained from Voxforge's English.
-
The shell commands
flac,gawk,swig, andwget.
On Ubuntu, you might need tosudo apt install flac gawk swig wget. -
The Kaldi toolkit for automatic speech recognition.
To install it,git clone https://www.github.com/kaldi-asr/kaldi. -
The SRI Language Modeling Toolkit.
To add this to Kaldi, download the filesrilm.tgzintokaldi/tools, and then (fromkaldi/tools)./install_srilm.sh. -
The Sequitur grapheme-to-phoneme converter.
To add this to Kaldi,cd kaldi/tools && extras/install_sequitur.sh.
(You might first need tosudo pip install numpy(for Python 2.7)).
Add these pseudo-Swahili scripts to Kaldi.
cd kaldi/egs
git clone https://www.github.com/uiuc-sst/pseudo-swahili
cd pseudo-swahili/s5
ln -s ../../wsj/s5/steps steps
ln -s ../../wsj/s5/utils utils
Get the Voxforge corpus of spoken English (this takes 45 minutes, and uses 25 GB of disk space).
./getdata.sh
Build the low-resource language model, vocabulary, etc. for Swahili.
cd pseudo-swahili/pseudo && ./a.sh
Build and test the speech recognizer.
cd pseudo-swahili/s5 && ./run.sh