A2W Segmental Model (SLT'2021)

This repo contains the code for A2W segmental model Whole-Word Segmental Speech Recognition with Acoustic Word Embeddings

Requirements

Kaldi
g++ 5.5.0
CUDA 10.0
Pytorch 1.4.0
kaldi-io

Usage:

Use Kaldi for data preparation: setting up data dir, extracting fbank feature, computing and applying CMVN.
Extract word list and map words into ids. This can be done via src/preproc/prep_seg_word.py.
Edit paths for feature and text data in the configuration file (in conf/). An example of data and configuration file can be found in data/ and src/conf/.
Training and evaluation

cd src
./run.sh

! Note before running ./run.sh, make sure following paths are correctly set including KALDI_ROOT in path.sh, CC,CXX,PATH,LD_LIBRARY_PATH in run.sh

ToDo:

Code release for segmental model
Code release for AWE/AGWE pre-training

Reference

Please cite the paper below if you use this code in your research:

@inproceedings{shi2021segmental,
  author = {Bowen Shi and Shane Settle and Karen Livescu},
  title = {Whole-Word Segmental Speech Recognition with Acoustic Word Embeddings},
  booktitle = {SLT},
  year = {2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A2W Segmental Model (SLT'2021)

Requirements

Usage:

ToDo:

Reference

About

Releases

Packages

Languages

chevalierNoir/A2W-Segmental

Folders and files

Latest commit

History

Repository files navigation

A2W Segmental Model (SLT'2021)

Requirements

Usage:

ToDo:

Reference

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages