Files

.github
docker
docs
egs
- aidatatang_200zh
- aishell
- aishell2
- aishell4
- alimeeting
- ami
- audioset
- commonvoice
- csj
- fluent_speech_commands
- gigaspeech
- ksponspeech
- libricss
- libriheavy
- librilight
- librispeech
  - ASR
    - conformer_ctc
    - conformer_ctc2
    - conformer_ctc3
    - conformer_mmi
    - conv_emformer_transducer_stateless
    - conv_emformer_transducer_stateless2
    - local
    - long_file_recog
    - lstm_transducer_stateless
    - lstm_transducer_stateless2
    - lstm_transducer_stateless3
    - pruned2_knowledge
    - pruned_stateless_emformer_rnnt2
    - pruned_transducer_stateless
    - pruned_transducer_stateless2
    - pruned_transducer_stateless3
    - pruned_transducer_stateless4
    - pruned_transducer_stateless5
    - pruned_transducer_stateless6
    - pruned_transducer_stateless7
    - pruned_transducer_stateless7_ctc
    - pruned_transducer_stateless7_ctc_bs
    - pruned_transducer_stateless7_streaming
    - pruned_transducer_stateless7_streaming_multi
    - pruned_transducer_stateless8
    - streaming_conformer_ctc
    - tdnn_lstm_ctc
    - tiny_transducer_ctc
    - transducer
    - transducer_lstm
    - transducer_stateless
    - transducer_stateless2
    - transducer_stateless_multi_datasets
    - zipformer
    - zipformer_adapter
    - zipformer_ctc
    - zipformer_lora
    - zipformer_mmi
    - .gitignore
    - README.md
    - RESULTS-100hours.md
    - RESULTS.md
    - add_alignments.sh
    - distillation_with_hubert.sh
    - finetune.sh
    - long_file_recog.sh
    - prepare.sh
    - prepare_lm.sh
    - prepare_mmi.sh
    - shared
  - SSL
  - WSASR
- libritts
- ljspeech
- mdcc
- mgb2
- multi_zh-hans
- multi_zh_en
- must_c
- ptb
- reazonspeech
- speech_llm
- speechio
- spgispeech
- swbd
- tal_csasr
- tedlium3
- timit
- vctk
- voxpopuli
- wenetspeech
- wenetspeech4tts
- xbmu_amdo31
- yesno
icefall
test
.flake8
.git-blame-ignore-revs
.gitignore
.pre-commit-config.yaml
LICENSE
README.md
contributing.md
pyproject.toml
requirements-ci.txt
requirements-tts.txt
requirements.txt
setup.py

ASR

Revert "Replace deprecated pytorch methods (#1814 )" (#1841 )

Dec 18, 2024

d4d4f28 · Dec 18, 2024

Name	Name	Last commit message	Last commit date
parent directory ..
conformer_ctc	conformer_ctc	Fix doc URLs (#1660 )	Jun 21, 2024
conformer_ctc2	conformer_ctc2	Revert "Replace deprecated pytorch methods (#1814 )" (#1841 )	Dec 18, 2024
conformer_ctc3	conformer_ctc3	Revert "Replace deprecated pytorch methods (#1814 )" (#1841 )	Dec 18, 2024
conformer_mmi	conformer_mmi	Use shuffled LibriSpeech cuts instead (#1450 )	Jan 8, 2024
conv_emformer_transducer_stateless	conv_emformer_transducer_stateless	Revert "Replace deprecated pytorch methods (#1814 )" (#1841 )	Dec 18, 2024
conv_emformer_transducer_stateless2	conv_emformer_transducer_stateless2	Revert "Replace deprecated pytorch methods (#1814 )" (#1841 )	Dec 18, 2024
local	local	Refactor prepare.sh in librispeech (#1493 )	Feb 9, 2024
long_file_recog	long_file_recog	Strengthened style constraints (#1527 )	Mar 4, 2024
lstm_transducer_stateless	lstm_transducer_stateless	Revert "Replace deprecated pytorch methods (#1814 )" (#1841 )	Dec 18, 2024
lstm_transducer_stateless2	lstm_transducer_stateless2	Revert "Replace deprecated pytorch methods (#1814 )" (#1841 )	Dec 18, 2024
lstm_transducer_stateless3	lstm_transducer_stateless3	Revert "Replace deprecated pytorch methods (#1814 )" (#1841 )	Dec 18, 2024
pruned2_knowledge	pruned2_knowledge	Revert "Replace deprecated pytorch methods (#1814 )" (#1841 )	Dec 18, 2024
pruned_stateless_emformer_rnnt2	pruned_stateless_emformer_rnnt2	Revert "Replace deprecated pytorch methods (#1814 )" (#1841 )	Dec 18, 2024
pruned_transducer_stateless	pruned_transducer_stateless	Strengthened style constraints (#1527 )	Mar 4, 2024
pruned_transducer_stateless2	pruned_transducer_stateless2	Revert "Replace deprecated pytorch methods (#1814 )" (#1841 )	Dec 18, 2024
pruned_transducer_stateless3	pruned_transducer_stateless3	Revert "Replace deprecated pytorch methods (#1814 )" (#1841 )	Dec 18, 2024
pruned_transducer_stateless4	pruned_transducer_stateless4	Revert "Replace deprecated pytorch methods (#1814 )" (#1841 )	Dec 18, 2024
pruned_transducer_stateless5	pruned_transducer_stateless5	Revert "Replace deprecated pytorch methods (#1814 )" (#1841 )	Dec 18, 2024
pruned_transducer_stateless6	pruned_transducer_stateless6	Revert "Replace deprecated pytorch methods (#1814 )" (#1841 )	Dec 18, 2024
pruned_transducer_stateless7	pruned_transducer_stateless7	Revert "Replace deprecated pytorch methods (#1814 )" (#1841 )	Dec 18, 2024
pruned_transducer_stateless7_ctc	pruned_transducer_stateless7_ctc	Revert "Replace deprecated pytorch methods (#1814 )" (#1841 )	Dec 18, 2024
pruned_transducer_stateless7_ctc_bs	pruned_transducer_stateless7_ctc_bs	Revert "Replace deprecated pytorch methods (#1814 )" (#1841 )	Dec 18, 2024
pruned_transducer_stateless7_streaming	pruned_transducer_stateless7_streaming	Revert "Replace deprecated pytorch methods (#1814 )" (#1841 )	Dec 18, 2024
pruned_transducer_stateless7_streaming_multi	pruned_transducer_stateless7_streaming_multi	Revert "Replace deprecated pytorch methods (#1814 )" (#1841 )	Dec 18, 2024
pruned_transducer_stateless8	pruned_transducer_stateless8	Revert "Replace deprecated pytorch methods (#1814 )" (#1841 )	Dec 18, 2024
streaming_conformer_ctc	streaming_conformer_ctc	formatted the entire LibriSpeech recipe (#1270 )	Sep 24, 2023
tdnn_lstm_ctc	tdnn_lstm_ctc	Fix typos, remove unused packages, normalize comments (#1678 )	Jul 4, 2024
tiny_transducer_ctc	tiny_transducer_ctc	Revert "Replace deprecated pytorch methods (#1814 )" (#1841 )	Dec 18, 2024
transducer	transducer	Use high_freq -400 in computing fbank features. (#1447 )	Jan 4, 2024
transducer_lstm	transducer_lstm	formatted the entire LibriSpeech recipe (#1270 )	Sep 24, 2023
transducer_stateless	transducer_stateless	Use high_freq -400 in computing fbank features. (#1447 )	Jan 4, 2024
transducer_stateless2	transducer_stateless2	Use high_freq -400 in computing fbank features. (#1447 )	Jan 4, 2024
transducer_stateless_multi_datasets	transducer_stateless_multi_datasets	Use high_freq -400 in computing fbank features. (#1447 )	Jan 4, 2024
zipformer	zipformer	Revert "Replace deprecated pytorch methods (#1814 )" (#1841 )	Dec 18, 2024
zipformer_adapter	zipformer_adapter	Revert "Replace deprecated pytorch methods (#1814 )" (#1841 )	Dec 18, 2024
zipformer_ctc	zipformer_ctc	Revert "Replace deprecated pytorch methods (#1814 )" (#1841 )	Dec 18, 2024
zipformer_lora	zipformer_lora	Revert "Replace deprecated pytorch methods (#1814 )" (#1841 )	Dec 18, 2024
zipformer_mmi	zipformer_mmi	Revert "Replace deprecated pytorch methods (#1814 )" (#1841 )	Dec 18, 2024
.gitignore	.gitignore	Streaming Zipformer with multi-dataset (#984 )	Apr 21, 2023
README.md	README.md	Add Consistency-Regularized CTC (#1766 )	Oct 21, 2024
RESULTS-100hours.md	RESULTS-100hours.md	[Ready to merge]stateless6: states4 + hubert distillation. (#387 )	May 28, 2022
RESULTS.md	RESULTS.md	Add prefix beam search and corresponding decoding methods (#1786 )	Oct 30, 2024
add_alignments.sh	add_alignments.sh	Get alignments using lhotse workflows align-with-torchaudio (#888 )	Feb 8, 2023
distillation_with_hubert.sh	distillation_with_hubert.sh	Fix docs for MVQ (#1272 )	Sep 25, 2023
finetune.sh	finetune.sh	Add adaption recipe for pruned_transducer_stateless7 (#1059 )	May 17, 2023
long_file_recog.sh	long_file_recog.sh	Support long audios recognition (#980 )	May 19, 2023
prepare.sh	prepare.sh	Refactor prepare.sh in librispeech (#1493 )	Feb 9, 2024
prepare_lm.sh	prepare_lm.sh	Fix an error occured during mmi preparation (#1626 )	May 17, 2024
prepare_mmi.sh	prepare_mmi.sh	Fix an error occured during mmi preparation (#1626 )	May 17, 2024
shared	shared	Refactoring (#4 )	Aug 4, 2021

README.md

Introduction

Please refer to https://k2-fsa.github.io/icefall/recipes/Non-streaming-ASR/librispeech/index.html for how to run models in this recipe.

./RESULTS.md contains the latest results.

Transducers

There are various folders containing the name transducer in this folder. The following table lists the differences among them.

	Encoder	Decoder	Comment
`transducer`	Conformer	LSTM
`transducer_stateless`	Conformer	Embedding + Conv1d	Using optimized_transducer from computing RNN-T loss
`transducer_stateless2`	Conformer	Embedding + Conv1d	Using torchaudio for computing RNN-T loss
`transducer_lstm`	LSTM	LSTM
`transducer_stateless_multi_datasets`	Conformer	Embedding + Conv1d	Using data from GigaSpeech as extra training data
`pruned_transducer_stateless`	Conformer	Embedding + Conv1d	Using k2 pruned RNN-T loss
`pruned_transducer_stateless2`	Conformer(modified)	Embedding + Conv1d	Using k2 pruned RNN-T loss
`pruned_transducer_stateless3`	Conformer(modified)	Embedding + Conv1d	Using k2 pruned RNN-T loss + using GigaSpeech as extra training data
`pruned_transducer_stateless4`	Conformer(modified)	Embedding + Conv1d	same as pruned_transducer_stateless2 + save averaged models periodically during training + delay penalty
`pruned_transducer_stateless5`	Conformer(modified)	Embedding + Conv1d	same as pruned_transducer_stateless4 + more layers + random combiner
`pruned_transducer_stateless6`	Conformer(modified)	Embedding + Conv1d	same as pruned_transducer_stateless4 + distillation with hubert
`pruned_transducer_stateless7`	Zipformer	Embedding + Conv1d	First experiment with Zipformer from Dan
`pruned_transducer_stateless7_ctc`	Zipformer	Embedding + Conv1d	Same as pruned_transducer_stateless7, but with extra CTC head
`pruned_transducer_stateless7_ctc_bs`	Zipformer	Embedding + Conv1d	pruned_transducer_stateless7_ctc + blank skip
`pruned_transducer_stateless7_streaming`	Streaming Zipformer	Embedding + Conv1d	streaming version of pruned_transducer_stateless7
`pruned_transducer_stateless7_streaming_multi`	Streaming Zipformer	Embedding + Conv1d	same as pruned_transducer_stateless7_streaming, trained on LibriSpeech + GigaSpeech
`pruned_transducer_stateless8`	Zipformer	Embedding + Conv1d	Same as pruned_transducer_stateless7, but using extra data from GigaSpeech
`pruned_stateless_emformer_rnnt2`	Emformer(from torchaudio)	Embedding + Conv1d	Using Emformer from torchaudio for streaming ASR
`conv_emformer_transducer_stateless`	ConvEmformer	Embedding + Conv1d	Using ConvEmformer for streaming ASR + mechanisms in reworked model
`conv_emformer_transducer_stateless2`	ConvEmformer	Embedding + Conv1d	Using ConvEmformer with simplified memory for streaming ASR + mechanisms in reworked model
`lstm_transducer_stateless`	LSTM	Embedding + Conv1d	Using LSTM with mechanisms in reworked model
`lstm_transducer_stateless2`	LSTM	Embedding + Conv1d	Using LSTM with mechanisms in reworked model + gigaspeech (multi-dataset setup)
`lstm_transducer_stateless3`	LSTM	Embedding + Conv1d	Using LSTM with mechanisms in reworked model + gradient filter + delay penalty
`zipformer`	Upgraded Zipformer	Embedding + Conv1d	The latest recipe
`zipformer_adapter`	Upgraded Zipformer	Embedding + Conv1d	It supports domain adaptation of Zipformer using parameter efficient adapters
`zipformer_adapter`	Upgraded Zipformer	Embedding + Conv1d	Finetune Zipformer with LoRA

The decoder in transducer_stateless is modified from the paper Rnn-Transducer with Stateless Prediction Network. We place an additional Conv1d layer right after the input embedding layer.

CTC

	Encoder	Comment
`conformer-ctc`	Conformer	Use auxiliary attention head
`conformer-ctc2`	Reworked Conformer	Use auxiliary attention head
`conformer-ctc3`	Reworked Conformer	Streaming version + delay penalty
`zipformer-ctc`	Zipformer	Use auxiliary attention head
`zipformer`	Upgraded Zipformer	Use auxiliary transducer head / attention-decoder head (the latest recipe)

MMI

	Encoder	Comment
`conformer-mmi`	Conformer
`zipformer-mmi`	Zipformer	CTC warmup + use HP as decoding graph for decoding

CR-CTC

	Encoder	Comment
`zipformer`	Upgraded Zipformer	Could also be an auxiliary loss to improve transducer or CTC/AED (the latest recipe)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

ASR

ASR

README.md

Introduction

Transducers

CTC

MMI

CR-CTC

Files

ASR

Directory actions

More options

Directory actions

More options

Latest commit

History

ASR

Folders and files

parent directory

README.md

Introduction

Transducers

CTC

MMI

CR-CTC