Design thoughts

Guys (mostly Piotr but also anyone who's listening),

Firstly, on timeline, we need something working well in time for September 1st-ish, the time we give the tutorial.
So we can't be too ambitious: think, cleaned up and reorganized version of Snowfall, working by early-to-mid August.  Sorry I have delayed this for so long.  Liyong is working on   replicating ESPNet results with k2 mechanisms, he is making good progress, we may want to incorporate parts of that.

I want to avoid big centralized APIs at the moment.

I also want to avoid the phenomenon in SpeechBrain and ESPNet where there is a kind of "configuration layer" where
you pass in configs, and these get parsed into actual python code by some other code.  I would rather keep it all
Python code.  Suppose we have a directory (this doesn't have to be the real name):
  egs/librispeech/ASR/
then I am thinking we can have subdirectories of that where the scripts for different versions of experiments live.
We might have some data-prep scripts:
   egs/librispeech/ASR/{prepare.sh,local/blahblah,...}
and these would write to some subdirectory, e.g. egs/librispeech/ASR/data/...
Then for different experiments we'd have the scripts in subdirectories, like:
  egs/librispeech/ASR/tdnn_lstm_ctc/{model.py,train.py,decode.py,README.md}
and we might have
  egs/librispeech/ASR/conformer_mmi/{model.py,train.py,decode.py,README.md}
that would refer to the alignment model in e.g. ../tdnn_lstm_ctc/8.pt, and to the data in ../data/blah...

The basic idea here is that if you want to change the experiment locally, you would copy-and-modify the scripts in conformer_mmi to e.g. conformer_mmi_1a/, and add them to your git repo if wanted.  We would avoid overloading the scripts in these experiment directories with command-line options.  Any back-compatibility would be at the level of the icefall Python libraries themselves.   We could perhaps introduce versions of the data directories as well, e.g. data/, data2/ and so on (not sure whether it would make sense to have multiple versions of the data-prep scripts or use options).

In order to avoid overloading the model code, and utility code, with excessive back-compatibility, I suggest that we have versions of the model code and maybe even parts of the other libraries: e.g. snowfall/models1/.  Then we can add options etc., but when  it becomes oppressive we can just copy-and-modify to models2/ and strip out most of the options.   This will tend to reduce "cyclomatic complexity" by keeping any given version of the code simple.  At this point, let's think of this to some extent as a demo tool for k2 and lhotse, we don't have to think of it as some vast toolkit with zillions of features.


      



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Design thoughts #1

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Design thoughts #1

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions