Topics discussed in course:
- Digital Signal Processing
- Automatic Speech Recognition (ASR)
- Key-word spotting (KWS)
- Text-to-Speech (TTS)
- Voice Conversion
- Unsupervised learning in Audio
- Music Generation with NNs
# | Date | Description | Slides | Video |
---|---|---|---|---|
1 | September, 12 | Lecture 1: Introduction and Digital Signal Processing | slides | video |
2 | September, 19 | Seminar 1: Introduction, Spectrograms and Griffin-Lim | notebook | video |
3 | September, 30 | Lecture 2: Automatic Speech Recognition 1: WER, CTC, LAS, Beam Search | slides | video |
4 | October, 3 | Seminar 2: Levenstein distance, WER, CER | notebook | video |
5 | October, 10 | Lecture 3: Automatic Speech Recognition 2: RNN-T, Conformer, Whisper, Language models in ASR, BPE | slides | video |
6 | October, 17 | Lecture 4: Key-word spotting (KWS) | slides | video |
6 | October, 24 | Seminar 3: CTC, Beam Search | notebook | video |
8 | October, 31 | Lecture 5: Text-to-speech: Tacotron, FastSpeech, Guided Attention | slides | video |
9 | November, 7 | Seminar 4: Key-word spotting | notebook | video |
10 | November, 14 | Seminar 5: Text-to-speech: Tacotron2 | notebook | video |
11 | November, 21 | Lecture 6: Text-to-speech: Neural Vocoders (WaveNet, PWGAN, DiffWave) | slides | video |
12 | November, 28 | Lecture 7: Self-supervised learning in Audio | slides | video |
- 4 homeworks each of 2 points = 8 points
- final test = 2 points
- maximum points: 8 + 2 = 10 points
Author + Lectures: Pavel Severilov
- telegram: @severilov
- e-mail: [email protected]
Seminars: Viacheslav Shokorov
- telegram: @vshokorov
- e-mail: [email protected]
Help build course materials and held seminars Daniel Knyazev
- telegram: @Oorgien
- e-mail: [email protected]