Skip to content

severilov/DL-Audio-Course

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

53 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

logo

Deep Learning for Audio Course, Fall 2024

Description

Topics discussed in course:

  • Digital Signal Processing
  • Automatic Speech Recognition (ASR)
  • Key-word spotting (KWS)
  • Text-to-Speech (TTS)
  • Voice Conversion
  • Unsupervised learning in Audio
  • Music Generation with NNs

Course materials

Materials

# Date Description Slides Video
1 September, 12 Lecture 1: Introduction and Digital Signal Processing slides video
2 September, 19 Seminar 1: Introduction, Spectrograms and Griffin-Lim notebook video
3 September, 30 Lecture 2: Automatic Speech Recognition 1: WER, CTC, LAS, Beam Search slides video
4 October, 3 Seminar 2: Levenstein distance, WER, CER notebook video
5 October, 10 Lecture 3: Automatic Speech Recognition 2: RNN-T, Conformer, Whisper, Language models in ASR, BPE slides video
6 October, 17 Lecture 4: Key-word spotting (KWS) slides video
6 October, 24 Seminar 3: CTC, Beam Search notebook video
8 October, 31 Lecture 5: Text-to-speech: Tacotron, FastSpeech, Guided Attention slides video
9 November, 7 Seminar 4: Key-word spotting notebook video
10 November, 14 Seminar 5: Text-to-speech: Tacotron2 notebook video
11 November, 21 Lecture 6: Text-to-speech: Neural Vocoders (WaveNet, PWGAN, DiffWave) slides video
12 November, 28 Lecture 7: Self-supervised learning in Audio slides video

Homeworks

Homework Date Deadline Description Link
1 September, 26 October, 10
  1. Audio classification
  2. Audio preprocessing
Open In Github
2 September, 26 October, 24 ASR-1: CTC Open In Github
3 October, 24 November, 7 ASR-2: RNN-T Open In Github
4 October, 24 November, 22 Text-to-speech: FastPitch Open In Github

Game rules

  • 4 homeworks each of 2 points = 8 points
  • final test = 2 points
  • maximum points: 8 + 2 = 10 points

Contributors & course staff

Author + Lectures: Pavel Severilov

Seminars: Viacheslav Shokorov

Help build course materials and held seminars Daniel Knyazev

About

Deep Learning Audio Course, 2023

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published