DeepMIR

Teaching material for the course (CommE5070) "Deep Learning for Music Analysis and Generation" I taught at National Taiwan University (2023 Fall, 2024 Fall).

Lecturer: Yi-Hsuan Yang (https://affige.github.io/; [email protected]; [email protected])

“Music Information Research” (MIR) is an interdisciplinary research field that concerns with the analysis, retrieval, processing, and generation of musical content or information. Researchers involved in MIR may have a background in signal processing, machine learning, information retrieval, human-computer interaction, musicology, psychoacoustics, psychology, or some combination of these.

In this course, we are mainly interested in the application of machine learning, in particular deep learning, to address music related problems. Specifically, the course is divided to two parts: analysis and generation.

The first part is about the analysis of musical audio signals, covering topics such as feature extraction and representation learning for musical audio, music audio classification, melody extraction, automatic music transcription, and musical source separation.

The second part is about the generation of musical material, including symbolic-domain MIDI or tablatures, and audio-domain music signals such as singing voices and instrumental music. This would involve deep generative models such as generative adversarial networks (GANs), variational autoencoders (VAE), Transformers, and diffusion models.

Syllabus (of year 2024)

Lecture 1. Introduction to the course (slides)
Lecture 2. Fundamentals of musical audio (slides)
Lecture 3. Music classification and transcription (slides)
Lecture 4. Source separation (slides)
Lecture 5. GAN & Vocoders (slides)
Lecture 6. Fundamentals of symbolic music (slides)
Lecture 7. Symbolic MIDI generation (slides)
Lecture 8. Synthesis and timbre transfer (slides)
Lecture 9. Differentiable DSP models and automatic mixing (slides1, slides2, slides3)
Lecture 10. Singing voice generation (slides)
Lecture 11. Text-to-music generation (slides)
Lecture 12. Miscellaneous Topics (emotion/structure/alignment/rhythm) (slides)

Syllabus (of year 2023)

Lecture 1. Introduction to the course (slides1, slides2)
Lecture 2. Fundamentals & Music representation (slides)
Lecture 3. Analysis I (timbre): Automatic music classification and representation learning (slides)
Lecture 4. Generation I: Source separation (slides)
Lecture 5. Generation II: GAN & Vocoders (slides)
Lecture 6. Generation III: Synthesis of notes and loops (slides)
Lecture 7. Analysis II (pitch): Music transcription, Melody extraction, and Chord Recognition (slides1, slides2)
Lecture 8. Generation IV: Symbolic MIDI generation (slides)
Lecture 9. Generation V: Symbolic MIDI generation: Advanced topic on music structure (slides)
Lecture 10. Generation VI: Singing voice generation (slides)
Lecture 11. Generation VII: Text-to-music generation (slides)
Lecture 12. Generation VIII: Differentiable DSP models and automatic mixing (slides)
Lecture 13. Analysis III (rhythm) (slides)

License

The slides are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (https://creativecommons.org/licenses/by-nc-sa/4.0/). By downloading the slides, you agree to this license.

Name		Name	Last commit message	Last commit date
Latest commit History 157 Commits
2023		2023
2024		2024
README.md		README.md
license.png		license.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepMIR

Syllabus (of year 2024)

Syllabus (of year 2023)

License

About

Releases

Packages

affige/DeepMIR

Folders and files

Latest commit

History

Repository files navigation

DeepMIR

Syllabus (of year 2024)

Syllabus (of year 2023)

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages