in this repository, FSDD (free spoken digits dataset) Audio Files are preprocessed using a preprocessing pipeline (see Audio Signal Processing for ML) to train a Varitoanl Auto Encoder Model to generate new audio that outputs the generated audio in /Audio directory.
Some Notes:
- this repo is for demo only, so the quality of the output audio isn't the best
- this repo initially was written without the intent of being published, so the code may be unorganized at some points, but it will be restructured later
References:
-
Generating Sound using neural network playlist on youtube by Valero Velardo.
-
Generative Deep Learning, 2nd Edition by David Foster, chapter 3 Variational Autoencoders