In this work we study one aspect of this problem by reconstructing speech from the intermediate embeddings computed by a CNN. Specifically, we consider a pre-trained network that acts as a feature extractor from speech audio. We investigate the possibility of inverting these features, reconstructing the input signals in a black-box scenario, and quantitatively measure the reconstruction quality by measuring the word-error-rate of an off-the-shelf ASR model.
-
Notifications
You must be signed in to change notification settings - Fork 0
Speech reconstruction from pre-trained CNN embeddings
License
polimi-ispl/speech_reconstruction_embeddings
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Speech reconstruction from pre-trained CNN embeddings
Topics
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published