Filtering the training data #151

ANonEntity · 2022-09-27T13:43:13Z

ANonEntity
Sep 27, 2022

Whisper tends to transcribe/translate silence as "Thank you for watching!", "Please subscribe to my channel!" and so on, since the training data contains YouTube captions. It seems like removing these lines from the training data, and then retraining/finetuning Whisper is the only way to solve this.

Would it be possible to detect unvoiced lines like these automatically? Maybe by filtering all the training data through a VAD?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filtering the training data #151

{{title}}

Replies: 0 comments

Select a reply

Filtering the training data #151

ANonEntity Sep 27, 2022

Replies: 0 comments

ANonEntity
Sep 27, 2022