-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
I just installed the latest 0.3.0 release. I can't generate a voice profile, because with every sound sample I upload I get a "clipping" error when I try to create the profile. Trying again with a new sample or voice profile gives me a "failed to fetch" error when transcribing the voice sample.
Regarding CUDA support: Voicebox_0.3.0_x64-setup.exe still does not enable CUDA support on my RTX5070
From release notes, I gather that cu126 is used, but as far as I know, Blackwell cards need at least cu128 (I use cu130 on other projects).
Is it possible to choose the model used for transcription of voice sampes? That would be a great feature.
Is there any information on the expected file format for voice samples? Adding a "preprocessing" with ffmpeg would be great, so that the program always gets the audio format it works best with.