-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only translating last 30s or so of the audio file. #172
Comments
What kind of audio is it? Also could you provide the command you are using to call the cli? This may be a result of log prob errors considering the full windows to be silent, which would happen if the audio is particularly noisy. Can you try adjusting the log prop threshold and see if the results are better? |
So I was trying to transcribe an episode from the Cautionary Tales podcast. The sound is clear for the majority of the episodes. I was using the CLI with this command: You can get the audio file from here:
|
@ZachNagengast After doing some more tests I believe the bug is on the converting process when the source file is not 1 channel and 16Kz. This line here While we are reading new data for the input buffer in chunck we are always writing to the same position (0) of the outputBuffer so in the end the outputBuffer only has data from the last chunk read from the input file. |
Hi @SergioEstevao I'm having trouble reproducing this, can you share your the hardware and OS you're using where this error occurs? This is the file I get running your same command with the file |
I've confirmed theres something weird with the output buffer. Will have something for this shortly. |
When using the whisper-kit cli or apps with a large file, 50 minutes of audio, it looks like the final report (.srt file) is only showing the last 30s of content transcribed.
Is this expected, Am I'm missing a command line argument?
The text was updated successfully, but these errors were encountered: