Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

语音转文字报错 #150

Open
hjj-lmx opened this issue Oct 23, 2024 · 1 comment
Open

语音转文字报错 #150

hjj-lmx opened this issue Oct 23, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@hjj-lmx
Copy link

hjj-lmx commented Oct 23, 2024

同一个视频,在windows是好的,ubuntu上报错

RROR:root:An error occurred: choose a window size 400 that is [2, 160] | 0/24 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/UD-AI-TextToSpeech/text_to_speech/server_gpu.py", line 303, in audio2text
text = sense_voice_model(params)
File "/UD-AI-TextToSpeech/text_to_speech/audio_to_text/sense_voice_model.py", line 20, in call
res = self.sense_voice_model.generate(
File "/usr/local/lib/python3.10/dist-packages/funasr/auto/auto_model.py", line 263, in generate
return self.inference_with_vad(input, input_len=input_len, **cfg)
File "/usr/local/lib/python3.10/dist-packages/funasr/auto/auto_model.py", line 417, in inference_with_vad
results = self.inference(
File "/usr/local/lib/python3.10/dist-packages/funasr/auto/auto_model.py", line 302, in inference
res = model.inference(**batch, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/funasr/models/sense_voice/model.py", line 832, in inference
speech, speech_lengths = extract_fbank(
File "/usr/local/lib/python3.10/dist-packages/funasr/utils/load_utils.py", line 173, in extract_fbank
data, data_len = frontend(data, data_len, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1541, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/funasr/frontends/wav_frontend.py", line 134, in forward
mat = kaldi.fbank(
File "/usr/local/lib/python3.10/dist-packages/torchaudio/compliance/kaldi.py", line 591, in fbank
waveform, window_shift, window_size, padded_window_size = _get_waveform_and_window_properties(
File "/usr/local/lib/python3.10/dist-packages/torchaudio/compliance/kaldi.py", line 142, in _get_waveform_and_window_properties
assert 2 <= window_size <= len(waveform), "choose a window size {} that is [2, {}]".format(
AssertionError: choose a window size 400 that is [2, 160]
0%| | 0/24 [00:00<?, ?it/s]
0%| | 0/1 [00:06<?, ?it/s]
INFO:werkzeug:172.31.16.5 - - [23/Oct/2024 02:15:55] "POST /audio2text HTTP/1.1" 500 -

@hjj-lmx hjj-lmx added the bug Something isn't working label Oct 23, 2024
@hjj-lmx
Copy link
Author

hjj-lmx commented Oct 26, 2024

大佬们,这是什么问题啊,都是用的同样的requirements.txt,只不过,在ubuntu上我是自己打包的镜像过去运行的,不是所有的视频url传进去都报错,有部分视频才报

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant