-
Notifications
You must be signed in to change notification settings - Fork 742
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pandas.arrays_to_mgr报错数组长度必须一致 #270
Comments
请问使用的什么 llm 呢?我在使用推荐的几家模型基本都不会出现翻译行数变少的情况,这个检查是为了最终字幕的稳定 |
Qwen2-72B-Instruct就会出现这个问题 |
😂现在默认还是推荐 claude 了,Qwen 还需要很长时间的追赶 |
我用 openai 的 gpt-4o 也会报这个错,奇怪的是同一个yt视频,360p的没问题,720(自己加的)和1080都会报错。 2024-11-25 10:47:17.855 Uncaught app exception
Traceback (most recent call last):
File "/opt/anaconda3/envs/videolingo/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/exec_code.py", line 88, in exec_func_with_error_handling
result = func()
File "/opt/anaconda3/envs/videolingo/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 590, in code_to_exec
exec(code, module.__dict__)
File "/Users/dinochan/bs/ai/VideoLingo/st.py", line 116, in <module>
main()
File "/Users/dinochan/bs/ai/VideoLingo/st.py", line 112, in main
text_processing_section()
File "/Users/dinochan/bs/ai/VideoLingo/st.py", line 30, in text_processing_section
process_text()
File "/Users/dinochan/bs/ai/VideoLingo/st.py", line 54, in process_text
step5_splitforsub.split_for_sub_main()
File "/Users/dinochan/bs/ai/VideoLingo/core/step5_splitforsub.py", line 104, in split_for_sub_main
pd.DataFrame({'Source': src_lines, 'Translation': tr_lines}).to_excel("output/log/translation_results_for_subtitles.xlsx", index=False)
File "/opt/anaconda3/envs/videolingo/lib/python3.10/site-packages/pandas/core/frame.py", line 778, in __init__
mgr = dict_to_mgr(data, index, columns, dtype=dtype, copy=copy, typ=manager)
File "/opt/anaconda3/envs/videolingo/lib/python3.10/site-packages/pandas/core/internals/construction.py", line 503, in dict_to_mgr
return arrays_to_mgr(arrays, columns, index, dtype=dtype, typ=typ, consolidate=copy)
File "/opt/anaconda3/envs/videolingo/lib/python3.10/site-packages/pandas/core/internals/construction.py", line 114, in arrays_to_mgr
index = _extract_index(arrays)
File "/opt/anaconda3/envs/videolingo/lib/python3.10/site-packages/pandas/core/internals/construction.py", line 677, in _extract_index
raise ValueError("All arrays must be of the same length") |
哈哈哈这个和分辨率无关,可能是概率上会出错,gpt4o 可能没有返回完整响应或者漏了句子。 |
我也是这样想的,只是当时测试过程中连续稳定重现所以我才奇怪。 😂 |
gpt_log 会记录所有响应并且重复运行的时候会从中读取历史,所以如果没有删除 log 就重新运行其实还是会报同样错误~ |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
部分视频翻译执行到step5的pd.DataFrame({'Source': src, 'Translation': remerged}).to_excel(OUTPUT_REMERGED_FILE, index=False)时,报错arrays_to_mgr数组长度必须一致。
看了下"output/log/translation_results_remerged.xlsx"这个文件只跟翻译纯音频相关,所以目前注释step5和step6相关代码之后,就能正确结束视频翻译的任务。
The text was updated successfully, but these errors were encountered: