Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why can’t training start? #142

Open
piwawa opened this issue May 7, 2024 · 4 comments
Open

Why can’t training start? #142

piwawa opened this issue May 7, 2024 · 4 comments

Comments

@piwawa
Copy link

piwawa commented May 7, 2024

image

image

image

The epoch 0 has been going on for 2 days. I have filled in the path correctly in train.txt and text.txt, the dataset has 330k videos and has been preprocessed. If I select some data from train.txt, it can start training immediately. Why can't use a large dataset to train?

@bjfrbjx
Copy link

bjfrbjx commented May 17, 2024

DataSet里的continue打一下断点,原项目写的数据加载方式总会跳过异常,变成无效死循环

@piwawa
Copy link
Author

piwawa commented Jun 12, 2024

DataSet里的continue打一下断点,原项目写的数据加载方式总会跳过异常,变成无效死循环

Syncnet能训了,不过loss一直在0.69。

python hq_wav2lip_sam_train.py 现在训wav2lip又出问题了,用的一模一样的数据集,一直卡在这不动,4090显卡,200多G内存,卡好几天没反应。

image

@yuanmaitian
Copy link

DataSet里的continue打一下断点,原项目写的数据加载方式总会跳过异常,变成无效死循环

大佬,打断点是指调试?还是说直接break?我的也是,用了你的stream-wav2lip的训练syncnet,调好了index error后,一直卡在epoch 0不动。

@qiuzi
Copy link

qiuzi commented Aug 12, 2024

@piwawa 老哥你是怎么解决epoch 0 问题的?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants