We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
您好,我看了您提供的训练好的模型,发现每个方法有两份训练好的模型,这两份训练好的模型,一个好像是在2080 ti上训练的,另一个好像是在V100上训的,但是前者只需要200个epoch就能够收敛的很好,后者则需要2000个epoch才能达到同样的收敛效果,这是为什么?
The text was updated successfully, but these errors were encountered:
您好,我目前也在使用这个项目,个人认为原因是这个项目使用了学习率调度器,例如Onecycle调度器遵循根据所处epoch的相对进度调整学习率,总体趋势为从小到大再变小。例如设置学习率为1e-3,对于200epoch来说在第100个epoch时学习率接近最大值1e-3。而对于2000epoch的设置,在前200epoch的学习率都非常小,只有在第1000epoch时才会达到1e-3附近。2000epoch应该是为了更充分的进行训练,并非为了对比训练效率。欢迎随时交流!
Sorry, something went wrong.
No branches or pull requests
您好,我看了您提供的训练好的模型,发现每个方法有两份训练好的模型,这两份训练好的模型,一个好像是在2080 ti上训练的,另一个好像是在V100上训的,但是前者只需要200个epoch就能够收敛的很好,后者则需要2000个epoch才能达到同样的收敛效果,这是为什么?
The text was updated successfully, but these errors were encountered: