You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In your paper, you use the learning rate above where β = 5000, which meaning the learning rate is less than 10−8. However, in your code, there is
Thus, the learning rate in your code is about 2e-5 with the parameter "scale_lr". Is there anything wrong?
The text was updated successfully, but these errors were encountered:
In your paper, you use the learning rate above where β = 5000, which meaning the learning rate is less than 10−8. However, in your code, there is
Thus, the learning rate in your code is about 2e-5 with the parameter "scale_lr". Is there anything wrong?
The text was updated successfully, but these errors were encountered: