Skip to content

Use dynamic learning rate decay for convergence#39

Open
begeekmyfriend wants to merge 2 commits intoseungwonpark:masterfrom
begeekmyfriend:master
Open

Use dynamic learning rate decay for convergence#39
begeekmyfriend wants to merge 2 commits intoseungwonpark:masterfrom
begeekmyfriend:master

Conversation

@begeekmyfriend
Copy link

The evaluation sounds better than that with fixed learning rate.

Signed-off-by: begeekmyfriend begeekmyfriend@gmail.com

Signed-off-by: begeekmyfriend <begeekmyfriend@gmail.com>
@seungwonpark
Copy link
Owner

Hi, your code looks great, and thanks for kindly sending PR!
Can you please show the audio samples you got (w/ the number of epochs) for comparison?

@begeekmyfriend
Copy link
Author

melgan_eval_mandarin.zip
I have synthesized voices from 4 anchors (1 male and 3 females). And the checkpoint is only at epoch 375 and still under training. I think it helps convergence.

Signed-off-by: begeekmyfriend <begeekmyfriend@gmail.com>


def cosine_decay(init_val, final_val, step, decay_steps):
alpha = final_val / init_val
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be init_val / final_val?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The learning rate decays. You might write a demo for testing.

init_val = 1e-4
final_val = 1e-5

Copy link

@casper-hansen casper-hansen Jan 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the following source it's "Minimum learning rate value as a fraction of learning_rate."
https://docs.w3cub.com/tensorflow~python/tf/train/cosine_decay/

Given the values, it looks like it's correct. The naming is just off - it should be the smallest value in the numerator and largest value in the denominator.

@bob80333
Copy link

Is this different from pytorch's built-in CosineAnnealingLR?

@Liujingxiu23
Copy link

@begeekmyfriend
Your tried different lr and found that the cos-lr is the best? And why it is suitable in melgan?
I am confused, when should we use unchanged lr, when should we use lr with decline,for example exponential decline in tacotron, and when shoud we use cos-lr.

@begeekmyfriend
Copy link
Author

It is just a preference. Pick it or other as you like.

@Liujingxiu23
Copy link

Liujingxiu23 commented Jan 28, 2021

@begeekmyfriend Thank you for your quick reply. I used your branch of tacotron and found it is one of the best among a lot of code branchs. I will try the cos-lr as well as the apex in tfgan.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants