-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RL results are worse than Mle results in Rouge-1,2 #26
Comments
@pengzhi123 can you please let me know the system specification that you have used? I am trying to run this in windows machine with 32 GB RAM, I don't have CUDA enabled in my system. |
You should use ubuntu, not windows.
You should use ubuntu, not windows. |
hi, we found the best model (0050000.tar) for rl training after mle training, Although the rouge-L score improved, but the rouge-1 and rouge-2 score became very bad .
we show the eval:(we use rouge-1 for evaluation).
mle (official testset):
Training mle: yes, Training rl: no, mle weight: 1.00, rl weight: 0.00
intra_encoder: True intra_decoder: True
0005000.tar rouge_1: 0.3174
0010000.tar rouge_1: 0.3249
0015000.tar rouge_1: 0.3289
0020000.tar rouge_1: 0.3325
0025000.tar rouge_1: 0.3331
0030000.tar rouge_1: 0.3357
0035000.tar rouge_1: 0.3379
0040000.tar rouge_1: 0.3355
0045000.tar rouge_1: 0.3382
0050000.tar rouge_1: 0.3426
0055000.tar rouge_1: 0.3384
0060000.tar rouge_1: 0.3339
0065000.tar rouge_1: 0.3410
0070000.tar rouge_1: 0.3408
0075000.tar rouge_1: 0.3425
0080000.tar rouge_1: 0.3384
0085000.tar rouge_1: 0.3362
0090000.tar rouge_1: 0.3424
0095000.tar rouge_1: 0.3377
0100000.tar rouge_1: 0.3361
0105000.tar rouge_1: 0.3357
0110000.tar rouge_1: 0.3389
0115000.tar rouge_1: 0.3374
0120000.tar rouge_1: 0.3341
0125000.tar rouge_1: 0.3357
0130000.tar rouge_1: 0.3377
0135000.tar rouge_1: 0.3317
0140000.tar rouge_1: 0.3321
0145000.tar rouge_1: 0.3349
0150000.tar rouge_1: 0.3363
rl (official testset):
in_rl=yes --mle_weight=0.0 --load_model=0050000.tar --new_lr=0.0001
Training mle: no, Training rl: yes, mle weight: 0.00, rl weight: 1.00
intra_encoder: True intra_decoder: True
Loaded model at data/saved_models/0050000.tar
0050000.tar rouge_1: 0.3426
0055000.tar rouge_1: 0.2522
0060000.tar rouge_1: 0.2520
0065000.tar rouge_1: 0.2549
0070000.tar rouge_1: 0.2550
0075000.tar rouge_1: 0.2547
0080000.tar rouge_1: 0.2584
0085000.tar rouge_1: 0.2576
0090000.tar rouge_1: 0.2543
0095000.tar rouge_1: 0.2567
0100000.tar rouge_1: 0.2562
0105000.tar rouge_1: 0.2556
0110000.tar rouge_1: 0.2547
0115000.tar rouge_1: 0.2575
0120000.tar rouge_1: 0.2543
0125000.tar rouge_1: 0.2581
0130000.tar rouge_1: 0.2534
0135000.tar rouge_1: 0.2533
0140000.tar rouge_1: 0.2526
0145000.tar rouge_1: 0.2511
0150000.tar rouge_1: 0.2547
mle result:
0075000.tar scores: {'rouge-1': {'f': 0.3424728366572667, 'p': 0.39166721241721236, 'r': 0.31968494072078807}, 'rouge-2': {'f': 0.1732520206640223, 'p': 0.19845553983053968, 'r': 0.1623725413112666}, 'rouge-l': {'f': 0.32962985739519235, 'p': 0.3758193750693756, 'r': 0.3075168451832533}}
rl result:
0080000.tar scores: {'rouge-1': {'f': 0.2574669041724543, 'p': 0.21302155489848726, 'r': 0.34803503077209935}, 'rouge-2': {'f': 0.11896310475645827, 'p': 0.09758671687502977, 'r': 0.16587082443700088}, 'rouge-l': {'f': 0.35379459020991105, 'p': 0.39799812070645335, 'r': 0.33855028225319733}}
0125000.tar scores: {'rouge-1': {'f': 0.25674349158898563, 'p': 0.21440196978373974, 'r': 0.34277860517537473}, 'rouge-2': {'f': 0.11907341598225046, 'p': 0.09900864566338015, 'r': 0.16306397570581008}, 'rouge-l': {'f': 0.35462601354567735, 'p': 0.40579230645897313, 'r': 0.33368591052575747}}
thanks for your help!
The text was updated successfully, but these errors were encountered: