Errors in finetuning #6

pqviet · 2022-07-14T07:23:52Z

After completing pre-training, I finetuned to refcoco-unc and found the following error messages
File "SeqTR/seqtr/utils/checkpoint.py", line 57, in load_pretrained_checkpoint
state, ema_state = ckpt['state_dict'], ckpt['ema_state_dict']
KeyError: 'ema_state_dict'
Even after fixing this bug, I still found many bugs (e.g. lan_enc.embedding.weight, model.head) in load_pretrained_checkpoint().
Can you please check it?

seanzhuh · 2022-07-15T14:25:09Z

Hi, please upload the full traceback. Did it show the lan_enc.embedding.weight does not match the size, pre-training uses a larger word vocabulary, while fine-tuning only need a subset of this vocabulary, since we freeze the embedding weight both for pre-training and fine-tuning, it's ok, don't worry.

pqviet · 2022-07-19T07:53:52Z

After fixing the 'ema_state_dict' keyerror, I got the same error for lan_enc.embedding.weight
KeyError: 'lan_enc.embedding.weight'
I think some keys in the fine-tuned model were not defined in the pretrain model.

seanzhuh · 2022-07-19T12:41:21Z

Did you use DDP during fine-tuning, if that's the case, the keys in pre-trained state_dict need to prepend "module." since we move it in line 58-59. By default we fine-tune on a single GPU card.

pqviet · 2022-07-20T01:15:25Z

No, I didn't use DDP in fine-tuning
python tools/train.py configs/seqtr/detection/seqtr_det_refcoco-unc.py --finetune-from work_dir/seqtr_det_mixed/det_best.pth --cfg-options scheduler_config.max_epoch=5 scheduler_config.decay_steps=[4] scheduler_config.warmup_epochs=0

CCYChongyanChen · 2022-11-03T07:52:10Z

Dear Author:
I met the same error. The trackback is attached:

Traceback (most recent call last):
File "tools/train.py", line 183, in
main()
File "tools/train.py", line 179, in main
main_worker(cfg)
File "tools/train.py", line 105, in main_worker
load_pretrained_checkpoint(model, model_ema, cfg.finetune_from, amp=cfg.use_fp16)
File "/home/chch3470/SeqTR/seqtr/utils/checkpoint.py", line 57, in load_pretrained_checkpoint
state, ema_state = ckpt['state_dict'], ckpt['ema_state_dict']
KeyError: 'ema_state_dict'

I am finetuning the segmentation model from the "pre-trained + fine-tuned SeqTR segmentation" on a customized dataset.

(1) I can run inference/test on this pretrained model.
(2)I also can fine-tune the detection model.

Not sure if there is something missing from the segmentation finetune...Could you kindly guide me? Thank you so much!

The script I run is
``python tools/train.py configs/seqtr/segmentation/seqtr_segm_vizwiz.py --finetune-from "/home/chch3470/SeqTR/work_dir/segm_best.pth" --cfg-options scheduler_config.max_epoch=10 scheduler_config.decay_steps=[4] scheduler_config.warmup_epochs=0 "

seanzhuh · 2022-11-03T10:04:44Z

During pretraining, we disable EMA and LSJ, so there is no ema_state_dict of the model. Just comment this line and loading the state_dict would be fine.

…

________________________________ 发件人: 陈崇彦 ***@***.***> 发送时间: Thursday, November 3, 2022 3:52:20 PM 收件人: sean-zhuh/SeqTR ***@***.***> 抄送: seanZhuh ***@***.***>; Comment ***@***.***> 主题: Re: [sean-zhuh/SeqTR] Errors in finetuning (Issue #6) Dear Author: I met the same error. The trackback is attached: Traceback (most recent call last): File "tools/train.py", line 183, in main() File "tools/train.py", line 179, in main main_worker(cfg) File "tools/train.py", line 105, in main_worker load_pretrained_checkpoint(model, model_ema, cfg.finetune_from, amp=cfg.use_fp16) File "/home/chch3470/SeqTR/seqtr/utils/checkpoint.py", line 57, in load_pretrained_checkpoint state, ema_state = ckpt['state_dict'], ckpt['ema_state_dict'] KeyError: 'ema_state_dict' I am finetuning the segmentation model from the "pre-trained + fine-tuned SeqTR segmentation" on a customized dataset. (1) I can run inference/test on this pretrained model. (2)I also can fine-tune the detection model. Not sure if there is something missing from the segmentation finetune...Could you kindly guide me? Thank you so much! ― Reply to this email directly, view it on GitHub<#6 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AG6FBLMHWSLPNZ2UE6O57KLWGNVLJANCNFSM53RHB3ZQ>. You are receiving this because you commented.Message ID: ***@***.***>

CCYChongyanChen · 2022-11-03T18:12:16Z

Thank you for quick reply!
I comment out the lines about ema and it shows error about lan_enc.embedding.weight

Traceback (most recent call last):
File "tools/train.py", line 183, in
main()
File "tools/train.py", line 179, in main
main_worker(cfg)
File "tools/train.py", line 105, in main_worker
load_pretrained_checkpoint(model, model_ema, cfg.finetune_from, amp=cfg.use_fp16)
File "/home/chch3470/SeqTR/seqtr/utils/checkpoint.py", line 61, in load_pretrained_checkpoint
state.pop("lan_enc.embedding.weight")
KeyError: 'lan_enc.embedding.weight'

The seq_embedding_dim key is also missing.
I commented out many lines and it seems to be working. Though not sure if I did it correctly or not

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Errors in finetuning #6

Errors in finetuning #6

pqviet commented Jul 14, 2022

seanzhuh commented Jul 15, 2022

pqviet commented Jul 19, 2022

seanzhuh commented Jul 19, 2022

pqviet commented Jul 20, 2022

CCYChongyanChen commented Nov 3, 2022 •

edited

Loading

seanzhuh commented Nov 3, 2022 via email

CCYChongyanChen commented Nov 3, 2022 •

edited

Loading

Errors in finetuning #6

Errors in finetuning #6

Comments

pqviet commented Jul 14, 2022

seanzhuh commented Jul 15, 2022

pqviet commented Jul 19, 2022

seanzhuh commented Jul 19, 2022

pqviet commented Jul 20, 2022

CCYChongyanChen commented Nov 3, 2022 • edited Loading

seanzhuh commented Nov 3, 2022 via email

CCYChongyanChen commented Nov 3, 2022 • edited Loading

CCYChongyanChen commented Nov 3, 2022 •

edited

Loading

CCYChongyanChen commented Nov 3, 2022 •

edited

Loading