Skip to content

Update eval_percep.py and train.py#24

Open
atharvadeore999 wants to merge 2 commits intossundaram21:mainfrom
atharvadeore999:main
Open

Update eval_percep.py and train.py#24
atharvadeore999 wants to merge 2 commits intossundaram21:mainfrom
atharvadeore999:main

Conversation

@atharvadeore999
Copy link

@atharvadeore999 atharvadeore999 commented Dec 2, 2024

  1. cfg is already a dictionary no need to pass it to vars() on line 55 of eval_percep.py
  2. line 189 of train.py , self.perceptual_model needs to be first passed to get_peft_model() along with lora_config

@atharvadeore999 atharvadeore999 changed the title Update eval_percep.py Update eval_percep.py and train.py Dec 13, 2024
@imneonizer
Copy link

imneonizer commented Dec 13, 2024

@atharvadeore999 Did you get the training or evaluation working? Can you please share you env details or pip freeze output.

@atharvadeore999
Copy link
Author

atharvadeore999 commented Dec 16, 2024

Hey @imneonizer ,sorry for late reply .I have attached the list of dependencies file. I have tried the evaluation and training code and made some changes in the code as mentioned above .
requirements11.txt

@imneonizer
Copy link

Thanks for sharing this. It was super helpful.

@atharvadeore999
Copy link
Author

You're very welcome

@stephanie-fu
Copy link
Collaborator

Hi, thanks for the PR! Could you provide some more details about the errors you were getting with the original training script that prompted the model-loading change?

The current model-loading code seems to work for me for both save_mode='all' and 'adapter_only' (so I can reproduce the validation accuracy when running eval_percep.py), with the packages in requirements.txt installed.
We were able to simplify some of the model-loading code (which originally had to explicitly call LoraConfig and get_peft_model) with the new PEFT setup, so we shouldn't need this extra code anymore.

@atharvadeore999
Copy link
Author

I am using dreamsim v0.2.1 & peft==0.13.2.Here is my environment file
requirements11.txt

ERROR 1:
Seed set to 1234
Traceback (most recent call last):
File "/media/emsg/2d46715b-293d-4478-acd4-5f000d443896/new/dreamsim/./evaluation/eval_percep.py", line 144, in
run(args, device)
File "/media/emsg/2d46715b-293d-4478-acd4-5f000d443896/new/dreamsim/./evaluation/eval_percep.py", line 130, in run
eval_model, preprocess = load_dreamsim_model(args)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/emsg/2d46715b-293d-4478-acd4-5f000d443896/new/dreamsim/./evaluation/eval_percep.py", line 56, in load_dreamsim_model
model_cfg = vars(cfg)
^^^^^^^^^
TypeError: vars() argument must have dict attribute

SOLUTION 1:
model_cfg = cfg

ERROR 2:
Seed set to 1234
Using cache found in ./models/facebookresearch_dino_main
/media/emsg/2d46715b-293d-4478-acd4-5f000d443896/anaconda3/envs/ds/lib/python3.12/site-packages/torch/nn/utils/weight_norm.py:28: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
Total params: 92623105 | Trainable params: 6822401 | % Trainable: 7.365765809729656
Traceback (most recent call last):
File "/media/emsg/2d46715b-293d-4478-acd4-5f000d443896/new/dreamsim/./evaluation/eval_percep.py", line 143, in
run(args, device)
File "/media/emsg/2d46715b-293d-4478-acd4-5f000d443896/new/dreamsim/./evaluation/eval_percep.py", line 129, in run
eval_model, preprocess = load_dreamsim_model(args)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/emsg/2d46715b-293d-4478-acd4-5f000d443896/new/dreamsim/./evaluation/eval_percep.py", line 58, in load_dreamsim_model
model.load_lora_weights(args.eval_checkpoint)
File "/media/emsg/2d46715b-293d-4478-acd4-5f000d443896/new/dreamsim/evaluation/training/train.py", line 196, in load_lora_weights
self.perceptual_model = PeftModel.from_pretrained(self.perceptual_model.base_model.model, checkpoint_root).to(self.device)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/media/emsg/2d46715b-293d-4478-acd4-5f000d443896/anaconda3/envs/ds/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1709, in getattr
raise AttributeError(f"'{type(self).name}' object has no attribute '{name}'")
AttributeError: 'PerceptualModel' object has no attribute 'base_model'

SOLUTION 2:
def load_lora_weights(self, checkpoint_root, epoch_load=None):
if self.save_mode in {'adapter_only', 'all'}:
if epoch_load is not None:
checkpoint_root = os.path.join(checkpoint_root, f'epoch_{epoch_load}')

        with open(os.path.join(checkpoint_root, 'adapter_config.json'), 'r') as f:
            adapter_config = json.load(f)
        lora_keys = ['r', 'lora_alpha', 'lora_dropout', 'bias', 'target_modules']
        lora_config = LoraConfig(**{k: adapter_config[k] for k in lora_keys})
        self.perceptual_model = get_peft_model(self.perceptual_model, lora_config)

        logging.info(f'Loading adapter weights from {checkpoint_root}')
        self.perceptual_model = PeftModel.from_pretrained(self.perceptual_model.base_model.model, checkpoint_root).to(self.device)
    else:
        logging.info(f'Loading entire model from {checkpoint_root}')
        sd = torch.load(os.path.join(checkpoint_root, f'epoch={epoch_load:02d}.ckpt'))['state_dict']
        self.load_state_dict(sd, strict=True)

@atharvadeore999
Copy link
Author

Hey @stephanie-fu , can you please tell how to solve the above errors without adding any extra code ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants