Additional Information to prepare_model_for_kbit_training #2299

NilBiescas · 2024-12-27T19:52:16Z

Feature request

Add a comment in the docstring of prepare_model_for_kbit_training to inform that it sets requires_grad to false to all the base model parameters.

Motivation

As this function is used before training it might be nice to know that its actually freezing all the base model.

Your contribution

I could add a line commenting that the function freezes the base model.

githubnemo · 2024-12-29T02:53:13Z

Hey @NilBiescas! I think this is a good idea since it might be surprising to find that the base model cannot be trained after using it with prepare_model_for_kbit_training.

In addition it might also be noteworthy that this is not the only time the requires_grad flag is touched. Currently there might be a case where a forward hook is registered on the input embeddings which always forces requires_grad to true on these.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Additional Information to prepare_model_for_kbit_training #2299

Additional Information to prepare_model_for_kbit_training #2299

NilBiescas commented Dec 27, 2024

githubnemo commented Dec 29, 2024

Additional Information to prepare_model_for_kbit_training #2299

Additional Information to prepare_model_for_kbit_training #2299

Comments

NilBiescas commented Dec 27, 2024

Feature request

Motivation

Your contribution

githubnemo commented Dec 29, 2024