Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Additional Information to prepare_model_for_kbit_training #2299

Open
NilBiescas opened this issue Dec 27, 2024 · 1 comment
Open

Additional Information to prepare_model_for_kbit_training #2299

NilBiescas opened this issue Dec 27, 2024 · 1 comment

Comments

@NilBiescas
Copy link

Feature request

Add a comment in the docstring of prepare_model_for_kbit_training to inform that it sets requires_grad to false to all the base model parameters.

Motivation

As this function is used before training it might be nice to know that its actually freezing all the base model.

Your contribution

I could add a line commenting that the function freezes the base model.

@githubnemo
Copy link
Collaborator

Hey @NilBiescas! I think this is a good idea since it might be surprising to find that the base model cannot be trained after using it with prepare_model_for_kbit_training.

In addition it might also be noteworthy that this is not the only time the requires_grad flag is touched. Currently there might be a case where a forward hook is registered on the input embeddings which always forces requires_grad to true on these.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants