You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey @NilBiescas! I think this is a good idea since it might be surprising to find that the base model cannot be trained after using it with prepare_model_for_kbit_training.
In addition it might also be noteworthy that this is not the only time the requires_grad flag is touched. Currently there might be a case where a forward hook is registered on the input embeddings which always forces requires_grad to true on these.
Feature request
Add a comment in the docstring of prepare_model_for_kbit_training to inform that it sets requires_grad to false to all the base model parameters.
Motivation
As this function is used before training it might be nice to know that its actually freezing all the base model.
Your contribution
I could add a line commenting that the function freezes the base model.
The text was updated successfully, but these errors were encountered: