-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Laplace Subnetwork with timm library model #128
Comments
If it's a BackPACK issue, then maybe switching backend will help. Can you try the following? from laplace import Laplace
from laplace.curvature import AsdlGGN
la = Laplace(model, ..., backend=AsdlGGN) |
Was just about to suggest the same thing. However, if you want to use |
Thanks for the recommendation. I believe I installed as you suggested @runame, however, I get |
Ah right, can you try to remove the argument Edit: I also just fixed this on the |
Thank you for the help, for some models I do get the following pytorch warning and I am not sure what it implies: From this trace:
|
Hi,
I would like to apply the Laplace Subnetwork approach to a timm library model (standard resnet18). I think the problem I am encountering is not unique to timm models per se, but to inplace operations? I have made a small reproducible example in this google colab. The error
Output 0 of BackwardHookFunctionBackward is a view and is being modified inplace. This view was created inside a custom Function (or because an input was returned as-is) and the autograd logic to handle view+inplace would override the custom backward associated with the custom Function, leading to incorrect gradients. This behavior is forbidden. You can fix this by cloning the output of the custom Function.
only occurs when trying to do a subnetwork approach, and not with the default Laplace parameters. I have also tried to change the timm resnet activation functions to not be inplace, but maybe it is also related to skip connections? Even though the error does not occur inside the Laplace library immediately, I was wondering if you had any suggestions or pointers to make this approach work.
The text was updated successfully, but these errors were encountered: