-
-
Notifications
You must be signed in to change notification settings - Fork 210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Neural Tangent Kernel Adaptive Loss #501
Comments
I have been going through the code written in the `discretize_inner_functions' and there seems to be some additional code for each adaptive loss method specifically of the form,
For the implementation of the NTK loss, is it only the struct that needs to be defined or some of how the loss is to be propogated as well? Or in fact am I mistaken and that code is purely for logging and the calculations are done purely in the struct through the lines of the form, I am going to use the adaptive losses that are already defined as a starting point, but it only seems possible to me to define another chain/function to compute the NTK loss that is called upon by the struct. Let me know if I am approaching it from the right perspective. |
The authors had used Jacobians for predicting the weights ( Fig. attached ) but as you can see from the algo - as we need only the trace of |
Yeah, we only need the diagonal and the formula for the diagonal is much easier since it's just square of the derivative. There's no need to compute the whole matrix. |
Hi, I want to work on this but making sure if this is open or if anyone is working towards it. |
No one is working on it. |
Implementing the Neural Tangent Kernel adaptive loss method proposed in the "When and Why PINNs Fail to Train: A Neural Tangent Kernel Perspective" paper by Sifan Wang, Xinling Yu, Paris Perdikaris. There is a github repo that should guide implementation.
The algorithm is Algorithm 1 in the paper. The algorithm should be implemented as a concrete subtype of
AbstractAdaptiveLoss
so that it fits within our pre-existing code gen infrastructure in thediscretize_inner_functions
function. The definition of theK
kernels is in Lemma 3.1.(i.e.)
This paper is slightly harder than some of the other adaptive loss methods to implement in our system, but not that much harder. The definition of
K
requires a selection of points from each domain, and so that could be generated via a grid or stochastic or quasi-random. The implementation provided on their github seems to have used a Grid strategy, but I don't see why that must always be the case for this quantity (it seems arbitrary). Thus, most of the difficulty in implementation is just figuring out the best way to have this own type maintain its own samples that are possibly different from the main PDE domain samplers for the toplevel PINN, and then calculating the kernel quantities using those points and the internal generated PDE functions. There is a ton of interesting theory in this paper but the implementing the algorithm mainly relies on understanding how to compute theK
kernels.The text was updated successfully, but these errors were encountered: