Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Distributing the update across multiple layer #2

Open
YoadTew opened this issue Dec 7, 2022 · 1 comment
Open

Distributing the update across multiple layer #2

YoadTew opened this issue Dec 7, 2022 · 1 comment

Comments

@YoadTew
Copy link

YoadTew commented Dec 7, 2022

Hey,
Thanks for sharing your work!
I have a question about how you chose to spread the residual across the remaining layers at each update step (Eq. 20).
You chose the updated values as:
M' = M + residual / (L - l + 1)
claiming it spreads the residual equally across the updated layers, but actually if there are 4 updates layers:
the first layer will provide 1/4 of the residual,
the second layer will provide 1/12 (=1/3 - 1/4) of the residual,
the third layer will provide 1/6 (=1/2-1/3) of the residual,
and the fourth layer will provide 1/2 (=1-1/2) of the residual.

Shouldn't the correct update be:
M' = M + residual * (l - first_edited_layer + 1) / (L - first_edited_layer + 1)?

Thanks

@kmeng01
Copy link
Owner

kmeng01 commented Jan 13, 2023

Hi @YoadTew, great question! I think it comes down to a notional clarification.

In Equation 20, we write $m^l_i = W_{out} k_i^l + r_i^l$ where
$$r_i^l = \frac{z_i - h_i^L}{L-l+1}.$$

Critically, $r_i^l$ is re-evaluated at each $i$, since the value of $h_i^L$ is affected by every layer update. We perform these updates iteratively because MEMIT updates use error minimization, and thus the post-update residuals may not match exactly. Also note that the outputs of future modules $m_j, j > i$ shift when $m_i$ is updated, which introduces additional error.

Modulo these errors, the scheme should give us even spreading. Let me know if you have any further questions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants