You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I really appreciate your work but have a question. Hopefully you can help me.
in ROME E.5, you said "We perform the intervention at layer 18. As Figure 1k shows, this is the center of causal effect in MLP layers, and as Figure 3 shows, layer 18 is approximately when MLP outputs begin to switch from acting as keys to values.".
However, in MEMIT, you said "at layers where the gap is largest, the role of the MLP computation is important. We select the layers where the gap is largest as the range R to use for the intervention done by MEMIT"
layer 18 obviously doesn't have the largest gap, but why you choose it as key layer?
Is there sth I miss?
thanks!
The text was updated successfully, but these errors were encountered:
Dear authors,
I really appreciate your work but have a question. Hopefully you can help me.
in ROME E.5, you said "We perform the intervention at layer 18. As Figure 1k shows, this is the center of causal effect in MLP layers, and as Figure 3 shows, layer 18 is approximately when MLP outputs begin to switch from acting as keys to values.".
However, in MEMIT, you said "at layers where the gap is largest, the role of the MLP computation is important. We select the layers where the gap is largest as the range R to use for the intervention done by MEMIT"
layer 18 obviously doesn't have the largest gap, but why you choose it as key layer?
Is there sth I miss?
thanks!
The text was updated successfully, but these errors were encountered: