Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to apply it to new models #165

Closed
JX-Zhang98 opened this issue Jan 29, 2024 · 6 comments
Closed

How to apply it to new models #165

JX-Zhang98 opened this issue Jan 29, 2024 · 6 comments
Assignees
Labels
question Further information is requested

Comments

@JX-Zhang98
Copy link

JX-Zhang98 commented Jan 29, 2024

Hello, I have succeeded editing models from llama, including llama-7b, ziya-llama-13B.
But it failed on editing chatglm3 with the config from ROME/chatglm2-6b.yaml. Even if I adjust the edition steps to 200, I cannot effectively reduce the loss and improve the accuracy of editing.
It shows:

loss 19.301 = 19.301 + 0.0 + 0.0 avg prob of [ 纽约] 1.9118447269761418e-08
loss 18.104 = 17.835 + 0.071 + 0.198 avg prob of [ 纽约] 7.327661677436481e-08
loss 15.357 = 15.124 + 0.01 + 0.222 avg prob of [ 纽约] 2.2358667592925485e-06
loss 12.684 = 12.451 + 0.011 + 0.222 avg prob of [ 纽约] 2.6959438400808722e-05
loss 11.079 = 10.846 + 0.011 + 0.222 avg prob of [ 纽约] 6.573031714651734e-05
loss 9.582 = 9.349 + 0.011 + 0.222 avg prob of [ 纽约] 0.0002662458864506334
loss 8.234 = 8.001 + 0.011 + 0.222 avg prob of [ 纽约] 0.0014487882144749165
loss 7.08 = 6.847 + 0.011 + 0.222 avg prob of [ 纽约] 0.005161626730114222
loss 6.194 = 5.962 + 0.01 + 0.222 avg prob of [ 纽约] 0.010020474903285503
loss 5.513 = 5.281 + 0.01 + 0.222 avg prob of [ 纽约] 0.016288358718156815
loss 5.183 = 4.951 + 0.01 + 0.222 avg prob of [ 纽约] 0.018959175795316696
...
loss 1.978 = 1.747 + 0.009 + 0.222 avg prob of [ 纽约] 0.18422633409500122
loss 1.996 = 1.765 + 0.009 + 0.222 avg prob of [ 纽约] 0.18353217840194702
loss 1.991 = 1.76 + 0.009 + 0.222 avg prob of [ 纽约] 0.1818918138742447
loss 2.032 = 1.801 + 0.009 + 0.222 avg prob of [ 纽约] 0.1784791201353073
loss 2.24 = 2.009 + 0.008 + 0.222 avg prob of [ 纽约] 0.1463432013988495
loss 8.215 = 7.983 + 0.009 + 0.222 avg prob of [ 纽约] 0.008609873242676258
loss 9.747 = 9.515 + 0.01 + 0.222 avg prob of [ 纽约] 0.00037625530967488885
loss 7.657 = 7.425 + 0.01 + 0.222 avg prob of [ 纽约] 0.0019925576634705067
loss 6.161 = 5.929 + 0.009 + 0.222 avg prob of [ 纽约] 0.010164782404899597

So I wonder how to apply this framework to another mode, and how to set the layers, v_loss_layers, v_loss_layer and others options in config?
Thank you

@zxlzr zxlzr added the question Further information is requested label Jan 29, 2024
@littlefive5
Copy link
Collaborator

ROME is kind of a locate-then-edit model.
You can first conduct the causal analysis on the new model to determine which layer may store the knowledge. You can also just set layers as a hyper-parameter and try different ones.
Then, you should compute the layer_stats for the new model, the code would compute automatically, so I think it doesn't matter.
v_loss_layer is usually the last layer of the model.
I also think you can try to change v_lr and v_weight_decay .

@zxlzr
Copy link
Contributor

zxlzr commented Jan 30, 2024

hi, do you have any further questions?

@zxlzr zxlzr closed this as completed Jan 31, 2024
@JX-Zhang98
Copy link
Author

It sounds a challenging work.
I will have a try and come back for more advice if I encounter any difficulties.
Thank you very much.

@JX-Zhang98
Copy link
Author

Is there any practical information that can be used as a reference? I encountered this problem kmeng01/rome#43 , but it seems that no developer helped this problem.
Thank you

@zxlzr zxlzr reopened this Feb 23, 2024
@JX-Zhang98
Copy link
Author

ROME is kind of a locate-then-edit model. You can first conduct the causal analysis on the new model to determine which layer may store the knowledge. You can also just set layers as a hyper-parameter and try different ones. Then, you should compute the layer_stats for the new model, the code would compute automatically, so I think it doesn't matter. v_loss_layer is usually the last layer of the model. I also think you can try to change v_lr and v_weight_decay .

I also tried to set layers as a hyper-parameter when editing chatglm3, but none of the results shows it is right.

@littlefive5
Copy link
Collaborator

Sorry for the slow response and I'm not sure if you still face the problem. However, we 're busy of late and cannot support chatglm3 currently. We would support it in the future and will let you know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants