You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I tried to replay the causal tracing on gpt-xl and llama-7b, but it failed in some ways
Failed on predict_token
def predict_token(mt, prompts, return_p=False):
inp = make_inputs(mt.tokenizer, prompts)
preds, p = predict_from_input(mt.model, inp)
result = [mt.tokenizer.decode(c) for c in preds]
if return_p:
result = (result, p)
return result
def predict_from_input(model, inp):
# failed on this lookup
out = model(**inp)["logits"]
probs = torch.softmax(out[:, -1], dim=1)
p, preds = torch.max(probs, dim=1)
return preds, p
When I use gpt-xl and llama-7b model, the key of model(**inp) is ['last_hidden_state', 'past_key_values'], so it met an KeyError.
If I use chatglm3-6b, it won't go wrong in front, but it failed on collect_embedding_std.
It returns "transformer.wte" here:
def layername(model, num, kind=None):
if hasattr(model, "transformer"):
if kind == "embed":
return "transformer.wte"
return f'transformer.h.{num}{"" if kind is None else "." + kind}'
if hasattr(model, "gpt_neox"):
if kind == "embed":
return "gpt_neox.embed_in"
if kind == "attn":
kind = "attention"
return f'gpt_neox.layers.{num}{"" if kind is None else "." + kind}'
assert False, "unknown transformer structure"
But there is no modules named 'transformer.wte' in this model, so it raised an error here.
def get_module(model, name):
"""
Finds the named module within the given model.
"""
for n, m in model.named_modules():
if n == name:
return m
raise LookupError(name)
In addition, gpt-xl and llama-7b will failed on the assert in layername().
I would like to know what could be causing my failure and is there any way to successfully edit the model.
Thank you
The text was updated successfully, but these errors were encountered:
I tried to replay the causal tracing on gpt-xl and llama-7b, but it failed in some ways
predict_token
When I use gpt-xl and llama-7b model, the key of
model(**inp)
is['last_hidden_state', 'past_key_values']
, so it met an KeyError.If I use chatglm3-6b, it won't go wrong in front, but it failed on
collect_embedding_std
.It returns "transformer.wte" here:
But there is no modules named 'transformer.wte' in this model, so it raised an error here.
In addition, gpt-xl and llama-7b will failed on the assert in
layername()
.I would like to know what could be causing my failure and is there any way to successfully edit the model.
Thank you
The text was updated successfully, but these errors were encountered: