You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've found what I believe to be a bug in the implementation of the fine-tuning baseline which would yield incorrect results when the target is longer than one token.
Looking at the code, the fine-tuning baseline seems to get the logits on which to backpropagate by calling model(**inputs) where inputs are the prompt with the subject but excluding the target. It then maximises the probability of the target by taking the logits associated to the last token in the input, and maximising the probability of all the target tokens as simultaneous direct continuations. This is not the regular fine-tuning behaviour which would be to maximise the probability of the first token in the target being a continuation to the input and then maximising the probability of the second token in the target being a continuation to the first token in the target.
Thank you for your assistance and look forwards to hearing back and understanding whether I may have misunderstood an aspect in the implementation.
The text was updated successfully, but these errors were encountered:
I've found what I believe to be a bug in the implementation of the fine-tuning baseline which would yield incorrect results when the target is longer than one token.
Looking at the code, the fine-tuning baseline seems to get the logits on which to backpropagate by calling
model(**inputs)
where inputs are the prompt with the subject but excluding the target. It then maximises the probability of the target by taking the logits associated to the last token in the input, and maximising the probability of all the target tokens as simultaneous direct continuations. This is not the regular fine-tuning behaviour which would be to maximise the probability of the first token in the target being a continuation to the input and then maximising the probability of the second token in the target being a continuation to the first token in the target.Thank you for your assistance and look forwards to hearing back and understanding whether I may have misunderstood an aspect in the implementation.
The text was updated successfully, but these errors were encountered: