-
Notifications
You must be signed in to change notification settings - Fork 10.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server : fix format_infill #10724
server : fix format_infill #10724
Conversation
@@ -3509,6 +3516,21 @@ int main(int argc, char ** argv) { | |||
} | |||
data["input_extra"] = input_extra; // default to empty array if it's not exist | |||
|
|||
std::string prompt = json_value(data, "prompt", std::string()); | |||
std::vector<llama_tokens> tokenized_prompts = tokenize_input_prompts(ctx_server.ctx, prompt, true, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably return an error if there is more than 1 resulting prompt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because above we already checked if data["prompt"]
is string, here we can be sure that we only have one single prompt to deal with. Probably an GGML_ASSERT here make more sense?
(The expected behavior of tokenize_input_prompts
is that if prompt is a string, then output vector size is == 1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. It's fine as it is.
@ggerganov I added a test using
Feel free to add more complicated test case(s) if you need! |
Ah, the infill endpoint should be used only with the So the tests should be updated to use the |
OK so I've tried the non-instruction model but I think the problem is related to placement of
If the prompt is placed at the beginning, it should work:
We can fix this in another PR, for now I'm gonna comment out the |
FIM should not add instructions such as "Complete this", as can be seen in the technical report. The |
Ok thanks for the clarification. I updated the test to reflect this. The
|
Yes, perfect. The |
* server : fix format_infill * fix * rename * update test * use another model * update test * update test * test_invalid_input_extra_req
Should fix #10691 (comment) , I remove the
format_chat/_infill/_rerank
fromhandle_completions_generic
but forgot to put it back inhandle_infill