-
Notifications
You must be signed in to change notification settings - Fork 161
Fix kimi-k2 tool call #996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
yeah I saw the mainline PR has changed another file (common/chat-parser-xml-toolcall.cpp). Other the rest was merged in another PR a yesterday or the day before |
|
LGTM |
|
That is, it introduced the ':' in the beginning of the tool call? |
|
Actually, I just discovered that ':' is a null-operator in bash. It's always result in exit code = 0. So if the model is doing like: : && echo okthis is finicky but perfectly fine. The only problem if the model just outputs the ':' and that's it. That would be an error. But you didn't provide any prompt so we can't debug. |
For starters, I don't see the exact logs. What tool call the LLM is trying to perform? The: : && ls -la? That's a perfectly fine tool call. You can see it for yourself if you would run it via bash. |
|
@calvin2021y tends to claim that something does not work to then not responds to requests for clarification. If there aren't more details by tomorrow morning I'll just delete the comment. In the meantime you can safely ignore it. |
|
load/unload it take more than 10m, and test take more time. It is hard for me to test since I am not sure how to record the request body of zed. ik_llama.cpp |
You do know about the --mlock option, right? |
Try this: |
|
I just tested @ubergarm's Kimi-K2-Thinking Q4_X on the latest main branch using a fresh install of Jan.AI with the MCP tools "sequential-thinking" (built-in) and Tavily (search, extract, crawl) enabled, and it was able to reason, search the web multiple times, extract relevant content from the news site, and return the results. It was also able to call the tools in subsequent requests, and the responses did not include the extra I am running ik-llama with: I took the template file from llama.cpp repo. The original one from Moonshot AI mentioned on @ubergarm's model README file did not work. I am not at all discounting @calvin2021y's or @Lissanro's findings in #955; I'm just adding another data point. Many thanks to everyone involved. |
|
Oh now I see what you were talking about. I spent some time with various tool calling and K2-Thinking and I can say that yes, not only the problem of empty tool calls exist but sometimes instead of ':' bash operator it uses the '>' redirection which empties the files it tries to write into which looks like a madness. Sometimes its doing some empty tool calls etc. At this point I am not sure at all if its a problem of ik_llama.cpp implementation or its the problem of the model itself. Do we have a baseline anywhere? |
|
Oh this is crazy. What a stupid tool calls its doing right now? [EDIT]: sometimes instead of '> ' its doing ': && '. WTF is that? This is embarrassing. |
|
Uh oh! You were completely right! I do see the stupid empty tool calls. Example: |
|
Okay. I suggest we do the following. There is two approaches.
What should we pursue? |


Port from ggml-org/llama.cpp#17376
Closes #955 (comment)