-
Notifications
You must be signed in to change notification settings - Fork 486
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Function calling for OpenAI backend #573
base: main
Are you sure you want to change the base?
Conversation
0b7d7f1
to
fbe44c2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! I left a few comments. The review is still in progress.
python/sglang/lang/interpreter.py
Outdated
@@ -23,6 +23,7 @@ | |||
SglFunction, | |||
SglGen, | |||
SglImage, | |||
SglFuncCall, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use dictionary order instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing this given we are moving it to be part of SglGen.
def multi_turn_question(s, question_1, functions=[]): | ||
s += sgl.system("You are a helpful assistant.") | ||
s += sgl.user(question_1) | ||
s += sgl.func_call("func_call_1", tools=functions, tool_choice="auto") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may also want to retrieve the results from function call.
Add tests for state["func_call_1"]
in the function single()
.
python/sglang/backend/openai.py
Outdated
# Open AI model requires function call information to be sent to the model | ||
# along with the prompt. | ||
for function_call in s.function_calls: | ||
prompt.append(function_call) | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s.messages_
should be updated after function call finished rather than in generate
, and the append logic should happen in interpreter.py
, see _execute_role_end()
as a reference.
Additionally, changes prompt
implicitly changes s.messages_
. This is not safe. Changes s.messages_
then set prompt = s.messages_
is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restructured the code a little bit based on your suggestions (with some minor tweaks but I can update if you think it's still better to move the function call generate call outside of generate (we will just have a simpler generate call):
Within openai.py
build_function_call_messages()
: a new function which builds function call messages. Given function signature is specific to open ai models, keeping the logic to parse inputs and produce function call messages within the backend code.generate()
: Given prompt is local to thegenerate()
call, I directly addedfunction_call_messages
to it so that we can call with function call messages during the current completion call's prompt. The main intuition is to try resuing the generate call logic and it also only appends function call response (comp) without intermediate messages into the final text/messages.
Within interpreter.py
- Updated
_execute_gen()
logic to include building function call messages if tools are provided, and handle both parallel function calling and non-parallel function calling by either callingbackend.generate
one time for parallel function call supported models, or multiple times if parallel call is not supported.
python/sglang/backend/openai.py
Outdated
"gpt-3.5-turbo-0613", | ||
]: | ||
raise RuntimeError( | ||
"This model currently does not support function calling." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keep in mind that the set of models that support function calling and parallel function calling are different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing out! Updated to have different handling logic.
python/sglang/backend/openai.py
Outdated
cur_tool_choice = ( | ||
tool_choice | ||
if tool_choice in ["auto", "required", "none"] | ||
else {"type": "function", "function": {"name": tool_choice}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, assert tool_choice
is in names of candidate functions.
python/sglang/backend/openai.py
Outdated
tool_calls = response_message.tool_calls | ||
# Check if the model wanted to call a function | ||
ret_messages = [] | ||
if tool_calls: | ||
# Call the function | ||
# Note: the JSON response may not always be valid; be sure to handle errors | ||
available_functions = {} | ||
for tool in tools: | ||
available_functions[tool.__name__] = tool | ||
ret_messages.append(response_message) | ||
# Send the info for each function call and function response to the model | ||
for tool_call in tool_calls: | ||
function_name = tool_call.function.name | ||
function_to_call = available_functions[function_name] | ||
function_args = json.loads(tool_call.function.arguments) | ||
function_response = function_to_call(**function_args) | ||
ret_messages.append( | ||
{ | ||
"tool_call_id": tool_call.id, | ||
"role": "tool", | ||
"name": function_name, | ||
"content": str(function_response), | ||
} | ||
) | ||
return ret_messages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better to put the logic of real function call into the interpreter, so that it can be reused when we develop the feature for local models.
And remember to handle the logic of appending s.messages_
and s.text_
in the interpreter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, returning just the function call messages here so we can do real function call separately.
python/sglang/lang/interpreter.py
Outdated
@@ -554,6 +560,12 @@ def _execute_select(self, expr: SglSelect): | |||
self.variable_event[name].set() | |||
self.text_ += decision | |||
|
|||
def _execute_func_call(self, expr: SglFuncCall): | |||
# TODO: Should we clear the previous function call states for the next function call |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think yes, by default. Although accumulating functions could be an option.
def multi_turn_question(s, question_1, functions=[]): | ||
s += sgl.system("You are a helpful assistant.") | ||
s += sgl.user(question_1) | ||
s += sgl.func_call("func_call_1", tools=functions, tool_choice="auto") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A design suggestion is that it might be better to just have sgl.gen
with func_call
as an argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I think it's more straightforward to have it as part of sgl.gen
. Would it make sense to have something like sgl.gen("answer_1", max_tokens=256, sgl.func_call(...))
or simply expose parameters directly to sgl.gen
like sgl.gen("answer_1", max_tokens=256, tools=[...])
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's simply expose parameters directly to sgl.gen
.
e859d3e
to
edc30d2
Compare
2463404
to
d737da5
Compare
5b3918a
to
075b053
Compare
7eda0c8
to
41d1f67
Compare
e1cbcf5
to
2a01552
Compare
f58f983
to
1e978c2
Compare
16b1dbf
to
f1389dc
Compare
Adding skeleton code for function calling with Open API models.
Example output (when
tool_choice is "auto" or "required"
):Example output (when
tool_choice is "none"
):The current implementation does not support: