Function calling for OpenAI backend #573

Yiyun-Liang · 2024-06-29T18:37:42Z

Adding skeleton code for function calling with Open API models.
Example output (when tool_choice is "auto" or "required"):

system : You are a helpful assistant.
user : What's the weather like in San Francisco, Tokyo, Paris, and Beijing?
assistant : The current weather in San Francisco is 72°F, in Tokyo it is 10°C, and in Paris it is 22°C. Unfortunately, I couldn't retrieve the weather information for Beijing at the moment.

Example output (when tool_choice is "none"):

system : You are a helpful assistant.
user : What's the weather like in San Francisco, Tokyo, Paris, and Beijing?
assistant : I'm sorry for the inconvenience, but as an AI model developed by OpenAI, I don't have real-time capabilities to provide the current weather in San Francisco, Tokyo, Paris, and Beijing. I recommend using a reliable weather forecast service or website to get real-time weather updates for these cities.

The current implementation does not support:

chat models not listed by OpenAI as of 07/25/24.
non-chat models
speculative execution

Ying1123

Thanks for the PR! I left a few comments. The review is still in progress.

examples/quick_start/openai_example_func_call.py

Ying1123 · 2024-06-30T08:34:17Z

python/sglang/lang/interpreter.py

@@ -23,6 +23,7 @@
    SglFunction,
    SglGen,
    SglImage,
+    SglFuncCall,


use dictionary order instead.

Removing this given we are moving it to be part of SglGen.

Ying1123 · 2024-06-30T08:44:25Z

examples/quick_start/openai_example_func_call.py

+def multi_turn_question(s, question_1, functions=[]):
+    s += sgl.system("You are a helpful assistant.")
+    s += sgl.user(question_1)
+    s += sgl.func_call("func_call_1", tools=functions, tool_choice="auto")


We may also want to retrieve the results from function call.
Add tests for state["func_call_1"] in the function single().

Ying1123 · 2024-06-30T08:50:21Z

python/sglang/backend/openai.py

+                    # Open AI model requires function call information to be sent to the model
+                    # along with the prompt.
+                    for function_call in s.function_calls:
+                        prompt.append(function_call)
                else:


s.messages_ should be updated after function call finished rather than in generate, and the append logic should happen in interpreter.py, see _execute_role_end() as a reference.

Additionally, changes prompt implicitly changes s.messages_. This is not safe. Changes s.messages_ then set prompt = s.messages_ is better.

Restructured the code a little bit based on your suggestions (with some minor tweaks but I can update if you think it's still better to move the function call generate call outside of generate (we will just have a simpler generate call):

Within openai.py

build_function_call_messages(): a new function which builds function call messages. Given function signature is specific to open ai models, keeping the logic to parse inputs and produce function call messages within the backend code.

generate(): Given prompt is local to the generate() call, I directly added function_call_messages to it so that we can call with function call messages during the current completion call's prompt. The main intuition is to try resuing the generate call logic and it also only appends function call response (comp) without intermediate messages into the final text/messages.

Within interpreter.py

Updated _execute_gen() logic to include building function call messages if tools are provided, and handle both parallel function calling and non-parallel function calling by either calling backend.generate one time for parallel function call supported models, or multiple times if parallel call is not supported.

Ying1123 · 2024-06-30T08:51:34Z

python/sglang/backend/openai.py

+            "gpt-3.5-turbo-0613",
+        ]:
+            raise RuntimeError(
+                "This model currently does not support function calling."


keep in mind that the set of models that support function calling and parallel function calling are different.

Thanks for pointing out! Updated to have different handling logic.

Ying1123 · 2024-06-30T09:00:26Z

python/sglang/backend/openai.py

+            cur_tool_choice = (
+                tool_choice
+                if tool_choice in ["auto", "required", "none"]
+                else {"type": "function", "function": {"name": tool_choice}}


In this case, assert tool_choice is in names of candidate functions.

Ying1123 · 2024-06-30T09:03:34Z

python/sglang/backend/openai.py

+        tool_calls = response_message.tool_calls
+        # Check if the model wanted to call a function
+        ret_messages = []
+        if tool_calls:
+            # Call the function
+            # Note: the JSON response may not always be valid; be sure to handle errors
+            available_functions = {}
+            for tool in tools:
+                available_functions[tool.__name__] = tool
+            ret_messages.append(response_message)
+            # Send the info for each function call and function response to the model
+            for tool_call in tool_calls:
+                function_name = tool_call.function.name
+                function_to_call = available_functions[function_name]
+                function_args = json.loads(tool_call.function.arguments)
+                function_response = function_to_call(**function_args)
+                ret_messages.append(
+                    {
+                        "tool_call_id": tool_call.id,
+                        "role": "tool",
+                        "name": function_name,
+                        "content": str(function_response),
+                    }
+                )
+        return ret_messages


I think it is better to put the logic of real function call into the interpreter, so that it can be reused when we develop the feature for local models.
And remember to handle the logic of appending s.messages_ and s.text_ in the interpreter.

Makes sense, returning just the function call messages here so we can do real function call separately.

Ying1123 · 2024-06-30T09:06:05Z

python/sglang/lang/interpreter.py

@@ -554,6 +560,12 @@ def _execute_select(self, expr: SglSelect):
            self.variable_event[name].set()
        self.text_ += decision

+    def _execute_func_call(self, expr: SglFuncCall):
+        # TODO: Should we clear the previous function call states for the next function call


I think yes, by default. Although accumulating functions could be an option.

Ying1123 · 2024-06-30T17:06:01Z

examples/quick_start/openai_example_func_call.py

+def multi_turn_question(s, question_1, functions=[]):
+    s += sgl.system("You are a helpful assistant.")
+    s += sgl.user(question_1)
+    s += sgl.func_call("func_call_1", tools=functions, tool_choice="auto")


A design suggestion is that it might be better to just have sgl.gen with func_call as an argument.

I agree, I think it's more straightforward to have it as part of sgl.gen. Would it make sense to have something like sgl.gen("answer_1", max_tokens=256, sgl.func_call(...)) or simply expose parameters directly to sgl.gen like sgl.gen("answer_1", max_tokens=256, tools=[...])?

Let's simply expose parameters directly to sgl.gen.

merrymercy · 2024-07-27T09:42:29Z

https://docs.together.ai/docs/llama-3-function-calling

Yiyun-Liang force-pushed the func-call branch 4 times, most recently from 0b7d7f1 to fbe44c2 Compare June 29, 2024 22:00

Ying1123 mentioned this pull request Jun 30, 2024

Development Roadmap (Deprecated) #157

Closed

Ying1123 requested changes Jun 30, 2024

View reviewed changes

Ying1123 self-assigned this Jun 30, 2024

Ying1123 changed the title ~~Func call~~ Function calling for OpenAI backend Jun 30, 2024

Ying1123 reviewed Jun 30, 2024

View reviewed changes

Yiyun-Liang force-pushed the func-call branch 3 times, most recently from e859d3e to edc30d2 Compare June 30, 2024 23:44

Ying1123 force-pushed the main branch from d530a1c to c7709d3 Compare July 3, 2024 21:59

merrymercy force-pushed the main branch 2 times, most recently from 2463404 to d737da5 Compare July 4, 2024 07:57

Yiyun-Liang force-pushed the func-call branch 9 times, most recently from 5b3918a to 075b053 Compare July 9, 2024 05:26

merrymercy force-pushed the main branch 2 times, most recently from 7eda0c8 to 41d1f67 Compare July 16, 2024 03:44

Ying1123 mentioned this pull request Jul 17, 2024

Development Roadmap (2024 Q3) #634

Closed

29 tasks

Yiyun-Liang force-pushed the func-call branch 3 times, most recently from e1cbcf5 to 2a01552 Compare July 21, 2024 19:15

Yiyun-Liang force-pushed the func-call branch 3 times, most recently from f58f983 to 1e978c2 Compare July 26, 2024 03:45

Yiyun-Liang added 3 commits July 25, 2024 20:52

initial function calling skeleton

eaf92e1

add call_func api

8365c12

add more comments

7d67f46

Yiyun-Liang force-pushed the func-call branch 2 times, most recently from 16b1dbf to f1389dc Compare July 26, 2024 04:13

update function call code structure

3934b69

Yiyun-Liang force-pushed the func-call branch from f1389dc to 3934b69 Compare July 26, 2024 04:14

add multiple tool choice test

7b6da14

zhyncs requested review from merrymercy, hnyls2002 and ByronHsu as code owners October 24, 2024 22:10

merrymercy force-pushed the main branch from 55311eb to 2134f08 Compare November 2, 2024 01:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Function calling for OpenAI backend #573

Function calling for OpenAI backend #573

Yiyun-Liang commented Jun 29, 2024 •

edited

Loading

Ying1123 left a comment

Ying1123 Jun 30, 2024

Yiyun-Liang Jul 9, 2024

Ying1123 Jun 30, 2024

Ying1123 Jun 30, 2024 •

edited

Loading

Yiyun-Liang Jul 9, 2024

Ying1123 Jun 30, 2024

Yiyun-Liang Jul 9, 2024

Ying1123 Jun 30, 2024

Ying1123 Jun 30, 2024

Yiyun-Liang Jul 21, 2024

Ying1123 Jun 30, 2024

Ying1123 Jun 30, 2024

Yiyun-Liang Jun 30, 2024 •

edited

Loading

Ying1123 Jun 30, 2024

merrymercy commented Jul 27, 2024

Function calling for OpenAI backend #573

Are you sure you want to change the base?

Function calling for OpenAI backend #573

Conversation

Yiyun-Liang commented Jun 29, 2024 • edited Loading

Ying1123 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Ying1123 Jun 30, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Yiyun-Liang Jun 30, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

merrymercy commented Jul 27, 2024

Yiyun-Liang commented Jun 29, 2024 •

edited

Loading

Ying1123 Jun 30, 2024 •

edited

Loading

Yiyun-Liang Jun 30, 2024 •

edited

Loading