[RFC] Support Function Calling agent type in Agent framework #3000

xinyual · 2024-09-29T08:02:23Z

Problem statement

In our current agent framework, we support both flow agent and REACT chat agent. Recently, a new chat agent type called function calling has been introduced, offering more powerful capabilities to generate contextually relevant, formatted text output. This output can be used by developers to trigger method calls or API requests, referred to as tools. Function calling is an inherent capability of the LLM itself. Function calling provides a more structured and organized approach, enabling deterministic results from large language models (LLMs) with a reduced error rate. This advanced functionality requires additional fine-tuning in pretrained models, which is now supported by the latest models, such as GPT-3.5/4 and the Claude 3 family.

Proposed Solution

The input and output are well-structured when using function calling with an LLM. Take bedrock's Claude 3 as example (see function calling full API):
Input:

{
    "anthropic_version": "bedrock-2023-05-31",    
    "max_tokens": int,
    "system": string,    
    "messages": [
      ...
    ],
    "tools": [
        {
                "name": string,
                "description": string,
                "input_schema": json
            
        }
    ],
    "stop_sequences": [string]
}

The tools here is a parameter in the API calling instead of part of prompt. Also, the output would be in the following structure:

{
  "type": "message",
   ...
  "content": [
    {
      "type": "text",
      "text": "..."
    },
    {
      "type": "tool_use",
      "name": "..",
      "input": {
       ...
      }
    }
  ],
  "stop_reason": "tool_use/end_turn",
}

If we detect the keyword tool_use, we should execute the tool; Otherwise, we should return the result to the user. This input-output structure and execution logic are not compatible with the current REACT agent. Therefore, we propose adding another agent type to configure this functionality.

{
  "type": "function_calling"
  "llm": {
    }
  },
  "memory": {
    "type": "conversation_index"
  },
  "tools": [
  ]
}

Implementation details

We could still reuse part of code in Class MLChatAgentRunner, like memory. But the logic to parse the response from LLM and extract next step should be changed. Also, logic to format the prompt should also be different since tools now are parameters.

The text was updated successfully, but these errors were encountered:

xinyual · 2024-09-29T08:08:41Z

I can be assignee for this new attribute.

yuye-aws · 2024-09-29T08:48:51Z

Can you provide more context? Such as the request to call the agent or the formatted tool description.

yuye-aws · 2024-09-29T08:49:56Z

We could still reuse part of code in Class MLChatAgentRunner, like memory. But the logic to parse the response from LLM and extract next step should be changed. Also, logic to format the prompt should also be different since tools now are parameters.

I would prefer a new class other than modifying existing code in MLChatAgentRunner.

xinyual · 2024-09-29T08:53:29Z

Can you provide more context? Such as the request to call the agent or the formatted tool description.

To call agent, we still use the same parameters as REACT one. And following is a tool description:

{
            "name": "get_weather",
            "description": "Get the current weather in a given location",
            "input_schema": {
              "type": "object",
              "properties": {
                "location": {
                  "type": "string",
                  "description": "The city and state, e.g. San Francisco, CA"
                }
              },
              "required": ["location"]
            }
          }

xinyual · 2024-09-29T08:55:27Z

We could still reuse part of code in Class MLChatAgentRunner, like memory. But the logic to parse the response from LLM and extract next step should be changed. Also, logic to format the prompt should also be different since tools now are parameters.

I would prefer a new class other than modifying existing code in MLChatAgentRunner.

My plan is create two different class like ReactChatAgentRunner/ FunctionCallingAgentRunner, both reuse existing code of MLChatAgentRunner and then implement their own code to parse response and format input.

yuye-aws · 2024-09-29T09:02:05Z

To call agent, we still use the same parameters as REACT one. And following is a tool description:

{
            "name": "get_weather",
            "description": "Get the current weather in a given location",
            "input_schema": {
              "type": "object",
              "properties": {
                "location": {
                  "type": "string",
                  "description": "The city and state, e.g. San Francisco, CA"
                }
              },
              "required": ["location"]
            }
          }

Thanks for sharing these. Should user specify parameters like this when creating a tool? Or maybe we parse these parameter from tools?

yuye-aws · 2024-09-29T09:03:23Z

My plan is create two different class like ReactChatAgentRunner/ FunctionCallingAgentRunner, both reuse existing code of MLChatAgentRunner and then implement their own code to parse response and format input.

It makes sense. Just wondering if ReactChatAgentRunner will be the original MLChatAgentRunner.

zane-neo · 2024-10-08T07:42:15Z

This looks good, one question: Did we implement any benchmark on this? The accuracy between agent framework tool selection and function calling in API?

xinyual added enhancement New feature or request untriaged labels Sep 29, 2024

zane-neo mentioned this issue Oct 8, 2024

[FEATURE] Add support for tools (via Bedrock Converse API) to the RAG pipeline #3071

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Support Function Calling agent type in Agent framework #3000

[RFC] Support Function Calling agent type in Agent framework #3000

xinyual commented Sep 29, 2024 •

edited

Loading

xinyual commented Sep 29, 2024

yuye-aws commented Sep 29, 2024

yuye-aws commented Sep 29, 2024

xinyual commented Sep 29, 2024

xinyual commented Sep 29, 2024

yuye-aws commented Sep 29, 2024 •

edited

Loading

yuye-aws commented Sep 29, 2024

zane-neo commented Oct 8, 2024

[RFC] Support Function Calling agent type in Agent framework #3000

[RFC] Support Function Calling agent type in Agent framework #3000

Comments

xinyual commented Sep 29, 2024 • edited Loading

Problem statement

Proposed Solution

Implementation details

xinyual commented Sep 29, 2024

yuye-aws commented Sep 29, 2024

yuye-aws commented Sep 29, 2024

xinyual commented Sep 29, 2024

xinyual commented Sep 29, 2024

yuye-aws commented Sep 29, 2024 • edited Loading

yuye-aws commented Sep 29, 2024

zane-neo commented Oct 8, 2024

xinyual commented Sep 29, 2024 •

edited

Loading

yuye-aws commented Sep 29, 2024 •

edited

Loading