Skip to content

Commit

Permalink
Merge pull request #100 from aymeric-roucher/main
Browse files Browse the repository at this point in the history
Add custom LLM engine in agent cookbook
  • Loading branch information
aymeric-roucher committed May 29, 2024
2 parents ae64664 + e0987b5 commit f7f57b3
Show file tree
Hide file tree
Showing 2 changed files with 169 additions and 26 deletions.
193 changes: 168 additions & 25 deletions notebooks/en/agents.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,13 @@
"# Build an agent with tool-calling superpowers 🦸 using Transformers Agents\n",
"_Authored by: [Aymeric Roucher](https://huggingface.co/m-ric)_\n",
"\n",
"This notebook demonstrates how you can use [**Transformers Agents**](https://huggingface.co/docs/transformers/en/transformers_agents) to build awesome **agents**!\n",
"This notebook demonstrates how you can use [**Transformers Agents**](https://huggingface.co/docs/transformers/en/agents) to build awesome **agents**!\n",
"\n",
"What are **agents**? Agents are systems that are powered by an LLM and enable the LLM (with careful prompting and output parsing) to use specific *tools* to solve problems.\n",
"\n",
"These *tools* are basically functions that the LLM couldn't perform well by itself: for instance for a text-generation LLM like [Llama-3-70B](https://huggingface.co/meta-llama/Meta-Llama-3-70B-Instruct), this could be an image generation tool, a web search tool, a calculator...\n",
"\n",
"What is **Transformers Agents**? it's an extension of our `transformers` library that provides building blocks to build your own agents! Learn more about it in the [documentation](https://huggingface.co/docs/transformers/en/transformers_agents).\n",
"What is **Transformers Agents**? it's an extension of our `transformers` library that provides building blocks to build your own agents! Learn more about it in the [documentation](https://huggingface.co/docs/transformers/en/agents).\n",
"\n",
"Let's see how to use it, and which use cases it can solve.\n",
"\n",
Expand All @@ -35,7 +35,7 @@
"metadata": {},
"outputs": [],
"source": [
"!pip install datasets huggingface_hub langchain sentence-transformers faiss-cpu serpapi google-search-results -q"
"!pip install datasets huggingface_hub langchain sentence-transformers faiss-cpu serpapi google-search-results openai -q"
]
},
{
Expand All @@ -53,37 +53,33 @@
},
{
"cell_type": "code",
"execution_count": 1,
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"You're loading a tool from the Hub from None. Please make sure this is a source that you trust as the code within that tool will be executed on your machine. Always verify the code of the tools that you load. We recommend specifying a `revision` to ensure you're loading the code that you have checked.\n",
"\u001b[33;1m======== New task ========\u001b[0m\n",
"\u001b[37;1mGenerate me a photo of the car that James bond drove in the latest movie.\u001b[0m\n",
"\u001b[33;1m==== Agent is executing the code below:\u001b[0m\n",
"\u001b[0m\u001b[38;5;7mlatest_bond_car\u001b[39m\u001b[38;5;7m \u001b[39m\u001b[38;5;109;01m=\u001b[39;00m\u001b[38;5;7m \u001b[39m\u001b[38;5;7msearch\u001b[39m\u001b[38;5;7m(\u001b[39m\u001b[38;5;144m\"\u001b[39m\u001b[38;5;144mWhat car did James Bond drive in the latest movie?\u001b[39m\u001b[38;5;144m\"\u001b[39m\u001b[38;5;7m)\u001b[39m\n",
"\u001b[38;5;109mprint\u001b[39m\u001b[38;5;7m(\u001b[39m\u001b[38;5;144m\"\u001b[39m\u001b[38;5;144mLatest Bond Car:\u001b[39m\u001b[38;5;144m\"\u001b[39m\u001b[38;5;7m,\u001b[39m\u001b[38;5;7m \u001b[39m\u001b[38;5;7mlatest_bond_car\u001b[39m\u001b[38;5;7m)\u001b[39m\u001b[0m\n",
"\u001b[0m\u001b[38;5;7mlatest_movie\u001b[39m\u001b[38;5;7m \u001b[39m\u001b[38;5;109;01m=\u001b[39;00m\u001b[38;5;7m \u001b[39m\u001b[38;5;7msearch\u001b[39m\u001b[38;5;7m(\u001b[39m\u001b[38;5;144m\"\u001b[39m\u001b[38;5;144mWhat is the latest James Bond movie?\u001b[39m\u001b[38;5;144m\"\u001b[39m\u001b[38;5;7m)\u001b[39m\n",
"\u001b[38;5;109mprint\u001b[39m\u001b[38;5;7m(\u001b[39m\u001b[38;5;144m\"\u001b[39m\u001b[38;5;144mLatest James Bond movie:\u001b[39m\u001b[38;5;144m\"\u001b[39m\u001b[38;5;7m,\u001b[39m\u001b[38;5;7m \u001b[39m\u001b[38;5;7mlatest_movie\u001b[39m\u001b[38;5;7m)\u001b[39m\n",
"\u001b[38;5;7mbond_car\u001b[39m\u001b[38;5;7m \u001b[39m\u001b[38;5;109;01m=\u001b[39;00m\u001b[38;5;7m \u001b[39m\u001b[38;5;7msearch\u001b[39m\u001b[38;5;7m(\u001b[39m\u001b[38;5;144m\"\u001b[39m\u001b[38;5;144mWhat car did James Bond drive in the latest movie?\u001b[39m\u001b[38;5;144m\"\u001b[39m\u001b[38;5;7m)\u001b[39m\n",
"\u001b[38;5;109mprint\u001b[39m\u001b[38;5;7m(\u001b[39m\u001b[38;5;144m\"\u001b[39m\u001b[38;5;144mJames Bond\u001b[39m\u001b[38;5;144m'\u001b[39m\u001b[38;5;144ms car:\u001b[39m\u001b[38;5;144m\"\u001b[39m\u001b[38;5;7m,\u001b[39m\u001b[38;5;7m \u001b[39m\u001b[38;5;7mbond_car\u001b[39m\u001b[38;5;7m)\u001b[39m\u001b[0m\n",
"\u001b[33;1m====\u001b[0m\n",
"\u001b[33;1mPrint outputs:\u001b[0m\n",
"\u001b[32;20mLatest Bond Car: Aston Martin DB5\n",
"\u001b[32;20mLatest James Bond movie: No Time to Die\n",
"James Bond's car: Aston Martin DB5\n",
"\u001b[0m\n",
"\u001b[33;1m==== Agent is executing the code below:\u001b[0m\n",
"\u001b[0m\u001b[38;5;7mimage\u001b[39m\u001b[38;5;7m \u001b[39m\u001b[38;5;109;01m=\u001b[39;00m\u001b[38;5;7m \u001b[39m\u001b[38;5;7mimage_generator\u001b[39m\u001b[38;5;7m(\u001b[39m\u001b[38;5;144m\"\u001b[39m\u001b[38;5;144mA high-resolution, photorealistic image of the Aston Martin DB5, similar to the one driven by James Bond\u001b[39m\u001b[38;5;144m\"\u001b[39m\u001b[38;5;7m)\u001b[39m\n",
"\u001b[38;5;109mprint\u001b[39m\u001b[38;5;7m(\u001b[39m\u001b[38;5;144m\"\u001b[39m\u001b[38;5;144mImage:\u001b[39m\u001b[38;5;144m\"\u001b[39m\u001b[38;5;7m,\u001b[39m\u001b[38;5;7m \u001b[39m\u001b[38;5;7mimage\u001b[39m\u001b[38;5;7m)\u001b[39m\u001b[0m\n",
"\u001b[33;1m====\u001b[0m\n",
"\u001b[33;1mPrint outputs:\u001b[0m\n",
"\u001b[32;20mImage: /var/folders/6m/9b1tts6d5w960j80wbw9tx3m0000gn/T/tmpjsjhaszo/8388c8e4-21b1-403e-8006-39eb0d7600db.png\n",
"\u001b[0m\n",
"\u001b[33;1m==== Agent is executing the code below:\u001b[0m\n",
"\u001b[0m\u001b[38;5;7mfinal_answer\u001b[39m\u001b[38;5;7m(\u001b[39m\u001b[38;5;7mimage\u001b[39m\u001b[38;5;7m)\u001b[39m\u001b[0m\n",
"\u001b[0m\u001b[38;5;7mimage\u001b[39m\u001b[38;5;7m \u001b[39m\u001b[38;5;109;01m=\u001b[39;00m\u001b[38;5;7m \u001b[39m\u001b[38;5;7mimage_generator\u001b[39m\u001b[38;5;7m(\u001b[39m\u001b[38;5;144m\"\u001b[39m\u001b[38;5;144mA high-res, photorealistic image of the Aston Martin DB5 driven by James Bond in No Time to Die\u001b[39m\u001b[38;5;144m\"\u001b[39m\u001b[38;5;7m)\u001b[39m\n",
"\u001b[38;5;7mfinal_answer\u001b[39m\u001b[38;5;7m(\u001b[39m\u001b[38;5;7mimage\u001b[39m\u001b[38;5;7m)\u001b[39m\u001b[0m\n",
"\u001b[33;1m====\u001b[0m\n",
"\u001b[33;1mPrint outputs:\u001b[0m\n",
"\u001b[32;20m\u001b[0m\n",
"\u001b[33;1m>>> Final answer:\u001b[0m\n",
"\u001b[32;20m/var/folders/6m/9b1tts6d5w960j80wbw9tx3m0000gn/T/tmprsyo3hzd/f7f52df0-bb78-4051-b006-5a1bdab8e97c.png\u001b[0m\n"
"\u001b[32;20m/var/folders/6m/9b1tts6d5w960j80wbw9tx3m0000gn/T/tmptcdd2ra6/2bf48fc0-6fff-4e86-8fb5-85b3221bc0c8.png\u001b[0m\n"
]
}
],
Expand All @@ -106,7 +102,10 @@
")\n",
"\n",
"# Run it!\n",
"agent.run(\"Generate me a photo of the car that James bond drove in the latest movie.\")"
"result = agent.run(\n",
" \"Generate me a photo of the car that James bond drove in the latest movie.\",\n",
")\n",
"result"
]
},
{
Expand All @@ -120,13 +119,18 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. 📚💬 Retrieval-Augmented Generation with source selection\n",
"## 2. 📚💬 RAG with Iterative query refinement & Source selection\n",
"\n",
"Quick definition: Retrieval-Augmented-Generation (RAG) is “using an LLM to answer a user query, but basing the answer on information retrieved from a knowledge base”. It has many advantages over using a vanilla or fine-tuned LLM: to name a few, it allows to ground the answer on true facts and reduce confabulations, it allows to provide the LLM with domain-specific knowledge, and it allows fine-grained control of access to information from the knowledge base.\n",
"Quick definition: Retrieval-Augmented-Generation (RAG) is ___“using an LLM to answer a user query, but basing the answer on information retrieved from a knowledge base”.___\n",
"\n",
"Now let’s say we want to perform RAG, but with the additional constraint that some parameters must be dynamically generated. For example, depending on the user query we could want to restrict the search to specific subsets of the knowledge base, or we could want to adjust the number of documents retrieved. The difficulty is: **how to dynamically adjust these parameters based on the user query?**\n",
"This method has many advantages over using a vanilla or fine-tuned LLM: to name a few, it allows to ground the answer on true facts and reduce confabulations, it allows to provide the LLM with domain-specific knowledge, and it allows fine-grained control of access to information from the knowledge base.\n",
"\n",
"🔧 Well, we can solve this by in a simple way: we will **give our agent control over these parameters!**\n",
"- Now let’s say we want to perform RAG, but with the additional constraint that some parameters must be dynamically generated. For example, depending on the user query we could want to restrict the search to specific subsets of the knowledge base, or we could want to adjust the number of documents retrieved. The difficulty is: **how to dynamically adjust these parameters based on the user query?**\n",
"\n",
"- A frequent failure case of RAG is when the retrieval based on the user query does not return any relevant supporting documents. **Is there a way to iterate by re-calling the retriever with a modified query in case the previous results were not relevant?**\n",
"\n",
"\n",
"🔧 Well, we can solve the points above in a simple way: we will **give our agent control over the retriever's parameters!**\n",
"\n",
"➡️ Let's show how to do this. We first load a knowledge base on which we want to perform RAG: this dataset is a compilation of the documentation pages for many `huggingface` packages, stored as markdown.\n"
]
Expand Down Expand Up @@ -213,7 +217,7 @@
},
{
"cell_type": "code",
"execution_count": 19,
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
Expand Down Expand Up @@ -362,7 +366,7 @@
"\n",
"Note that **using an LLM agent** that calls a retriever as a tool and can dynamically modify the query and other retrieval parameters **is a more general formulation of RAG**, which also covers many RAG improvement techniques like iterative query refinement.\n",
"\n",
"## 3. 💻 Debugging Python code\n",
"## 3. 💻 Debug Python code\n",
"Since the ReactCodeAgent has an built int Python code interpreter, we can use it to debug our faulty Python script!"
]
},
Expand Down Expand Up @@ -544,13 +548,152 @@
"print(final_answer)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 4. Create your own LLM engine (OpenAI)\n",
"\n",
"It's really easy to set up your own LLM engine:\n",
"it only needs a `__call__` method with these criteria:\n",
"1. Takes as input a list of messages in [ChatML format](https://huggingface.co/docs/transformers/main/en/chat_templating#introduction) and outputs the answer.\n",
"2. Accepts a `stop_sequences` arguments to pass sequences on which generation stops.\n",
"3. Depending on which kind of message roles your LLM accepts, you may also need to convert some message roles."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"\u001b[33;1m======== New task ========\u001b[0m\n",
"\u001b[37;1mI have some code that creates a bug: please debug it and return the final code\n",
"You have been provided with these initial arguments: {'code': '\\nlist=[0, 1, 2]\\n\\nfor i in range(4):\\n print(list(i))\\n'}.\u001b[0m\n",
"\u001b[33;1m==== Agent is executing the code below:\u001b[0m\n",
"\u001b[0m\u001b[38;5;7mmy_list\u001b[39m\u001b[38;5;7m \u001b[39m\u001b[38;5;109;01m=\u001b[39;00m\u001b[38;5;7m \u001b[39m\u001b[38;5;7m[\u001b[39m\u001b[38;5;139m0\u001b[39m\u001b[38;5;7m,\u001b[39m\u001b[38;5;7m \u001b[39m\u001b[38;5;139m1\u001b[39m\u001b[38;5;7m,\u001b[39m\u001b[38;5;7m \u001b[39m\u001b[38;5;139m2\u001b[39m\u001b[38;5;7m]\u001b[39m\u001b[38;5;7m \u001b[39m\u001b[38;5;60;03m# Renamed the list to avoid using the built-in name\u001b[39;00m\n",
"\n",
"\u001b[38;5;109;01mfor\u001b[39;00m\u001b[38;5;7m \u001b[39m\u001b[38;5;7mi\u001b[39m\u001b[38;5;7m \u001b[39m\u001b[38;5;109;01min\u001b[39;00m\u001b[38;5;7m \u001b[39m\u001b[38;5;109mrange\u001b[39m\u001b[38;5;7m(\u001b[39m\u001b[38;5;109mlen\u001b[39m\u001b[38;5;7m(\u001b[39m\u001b[38;5;7mmy_list\u001b[39m\u001b[38;5;7m)\u001b[39m\u001b[38;5;7m)\u001b[39m\u001b[38;5;7m:\u001b[39m\u001b[38;5;7m \u001b[39m\u001b[38;5;60;03m# Changed the range to be within the length of the list\u001b[39;00m\n",
"\u001b[38;5;7m \u001b[39m\u001b[38;5;109mprint\u001b[39m\u001b[38;5;7m(\u001b[39m\u001b[38;5;7mmy_list\u001b[39m\u001b[38;5;7m[\u001b[39m\u001b[38;5;7mi\u001b[39m\u001b[38;5;7m]\u001b[39m\u001b[38;5;7m)\u001b[39m\u001b[38;5;7m \u001b[39m\u001b[38;5;60;03m# Corrected the list access syntax\u001b[39;00m\u001b[0m\n",
"\u001b[33;1m====\u001b[0m\n",
"\u001b[33;1mPrint outputs:\u001b[0m\n",
"\u001b[32;20m0\n",
"1\n",
"2\n",
"\u001b[0m\n",
"\u001b[33;1m==== Agent is executing the code below:\u001b[0m\n",
"\u001b[0m\u001b[38;5;7mmy_list\u001b[39m\u001b[38;5;7m \u001b[39m\u001b[38;5;109;01m=\u001b[39;00m\u001b[38;5;7m \u001b[39m\u001b[38;5;7m[\u001b[39m\u001b[38;5;139m0\u001b[39m\u001b[38;5;7m,\u001b[39m\u001b[38;5;7m \u001b[39m\u001b[38;5;139m1\u001b[39m\u001b[38;5;7m,\u001b[39m\u001b[38;5;7m \u001b[39m\u001b[38;5;139m2\u001b[39m\u001b[38;5;7m]\u001b[39m\u001b[38;5;7m \u001b[39m\u001b[38;5;60;03m# Renamed the list to avoid using the built-in name\u001b[39;00m\n",
"\n",
"\u001b[38;5;109;01mfor\u001b[39;00m\u001b[38;5;7m \u001b[39m\u001b[38;5;7mi\u001b[39m\u001b[38;5;7m \u001b[39m\u001b[38;5;109;01min\u001b[39;00m\u001b[38;5;7m \u001b[39m\u001b[38;5;109mrange\u001b[39m\u001b[38;5;7m(\u001b[39m\u001b[38;5;109mlen\u001b[39m\u001b[38;5;7m(\u001b[39m\u001b[38;5;7mmy_list\u001b[39m\u001b[38;5;7m)\u001b[39m\u001b[38;5;7m)\u001b[39m\u001b[38;5;7m:\u001b[39m\u001b[38;5;7m \u001b[39m\u001b[38;5;60;03m# Changed the range to be within the length of the list\u001b[39;00m\n",
"\u001b[38;5;7m \u001b[39m\u001b[38;5;109mprint\u001b[39m\u001b[38;5;7m(\u001b[39m\u001b[38;5;7mmy_list\u001b[39m\u001b[38;5;7m[\u001b[39m\u001b[38;5;7mi\u001b[39m\u001b[38;5;7m]\u001b[39m\u001b[38;5;7m)\u001b[39m\u001b[38;5;7m \u001b[39m\u001b[38;5;60;03m# Corrected the list access syntax\u001b[39;00m\u001b[0m\n",
"\u001b[33;1m====\u001b[0m\n",
"\u001b[33;1mPrint outputs:\u001b[0m\n",
"\u001b[32;20m0\n",
"1\n",
"2\n",
"\u001b[0m\n",
"\u001b[33;1m==== Agent is executing the code below:\u001b[0m\n",
"\u001b[0m\u001b[38;5;7mcorrected_code\u001b[39m\u001b[38;5;7m \u001b[39m\u001b[38;5;109;01m=\u001b[39;00m\u001b[38;5;7m \u001b[39m\u001b[38;5;144m'''\u001b[39m\n",
"\u001b[38;5;144mmy_list = [0, 1, 2] # Renamed the list to avoid using the built-in name\u001b[39m\n",
"\n",
"\u001b[38;5;144mfor i in range(len(my_list)): # Changed the range to be within the length of the list\u001b[39m\n",
"\u001b[38;5;144m print(my_list[i]) # Corrected the list access syntax\u001b[39m\n",
"\u001b[38;5;144m'''\u001b[39m\n",
"\n",
"\u001b[38;5;7mfinal_answer\u001b[39m\u001b[38;5;7m(\u001b[39m\u001b[38;5;7manswer\u001b[39m\u001b[38;5;109;01m=\u001b[39;00m\u001b[38;5;7mcorrected_code\u001b[39m\u001b[38;5;7m)\u001b[39m\u001b[0m\n",
"\u001b[33;1m====\u001b[0m\n",
"\u001b[33;1mPrint outputs:\u001b[0m\n",
"\u001b[32;20m\u001b[0m\n",
"\u001b[33;1m>>> Final answer:\u001b[0m\n",
"\u001b[32;20m\n",
"my_list = [0, 1, 2] # Renamed the list to avoid using the built-in name\n",
"\n",
"for i in range(len(my_list)): # Changed the range to be within the length of the list\n",
" print(my_list[i]) # Corrected the list access syntax\n",
"\u001b[0m\n"
]
}
],
"source": [
"import os\n",
"from openai import OpenAI\n",
"from transformers.agents.llm_engine import MessageRole, get_clean_message_list\n",
"\n",
"openai_role_conversions = {\n",
" MessageRole.TOOL_RESPONSE: \"user\",\n",
"}\n",
"\n",
"\n",
"class OpenAIEngine:\n",
" def __init__(self, model_name=\"gpt-4o-2024-05-13\"):\n",
" self.model_name = model_name\n",
" self.client = OpenAI(\n",
" api_key=os.getenv(\"OPENAI_API_KEY\"),\n",
" )\n",
"\n",
" def __call__(self, messages, stop_sequences=[]):\n",
" # Get clean message list\n",
" messages = get_clean_message_list(\n",
" messages, role_conversions=openai_role_conversions\n",
" )\n",
"\n",
" # Get LLM output\n",
" response = self.client.chat.completions.create(\n",
" model=self.model_name,\n",
" messages=messages,\n",
" stop=stop_sequences,\n",
" )\n",
" return response.choices[0].message.content\n",
"\n",
"\n",
"openai_engine = OpenAIEngine()\n",
"agent = ReactCodeAgent(llm_engine=openai_engine, tools=[])\n",
"\n",
"code = \"\"\"\n",
"list=[0, 1, 2]\n",
"\n",
"for i in range(4):\n",
" print(list(i))\n",
"\"\"\"\n",
"\n",
"final_answer = agent.run(\n",
" \"I have some code that creates a bug: please debug it and return the final code\",\n",
" code=code,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"my_list = [0, 1, 2] # Renamed the list to avoid using the built-in name\n",
"\n",
"for i in range(len(my_list)): # Changed the range to be within the length of the list\n",
" print(my_list[i]) # Corrected the list access syntax\n",
"\n"
]
}
],
"source": [
"print(final_answer)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## ➡️ Conclusion\n",
"\n",
"These three use cases should give you a glimpse into the possibilities of our Agents framework!\n",
"The use cases above should give you a glimpse into the possibilities of our Agents framework!\n",
"\n",
"For more advanced usage, read the [documentation](https://huggingface.co/docs/transformers/en/transformers_agents), and [this experiment](https://github.com/aymeric-roucher/agent_reasoning_benchmark/blob/main/benchmark_gaia.ipynb) that allowed us to build our own agent based on Llama-3-70B that beats many GPT-4 agents on the very difficult [GAIA Leaderboard](https://huggingface.co/spaces/gaia-benchmark/leaderboard)!\n",
"\n",
Expand Down
2 changes: 1 addition & 1 deletion notebooks/en/structured_generation.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -502,8 +502,8 @@
"For instance in your [LLM judge](llm_judge) workflows, you can also use constrained generation to output a JSON, as follows:\n",
"```\n",
"{\n",
" \"rationale\": \"The answer does not match the true answer at all.\"\n",
" \"score\": 1,\n",
" \"rationale\": \"The answer does not match the true answer at all.\"\n",
" \"confidence_level\": 0.85\n",
"}\n",
"```"
Expand Down

0 comments on commit f7f57b3

Please sign in to comment.