Skip to content

Commit e8f2313

Browse files
Integrate docker images into compose yaml file to simplify the run instructions. fix ui ip issue and add web search tool support (opea-project#1656)
Integrate docker images into compose yaml file to simplify the run instructions. fix ui ip issue and add web search tool support Signed-off-by: Tsai, Louie <[email protected]> Co-authored-by: alexsin368 <[email protected]>
1 parent 6d24c1c commit e8f2313

15 files changed

+466
-512
lines changed

AgentQnA/README.md

Lines changed: 89 additions & 119 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
## Overview
44

5-
This example showcases a hierarchical multi-agent system for question-answering applications. The architecture diagram is shown below. The supervisor agent interfaces with the user and dispatch tasks to two worker agents to gather information and come up with answers. The worker RAG agent uses the retrieval tool to retrieve relevant documents from the knowledge base (a vector database). The worker SQL agent retrieve relevant data from the SQL database. Although not included in this example, but other tools such as a web search tool or a knowledge graph query tool can be used by the supervisor agent to gather information from additional sources.
5+
This example showcases a hierarchical multi-agent system for question-answering applications. The architecture diagram below shows a supervisor agent that interfaces with the user and dispatches tasks to two worker agents to gather information and come up with answers. The worker RAG agent uses the retrieval tool to retrieve relevant documents from a knowledge base - a vector database. The worker SQL agent retrieves relevant data from a SQL database. Although not included in this example by default, other tools such as a web search tool or a knowledge graph query tool can be used by the supervisor agent to gather information from additional sources.
66
![Architecture Overview](assets/img/agent_qna_arch.png)
77

88
The AgentQnA example is implemented using the component-level microservices defined in [GenAIComps](https://github.com/opea-project/GenAIComps). The flow chart below shows the information flow between different microservices for this example.
@@ -75,191 +75,161 @@ flowchart LR
7575
7676
```
7777

78-
### Why Agent for question answering?
78+
### Why should AI Agents be used for question-answering?
7979

80-
1. Improve relevancy of retrieved context.
81-
RAG agent can rephrase user queries, decompose user queries, and iterate to get the most relevant context for answering user's questions. Compared to conventional RAG, RAG agent can significantly improve the correctness and relevancy of the answer.
82-
2. Expand scope of the agent.
83-
The supervisor agent can interact with multiple worker agents that specialize in different domains with different skills (e.g., retrieve documents, write SQL queries, etc.), and thus can answer questions in multiple domains.
84-
3. Hierarchical multi-agents can improve performance.
85-
Expert worker agents, such as RAG agent and SQL agent, can provide high-quality output for different aspects of a complex query, and the supervisor agent can aggregate the information together to provide a comprehensive answer. If we only use one agent and provide all the tools to this single agent, it may get overwhelmed and not able to provide accurate answers.
80+
1. **Improve relevancy of retrieved context.**
81+
RAG agents can rephrase user queries, decompose user queries, and iterate to get the most relevant context for answering a user's question. Compared to conventional RAG, RAG agents significantly improve the correctness and relevancy of the answer because of the iterations it goes through.
82+
2. **Expand scope of skills.**
83+
The supervisor agent interacts with multiple worker agents that specialize in different skills (e.g., retrieve documents, write SQL queries, etc.). Thus, it can answer questions with different methods.
84+
3. **Hierarchical multi-agents improve performance.**
85+
Expert worker agents, such as RAG agents and SQL agents, can provide high-quality output for different aspects of a complex query, and the supervisor agent can aggregate the information to provide a comprehensive answer. If only one agent is used and all tools are provided to this single agent, it can lead to large overhead or not use the best tool to provide accurate answers.
8686

8787
## Deploy with docker
8888

89-
1. Build agent docker image [Optional]
89+
### 1. Set up environment </br>
9090

91-
> [!NOTE]
92-
> the step is optional. The docker images will be automatically pulled when running the docker compose commands. This step is only needed if pulling images failed.
93-
94-
First, clone the opea GenAIComps repo.
91+
#### First, clone the `GenAIExamples` repo.
9592

9693
```
9794
export WORKDIR=<your-work-directory>
9895
cd $WORKDIR
99-
git clone https://github.com/opea-project/GenAIComps.git
96+
git clone https://github.com/opea-project/GenAIExamples.git
10097
```
10198

102-
Then build the agent docker image. Both the supervisor agent and the worker agent will use the same docker image, but when we launch the two agents we will specify different strategies and register different tools.
99+
#### Second, set up environment variables.
100+
101+
##### For proxy environments only
103102

104103
```
105-
cd GenAIComps
106-
docker build -t opea/agent:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/agent/src/Dockerfile .
104+
export http_proxy="Your_HTTP_Proxy"
105+
export https_proxy="Your_HTTPs_Proxy"
106+
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
107+
export no_proxy="Your_No_Proxy"
107108
```
108109

109-
2. Set up environment for this example </br>
110-
111-
First, clone this repo.
112-
113-
```
114-
export WORKDIR=<your-work-directory>
115-
cd $WORKDIR
116-
git clone https://github.com/opea-project/GenAIExamples.git
117-
```
110+
##### For using open-source llms
118111

119-
Second, set up env vars.
120-
121-
```
122-
# Example: host_ip="192.168.1.1" or export host_ip="External_Public_IP"
123-
export host_ip=$(hostname -I | awk '{print $1}')
124-
# if you are in a proxy environment, also set the proxy-related environment variables
125-
export http_proxy="Your_HTTP_Proxy"
126-
export https_proxy="Your_HTTPs_Proxy"
127-
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
128-
export no_proxy="Your_No_Proxy"
112+
```
113+
export HUGGINGFACEHUB_API_TOKEN=<your-HF-token>
114+
export HF_CACHE_DIR=<directory-where-llms-are-downloaded> #so that no need to redownload every time
115+
```
129116

130-
export TOOLSET_PATH=$WORKDIR/GenAIExamples/AgentQnA/tools/
131-
# for using open-source llms
132-
export HUGGINGFACEHUB_API_TOKEN=<your-HF-token>
133-
export HF_CACHE_DIR=<directory-where-llms-are-downloaded> #so that no need to redownload every time
117+
##### [Optional] OPANAI_API_KEY to use OpenAI models
134118

135-
# optional: OPANAI_API_KEY if you want to use OpenAI models
136-
export OPENAI_API_KEY=<your-openai-key>
137-
```
119+
```
120+
export OPENAI_API_KEY=<your-openai-key>
121+
```
138122

139-
3. Deploy the retrieval tool (i.e., DocIndexRetriever mega-service)
123+
#### Third, set up environment variables for the selected hardware using the corresponding `set_env.sh`
140124

141-
First, launch the mega-service.
125+
##### Gaudi
142126

143-
```
144-
cd $WORKDIR/GenAIExamples/AgentQnA/retrieval_tool
145-
bash launch_retrieval_tool.sh
146-
```
127+
```
128+
source $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi/set_env.sh
129+
```
147130

148-
Then, ingest data into the vector database. Here we provide an example. You can ingest your own data.
131+
##### Xeon
149132

150-
```
151-
bash run_ingest_data.sh
152-
```
133+
```
134+
source $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/cpu/xeon/set_env.sh
135+
```
153136

154-
4. Prepare SQL database
155-
In this example, we will use the Chinook SQLite database. Run the commands below.
137+
### 3. Launch the multi-agent system. </br>
156138

157-
```
158-
# Download data
159-
cd $WORKDIR
160-
git clone https://github.com/lerocha/chinook-database.git
161-
cp chinook-database/ChinookDatabase/DataSources/Chinook_Sqlite.sqlite $WORKDIR/GenAIExamples/AgentQnA/tests/
162-
```
139+
Two options are provided for the `llm_engine` of the agents: 1. open-source LLMs on Gaudi, 2. OpenAI models via API calls.
163140

164-
5. Launch other tools. </br>
165-
In this example, we will use some of the mock APIs provided in the Meta CRAG KDD Challenge to demonstrate the benefits of gaining additional context from mock knowledge graphs.
141+
#### Gaudi
166142

167-
```
168-
docker run -d -p=8080:8000 docker.io/aicrowd/kdd-cup-24-crag-mock-api:v0
169-
```
143+
On Gaudi, `meta-llama/Meta-Llama-3.1-70B-Instruct` will be served using vllm.
144+
By default, both the RAG agent and SQL agent will be launched to support the React Agent.
145+
The React Agent requires the DocIndexRetriever's [`compose.yaml`](../DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml) file, so two `compose.yaml` files need to be run with docker compose to start the multi-agent system.
170146

171-
6. Launch multi-agent system. </br>
172-
We provide two options for `llm_engine` of the agents: 1. open-source LLMs on Intel Gaudi2, 2. OpenAI models via API calls.
147+
> **Note**: To enable the web search tool, skip this step and proceed to the "[Optional] Web Search Tool Support" section.
173148
174-
::::{tab-set}
175-
:::{tab-item} Gaudi
176-
:sync: Gaudi
149+
```bash
150+
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi/
151+
docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml -f compose.yaml up -d
152+
```
177153

178-
On Gaudi2 we will serve `meta-llama/Meta-Llama-3.1-70B-Instruct` using vllm.
154+
##### [Optional] Web Search Tool Support
179155

180-
First build vllm-gaudi docker image.
156+
<details>
157+
<summary> Instructions </summary>
158+
A web search tool is supported in this example and can be enabled by running docker compose with the `compose.webtool.yaml` file.
159+
The Google Search API is used. Follow the [instructions](https://python.langchain.com/docs/integrations/tools/google_search) to create an API key and enable the Custom Search API on a Google account. The environment variables `GOOGLE_CSE_ID` and `GOOGLE_API_KEY` need to be set.
181160

182-
```bash
183-
cd $WORKDIR
184-
git clone https://github.com/vllm-project/vllm.git
185-
cd ./vllm
186-
git checkout v0.6.6
187-
docker build --no-cache -f Dockerfile.hpu -t opea/vllm-gaudi:latest --shm-size=128g . --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy
188-
```
161+
```bash
162+
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi/
163+
export GOOGLE_CSE_ID="YOUR_ID"
164+
export GOOGLE_API_KEY="YOUR_API_KEY"
165+
docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml -f compose.yaml -f compose.webtool.yaml up -d
166+
```
189167

190-
Then launch vllm on Gaudi2 with the command below.
168+
</details>
191169

192-
```bash
193-
vllm_port=8086
194-
model="meta-llama/Meta-Llama-3.1-70B-Instruct"
195-
vllm_volume=$HF_CACHE_DIR # you should have set this env var in previous step
196-
docker run -d --runtime=habana --rm --name "vllm-gaudi-server" -e HABANA_VISIBLE_DEVICES=0,1,2,3 -p $vllm_port:8000 -v $vllm_volume:/data -e HF_TOKEN=$HF_TOKEN -e HUGGING_FACE_HUB_TOKEN=$HF_TOKEN -e HF_HOME=/data -e OMPI_MCA_btl_vader_single_copy_mechanism=none -e PT_HPU_ENABLE_LAZY_COLLECTIVES=true -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e no_proxy=$no_proxy -e VLLM_SKIP_WARMUP=true --cap-add=sys_nice --ipc=host opea/vllm-gaudi:latest --model ${model} --max-seq-len-to-capture 16384 --tensor-parallel-size 4
197-
```
170+
#### Xeon
198171

199-
Then launch Agent microservices.
172+
On Xeon, only OpenAI models are supported.
173+
By default, both the RAG Agent and SQL Agent will be launched to support the React Agent.
174+
The React Agent requires the DocIndexRetriever's [`compose.yaml`](../DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml) file, so two `compose yaml` files need to be run with docker compose to start the multi-agent system.
200175

201-
```bash
202-
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/hpu/gaudi/
203-
bash launch_agent_service_gaudi.sh
204-
```
176+
```bash
177+
export OPENAI_API_KEY=<your-openai-key>
178+
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/cpu/xeon
179+
docker compose -f $WORKDIR/GenAIExamples/DocIndexRetriever/docker_compose/intel/cpu/xeon/compose.yaml -f compose_openai.yaml up -d
180+
```
205181

206-
:::
207-
:::{tab-item} Xeon
208-
:sync: Xeon
182+
### 4. Ingest Data into the vector database
209183

210-
To use OpenAI models, run commands below.
184+
The `run_ingest_data.sh` script will use an example jsonl file to ingest example documents into a vector database. Other ways to ingest data and other types of documents supported can be found in the OPEA dataprep microservice located in the opea-project/GenAIComps repo.
211185

212-
```
213-
export OPENAI_API_KEY=<your-openai-key>
214-
cd $WORKDIR/GenAIExamples/AgentQnA/docker_compose/intel/cpu/xeon
215-
bash launch_agent_service_openai.sh
216-
```
186+
```bash
187+
cd $WORKDIR/GenAIExamples/AgentQnA/retrieval_tool/
188+
bash run_ingest_data.sh
189+
```
217190

218-
:::
219-
::::
191+
> **Note**: This is a one-time operation.
220192
221-
## Deploy AgentQnA UI
193+
## Launch the UI
222194

223-
The AgentQnA UI can be deployed locally or using Docker.
195+
Open a web browser to http://localhost:5173 to access the UI. Ensure the environment variable `AGENT_URL` is set to http://$ip_address:9090/v1/chat/completions in [ui/svelte/.env](./ui/svelte/.env) or else the UI may not work properly.
224196

225-
For detailed instructions on deploying AgentQnA UI, refer to the [AgentQnA UI Guide](./ui/svelte/README.md).
197+
The AgentQnA UI can be deployed locally or using Docker. To customize deployment, refer to the [AgentQnA UI Guide](./ui/svelte/README.md).
226198

227-
## Deploy using Helm Chart
199+
## [Optional] Deploy using Helm Charts
228200

229201
Refer to the [AgentQnA helm chart](./kubernetes/helm/README.md) for instructions on deploying AgentQnA on Kubernetes.
230202

231-
## Validate services
203+
## Validate Services
232204

233-
1. First look at logs of the agent docker containers:
205+
1. First look at logs for each of the agent docker containers:
234206

235-
```
207+
```bash
236208
# worker RAG agent
237209
docker logs rag-agent-endpoint
238210

239211
# worker SQL agent
240212
docker logs sql-agent-endpoint
241-
```
242213

243-
```
244214
# supervisor agent
245215
docker logs react-agent-endpoint
246216
```
247217

248-
You should see something like "HTTP server setup successful" if the docker containers are started successfully.</p>
218+
Look for the message "HTTP server setup successful" to confirm the agent docker container has started successfully.</p>
249219

250-
2. You can use python to validate the agent system
220+
2. Use python to validate each agent is working properly:
251221

252222
```bash
253223
# RAG worker agent
254-
python tests/test.py --prompt "Tell me about Michael Jackson song Thriller" --agent_role "worker" --ext_port 9095
224+
python $WORKDIR/GenAIExamples/AgentQnA/tests/test.py --prompt "Tell me about Michael Jackson song Thriller" --agent_role "worker" --ext_port 9095
255225

256226
# SQL agent
257-
python tests/test.py --prompt "How many employees in company" --agent_role "worker" --ext_port 9096
227+
python $WORKDIR/GenAIExamples/AgentQnA/tests/test.py --prompt "How many employees in company" --agent_role "worker" --ext_port 9096
258228

259229
# supervisor agent: this will test a two-turn conversation
260-
python tests/test.py --agent_role "supervisor" --ext_port 9090
230+
python $WORKDIR/GenAIExamples/AgentQnA/tests/test.py --agent_role "supervisor" --ext_port 9090
261231
```
262232

263-
## How to register your own tools with agent
233+
## How to register other tools with the AI agent
264234

265-
You can take a look at the tools yaml and python files in this example. For more details, please refer to the "Provide your own tools" section in the instructions [here](https://github.com/opea-project/GenAIComps/tree/main/comps/agent/src/README.md).
235+
The [tools](./tools) folder contains YAML and Python files for additional tools for the supervisor and worker agents. Refer to the "Provide your own tools" section in the instructions [here](https://github.com/opea-project/GenAIComps/tree/main/comps/agent/src/README.md) to add tools and customize the AI agents.

0 commit comments

Comments
 (0)