You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: ChatQnA/README.md
+11Lines changed: 11 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -120,3 +120,14 @@ For ChatQnA specific tracing and metrics monitoring, follow [OpenTelemetry on Ch
120
120
## FAQ Generation Application
121
121
122
122
FAQ Generation Application leverages the power of large language models (LLMs) to revolutionize the way you interact with and comprehend complex textual data. By harnessing cutting-edge natural language processing techniques, our application can automatically generate comprehensive and natural-sounding frequently asked questions (FAQs) from your documents, legal texts, customer queries, and other sources. We merged the FaqGen into the ChatQnA example, which utilize LangChain to implement FAQ Generation and facilitate LLM inference using Text Generation Inference on Intel Xeon and Gaudi2 processors.
Copy file name to clipboardExpand all lines: CodeGen/docker_compose/intel/cpu/xeon/README.md
+61-61Lines changed: 61 additions & 61 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -33,65 +33,67 @@ This guide focuses on running the pre-configured CodeGen service using Docker Co
33
33
34
34
## Quick Start Deployment
35
35
36
-
This uses the default vLLM-based deployment profile (`codegen-xeon-vllm`).
36
+
This uses the default vLLM-based deployment using `compose.yaml`.
37
37
38
38
1.**Configure Environment:**
39
39
Set required environment variables in your shell:
40
40
41
-
```bash
42
-
# Replace with your host's external IP address (do not use localhost or 127.0.0.1)
43
-
export HOST_IP="your_external_ip_address"
44
-
# Replace with your Hugging Face Hub API token
45
-
export HF_TOKEN="your_huggingface_token"
46
-
47
-
# Optional: Configure proxy if needed
48
-
# export http_proxy="your_http_proxy"
49
-
# export https_proxy="your_https_proxy"
50
-
# export no_proxy="localhost,127.0.0.1,${HOST_IP}" # Add other hosts if necessary
51
-
source intel/set_env.sh
52
-
cd /intel/cpu/xeon
53
-
```
41
+
```bash
42
+
# Replace with your host's external IP address (do not use localhost or 127.0.0.1)
43
+
export HOST_IP="your_external_ip_address"
44
+
# Replace with your Hugging Face Hub API token
45
+
export HF_TOKEN="your_huggingface_token"
46
+
47
+
# Optional: Configure proxy if needed
48
+
# export http_proxy="your_http_proxy"
49
+
# export https_proxy="your_https_proxy"
50
+
# export no_proxy="localhost,127.0.0.1,${HOST_IP}" # Add other hosts if necessary
51
+
source intel/set_env.sh
52
+
cd /intel/cpu/xeon
53
+
```
54
54
55
-
_Note: The compose file might read additional variables from set_env.sh. Ensure all required variables like ports (`LLM_SERVICE_PORT`, `MEGA_SERVICE_PORT`, etc.) are set if not using defaults from the compose file._
55
+
_Note: The compose file might read additional variables from set_env.sh. Ensure all required variables like ports (`LLM_SERVICE_PORT`, `MEGA_SERVICE_PORT`, etc.) are setif not using defaults from the compose file._
56
56
57
57
For instance, edit the set_env.sh to change the LLM model
Wait several minutes for models to download (especially the first time) and services to initialize. Check container logs (`docker compose logs -f <service_name>`) or proceed to the validation steps below.
75
77
76
78
### Available Deployment Options
77
79
78
-
The `compose.yaml` file uses Docker Compose profiles to selectthe LLM serving backend.
80
+
Different Docker Compose files are available to selectthe LLM serving backend.
|`HOST_IP`| External IP address of the host machine. **Required.**|`your_external_ip_address`|
106
-
|`HF_TOKEN`| Your Hugging Face Hub token for model access. **Required.**|`your_huggingface_token`|
107
-
|`LLM_MODEL_ID`| Hugging Face model ID for the CodeGen LLM (used by TGI/vLLM service). Configured within `compose.yaml` environment.|`Qwen/Qwen2.5-Coder-7B-Instruct`|
108
-
|`EMBEDDING_MODEL_ID`| Hugging Face model ID for the embedding model (used by TEI service). Configured within `compose.yaml` environment. |`BAAI/bge-base-en-v1.5`|
|`DATAPREP_ENDPOINT`| Internal URL forthe Data Preparation service. Configuredin`compose.yaml`.|`http://codegen-dataprep-server:80/dataprep`|
112
-
|`BACKEND_SERVICE_ENDPOINT`| External URL for the CodeGen Gateway (MegaService). Derived from `HOST_IP` and port `7778`. |`http://${HOST_IP}:7778/v1/codegen`|
|`HOST_IP`| External IP address of the host machine. **Required.**|`your_external_ip_address`|
108
+
|`HF_TOKEN`| Your Hugging Face Hub token for model access. **Required.**|`your_huggingface_token`|
109
+
|`LLM_MODEL_ID`| Hugging Face model ID for the CodeGen LLM (used by TGI/vLLM service). Configured within composes files|`Qwen/Qwen2.5-Coder-7B-Instruct`|
110
+
|`EMBEDDING_MODEL_ID`| Hugging Face model ID for the embedding model (used by TEI service). Configured within compose files. |`BAAI/bge-base-en-v1.5`|
|`DATAPREP_ENDPOINT`| Internal URL forthe Data Preparation service. Configuredin compose files.|`http://codegen-dataprep-server:80/dataprep`|
114
+
|`BACKEND_SERVICE_ENDPOINT`| External URL for the CodeGen Gateway (MegaService). Derived from `HOST_IP` and port `7778`. |`http://${HOST_IP}:7778/v1/codegen`|
115
+
|`*_PORT` (Internal) | Internal container ports (e.g., `80`, `6379`). Defined in compose files.| N/A|
Most of these parameters are in`set_env.sh`, you can either modify this file or overwrite the env variables by setting them.
117
119
118
120
```shell
119
121
source CodeGen/docker_compose/set_env.sh
120
122
```
121
123
122
-
#### Compose Profiles
124
+
#### Compose Files
123
125
124
-
Docker Compose profiles (`codegen-xeon-vllm`, `codegen-xeon-tgi`) control which LLM serving backend (vLLM or TGI) and its associated dependencies are started. Only one profile should typically be active.
126
+
Different Docker Compose files (`compose.yaml`, `compose_tgi.yaml`) control which LLM serving backend (vLLM or TGI) and its associated dependencies are started. Choose the appropriate compose file based on your requirements.
125
127
126
128
## Building Custom Images (Optional)
127
129
@@ -130,19 +132,20 @@ If you need to modify the microservices:
130
132
1. Clone the [OPEA GenAIComps](https://github.com/opea-project/GenAIComps) repository.
131
133
2. Follow build instructions in the respective component directories (e.g., `comps/llms/text-generation`, `comps/codegen`, `comps/ui/gradio`, etc.). Use the provided Dockerfiles (e.g., `CodeGen/Dockerfile`, `CodeGen/ui/docker/Dockerfile.gradio`).
132
134
3. Tag your custom images appropriately (e.g., `my-custom-codegen:latest`).
133
-
4. Update the `image:` fields in the `compose.yaml`file to use your custom image tags.
135
+
4. Update the `image:` fields in the compose files (`compose.yaml`or `compose_tgi.yaml`) to use your custom image tags.
134
136
135
137
_Refer to the main [CodeGen README](../../../../README.md) for links to relevant GenAIComps components._
136
138
137
139
## Validate Services
138
140
139
141
### Check Container Status
140
142
141
-
Ensure all containers associated with the chosen profile are running:
143
+
Ensure all containers associated with the chosen compose file are running:
# Example: docker compose ps # for vLLM (compose.yaml)
148
+
# Example: docker compose -f compose_tgi.yaml ps # for TGI
146
149
```
147
150
148
151
Check logs for specific services: `docker compose logs <service_name>`
@@ -173,7 +176,7 @@ Use `curl` commands to test the main service endpoints. Ensure `HOST_IP` is corr
173
176
174
177
## Accessing the User Interface (UI)
175
178
176
-
Multiple UI options can be configured via the `compose.yaml`.
179
+
Multiple UI options can be configured via the compose files.
177
180
178
181
### Gradio UI (Default)
179
182
@@ -186,16 +189,16 @@ _(Port `5173` is the default host mapping for `codegen-gradio-ui-server`)_
186
189
187
190
### Svelte UI (Optional)
188
191
189
-
1. Modify `compose.yaml`: Comment out the `codegen-gradio-ui-server` service and uncomment/add the `codegen-xeon-ui-server` (Svelte) service definition, ensuring the port mapping is correct (e.g., `"- 5173:5173"`).
190
-
2. Restart Docker Compose: `docker compose --profile <profile_name> up -d`
192
+
1. Modify the compose file (either `compose.yaml` or `compose_tgi.yaml`): Comment out the `codegen-gradio-ui-server` service and uncomment/add the `codegen-xeon-ui-server` (Svelte) service definition, ensuring the port mapping is correct (e.g., `"- 5173:5173"`).
193
+
2. Restart Docker Compose: `docker compose up -d` or `docker compose -f compose_tgi.yaml up -d`
191
194
3. Access: `http://{HOST_IP}:5173` (or the host port you mapped).
1. Modify `compose.yaml`: Comment out the default UI service and uncomment/add the `codegen-xeon-react-ui-server` definition, ensuring correct port mapping (e.g., `"- 5174:80"`).
198
-
2. Restart Docker Compose: `docker compose --profile <profile_name> up -d`
200
+
1. Modify the compose file (either `compose.yaml` or `compose_tgi.yaml`): Comment out the default UI service and uncomment/add the `codegen-xeon-react-ui-server` definition, ensuring correct port mapping (e.g., `"- 5174:80"`).
201
+
2. Restart Docker Compose: `docker compose up -d` or `docker compose -f compose_tgi.yaml up -d`
199
202
3. Access: `http://{HOST_IP}:5174` (or the host port you mapped).
@@ -218,21 +221,18 @@ Users can interact with the backend service using the `Neural Copilot` VS Code e
218
221
219
222
- **Model Download Issues:** Check `HF_TOKEN`. Ensure internet connectivity or correct proxy settings. Check logs of `tgi-service`/`vllm-service` and `tei-embedding-server`. Gated models need prior Hugging Face access.
220
223
- **Connection Errors:** Verify `HOST_IP` is correct and accessible. Check `docker ps`for port mappings. Ensure `no_proxy` includes `HOST_IP`if using a proxy. Check logs of the service failing to connect (e.g., `codegen-backend-server` logs if it can't reach `codegen-llm-server`).
221
-
- **"Container name is in use"**: Stop existing containers (`docker compose down`) or change `container_name` in `compose.yaml`.
224
+
- **"Container name is in use"**: Stop existing containers (`docker compose down`) or change `container_name` in the compose file.
222
225
- **Resource Issues:** CodeGen models can be memory-intensive. Monitor host RAM usage. Increase Docker resources if needed.
223
226
224
227
## Stopping the Application
225
228
226
229
```bash
227
-
docker compose --profile <profile_name> down
228
-
# Example: docker compose --profile codegen-xeon-vllm down
230
+
docker compose down # for vLLM (compose.yaml)
231
+
# or
232
+
docker compose -f compose_tgi.yaml down # for TGI
229
233
```
230
234
231
235
## Next Steps
232
236
233
237
- Consult the [OPEA GenAIComps](https://github.com/opea-project/GenAIComps) repository for details on individual microservices.
234
238
- Refer to the main [CodeGen README](../../../../README.md) for links to benchmarking and Kubernetes deployment options.
0 commit comments