Skip to content

Commit 26cb531

Browse files
yinghu5pre-commit-ci[bot]Copilot
authored
Update README.md of model/port change (opea-project#1969)
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Copilot <[email protected]>
1 parent e9153b8 commit 26cb531

File tree

3 files changed

+76
-37
lines changed

3 files changed

+76
-37
lines changed

CodeGen/docker_compose/intel/cpu/xeon/README.md

Lines changed: 28 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -52,18 +52,29 @@ This uses the default vLLM-based deployment profile (`codegen-xeon-vllm`).
5252

5353
```bash
5454
# Replace with your host's external IP address (do not use localhost or 127.0.0.1)
55-
export HOST_IP="your_external_ip_address"
55+
export host_ip="your_external_ip_address"
5656
# Replace with your Hugging Face Hub API token
5757
export HUGGINGFACEHUB_API_TOKEN="your_huggingface_token"
5858

5959
# Optional: Configure proxy if needed
6060
# export http_proxy="your_http_proxy"
6161
# export https_proxy="your_https_proxy"
62-
# export no_proxy="localhost,127.0.0.1,${HOST_IP}" # Add other hosts if necessary
62+
# export no_proxy="localhost,127.0.0.1,${host_ip}" # Add other hosts if necessary
6363
source ../../../set_env.sh
6464
```
6565

66-
_Note: The compose file might read additional variables from a `.env` file or expect them defined elsewhere. Ensure all required variables like ports (`LLM_SERVICE_PORT`, `MEGA_SERVICE_PORT`, etc.) are set if not using defaults from the compose file._
66+
_Note: The compose file might read additional variables from set_env.sh. Ensure all required variables like ports (`LLM_SERVICE_PORT`, `MEGA_SERVICE_PORT`, etc.) are set if not using defaults from the compose file._
67+
like
68+
69+
```
70+
export LLM_MODEL_ID="Qwen/Qwen2.5-Coder-32B-Instruct"
71+
```
72+
73+
can be changed to small model if needed
74+
75+
```
76+
export LLM_MODEL_ID="Qwen/Qwen2.5-Coder-7B-Instruct"
77+
```
6778

6879
2. **Start Services (vLLM Profile):**
6980

@@ -91,7 +102,7 @@ The `compose.yaml` file uses Docker Compose profiles to select the LLM serving b
91102
- **Services Deployed:** `codegen-tgi-server`, `codegen-llm-server`, `codegen-tei-embedding-server`, `codegen-retriever-server`, `redis-vector-db`, `codegen-dataprep-server`, `codegen-backend-server`, `codegen-gradio-ui-server`.
92103
- **To Run:**
93104
```bash
94-
# Ensure environment variables (HOST_IP, HUGGINGFACEHUB_API_TOKEN) are set
105+
# Ensure environment variables (host_ip, HUGGINGFACEHUB_API_TOKEN) are set
95106
docker compose --profile codegen-xeon-tgi up -d
96107
```
97108

@@ -103,14 +114,14 @@ Key parameters are configured via environment variables set before running `dock
103114

104115
| Environment Variable | Description | Default (Set Externally) |
105116
| :-------------------------------------- | :------------------------------------------------------------------------------------------------------------------ | :----------------------------------------------------------------------------------------------- |
106-
| `HOST_IP` | External IP address of the host machine. **Required.** | `your_external_ip_address` |
117+
| `host_ip` | External IP address of the host machine. **Required.** | `your_external_ip_address` |
107118
| `HUGGINGFACEHUB_API_TOKEN` | Your Hugging Face Hub token for model access. **Required.** | `your_huggingface_token` |
108119
| `LLM_MODEL_ID` | Hugging Face model ID for the CodeGen LLM (used by TGI/vLLM service). Configured within `compose.yaml` environment. | `Qwen/Qwen2.5-Coder-7B-Instruct` |
109120
| `EMBEDDING_MODEL_ID` | Hugging Face model ID for the embedding model (used by TEI service). Configured within `compose.yaml` environment. | `BAAI/bge-base-en-v1.5` |
110121
| `LLM_ENDPOINT` | Internal URL for the LLM serving endpoint (used by `codegen-llm-server`). Configured in `compose.yaml`. | `http://codegen-tgi-server:80/generate` or `http://codegen-vllm-server:8000/v1/chat/completions` |
111122
| `TEI_EMBEDDING_ENDPOINT` | Internal URL for the Embedding service. Configured in `compose.yaml`. | `http://codegen-tei-embedding-server:80/embed` |
112123
| `DATAPREP_ENDPOINT` | Internal URL for the Data Preparation service. Configured in `compose.yaml`. | `http://codegen-dataprep-server:80/dataprep` |
113-
| `BACKEND_SERVICE_ENDPOINT` | External URL for the CodeGen Gateway (MegaService). Derived from `HOST_IP` and port `7778`. | `http://${HOST_IP}:7778/v1/codegen` |
124+
| `BACKEND_SERVICE_ENDPOINT` | External URL for the CodeGen Gateway (MegaService). Derived from `host_ip` and port `7778`. | `http://${host_ip}:7778/v1/codegen` |
114125
| `*_PORT` (Internal) | Internal container ports (e.g., `80`, `6379`). Defined in `compose.yaml`. | N/A |
115126
| `http_proxy` / `https_proxy`/`no_proxy` | Network proxy settings (if required). | `""` |
116127

@@ -150,23 +161,23 @@ Check logs for specific services: `docker compose logs <service_name>`
150161

151162
### Run Validation Script/Commands
152163

153-
Use `curl` commands to test the main service endpoints. Ensure `HOST_IP` is correctly set in your environment.
164+
Use `curl` commands to test the main service endpoints. Ensure `host_ip` is correctly set in your environment.
154165

155-
1. **Validate LLM Serving Endpoint (Example for vLLM on default port 8000 internally, exposed differently):**
166+
1. **Validate LLM Serving Endpoint (Example for vLLM on default port 9000 internally, exposed differently):**
156167

157168
```bash
158169
# This command structure targets the OpenAI-compatible vLLM endpoint
159-
curl http://${HOST_IP}:8000/v1/chat/completions \
170+
curl http://${host_ip}:9000/v1/chat/completions \
160171
-X POST \
161172
-H 'Content-Type: application/json' \
162-
-d '{"model": "Qwen/Qwen2.5-Coder-7B-Instruct", "messages": [{"role": "user", "content": "Implement a basic Python class"}], "max_tokens":32}'
173+
-d '{"model": "Qwen/Qwen2.5-Coder-32B-Instruct", "messages": [{"role": "user", "content": "Implement a basic Python class"}], "max_tokens":32}'
163174
```
164175

165176
- **Expected Output:** A JSON response with generated code in `choices[0].message.content`.
166177

167178
2. **Validate CodeGen Gateway (MegaService on default port 7778):**
168179
```bash
169-
curl http://${HOST_IP}:7778/v1/codegen \
180+
curl http://${host_ip}:7778/v1/codegen \
170181
-H "Content-Type: application/json" \
171182
-d '{"messages": "Write a Python function that adds two numbers."}'
172183
```
@@ -179,8 +190,8 @@ Multiple UI options can be configured via the `compose.yaml`.
179190
### Gradio UI (Default)
180191

181192
Access the default Gradio UI by navigating to:
182-
`http://{HOST_IP}:8080`
183-
_(Port `8080` is the default host mapping for `codegen-gradio-ui-server`)_
193+
`http://{host_ip}:5173`
194+
_(Port `5173` is the default host mapping for `codegen-gradio-ui-server`)_
184195

185196
![Gradio UI - Code Generation](../../../../assets/img/codegen_gradio_ui_main.png)
186197
![Gradio UI - Resource Management](../../../../assets/img/codegen_gradio_ui_dataprep.png)
@@ -189,15 +200,15 @@ _(Port `8080` is the default host mapping for `codegen-gradio-ui-server`)_
189200

190201
1. Modify `compose.yaml`: Comment out the `codegen-gradio-ui-server` service and uncomment/add the `codegen-xeon-ui-server` (Svelte) service definition, ensuring the port mapping is correct (e.g., `"- 5173:5173"`).
191202
2. Restart Docker Compose: `docker compose --profile <profile_name> up -d`
192-
3. Access: `http://{HOST_IP}:5173` (or the host port you mapped).
203+
3. Access: `http://{host_ip}:5173` (or the host port you mapped).
193204

194205
![Svelte UI Init](../../../../assets/img/codeGen_ui_init.jpg)
195206

196207
### React UI (Optional)
197208

198209
1. Modify `compose.yaml`: Comment out the default UI service and uncomment/add the `codegen-xeon-react-ui-server` definition, ensuring correct port mapping (e.g., `"- 5174:80"`).
199210
2. Restart Docker Compose: `docker compose --profile <profile_name> up -d`
200-
3. Access: `http://{HOST_IP}:5174` (or the host port you mapped).
211+
3. Access: `http://{host_ip}:5174` (or the host port you mapped).
201212

202213
![React UI](../../../../assets/img/codegen_react.png)
203214

@@ -207,7 +218,7 @@ Users can interact with the backend service using the `Neural Copilot` VS Code e
207218

208219
1. **Install:** Find and install `Neural Copilot` from the VS Code Marketplace.
209220
![Install Copilot](../../../../assets/img/codegen_copilot.png)
210-
2. **Configure:** Set the "Service URL" in the extension settings to your CodeGen backend endpoint: `http://${HOST_IP}:7778/v1/codegen` (use the correct port if changed).
221+
2. **Configure:** Set the "Service URL" in the extension settings to your CodeGen backend endpoint: `http://${host_ip}:7778/v1/codegen` (use the correct port if changed).
211222
![Configure Endpoint](../../../../assets/img/codegen_endpoint.png)
212223
3. **Usage:**
213224
- **Inline Suggestion:** Type a comment describing the code you want (e.g., `# Python function to read a file`) and wait for suggestions.
@@ -218,7 +229,7 @@ Users can interact with the backend service using the `Neural Copilot` VS Code e
218229
## Troubleshooting
219230

220231
- **Model Download Issues:** Check `HUGGINGFACEHUB_API_TOKEN`. Ensure internet connectivity or correct proxy settings. Check logs of `tgi-service`/`vllm-service` and `tei-embedding-server`. Gated models need prior Hugging Face access.
221-
- **Connection Errors:** Verify `HOST_IP` is correct and accessible. Check `docker ps` for port mappings. Ensure `no_proxy` includes `HOST_IP` if using a proxy. Check logs of the service failing to connect (e.g., `codegen-backend-server` logs if it can't reach `codegen-llm-server`).
232+
- **Connection Errors:** Verify `host_ip` is correct and accessible. Check `docker ps` for port mappings. Ensure `no_proxy` includes `host_ip` if using a proxy. Check logs of the service failing to connect (e.g., `codegen-backend-server` logs if it can't reach `codegen-llm-server`).
222233
- **"Container name is in use"**: Stop existing containers (`docker compose down`) or change `container_name` in `compose.yaml`.
223234
- **Resource Issues:** CodeGen models can be memory-intensive. Monitor host RAM usage. Increase Docker resources if needed.
224235

CodeGen/docker_compose/intel/hpu/gaudi/README.md

Lines changed: 29 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -53,18 +53,29 @@ This uses the default vLLM-based deployment profile (`codegen-gaudi-vllm`).
5353

5454
```bash
5555
# Replace with your host's external IP address (do not use localhost or 127.0.0.1)
56-
export HOST_IP="your_external_ip_address"
56+
export host_ip="your_external_ip_address"
5757
# Replace with your Hugging Face Hub API token
5858
export HUGGINGFACEHUB_API_TOKEN="your_huggingface_token"
5959

6060
# Optional: Configure proxy if needed
6161
# export http_proxy="your_http_proxy"
6262
# export https_proxy="your_https_proxy"
63-
# export no_proxy="localhost,127.0.0.1,${HOST_IP}" # Add other hosts if necessary
63+
# export no_proxy="localhost,127.0.0.1,${host_ip}" # Add other hosts if necessary
6464
source ../../../set_env.sh
6565
```
6666

67-
_Note: Ensure all required variables like ports (`LLM_SERVICE_PORT`, `MEGA_SERVICE_PORT`, etc.) are set if not using defaults from the compose file._
67+
_Note: The compose file might read additional variables from set_env.sh. Ensure all required variables like ports (`LLM_SERVICE_PORT`, `MEGA_SERVICE_PORT`, etc.) are set if not using defaults from the compose file._
68+
like
69+
70+
```
71+
export LLM_MODEL_ID="Qwen/Qwen2.5-Coder-32B-Instruct"
72+
```
73+
74+
can be changed to small model if needed
75+
76+
```
77+
export LLM_MODEL_ID="Qwen/Qwen2.5-Coder-7B-Instruct"
78+
```
6879

6980
2. **Start Services (vLLM Profile):**
7081

@@ -94,7 +105,7 @@ The `compose.yaml` file uses Docker Compose profiles to select the LLM serving b
94105
- **Other Services:** Same CPU-based services as the vLLM profile.
95106
- **To Run:**
96107
```bash
97-
# Ensure environment variables (HOST_IP, HUGGINGFACEHUB_API_TOKEN) are set
108+
# Ensure environment variables (host_ip, HUGGINGFACEHUB_API_TOKEN) are set
98109
docker compose --profile codegen-gaudi-tgi up -d
99110
```
100111

@@ -106,14 +117,14 @@ Key parameters are configured via environment variables set before running `dock
106117

107118
| Environment Variable | Description | Default (Set Externally) |
108119
| :-------------------------------------- | :------------------------------------------------------------------------------------------------------------------ | :----------------------------------------------------------------------------------------------- |
109-
| `HOST_IP` | External IP address of the host machine. **Required.** | `your_external_ip_address` |
120+
| `host_ip` | External IP address of the host machine. **Required.** | `your_external_ip_address` |
110121
| `HUGGINGFACEHUB_API_TOKEN` | Your Hugging Face Hub token for model access. **Required.** | `your_huggingface_token` |
111-
| `LLM_MODEL_ID` | Hugging Face model ID for the CodeGen LLM (used by TGI/vLLM service). Configured within `compose.yaml` environment. | `Qwen/Qwen2.5-Coder-7B-Instruct` |
122+
| `LLM_MODEL_ID` | Hugging Face model ID for the CodeGen LLM (used by TGI/vLLM service). Configured within `compose.yaml` environment. | `Qwen/Qwen2.5-Coder-32B-Instruct` |
112123
| `EMBEDDING_MODEL_ID` | Hugging Face model ID for the embedding model (used by TEI service). Configured within `compose.yaml` environment. | `BAAI/bge-base-en-v1.5` |
113124
| `LLM_ENDPOINT` | Internal URL for the LLM serving endpoint (used by `codegen-llm-server`). Configured in `compose.yaml`. | `http://codegen-tgi-server:80/generate` or `http://codegen-vllm-server:8000/v1/chat/completions` |
114125
| `TEI_EMBEDDING_ENDPOINT` | Internal URL for the Embedding service. Configured in `compose.yaml`. | `http://codegen-tei-embedding-server:80/embed` |
115126
| `DATAPREP_ENDPOINT` | Internal URL for the Data Preparation service. Configured in `compose.yaml`. | `http://codegen-dataprep-server:80/dataprep` |
116-
| `BACKEND_SERVICE_ENDPOINT` | External URL for the CodeGen Gateway (MegaService). Derived from `HOST_IP` and port `7778`. | `http://${HOST_IP}:7778/v1/codegen` |
127+
| `BACKEND_SERVICE_ENDPOINT` | External URL for the CodeGen Gateway (MegaService). Derived from `host_ip` and port `7778`. | `http://${host_ip}:7778/v1/codegen` |
117128
| `*_PORT` (Internal) | Internal container ports (e.g., `80`, `6379`). Defined in `compose.yaml`. | N/A |
118129
| `http_proxy` / `https_proxy`/`no_proxy` | Network proxy settings (if required). | `""` |
119130

@@ -170,21 +181,21 @@ Check logs: `docker compose logs <service_name>`. Pay attention to `vllm-gaudi-s
170181

171182
### Run Validation Script/Commands
172183

173-
Use `curl` commands targeting the main service endpoints. Ensure `HOST_IP` is correctly set.
184+
Use `curl` commands targeting the main service endpoints. Ensure `host_ip` is correctly set.
174185

175-
1. **Validate LLM Serving Endpoint (Example for vLLM on default port 8000 internally, exposed differently):**
186+
1. **Validate LLM Serving Endpoint (Example for vLLM on default port 9000 internally, exposed differently):**
176187

177188
```bash
178189
# This command structure targets the OpenAI-compatible vLLM endpoint
179-
curl http://${HOST_IP}:8000/v1/chat/completions \
190+
curl http://${host_ip}:9000/v1/chat/completions \
180191
-X POST \
181192
-H 'Content-Type: application/json' \
182-
-d '{"model": "Qwen/Qwen2.5-Coder-7B-Instruct", "messages": [{"role": "user", "content": "Implement a basic Python class"}], "max_tokens":32}'
193+
-d '{"model": "Qwen/Qwen2.5-Coder-32B-Instruct", "messages": [{"role": "user", "content": "Implement a basic Python class"}], "max_tokens":32}'
183194
```
184195

185196
2. **Validate CodeGen Gateway (MegaService, default host port 7778):**
186197
```bash
187-
curl http://${HOST_IP}:7778/v1/codegen \
198+
curl http://${host_ip}:7778/v1/codegen \
188199
-H "Content-Type: application/json" \
189200
-d '{"messages": "Implement a sorting algorithm in Python."}'
190201
```
@@ -197,26 +208,26 @@ UI options are similar to the Xeon deployment.
197208
### Gradio UI (Default)
198209

199210
Access the default Gradio UI:
200-
`http://{HOST_IP}:8080`
201-
_(Port `8080` is the default host mapping)_
211+
`http://{host_ip}:5173`
212+
_(Port `5173` is the default host mapping)_
202213

203214
![Gradio UI](../../../../assets/img/codegen_gradio_ui_main.png)
204215

205216
### Svelte UI (Optional)
206217

207218
1. Modify `compose.yaml`: Swap Gradio service for Svelte (`codegen-gaudi-ui-server`), check port map (e.g., `5173:5173`).
208219
2. Restart: `docker compose --profile <profile_name> up -d`
209-
3. Access: `http://{HOST_IP}:5173`
220+
3. Access: `http://{host_ip}:5173`
210221

211222
### React UI (Optional)
212223

213224
1. Modify `compose.yaml`: Swap Gradio service for React (`codegen-gaudi-react-ui-server`), check port map (e.g., `5174:80`).
214225
2. Restart: `docker compose --profile <profile_name> up -d`
215-
3. Access: `http://{HOST_IP}:5174`
226+
3. Access: `http://{host_ip}:5174`
216227

217228
### VS Code Extension (Optional)
218229

219-
Use the `Neural Copilot` extension configured with the CodeGen backend URL: `http://${HOST_IP}:7778/v1/codegen`. (See Xeon README for detailed setup screenshots).
230+
Use the `Neural Copilot` extension configured with the CodeGen backend URL: `http://${host_ip}:7778/v1/codegen`. (See Xeon README for detailed setup screenshots).
220231

221232
## Troubleshooting
222233

@@ -226,7 +237,7 @@ Use the `Neural Copilot` extension configured with the CodeGen backend URL: `htt
226237
- Verify `runtime: habana` and volume mounts in `compose.yaml`.
227238
- Gaudi initialization can take significant time and memory. Monitor resource usage.
228239
- **Model Download Issues:** Check `HUGGINGFACEHUB_API_TOKEN`, internet access, proxy settings. Check LLM service logs.
229-
- **Connection Errors:** Verify `HOST_IP`, ports, and proxy settings. Use `docker ps` and check service logs.
240+
- **Connection Errors:** Verify `host_ip`, ports, and proxy settings. Use `docker ps` and check service logs.
230241

231242
## Stopping the Application
232243

0 commit comments

Comments
 (0)