Update README.md of model/port change (opea-project#1969)

yinghu5 · pre-commit-ci[bot] · Copilot · web-flow · commit 26cb5317661f · 2025-05-20T14:29:23.000+08:00
Co-authored-by: pre-commit-ci[bot] &lt;66853113+pre-commit-ci[bot]@users.noreply.github.com&gt;
Co-authored-by: Copilot &lt;175728472+Copilot@users.noreply.github.com&gt;
diff --git a/CodeGen/docker_compose/intel/cpu/xeon/README.md b/CodeGen/docker_compose/intel/cpu/xeon/README.md
@@ -52,18 +52,29 @@ This uses the default vLLM-based deployment profile (`codegen-xeon-vllm`).
 
     ```bash
     # Replace with your host's external IP address (do not use localhost or 127.0.0.1)
-    export HOST_IP="your_external_ip_address"
+    export host_ip="your_external_ip_address"
     # Replace with your Hugging Face Hub API token
     export HUGGINGFACEHUB_API_TOKEN="your_huggingface_token"
 
     # Optional: Configure proxy if needed
     # export http_proxy="your_http_proxy"
     # export https_proxy="your_https_proxy"
-    # export no_proxy="localhost,127.0.0.1,${HOST_IP}" # Add other hosts if necessary
+    # export no_proxy="localhost,127.0.0.1,${host_ip}" # Add other hosts if necessary
     source ../../../set_env.sh
     ```
 
-    _Note: The compose file might read additional variables from a `.env` file or expect them defined elsewhere. Ensure all required variables like ports (`LLM_SERVICE_PORT`, `MEGA_SERVICE_PORT`, etc.) are set if not using defaults from the compose file._
+    _Note: The compose file might read additional variables from set_env.sh. Ensure all required variables like ports (`LLM_SERVICE_PORT`, `MEGA_SERVICE_PORT`, etc.) are set if not using defaults from the compose file._
+    like
+
+    ```
+    export LLM_MODEL_ID="Qwen/Qwen2.5-Coder-32B-Instruct"
+    ```
+
+    can be changed to small model if needed
+
+    ```
+    export LLM_MODEL_ID="Qwen/Qwen2.5-Coder-7B-Instruct"
+    ```
 
 2.  **Start Services (vLLM Profile):**
 
@@ -91,7 +102,7 @@ The `compose.yaml` file uses Docker Compose profiles to select the LLM serving b
 - **Services Deployed:** `codegen-tgi-server`, `codegen-llm-server`, `codegen-tei-embedding-server`, `codegen-retriever-server`, `redis-vector-db`, `codegen-dataprep-server`, `codegen-backend-server`, `codegen-gradio-ui-server`.
 - **To Run:**
   ```bash
-  # Ensure environment variables (HOST_IP, HUGGINGFACEHUB_API_TOKEN) are set
+  # Ensure environment variables (host_ip, HUGGINGFACEHUB_API_TOKEN) are set
   docker compose --profile codegen-xeon-tgi up -d
   ```
 
@@ -103,14 +114,14 @@ Key parameters are configured via environment variables set before running `dock
 
 | Environment Variable                    | Description                                                                                                         | Default (Set Externally)                                                                         |
 | :-------------------------------------- | :------------------------------------------------------------------------------------------------------------------ | :----------------------------------------------------------------------------------------------- |
-| `HOST_IP`                               | External IP address of the host machine. **Required.**                                                              | `your_external_ip_address`                                                                       |
+| `host_ip`                               | External IP address of the host machine. **Required.**                                                              | `your_external_ip_address`                                                                       |
 | `HUGGINGFACEHUB_API_TOKEN`              | Your Hugging Face Hub token for model access. **Required.**                                                         | `your_huggingface_token`                                                                         |
 | `LLM_MODEL_ID`                          | Hugging Face model ID for the CodeGen LLM (used by TGI/vLLM service). Configured within `compose.yaml` environment. | `Qwen/Qwen2.5-Coder-7B-Instruct`                                                                 |
 | `EMBEDDING_MODEL_ID`                    | Hugging Face model ID for the embedding model (used by TEI service). Configured within `compose.yaml` environment.  | `BAAI/bge-base-en-v1.5`                                                                          |
 | `LLM_ENDPOINT`                          | Internal URL for the LLM serving endpoint (used by `codegen-llm-server`). Configured in `compose.yaml`.             | `http://codegen-tgi-server:80/generate` or `http://codegen-vllm-server:8000/v1/chat/completions` |
 | `TEI_EMBEDDING_ENDPOINT`                | Internal URL for the Embedding service. Configured in `compose.yaml`.                                               | `http://codegen-tei-embedding-server:80/embed`                                                   |
 | `DATAPREP_ENDPOINT`                     | Internal URL for the Data Preparation service. Configured in `compose.yaml`.                                        | `http://codegen-dataprep-server:80/dataprep`                                                     |
-| `BACKEND_SERVICE_ENDPOINT`              | External URL for the CodeGen Gateway (MegaService). Derived from `HOST_IP` and port `7778`.                         | `http://${HOST_IP}:7778/v1/codegen`                                                              |
+| `BACKEND_SERVICE_ENDPOINT`              | External URL for the CodeGen Gateway (MegaService). Derived from `host_ip` and port `7778`.                         | `http://${host_ip}:7778/v1/codegen`                                                              |
 | `*_PORT` (Internal)                     | Internal container ports (e.g., `80`, `6379`). Defined in `compose.yaml`.                                           | N/A                                                                                              |
 | `http_proxy` / `https_proxy`/`no_proxy` | Network proxy settings (if required).                                                                               | `""`                                                                                             |
 
@@ -150,23 +161,23 @@ Check logs for specific services: `docker compose logs <service_name>`
 
 ### Run Validation Script/Commands
 
-Use `curl` commands to test the main service endpoints. Ensure `HOST_IP` is correctly set in your environment.
+Use `curl` commands to test the main service endpoints. Ensure `host_ip` is correctly set in your environment.
 
-1.  **Validate LLM Serving Endpoint (Example for vLLM on default port 8000 internally, exposed differently):**
+1.  **Validate LLM Serving Endpoint (Example for vLLM on default port 9000 internally, exposed differently):**
 
     ```bash
     # This command structure targets the OpenAI-compatible vLLM endpoint
-    curl http://${HOST_IP}:8000/v1/chat/completions \
+    curl http://${host_ip}:9000/v1/chat/completions \
        -X POST \
        -H 'Content-Type: application/json' \
-       -d '{"model": "Qwen/Qwen2.5-Coder-7B-Instruct", "messages": [{"role": "user", "content": "Implement a basic Python class"}], "max_tokens":32}'
+       -d '{"model": "Qwen/Qwen2.5-Coder-32B-Instruct", "messages": [{"role": "user", "content": "Implement a basic Python class"}], "max_tokens":32}'
     ```
 
     - **Expected Output:** A JSON response with generated code in `choices[0].message.content`.
 
 2.  **Validate CodeGen Gateway (MegaService on default port 7778):**
     ```bash
-    curl http://${HOST_IP}:7778/v1/codegen \
+    curl http://${host_ip}:7778/v1/codegen \
       -H "Content-Type: application/json" \
       -d '{"messages": "Write a Python function that adds two numbers."}'
     ```
@@ -179,8 +190,8 @@ Multiple UI options can be configured via the `compose.yaml`.
 ### Gradio UI (Default)
 
 Access the default Gradio UI by navigating to:
-`http://{HOST_IP}:8080`
-_(Port `8080` is the default host mapping for `codegen-gradio-ui-server`)_
+`http://{host_ip}:5173`
+_(Port `5173` is the default host mapping for `codegen-gradio-ui-server`)_
 
 ![Gradio UI - Code Generation](../../../../assets/img/codegen_gradio_ui_main.png)
 ![Gradio UI - Resource Management](../../../../assets/img/codegen_gradio_ui_dataprep.png)
@@ -189,15 +200,15 @@ _(Port `8080` is the default host mapping for `codegen-gradio-ui-server`)_
 
 1.  Modify `compose.yaml`: Comment out the `codegen-gradio-ui-server` service and uncomment/add the `codegen-xeon-ui-server` (Svelte) service definition, ensuring the port mapping is correct (e.g., `"- 5173:5173"`).
 2.  Restart Docker Compose: `docker compose --profile <profile_name> up -d`
-3.  Access: `http://{HOST_IP}:5173` (or the host port you mapped).
+3.  Access: `http://{host_ip}:5173` (or the host port you mapped).
 
 ![Svelte UI Init](../../../../assets/img/codeGen_ui_init.jpg)
 
 ### React UI (Optional)
 
 1.  Modify `compose.yaml`: Comment out the default UI service and uncomment/add the `codegen-xeon-react-ui-server` definition, ensuring correct port mapping (e.g., `"- 5174:80"`).
 2.  Restart Docker Compose: `docker compose --profile <profile_name> up -d`
-3.  Access: `http://{HOST_IP}:5174` (or the host port you mapped).
+3.  Access: `http://{host_ip}:5174` (or the host port you mapped).
 
 ![React UI](../../../../assets/img/codegen_react.png)
 
@@ -207,7 +218,7 @@ Users can interact with the backend service using the `Neural Copilot` VS Code e
 
 1.  **Install:** Find and install `Neural Copilot` from the VS Code Marketplace.
     ![Install Copilot](../../../../assets/img/codegen_copilot.png)
-2.  **Configure:** Set the "Service URL" in the extension settings to your CodeGen backend endpoint: `http://${HOST_IP}:7778/v1/codegen` (use the correct port if changed).
+2.  **Configure:** Set the "Service URL" in the extension settings to your CodeGen backend endpoint: `http://${host_ip}:7778/v1/codegen` (use the correct port if changed).
     ![Configure Endpoint](../../../../assets/img/codegen_endpoint.png)
 3.  **Usage:**
     - **Inline Suggestion:** Type a comment describing the code you want (e.g., `# Python function to read a file`) and wait for suggestions.
@@ -218,7 +229,7 @@ Users can interact with the backend service using the `Neural Copilot` VS Code e
 ## Troubleshooting
 
 - **Model Download Issues:** Check `HUGGINGFACEHUB_API_TOKEN`. Ensure internet connectivity or correct proxy settings. Check logs of `tgi-service`/`vllm-service` and `tei-embedding-server`. Gated models need prior Hugging Face access.
-- **Connection Errors:** Verify `HOST_IP` is correct and accessible. Check `docker ps` for port mappings. Ensure `no_proxy` includes `HOST_IP` if using a proxy. Check logs of the service failing to connect (e.g., `codegen-backend-server` logs if it can't reach `codegen-llm-server`).
+- **Connection Errors:** Verify `host_ip` is correct and accessible. Check `docker ps` for port mappings. Ensure `no_proxy` includes `host_ip` if using a proxy. Check logs of the service failing to connect (e.g., `codegen-backend-server` logs if it can't reach `codegen-llm-server`).
 - **"Container name is in use"**: Stop existing containers (`docker compose down`) or change `container_name` in `compose.yaml`.
 - **Resource Issues:** CodeGen models can be memory-intensive. Monitor host RAM usage. Increase Docker resources if needed.
 
diff --git a/CodeGen/docker_compose/intel/hpu/gaudi/README.md b/CodeGen/docker_compose/intel/hpu/gaudi/README.md
@@ -53,18 +53,29 @@ This uses the default vLLM-based deployment profile (`codegen-gaudi-vllm`).
 
     ```bash
     # Replace with your host's external IP address (do not use localhost or 127.0.0.1)
-    export HOST_IP="your_external_ip_address"
+    export host_ip="your_external_ip_address"
     # Replace with your Hugging Face Hub API token
     export HUGGINGFACEHUB_API_TOKEN="your_huggingface_token"
 
     # Optional: Configure proxy if needed
     # export http_proxy="your_http_proxy"
     # export https_proxy="your_https_proxy"
-    # export no_proxy="localhost,127.0.0.1,${HOST_IP}" # Add other hosts if necessary
+    # export no_proxy="localhost,127.0.0.1,${host_ip}" # Add other hosts if necessary
     source ../../../set_env.sh
     ```
 
-    _Note: Ensure all required variables like ports (`LLM_SERVICE_PORT`, `MEGA_SERVICE_PORT`, etc.) are set if not using defaults from the compose file._
+    _Note: The compose file might read additional variables from set_env.sh. Ensure all required variables like ports (`LLM_SERVICE_PORT`, `MEGA_SERVICE_PORT`, etc.) are set if not using defaults from the compose file._
+    like
+
+    ```
+    export LLM_MODEL_ID="Qwen/Qwen2.5-Coder-32B-Instruct"
+    ```
+
+    can be changed to small model if needed
+
+    ```
+    export LLM_MODEL_ID="Qwen/Qwen2.5-Coder-7B-Instruct"
+    ```
 
 2.  **Start Services (vLLM Profile):**
 
@@ -94,7 +105,7 @@ The `compose.yaml` file uses Docker Compose profiles to select the LLM serving b
 - **Other Services:** Same CPU-based services as the vLLM profile.
 - **To Run:**
   ```bash
-  # Ensure environment variables (HOST_IP, HUGGINGFACEHUB_API_TOKEN) are set
+  # Ensure environment variables (host_ip, HUGGINGFACEHUB_API_TOKEN) are set
   docker compose --profile codegen-gaudi-tgi up -d
   ```
 
@@ -106,14 +117,14 @@ Key parameters are configured via environment variables set before running `dock
 
 | Environment Variable                    | Description                                                                                                         | Default (Set Externally)                                                                         |
 | :-------------------------------------- | :------------------------------------------------------------------------------------------------------------------ | :----------------------------------------------------------------------------------------------- |
-| `HOST_IP`                               | External IP address of the host machine. **Required.**                                                              | `your_external_ip_address`                                                                       |
+| `host_ip`                               | External IP address of the host machine. **Required.**                                                              | `your_external_ip_address`                                                                       |
 | `HUGGINGFACEHUB_API_TOKEN`              | Your Hugging Face Hub token for model access. **Required.**                                                         | `your_huggingface_token`                                                                         |
-| `LLM_MODEL_ID`                          | Hugging Face model ID for the CodeGen LLM (used by TGI/vLLM service). Configured within `compose.yaml` environment. | `Qwen/Qwen2.5-Coder-7B-Instruct`                                                                 |
+| `LLM_MODEL_ID`                          | Hugging Face model ID for the CodeGen LLM (used by TGI/vLLM service). Configured within `compose.yaml` environment. | `Qwen/Qwen2.5-Coder-32B-Instruct`                                                                |
 | `EMBEDDING_MODEL_ID`                    | Hugging Face model ID for the embedding model (used by TEI service). Configured within `compose.yaml` environment.  | `BAAI/bge-base-en-v1.5`                                                                          |
 | `LLM_ENDPOINT`                          | Internal URL for the LLM serving endpoint (used by `codegen-llm-server`). Configured in `compose.yaml`.             | `http://codegen-tgi-server:80/generate` or `http://codegen-vllm-server:8000/v1/chat/completions` |
 | `TEI_EMBEDDING_ENDPOINT`                | Internal URL for the Embedding service. Configured in `compose.yaml`.                                               | `http://codegen-tei-embedding-server:80/embed`                                                   |
 | `DATAPREP_ENDPOINT`                     | Internal URL for the Data Preparation service. Configured in `compose.yaml`.                                        | `http://codegen-dataprep-server:80/dataprep`                                                     |
-| `BACKEND_SERVICE_ENDPOINT`              | External URL for the CodeGen Gateway (MegaService). Derived from `HOST_IP` and port `7778`.                         | `http://${HOST_IP}:7778/v1/codegen`                                                              |
+| `BACKEND_SERVICE_ENDPOINT`              | External URL for the CodeGen Gateway (MegaService). Derived from `host_ip` and port `7778`.                         | `http://${host_ip}:7778/v1/codegen`                                                              |
 | `*_PORT` (Internal)                     | Internal container ports (e.g., `80`, `6379`). Defined in `compose.yaml`.                                           | N/A                                                                                              |
 | `http_proxy` / `https_proxy`/`no_proxy` | Network proxy settings (if required).                                                                               | `""`                                                                                             |
 
@@ -170,21 +181,21 @@ Check logs: `docker compose logs <service_name>`. Pay attention to `vllm-gaudi-s
 
 ### Run Validation Script/Commands
 
-Use `curl` commands targeting the main service endpoints. Ensure `HOST_IP` is correctly set.
+Use `curl` commands targeting the main service endpoints. Ensure `host_ip` is correctly set.
 
-1.  **Validate LLM Serving Endpoint (Example for vLLM on default port 8000 internally, exposed differently):**
+1.  **Validate LLM Serving Endpoint (Example for vLLM on default port 9000 internally, exposed differently):**
 
     ```bash
     # This command structure targets the OpenAI-compatible vLLM endpoint
-    curl http://${HOST_IP}:8000/v1/chat/completions \
+    curl http://${host_ip}:9000/v1/chat/completions \
        -X POST \
        -H 'Content-Type: application/json' \
-       -d '{"model": "Qwen/Qwen2.5-Coder-7B-Instruct", "messages": [{"role": "user", "content": "Implement a basic Python class"}], "max_tokens":32}'
+       -d '{"model": "Qwen/Qwen2.5-Coder-32B-Instruct", "messages": [{"role": "user", "content": "Implement a basic Python class"}], "max_tokens":32}'
     ```
 
 2.  **Validate CodeGen Gateway (MegaService, default host port 7778):**
     ```bash
-    curl http://${HOST_IP}:7778/v1/codegen \
+    curl http://${host_ip}:7778/v1/codegen \
       -H "Content-Type: application/json" \
       -d '{"messages": "Implement a sorting algorithm in Python."}'
     ```
@@ -197,26 +208,26 @@ UI options are similar to the Xeon deployment.
 ### Gradio UI (Default)
 
 Access the default Gradio UI:
-`http://{HOST_IP}:8080`
-_(Port `8080` is the default host mapping)_
+`http://{host_ip}:5173`
+_(Port `5173` is the default host mapping)_
 
 ![Gradio UI](../../../../assets/img/codegen_gradio_ui_main.png)
 
 ### Svelte UI (Optional)
 
 1.  Modify `compose.yaml`: Swap Gradio service for Svelte (`codegen-gaudi-ui-server`), check port map (e.g., `5173:5173`).
 2.  Restart: `docker compose --profile <profile_name> up -d`
-3.  Access: `http://{HOST_IP}:5173`
+3.  Access: `http://{host_ip}:5173`
 
 ### React UI (Optional)
 
 1.  Modify `compose.yaml`: Swap Gradio service for React (`codegen-gaudi-react-ui-server`), check port map (e.g., `5174:80`).
 2.  Restart: `docker compose --profile <profile_name> up -d`
-3.  Access: `http://{HOST_IP}:5174`
+3.  Access: `http://{host_ip}:5174`
 
 ### VS Code Extension (Optional)
 
-Use the `Neural Copilot` extension configured with the CodeGen backend URL: `http://${HOST_IP}:7778/v1/codegen`. (See Xeon README for detailed setup screenshots).
+Use the `Neural Copilot` extension configured with the CodeGen backend URL: `http://${host_ip}:7778/v1/codegen`. (See Xeon README for detailed setup screenshots).
 
 ## Troubleshooting
 
@@ -226,7 +237,7 @@ Use the `Neural Copilot` extension configured with the CodeGen backend URL: `htt
   - Verify `runtime: habana` and volume mounts in `compose.yaml`.
   - Gaudi initialization can take significant time and memory. Monitor resource usage.
 - **Model Download Issues:** Check `HUGGINGFACEHUB_API_TOKEN`, internet access, proxy settings. Check LLM service logs.
-- **Connection Errors:** Verify `HOST_IP`, ports, and proxy settings. Use `docker ps` and check service logs.
+- **Connection Errors:** Verify `host_ip`, ports, and proxy settings. Use `docker ps` and check service logs.
 
 ## Stopping the Application
 
diff --git a/DocIndexRetriever/docker_compose/intel/cpu/xeon/README.md b/DocIndexRetriever/docker_compose/intel/cpu/xeon/README.md