diff --git a/ChatQnA/docker_compose/intel/cpu/aipc/README_mariadb.md b/ChatQnA/docker_compose/intel/cpu/aipc/README_mariadb.md new file mode 100644 index 0000000000..118dd26f63 --- /dev/null +++ b/ChatQnA/docker_compose/intel/cpu/aipc/README_mariadb.md @@ -0,0 +1,206 @@ +# Build Mega Service of ChatQnA on AIPC + +This document outlines the deployment process for a ChatQnA application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline on AIPC. The steps include Docker image creation, container deployment via Docker Compose, and service execution to integrate microservices such as `embedding`, `retriever`, `rerank`, and `llm`. + +## Quick Start: + +1. Set up the environment variables. +2. Run Docker Compose. +3. Consume the ChatQnA Service. + +### Quick Start: 1. Set up environment variables + +1. Set the default environment: + + ```bash + source ./set_env_mariadb.sh + ``` + +2. You need to set a Hugging Face access token for your account: + + ```bash + export HUGGINGFACEHUB_API_TOKEN="Your_Huggingface_API_Token" + ``` + +3. _Set up the model (optional)_ + + _By default, **llama3.2** is used for LLM serving, the default model can be changed to other LLM models. Please pick a [validated llm models](https://github.com/opea-project/GenAIComps/tree/main/comps/llms/src/text-generation#validated-llm-models) from the table._ + \*To change the default model defined in `set_env_mariadb.sh`, overwrite it by exporting `OLLAMA_MODEL` to the new model or by modifying `set_env_mariadb.sh`. + + _Example, in order to use [DeepSeek-R1-Distill-Llama-8B model](https://ollama.com/library/deepseek-r1:8b), set:_ + + ```bash + export OLLAMA_MODEL="deepseek-r1:8b" + ``` + +### Quick Start: 2. Run Docker Compose + +```bash +docker compose -f compose_mariadb.yaml up -d +``` + +--- + +_You should build docker image from source by yourself if_: + +- _You are developing off the git main branch (as the container's ports in the repo may be different from the published docker image)._ +- _You can't download the docker image._ +- _You want to use a specific docker image version_ + +Please refer to ['Build Docker Images'](#🚀-build-docker-images) in below. + +--- + +### Quick Start:3. Consume the ChatQnA Service + +Once the services are up, open the following URL from your browser: `http://localhost` and fiddle around with prompts. + +## 🚀 Build Docker Images + +```bash +# Clone the AI components repository +mkdir ~/OPEA -p +cd ~/OPEA +git clone https://github.com/opea-project/GenAIComps.git +cd GenAIComps +``` + +If you are in a proxy environment, set the proxy-related environment variables: + +```bash +export http_proxy="Your_HTTP_Proxy" +export https_proxy="Your_HTTPs_Proxy" +``` + +### 1. Build Retriever Image + +```bash +docker build --no-cache -t opea/retriever:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/retrievers/src/Dockerfile . +``` + +### 2. Build Dataprep Image + +```bash +docker build --no-cache -t opea/dataprep:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/dataprep/src/Dockerfile . +cd .. +``` + +### 3. Build MegaService Docker Image + +To construct the Mega Service, we utilize the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline within the `chatqna.py` Python script. Build MegaService Docker image via below command: + +```bash +cd ~/OPEA +git clone https://github.com/opea-project/GenAIExamples.git +cd GenAIExamples/ChatQnA +docker build --no-cache -t opea/chatqna:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile . +``` + +### 4. Build UI Docker Image + +Build frontend Docker image via below command: + +```bash +cd ~/OPEA/GenAIExamples/ChatQnA/ui +docker build --no-cache -t opea/chatqna-ui:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f ./docker/Dockerfile . +``` + +### 5. Build Nginx Docker Image + +```bash +cd GenAIComps +docker build -t opea/nginx:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/third_parties/nginx/src/Dockerfile . +``` + +Finally, `docker image ls` would return: + +``` +opea/dataprep:latest +opea/retriever:latest +opea/chatqna:latest +opea/chatqna-ui:latest +opea/nginx:latest +``` + +After starting the services, if you want to check each service individually, please refer to the section below. + +### Validate Microservices + +You can validate each microservice by making individual requests. +For details on how to verify the correctness of the response, refer to [how-to-validate_service](../../hpu/gaudi/how_to_validate_service.md). + +1. TEI Embedding Service + + ```bash + curl ${host_ip}:6006/embed \ + -X POST \ + -d '{"inputs":"What is Deep Learning?"}' \ + -H 'Content-Type: application/json' + ``` + +2. Retriever Microservice + + ```bash + # Mock the embedding vector + export your_embedding=$(python3 -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)") + ``` + + ```bash + # Perform a similarity search + curl http://${host_ip}:7000/v1/retrieval \ + -X POST \ + -d '{"text":"What is the revenue of Nike in 2023?","embedding":"'"${your_embedding}"'"}' \ + -H 'Content-Type: application/json' + ``` + +3. TEI Reranking Service + + ```bash + curl http://${host_ip}:8808/rerank \ + -X POST \ + -d '{"query":"What is Deep Learning?", "texts": ["Deep Learning is not...", "Deep learning is..."]}' \ + -H 'Content-Type: application/json' + ``` + +4. Ollama Service + + ```bash + curl http://${host_ip}:11434/api/generate -d '{"model": "llama3.2", "prompt":"What is Deep Learning?"}' + ``` + +5. MegaService + + ```bash + curl http://${host_ip}:8888/v1/chatqna -H "Content-Type: application/json" -d '{ + "messages": "What is the revenue of Nike in 2023?" + }' + ``` + +6. DataPrep Service + + Try the `ingest - get - delete` endpoints: + + ```bash + # Get a sample file to ingest + wget https://raw.githubusercontent.com/opea-project/GenAIComps/v1.1/comps/retrievers/redis/data/nke-10k-2023.pdf + ``` + + ```bash + # Ingest + curl -X POST "http://${host_ip}:6007/v1/dataprep/ingest" \ + -H "Content-Type: multipart/form-data" \ + -F "files=@./nke-10k-2023.pdf" + ``` + + ```bash + # Get + curl -X POST "http://${host_ip}:6007/v1/dataprep/get" \ + -H "Content-Type: application/json" + ``` + + ```bash + # Delete all + curl -X POST "http://${host_ip}:6007/v1/dataprep/delete" \ + -H "Content-Type: application/json" \ + -d '{"file_path": "all"}' + ``` diff --git a/ChatQnA/docker_compose/intel/cpu/aipc/compose_mariadb.yaml b/ChatQnA/docker_compose/intel/cpu/aipc/compose_mariadb.yaml new file mode 100644 index 0000000000..f53c2b8ea7 --- /dev/null +++ b/ChatQnA/docker_compose/intel/cpu/aipc/compose_mariadb.yaml @@ -0,0 +1,181 @@ +# Copyright (C) 2025 MariaDB Foundation +# SPDX-License-Identifier: Apache-2.0 + +services: + mariadb-server: + image: mariadb:latest + container_name: mariadb-server + ports: + - "3306:3306" + environment: + - MARIADB_DATABASE=${MARIADB_DATABASE} + - MARIADB_USER=${MARIADB_USER} + - MARIADB_PASSWORD=${MARIADB_PASSWORD} + - MARIADB_RANDOM_ROOT_PASSWORD=1 + healthcheck: + test: ["CMD", "healthcheck.sh", "--connect", "--innodb_initialized"] + start_period: 10s + interval: 10s + timeout: 5s + retries: 3 + + dataprep-mariadb-vector: + image: ${REGISTRY:-opea}/dataprep:${TAG:-latest} + container_name: dataprep-mariadb-vector + depends_on: + mariadb-server: + condition: service_healthy + tei-embedding-service: + condition: service_started + ports: + - "6007:5000" + environment: + no_proxy: ${no_proxy} + http_proxy: ${http_proxy} + https_proxy: ${https_proxy} + DATAPREP_COMPONENT_NAME: "OPEA_DATAPREP_MARIADBVECTOR" + MARIADB_CONNECTION_URL: mariadb+mariadbconnector://${MARIADB_USER}:${MARIADB_PASSWORD}@mariadb-server:3306/${MARIADB_DATABASE} + TEI_ENDPOINT: http://tei-embedding-service:80 + HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} + healthcheck: + test: ["CMD-SHELL", "curl -f http://localhost:5000/v1/health_check || exit 1"] + interval: 10s + timeout: 5s + retries: 50 + restart: unless-stopped + tei-embedding-service: + image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.6 + container_name: tei-embedding-server + ports: + - "6006:80" + volumes: + - "./data:/data" + shm_size: 1g + environment: + no_proxy: ${no_proxy} + http_proxy: ${http_proxy} + https_proxy: ${https_proxy} + command: --model-id ${EMBEDDING_MODEL_ID} --auto-truncate + retriever: + image: ${REGISTRY:-opea}/retriever:${TAG:-latest} + container_name: retriever-mariadb-vector + depends_on: + mariadb-server: + condition: service_healthy + ports: + - "7000:7000" + ipc: host + environment: + no_proxy: ${no_proxy} + http_proxy: ${http_proxy} + https_proxy: ${https_proxy} + MARIADB_CONNECTION_URL: mariadb+mariadbconnector://${MARIADB_USER}:${MARIADB_PASSWORD}@mariadb-server:3306/${MARIADB_DATABASE} + HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} + LOGFLAG: ${LOGFLAG} + RETRIEVER_COMPONENT_NAME: "OPEA_RETRIEVER_MARIADBVECTOR" + restart: unless-stopped + tei-reranking-service: + image: ghcr.io/huggingface/text-embeddings-inference:cpu-1.6 + container_name: tei-reranking-server + ports: + - "8808:80" + volumes: + - "./data:/data" + shm_size: 1g + environment: + no_proxy: ${no_proxy} + http_proxy: ${http_proxy} + https_proxy: ${https_proxy} + HUGGINGFACEHUB_API_TOKEN: ${HUGGINGFACEHUB_API_TOKEN} + HF_HUB_DISABLE_PROGRESS_BARS: 1 + HF_HUB_ENABLE_HF_TRANSFER: 0 + command: --model-id ${RERANK_MODEL_ID} --auto-truncate + ollama-service: + image: ollama/ollama + container_name: ollama + ports: + - "11434:11434" + volumes: + - ollama:/root/.ollama + entrypoint: ["bash", "-c"] + command: ["ollama serve & sleep 10 && ollama run ${OLLAMA_MODEL} & wait"] + environment: + no_proxy: ${no_proxy} + https_proxy: ${https_proxy} + OLLAMA_MODEL: ${OLLAMA_MODEL} + + chatqna-aipc-backend-server: + image: ${REGISTRY:-opea}/chatqna:${TAG:-latest} + container_name: chatqna-aipc-backend-server + depends_on: + mariadb-server: + condition: service_healthy + dataprep-mariadb-vector: + condition: service_healthy + tei-embedding-service: + condition: service_started + retriever: + condition: service_started + tei-reranking-service: + condition: service_started + ollama-service: + condition: service_started + ports: + - "8888:8888" + environment: + - no_proxy=${no_proxy} + - https_proxy=${https_proxy} + - http_proxy=${http_proxy} + - MEGA_SERVICE_HOST_IP=chatqna-aipc-backend-server + - EMBEDDING_SERVER_HOST_IP=tei-embedding-service + - EMBEDDING_SERVER_PORT=80 + - RETRIEVER_SERVICE_HOST_IP=retriever + - RERANK_SERVER_HOST_IP=tei-reranking-service + - RERANK_SERVER_PORT=80 + - LLM_SERVER_HOST_IP=ollama-service + - LLM_SERVER_PORT=11434 + - LLM_MODEL=${OLLAMA_MODEL} + - LOGFLAG=${LOGFLAG} + ipc: host + restart: always + chatqna-aipc-ui-server: + image: ${REGISTRY:-opea}/chatqna-ui:${TAG:-latest} + container_name: chatqna-aipc-ui-server + depends_on: + - chatqna-aipc-backend-server + ports: + - "5173:5173" + environment: + - no_proxy=${no_proxy} + - https_proxy=${https_proxy} + - http_proxy=${http_proxy} + ipc: host + restart: always + chatqna-aipc-nginx-server: + image: ${REGISTRY:-opea}/nginx:${TAG:-latest} + container_name: chatqna-aipc-nginx-server + depends_on: + - chatqna-aipc-backend-server + - chatqna-aipc-ui-server + ports: + - "${NGINX_PORT:-80}:80" + environment: + - no_proxy=${no_proxy} + - https_proxy=${https_proxy} + - http_proxy=${http_proxy} + - FRONTEND_SERVICE_IP=chatqna-aipc-ui-server + - FRONTEND_SERVICE_PORT=5173 + - BACKEND_SERVICE_NAME=chatqna + - BACKEND_SERVICE_IP=chatqna-aipc-backend-server + - BACKEND_SERVICE_PORT=8888 + - DATAPREP_SERVICE_IP=dataprep-mariadb-vector + - DATAPREP_SERVICE_PORT=5000 + ipc: host + restart: always + +volumes: + ollama: + +networks: + default: + driver: bridge diff --git a/ChatQnA/docker_compose/intel/cpu/aipc/set_env_mariadb.sh b/ChatQnA/docker_compose/intel/cpu/aipc/set_env_mariadb.sh new file mode 100644 index 0000000000..4070051781 --- /dev/null +++ b/ChatQnA/docker_compose/intel/cpu/aipc/set_env_mariadb.sh @@ -0,0 +1,29 @@ +#!/usr/bin/env bash + +# Copyright (C) 2025 MariaDB Foundation +# SPDX-License-Identifier: Apache-2.0 + +pushd "../../../../../" > /dev/null +source .set_env.sh +popd > /dev/null + +export host_ip=$(hostname -I | awk '{print $1}') + +if [ -z "${HUGGINGFACEHUB_API_TOKEN}" ]; then + echo "Error: HUGGINGFACEHUB_API_TOKEN is not set. Please set HUGGINGFACEHUB_API_TOKEN." +fi + +if [ -z "${host_ip}" ]; then + echo "Error: host_ip is not set. Please set host_ip first." +fi +# end of TMP +export MARIADB_DATABASE="vectordb" +export MARIADB_USER="chatqna" +export MARIADB_PASSWORD="password" +export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN} +export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5" +export RERANK_MODEL_ID="BAAI/bge-reranker-base" +export OLLAMA_MODEL="llama3.2" +# Set it as a non-null string, such as true, if you want to enable logging facility, +# otherwise, keep it as "" to disable it. +export LOGFLAG="" diff --git a/ChatQnA/tests/_test_compose_mariadb_on_aipc.sh b/ChatQnA/tests/_test_compose_mariadb_on_aipc.sh new file mode 100755 index 0000000000..772dc624a6 --- /dev/null +++ b/ChatQnA/tests/_test_compose_mariadb_on_aipc.sh @@ -0,0 +1,206 @@ +#!/bin/bash +# Copyright (C) 2025 MariaDB Foundation +# SPDX-License-Identifier: Apache-2.0 + +set -e +IMAGE_REPO=${IMAGE_REPO:-"opea"} +IMAGE_TAG=${IMAGE_TAG:-"latest"} +echo "REGISTRY=IMAGE_REPO=${IMAGE_REPO}" +echo "TAG=IMAGE_TAG=${IMAGE_TAG}" +export REGISTRY=${IMAGE_REPO} +export TAG=${IMAGE_TAG} +export MODEL_CACHE=${model_cache:-"./data"} + +WORKPATH=$(dirname "$PWD") +LOG_PATH="$WORKPATH/tests" +ip_address=$(hostname -I | awk '{print $1}') + +function build_docker_images() { + opea_branch=${opea_branch:-"main"} + + cd $WORKPATH/docker_image_build + git clone --depth 1 --branch ${opea_branch} https://github.com/opea-project/GenAIComps.git + pushd GenAIComps + echo "GenAIComps test commit is $(git rev-parse HEAD)" + docker build --no-cache -t ${REGISTRY}/comps-base:${TAG} --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f Dockerfile . + popd && sleep 1s + + echo "Build all the images with --no-cache, check docker_image_build.log for details..." + service_list="chatqna chatqna-ui dataprep retriever nginx" + docker compose -f build.yaml build ${service_list} --no-cache > ${LOG_PATH}/docker_image_build.log + + docker pull ghcr.io/huggingface/text-embeddings-inference:cpu-1.6 + docker pull ollama/ollama + + docker images && sleep 1s +} + +function start_services() { + cd $WORKPATH/docker_compose/intel/cpu/aipc/ + export MARIADB_DATABASE="vectordb" + export MARIADB_USER="chatqna" + export MARIADB_PASSWORD="test" + export no_proxy=${no_proxy},${ip_address} + export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5" + export RERANK_MODEL_ID="BAAI/bge-reranker-base" + export OLLAMA_MODEL="llama3.2" + export HUGGINGFACEHUB_API_TOKEN=${HUGGINGFACEHUB_API_TOKEN} + export LOGFLAG=true + + # Start Docker Containers + docker compose -f compose_mariadb.yaml up -d > ${LOG_PATH}/start_services_with_compose.log + + sleep 30s # Wait for services to be ready +} + +function validate_service() { + local URL="$1" + local EXPECTED_RESULT="$2" + local SERVICE_NAME="$3" + local DOCKER_NAME="$4" + local INPUT_DATA="$5" + + if [[ $SERVICE_NAME == *"dataprep_upload_file"* ]]; then + cd $LOG_PATH + HTTP_RESPONSE=$(curl --silent --write-out "HTTPSTATUS:%{http_code}" -X POST -F 'files=@./dataprep_file.txt' -H 'Content-Type: multipart/form-data' "$URL") + elif [[ $SERVICE_NAME == *"dataprep_del"* ]]; then + HTTP_RESPONSE=$(curl --silent --write-out "HTTPSTATUS:%{http_code}" -X POST -d '{"file_path": "all"}' -H 'Content-Type: application/json' "$URL") + else + echo $URL + HTTP_RESPONSE=$(curl --silent --write-out "HTTPSTATUS:%{http_code}" -X POST -d "$INPUT_DATA" -H 'Content-Type: application/json' "$URL") + fi + HTTP_STATUS=$(echo $HTTP_RESPONSE | tr -d '\n' | sed -e 's/.*HTTPSTATUS://') + RESPONSE_BODY=$(echo $HTTP_RESPONSE | sed -e 's/HTTPSTATUS\:.*//g') + + docker logs ${DOCKER_NAME} >> ${LOG_PATH}/${SERVICE_NAME}.log + + + # check response status + if [ "$HTTP_STATUS" -ne "200" ]; then + echo "[ $SERVICE_NAME ] HTTP status is not 200. Received status was $HTTP_STATUS" + exit 1 + else + echo "[ $SERVICE_NAME ] HTTP status is 200. Checking content..." + fi + echo "Response" + echo $RESPONSE_BODY + echo "Expected Result" + echo $EXPECTED_RESULT + # check response body + if [[ "$RESPONSE_BODY" != *"$EXPECTED_RESULT"* ]]; then + echo "[ $SERVICE_NAME ] Content does not match the expected result: $RESPONSE_BODY" + exit 1 + else + echo "[ $SERVICE_NAME ] Content is as expected." + fi + + sleep 1s +} + +function validate_microservices() { + # Check if the microservices are running correctly. + + # tei for embedding service + validate_service \ + "${ip_address}:6006/embed" \ + "[[" \ + "tei-embedding" \ + "tei-embedding-server" \ + '{"inputs":"What is Deep Learning?"}' + + sleep 1m # retrieval can't curl as expected, try to wait for more time + + # test /v1/dataprep/delete + validate_service \ + "http://${ip_address}:6007/v1/dataprep/delete" \ + '{"status":true}' \ + "dataprep_del" \ + "dataprep-mariadb-vector" + + + # test /v1/dataprep/ingest upload file + echo "Deep learning is a subset of machine learning that utilizes neural networks with multiple layers to analyze various levels of abstract data representations. It enables computers to identify patterns and make decisions with minimal human intervention by learning from large amounts of data." > $LOG_PATH/dataprep_file.txt + validate_service \ + "http://${ip_address}:6007/v1/dataprep/ingest" \ + "Data preparation succeeded" \ + "dataprep_upload_file" \ + "dataprep-mariadb-vector" + + + # retrieval microservice + test_embedding=$(python3 -c "import random; embedding = [random.uniform(-1, 1) for _ in range(768)]; print(embedding)") + validate_service \ + "${ip_address}:7000/v1/retrieval" \ + " " \ + "retrieval" \ + "retriever-mariadb-vector" \ + "{\"text\":\"What is the revenue of Nike in 2023?\",\"embedding\":${test_embedding}}" + + # tei for rerank microservice + echo "Validating reranking service" + validate_service \ + "${ip_address}:8808/rerank" \ + '{"index":1,"score":' \ + "tei-rerank" \ + "tei-reranking-server" \ + '{"query":"What is Deep Learning?", "texts": ["Deep Learning is not...", "Deep learning is..."]}' + + + # validate llm on ollama service + echo "Validating llm service" + validate_service \ + "${ip_address}:11434/api/chat" \ + "content" \ + "ollama-llm" \ + "ollama" \ + '{"model": "llama3.2", "messages": [{"role": "user", "content": "What is Deep Learning?"}], "options": {"num_predict": 17}}' +} + +function validate_megaservice() { + # Curl the Mega Service + validate_service \ + "${ip_address}:8888/v1/chatqna" \ + "data: " \ + "chatqna-megaservice" \ + "chatqna-aipc-backend-server" \ + '{"messages": "What is the revenue of Nike in 2023?"}' + +} + +function stop_docker() { + echo "In stop docker" + echo $WORKPATH + cd $WORKPATH/docker_compose/intel/cpu/aipc/ + docker compose -f compose_mariadb.yaml down +} + +function main() { + + echo "::group::stop_docker" + stop_docker + echo "::endgroup::" + + echo "::group::build_docker_images" + if [[ "$IMAGE_REPO" == "opea" ]]; then build_docker_images; fi + echo "::endgroup::" + + echo "::group::start_services" + start_services + echo "::endgroup::" + + echo "::group::validate_microservices" + validate_microservices + echo "::endgroup::" + + echo "::group::validate_megaservice" + validate_megaservice + echo "::endgroup::" + + echo "::group::stop_docker" + stop_docker + echo "::endgroup::" + + docker system prune -f +} + +main