opea-project · chensuyue · Aug 12, 2025 · Jul 30, 2025 · Jul 30, 2025
@@ -4,20 +4,6 @@
 
 RAG Pilot provides a set of tuners to optimize various parameters in a retrieval-augmented generation (RAG) pipeline. Each tuner allows fine-grained control over key aspects of parsing, chunking, postporcessing, and generating selection, enabling better retrieval and response generation.
 
-### 🧠 Available Tuners
-
-| Tuner | Stage | Function | Configuration |
-|---|---|---|---|
-| **EmbeddingTuner** | Retrieval | Tune embedding model and related parameters | Allows selection and configuration of the embedding model used for vectorization, including model name and optional parameters like dimension or backend. |
-| **NodeParserTuner** | Retrieval | Tune node parser parameters | General tuner for configuring node parsers, possibly extending to custom strategies or pre-processing logic. |
-| **SimpleNodeParserChunkTuner** | Retrieval | Tune `SentenceSplitter`'s `chunk_size` and `chunk_overlap` | Configures chunking behavior for document parsing by adjusting the size of individual text chunks and their overlap to ensure context retention. |
-| **RetrievalTopkTuner** | Retrieval | Tune `top_k` for retriever | Adjusts how many documents are retrieved before reranking, balancing recall and performance. |
-| **RerankerTopnTuner** | Postprocessing | Tune `top_n` for reranking | Adjusts the number of top-ranked documents returned after reranking, optimizing relevance and conciseness. |
-
-
-These tuners help in optimizing document parsing, chunking strategies, reranking efficiency, and embedding selection for improved RAG performance.
-
-
 ## 🌐 Quickstart Guide
 
 ### ⚙️ Dependencies and Environment Setup
@@ -31,12 +17,14 @@ Load documents in EdgeCraftRAG before running RAG Pilot.
 #### Setup RAG Pilot
 
 ```bash
-cd rag_pilot
 
-# (Optional) Build RAG Pilot and UI docker images
+# Build RAG Pilot and UI docker images
+cd rag_pilot
 docker build --build-arg HTTP_PROXY=$HTTP_PROXY --build-arg HTTP_PROXYS=$HTTP_PROXYS --build-arg NO_PROXY=$NO_PROXY -t opea/ragpilot:latest -f ./Dockerfile .
 docker build --build-arg HTTP_PROXY=$HTTP_PROXY --build-arg HTTP_PROXYS=$HTTP_PROXYS --build-arg NO_PROXY=$NO_PROXY -t opea/ragpilot-ui:latest -f ./ui/Dockerfile.ui .
-
+# Or build docker images by build.sh
+cd ./rag_pilot/docker_image_build
+docker compose -f build.yaml build
 # Setup ENV
 export ECRAG_SERVICE_HOST_IP=${HOST_IP} # HOST IP of EC-RAG Service, usually current host ip
 
@@ -49,205 +37,10 @@ export ECRAG_SERVICE_HOST_IP=${HOST_IP} # HOST IP of EC-RAG Service, usually cur
 # If you want to change exposed RAG Pilot service port
 #export RAGPILOT_SERVICE_PORT=
 
+#Start RAG Pilor server
+cd ./rag_pilot
 docker compose -f docker_compose/intel/gpu/arc/compose.yaml up -d
 ```
 
-### 🚦 Launch RAG Pilot in Online Mode
-
-To launch RAG Pilot, create the following *required files* before running the command:
-
-#### 🔹Input file: QA List File (`your_queries.csv`)
-
-The input CSV file should contain queries and associated ground truth data (optional) used for evaluation or tuning. Each row corresponds to a specific query and context file. The CSV must include the following **columns**:
-
-| Column | Required | Description |
-|--------|----------|-------------|
-| `query_id` | ✅ Yes | Unique identifier for the query. Can be used to group multiple context entries under the same query. |
-| `query` | ✅ Yes (at least one per `query_id`) | The actual query string. If left empty for some rows sharing the same `query_id`, the query from the first row with a non-empty value will be used. |
-| `file_name` | ✅ Yes | The name of the file or document where the context (for retrieval or grounding) is drawn from. |
-| `gt_context` | ✅ Yes | The ground truth context string that should be retrieved or matched against. |
-| `ground_truth` | ❌ Optional | The ideal answer or response for the query, used for optional answer-level evaluation. |
-
-##### 📌 CSV File Example
-
-```csv
-query_id,query,file_name,gt_context,ground_truth
-53,故障来源有哪些？,故障处理记录表.txt,故障来源：用户投诉、日志系统、例行维护中发现、其它来源。,故障来源：用户投诉、日志系统、例行维护中发现、其它来源。
-93,uMAC网元VNFC有哪几种备份方式,index.txt,ZUF-76-04-005 VNFC支持1+1主备冗余,uMAC网元VFNC有3中备份方式: 支持1+1主备冗余，支持N+M负荷分担冗余， 支持1+1互备冗余。
-93,,index.txt,ZUF-76-04-006 VNFC支持N+M负荷分担冗余,
-93,,index.txt,ZUF-76-04-008 VNFC支持1+1互备冗余,
-```
-
-#### ▶️ Run RAG Pilot
-
-Run the following command to start the tuning process.
-
-```bash
-# Run pipeline tuning tool
-export ECRAG_SERVICE_HOST_IP="ecrag_host_ip"
-python3 -m run_pilot -q "your_queries.csv"
-```
-
-#### 📦 Output Files and Structure
-
-Each tuning run in **RAG Pilot** generates a set of structured output files for analyzing and comparing different RAG pipeline configurations.
-
-##### 📁 Directory Layout
-
-- `rag_pilot_<timestamp>/`: Main folder for a tuning session.
-  - `summary.csv` – Overall performance metrics of all executed pipelines.
-  - `curr_pipeline.json` – Best pipeline configuration.
-  - `curr_rag_results.json` – Results of the best pipeline.
-  - `rag_summary.csv` – Query-wise summary.
-  - `rag_contexts.csv` – Detailed context analysis.
-  - `entry_<hash>/`: Subfolders for each tried pipeline with the same file structure:
-    - `pipeline.json`
-    - `rag_results.json`
-    - `rag_summary.csv`
-    - `rag_contexts.csv`
-
-##### 🗂️ Output File Overview
-
-| File Name             | Description                                                                 |
-|----------------------|-----------------------------------------------------------------------------|
-| `summary.csv`         | Aggregated summary across all pipelines                          |
-| `pipeline.json`       | RAG pipeline configuration used in a specific trial                        |
-| `rag_results.json`    | List of results for each query, including metadata and context sets         |
-| `rag_summary.csv`     | Summary of each query's outcome, including response and context hit counts |
-| `rag_contexts.csv`    | Breakdown of retrieved/reranked contexts and mapping to ground truth        |
-
-**Context Mapping Notes:**
-
-- Contexts are categorized as `gt_contexts`, `retrieval_contexts`, or `postprocessing_contexts`.
-- Mappings track which retrieved or postprocessed contexts hit the ground truth.
-- Each context is associated with a `query_id` and indexed for traceability.
-
-
-## 📴 Offline RAG Tuning
-
-RAG Pilot supports offline mode using a RAG configuration file.
-
-### ⚙️ Environment Setup
-
-Refer to [Create Running Environment](#create-running-environment) in the Online RAG pipeline tuning section for setting up the environment before proceeding.
-
-### 🚦 Launch RAG Pilot in Offline Mode
-
-To be added in later release
-
-
-## 🔧 How to Adjust RAG Pilot to Tune Your RAG Solution
-
-### 🧩 What's Nodes and Modules
-
-RAG Pilot represents each stage of the RAG pipeline as a **node**, such as `node_parser`, `indexer`, `retriever`, etc. Each node can have different **modules** that define its type and configuration. The nodes and modules are specified in a YAML file, allowing users to switch between different implementations easily.
-
-Here is an example of nodes and modules for EdgeCraftRAG.
-
-![RAG Pilot Architecture](RAG_Pilot.png)
-
-### ⚙️ How to Configure Nodes and Modules
-
-The available nodes and their modules are stored in a YAML file (i.e. `configs/ecrag.yaml` for EdgeCraftRAG as below). Each node can have multiple modules, and both nodes and modules have configurable parameters that can be tuned. 
-
-```yaml
-nodes:
-  - node: node_parser
-    modules:
-      - module_type: simple
-        chunk_size: 400
-        chunk_overlap: 48
-      - module_type: hierarchical
-        chunk_sizes:
-          - 256
-          - 384
-          - 512
-  - node: indexer
-    embedding_model:
-      - BAAI/bge-small-zh-v1.5
-      - BAAI/bge-small-en-v1.5
-    modules:
-      - module_type: vector
-      - module_type: faiss_vector
-  - node: retriever
-    retrieve_topk: 30
-    modules:
-      - module_type: vectorsimilarity
-      - module_type: auto_merge
-      - module_type: bm25
-  - node: postprocessor
-    modules:
-      - module_type: reranker
-        top_n: 3
-        reranker_model: BAAI/bge-reranker-large
-      - module_type: metadata_replace
-  - node: generator
-    model:
-      - Qwen/Qwen2-7B-Instruct
-    inference_type:
-      - local
-      - vllm
-    prompt: null
-```
-
-1. **Each Node Can Have Multiple Modules**  
-   - A node represents a stage in the RAG pipeline, such as `node_parser`, `indexer`, or `retriever`.
-   - Each node can support different modules that define how it operates. For example, the `node_parser` node can use either a `simple` or `hierarchical` module.
-
-2. **Nodes Have Parameters to Tune**  
-   - Some nodes have global parameters that affect all modules within them. For instance, the `retriever` node has a `retrieve_topk` parameter that defines how many top results are retrieved.
-
-3. **Modules Have Parameters to Tune**  
-   - Each module within a node can have its own parameters. For example, the `simple` parser module has `chunk_size` and `chunk_overlap` parameters, while the `hierarchical` parser module supports multiple `chunk_sizes`.
-
-4. **Each Node Selects Its Module Based on a Type Map**  
-   - The tool uses an internal mapping to associate each module type with its corresponding function. The type of module selected for each node is defined in a mapping system like the one below:
-
-     ```python
-     COMP_TYPE_MAP = {
-         "node_parser": "parser_type",
-         "indexer": "indexer_type",
-         "retriever": "retriever_type",
-         "postprocessor": "processor_type",
-         "generator": "inference_type",
-     }
-     ```
-
-### 🧑‍💻 How to Use Nodes and Modules
-
-Besides the YAML configuration file, the tool also uses a module map to associate each module with a runnable instance. This ensures that the tool correctly links each module type to its respective function within the pipeline.
-
-#### 🧾 Example: Mapping Modules to Functions
-The function below defines how different module types are mapped to their respective components in EdgeCraftRAG:
-
-```python
-def get_ecrag_module_map(ecrag_pl):
-    ecrag_modules = {
-        # root
-        "root": (ecrag_pl, ""),
-        # node_parser
-        "node_parser": (ecrag_pl, "node_parser"),
-        "simple": (ecrag_pl, "node_parser"),
-        "hierarchical": (ecrag_pl, "node_parser"),
-        "sentencewindow": (ecrag_pl, "node_parser"),
-        # indexer
-        "indexer": (ecrag_pl, "indexer"),
-        "vector": (ecrag_pl, "indexer"),
-        "faiss_vector": (ecrag_pl, "indexer"),
-        # retriever
-        "retriever": (ecrag_pl, "retriever"),
-        "vectorsimilarity": (ecrag_pl, "retriever"),
-        "auto_merge": (ecrag_pl, "retriever"),
-        "bm25": (ecrag_pl, "retriever"),
-        # postprocessor
-        "postprocessor": (ecrag_pl, "postprocessor[0]"),
-        "reranker": (ecrag_pl, "postprocessor[0]"),
-        "metadata_replace": (ecrag_pl, "postprocessor[0]"),
-        # generator
-        "generator": (ecrag_pl, "generator"),
-    }
-    return ecrag_modules
-```
-
 
-By modifying the YAML configuration file and understanding how modules are mapped to functions, you can experiment with different configurations and parameter settings to optimize their RAG pipeline effectively.
+About how to use RAG Pilot and more details, please refer to this [doc](./docs/Detail_Guide.md) for details.
@@ -1 +1 @@
-25.05-dev
+25.07-dev
@@ -182,6 +182,23 @@ async def run_pipeline_by_id(id: uuid.UUID):
         return f"Error: Pipeline {id} does not exist"
 
 
+@pilot_app.post(path="/v1/pilot/pipeline/restore")
+async def restore_pipeline():
+    success = pilot.restore_curr_pl()
+    if success:
+        current_pl = pilot.get_curr_pl()
+        return {
+            "message": "Pipeline restored successfully",
+            "pipeline_id": str(current_pl.get_id()) if current_pl else None,
+            "restored_from": "EdgeCraftRAG active pipeline",
+        }
+    else:
+        raise HTTPException(
+            status_code=404,
+            detail="Failed to restore pipeline: No active pipeline found in EdgeCraftRAG service or restore operation failed",
+        )
+
+
 @pilot_app.post(path="/v1/pilot/files")
 async def add_files(request: DataIn):
     ret = upload_files(request)

@@ -158,6 +158,21 @@ async def get_stage_pipelines(stage: RAGStage = Path(...)):
     return pipeline_list
 
 
+@tuner_app.get(path="/v1/tuners/stage/{stage}/pipelines/best/id")
+async def get_stage_pipelines_best(stage: RAGStage = Path(...)):
+    tuner_names = tunerMgr.get_tuners_by_stage(stage)
+    pl_id_list = []
+    for tuner_name in tuner_names:
+        record = tunerMgr.get_tuner_record(tuner_name)
+        if record is not None:
+            pl_id_list.extend(list(record.all_pipeline_ids))
+            if record.base_pipeline_id not in pl_id_list:
+                pl_id_list.append(record.base_pipeline_id)
+    pl_id_list = list(set(pl_id_list))
+    best_pl_id = get_best_pl_id(pl_id_list, stage)
+    return best_pl_id
+
+
 @tuner_app.get(path="/v1/tuners/{tuner_name}/pipelines/best")
 async def get_pipeline_best(tuner_name):
     record = tunerMgr.get_tuner_record(tuner_name)

@@ -180,9 +180,6 @@ def get_prompt(self) -> Optional[str]:
     def export_pipeline(self):
         self._restore_model_instances()
         exported_pl = copy.deepcopy(self.pl)
-        if hasattr(exported_pl, "generator") and exported_pl.generator:
-            if hasattr(exported_pl.generator, "prompt_content"):
-                delattr(exported_pl.generator, "prompt_content")
         self._replace_model_with_id()
         return exported_pl
 
@@ -372,7 +369,16 @@ class RAGResults(BaseModel):
     finished: bool = False
 
     def add_result(self, result):
-        self.results.append(result)
+        # if result.query_id has appear in self.results, then update the result,else append the results
+        updated_existing = False
+        if result.query_id is not None:
+            for idx, r in enumerate(self.results):
+                if r.query_id == result.query_id:
+                    self.results[idx] = result
+                    updated_existing = True
+                    break
+        if not updated_existing:
+            self.results.append(result)
         self.cal_metadata()
 
     def cal_recall(self):
@@ -418,7 +424,7 @@ def get_metrics(self):
         return self.metadata or {}
 
     def get_metric(self, metric: Metrics, default=float("-inf")):
-        return self.metadata.get(metric.value, default)
+        return (self.metadata or {}).get(metric.value, default)
 
     def update_result_metrics(self, query_id: int, metrics: Dict[str, Union[float, int]]):
         updated = False

@@ -25,6 +25,7 @@ class NodeParserIn(BaseModel):
 class IndexerIn(BaseModel):
     indexer_type: str
     embedding_model: Optional[ModelIn] = None
+    vector_uri: Optional[str] = None
 
 
 class RetrieverIn(BaseModel):
@@ -43,6 +44,7 @@ class GeneratorIn(BaseModel):
     prompt_content: Optional[str] = None
     model: Optional[ModelIn] = None
     inference_type: Optional[str] = "local"
+    vllm_endpoint: Optional[str] = None
 
 
 class PipelineCreateIn(BaseModel):
@@ -78,3 +80,7 @@ class KnowledgeBaseCreateIn(BaseModel):
     name: str
     description: Optional[str] = None
     active: Optional[bool] = None
+
+
+class MilvusConnectRequest(BaseModel):
+    vector_uri: str
@@ -50,6 +50,7 @@ class IndexerType(str, Enum):
 
     FAISS_VECTOR = "faiss_vector"
     DEFAULT_VECTOR = "vector"
+    MILVUS_VECTOR = "milvus_vector"
 
 
 class RetrieverType(str, Enum):

@@ -185,6 +185,17 @@ def get_curr_pl(self):
             else:
                 return None
 
+    def restore_curr_pl(self):
+        pilot.curr_pl_id = None
+        pl_raw = get_active_pipeline()
+        if pl_raw:
+            active_pl = RAGPipeline(pl_raw)
+            active_pl.regenerate_id()
+            pilot.add_rag_pipeline(active_pl)
+            return self.rag_pipeline_dict[self.curr_pl_id]
+        else:
+            return None
+
     def get_curr_pl_id(self):
         if self.curr_pl_id:
             return self.curr_pl_id

@@ -2,7 +2,7 @@
 # SPDX-License-Identifier: Apache-2.0
 
 services:
-  ui:
+  ragpilot-ui:
     image: ${REGISTRY:-opea}/ragpilot-ui:${TAG:-latest}
     container_name: ragpilot-ui
     environment: