Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/code_spell_ignore.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
rouge
Rouge
ROUGE
ModelIn
modelin
Binary file added evals/evaluation/rag_pilot/RAG_Pilot.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
209 changes: 209 additions & 0 deletions evals/evaluation/rag_pilot/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,209 @@

# RAG Pilot - A RAG Pipeline Tuning Tool

## Overview

RAG Pilot provides a set of tuners to optimize various parameters in a retrieval-augmented generation (RAG) pipeline. Each tuner allows fine-grained control over key aspects of parsing, chunking, postporcessing, and generating selection, enabling better retrieval and response generation.

### Available Tuners

| Tuner | Function | Configuration |
|---|---|---|
| **NodeParserTypeTuner** | Switch between `simple` and `hierarchical` node parsers | The `simple` parser splits text into basic chunks using [`SentenceSplitter`](https://docs.llamaindex.ai/en/stable/api_reference/node_parsers/sentence_splitter/), while the `hierarchical` parser ([`HierarchicalNodeParser`](https://docs.llamaindex.ai/en/v0.10.17/api/llama_index.core.node_parser.HierarchicalNodeParser.html)) creates a structured hierarchy of nodes to maintain contextual relationships. |
| **SimpleNodeParserChunkTuner** | Tune `SentenceSplitter`'s `chunk_size` and `chunk_overlap` | Configures chunking behavior for document parsing by adjusting the size of individual text chunks and their overlap to ensure context retention. |
| **RerankerTopnTuner** | Tune `top_n` for reranking | Adjusts the number of top-ranked documents retrieved, optimizing the relevance of retrieved results. |
| **EmbeddingLanguageTuner** | Select the embedding model | Configures the embedding model for retrieval, allowing users to select different models for vector representation. |

These tuners help in optimizing document parsing, chunking strategies, reranking efficiency, and embedding selection for improved RAG performance.


## Online RAG Tuning

### Dependencies and Environment Setup

#### Setup EdgeCraftRAG

Setup EdgeCraftRAG pipeline based on this [link](https://github.com/opea-project/GenAIExamples/tree/main/EdgeCraftRAG).

Load documents in EdgeCraftRAG before running RAG Pilot.

#### Create Running Environment

```bash
# Create a virtual environment
python3 -m venv tuning
source tuning/bin/activate

# Install dependencies
pip install -r requirements.txt
```

### Launch RAG Pilot in Online Mode

To launch RAG Pilot, create the following *required files* before running the command:

#### QA List File (`your_qa_list.json`)
Contains queries and optional ground truth answers. Below is a sample format:

```json
[
{
"query": "鸟类的祖先是恐龙吗?哪篇课文里讲了相关的内容?",
"ground_truth": "是的,鸟类的祖先是恐龙,这一内容在《飞向蓝天的恐龙》一文中有所讨论"
},
{
"query": "桃花水是什么季节的水?"
}
]
```

Run the following command to start the tuning process. The output RAG results will be stored in `rag_pipeline_out.json`:

```bash
# Run pipeline tuning tool
export ECRAG_SERVICE_HOST_IP="ecrag_host_ip"
python3 -m pipeline_tune -q "your_qa_list.json" -o "rag_pipeline_out.json"
```

## Offline RAG Tuning

RAG Pilot supports offline mode using a RAG configuration file.

### Environment Setup

Refer to [Create Running Environment](#create-running-environment) in the Online RAG pipeline tuning section for setting up the environment before proceeding.

### Launch RAG Pilot in Offline Mode

To launch RAG Pilot, create the following *required files* before running the command:

#### RAG Configuration File (`your_rag_pipeline.json`)
Settings for the RAG pipeline. Please follow the format of file `configs/pipeline_sample.json`, which is compatible with [EdgeCraftRAG](https://github.com/opea-project/GenAIExamples/tree/main/EdgeCraftRAG)

#### RAG Results File (`your_rag_results.json`)
Contains queries, responses, lists of contexts, and optional ground truth. Below is a sample format:

```json
[
{
"query": "鸟类的祖先是恐龙吗?哪篇课文里讲了相关的内容?",
"contexts": ["恐龙演化成鸟类的证据..."],
"response": "是的,鸟类的祖先是恐龙。",
"ground_truth": "是的,鸟类的祖先是恐龙,这一内容在《飞向蓝天的恐龙》一文中有所讨论"
}
]
```

Run the following command to start offline tuning. The output RAG results will be stored in `rag_pipeline_out.json`:

```bash
python3 -m pipeline_tune --offline -c "your_rag_pipeline.json" -r "your_rag_results.json" -o "rag_pipeline_out.json"
```

## How to use RAG Pilot to tune your RAG solution

### What's Nodes and Modules

RAG Pilot represents each stage of the RAG pipeline as a **node**, such as `node_parser`, `indexer`, `retriever`, etc. Each node can have different **modules** that define its type and configuration. The nodes and modules are specified in a YAML file, allowing users to switch between different implementations easily.

Here is an example of nodes and modules for EdgeCraftRAG.

![RAG Pilot Architecture](RAG_Pilot.png)

### How to configure Nodes and Modules

The available nodes and their modules are stored in a YAML file (i.e. `configs/ecrag.yaml` for EdgeCraftRAG as below). Each node can have multiple modules, and both nodes and modules have configurable parameters that can be tuned.

```yaml
nodes:
- node: node_parser
modules:
- module_type: simple
chunk_size: 400
chunk_overlap: 48
- module_type: hierarchical
chunk_sizes: [256, 384, 512]
- node: indexer
embedding_model: [BAAI/bge-small-zh-v1.5, BAAI/bge-small-en-v1.5]
modules:
- module_type: vector
- module_type: faiss_vector
- node: retriever
retrieve_topk: 30
modules:
- module_type: vectorsimilarity
- module_type: auto_merge
- module_type: bm25
- node: postprocessor
modules:
- module_type: reranker
top_n: 3
reranker_model: BAAI/bge-reranker-large
- module_type: metadata_replace
- node: generator
model: [Qwen/Qwen2-7B-Instruct]
inference_type: [local, vllm]
prompt: null
```

1. **Each Node Can Have Multiple Modules**
- A node represents a stage in the RAG pipeline, such as `node_parser`, `indexer`, or `retriever`.
- Each node can support different modules that define how it operates. For example, the `node_parser` node can use either a `simple` or `hierarchical` module.

2. **Nodes Have Parameters to Tune**
- Some nodes have global parameters that affect all modules within them. For instance, the `retriever` node has a `retrieve_topk` parameter that defines how many top results are retrieved.

3. **Modules Have Parameters to Tune**
- Each module within a node can have its own parameters. For example, the `simple` parser module has `chunk_size` and `chunk_overlap` parameters, while the `hierarchical` parser module supports multiple `chunk_sizes`.

4. **Each Node Selects Its Module Based on a Type Map**
- The tool uses an internal mapping to associate each module type with its corresponding function. The type of module selected for each node is defined in a mapping system like the one below:

```python
COMP_TYPE_MAP = {
"node_parser": "parser_type",
"indexer": "indexer_type",
"retriever": "retriever_type",
"postprocessor": "processor_type",
"generator": "inference_type",
}
```

### How to use Nodes and Modules

Besides the YAML configuration file, the tool also uses a module map to associate each module with a runnable instance. This ensures that the tool correctly links each module type to its respective function within the pipeline.

#### Example: Mapping Modules to Functions
The function below defines how different module types are mapped to their respective components in EdgeCraftRAG:

```python
def get_ecrag_module_map(ecrag_pl):
ecrag_modules = {
# root
"root": (ecrag_pl, ""),
# node_parser
"node_parser": (ecrag_pl, "node_parser"),
"simple": (ecrag_pl, "node_parser"),
"hierarchical": (ecrag_pl, "node_parser"),
"sentencewindow": (ecrag_pl, "node_parser"),
# indexer
"indexer": (ecrag_pl, "indexer"),
"vector": (ecrag_pl, "indexer"),
"faiss_vector": (ecrag_pl, "indexer"),
# retriever
"retriever": (ecrag_pl, "retriever"),
"vectorsimilarity": (ecrag_pl, "retriever"),
"auto_merge": (ecrag_pl, "retriever"),
"bm25": (ecrag_pl, "retriever"),
# postprocessor
"postprocessor": (ecrag_pl, "postprocessor[0]"),
"reranker": (ecrag_pl, "postprocessor[0]"),
"metadata_replace": (ecrag_pl, "postprocessor[0]"),
# generator
"generator": (ecrag_pl, "generator"),
}
return ecrag_modules
```


By modifying the YAML configuration file and understanding how modules are mapped to functions, you can experiment with different configurations and parameter settings to optimize their RAG pipeline effectively.
2 changes: 2 additions & 0 deletions evals/evaluation/rag_pilot/components/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Copyright (C) 2025 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
2 changes: 2 additions & 0 deletions evals/evaluation/rag_pilot/components/adaptor/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
# Copyright (C) 2025 Intel Corporation
# SPDX-License-Identifier: Apache-2.0
75 changes: 75 additions & 0 deletions evals/evaluation/rag_pilot/components/adaptor/adaptor.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# Copyright (C) 2025 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

from typing import Callable, Optional

from components.adaptor.base import Module, Node, convert_tuple, get_support_modules


class Adaptor:

def __init__(self, yaml_data: str):
self.nodes = self.parse_nodes(yaml_data)
self.root_func: Optional[Callable] = None

def parse_nodes(self, yaml_data):
parsed_nodes = {}
for node in yaml_data.get("nodes", []):
node_type = node.get("node")
modules_dict = {
mod.get("module_type"): Module(
type=mod.get("module_type", ""),
params={k: convert_tuple(v) for k, v in mod.items() if k not in ["module_type"]},
)
for mod in node.get("modules", [])
if mod.get("module_type")
}
node_params = {k: convert_tuple(v) for k, v in node.items() if k not in ["node", "node_type", "modules"]}
cur_node = Node(type=node_type, params=node_params, modules=modules_dict)
if node_type in parsed_nodes:
parsed_nodes[node_type].append(cur_node)
else:
parsed_nodes[node_type] = [cur_node]
return parsed_nodes

def get_node(self, node_type, idx=0):
nodes = self.nodes[node_type] if node_type in self.nodes else None
return nodes[idx] if nodes and idx < len(nodes) else None

def get_modules_from_node(self, node_type, idx=0):
node = self.get_node(node_type, idx)
return node.modules if node else None

def get_module(self, node_type, module_type, idx=0):
if module_type is None:
return self.get_node(node_type, idx)
else:
modules = self.get_modules_from_node(node_type, idx)
return modules[module_type] if modules and module_type in modules else None

def update_all_module_functions(self, module_map, node_type_map):
self.root_func = get_support_modules("root", module_map)

for node_list in self.nodes.values():
for node in node_list:
node.update_func(module_map)
node.is_active = False
for module in node.modules.values():
module.update_func(module_map)
module.is_active = False

self.activate_modules_based_on_type(node_type_map)

def activate_modules_based_on_type(self, node_type_map):
if not self.root_func:
return

for node_list in self.nodes.values():
for node in node_list:
node_type = node.type
if not getattr(self.root_func, node_type, None):
continue
node.is_active = True
active_module_type = getattr(node.func, node_type_map[node_type], None)
if active_module_type and active_module_type in node.modules:
node.modules[active_module_type].is_active = True
Loading
Loading