Benchmark #11

spyrchat · 2025-09-07T12:08:04Z

This pull request introduces a production-ready, modular Retrieval-Augmented Generation (RAG) system with a configurable LangGraph agent and robust CI/CD and deployment support. The main changes include a comprehensive new workflow for pipeline testing and security, a Dockerfile for containerization, a detailed project README, and the initial implementation of the agent's modular graph and nodes.

Key highlights:

Adds a modular, YAML-configurable LangGraph agent with clear separation of concerns (query interpretation, retrieval, generation, memory update).
Introduces a robust GitHub Actions workflow for minimal, integration, end-to-end, and security tests.
Provides a Dockerfile for reproducible production deployment.
Supplies detailed documentation and usage instructions in the README.

1. CI/CD and Deployment

Adds a comprehensive GitHub Actions workflow (.github/workflows/pipeline-tests.yml) with jobs for minimal, integration (with Qdrant), end-to-end (with API), and security/config validation tests, including checks for hardcoded secrets and YAML config validation.
Introduces a Dockerfile for building a slim Python 3.11-based container, installing system and Python dependencies, and copying the full source code for production deployment.

2. Documentation

Replaces README.md with a detailed overview of the system, including features, architecture, quick start, configuration examples, project structure, testing, extension guides, and migration instructions from legacy code.

3. Agent Graph Implementation

Implements the main agent graph in agent/graph.py using LangGraph, with nodes for query interpretation, retrieval, generation, and memory update, all configurable via YAML.

4. Modular Agent Nodes

Adds modular agent nodes:
- agent/nodes/query_interpreter.py: Determines if a question needs retrieval or can be answered directly, using a prompt and robust error handling.
- agent/nodes/generator.py: Generates answers using context or fallback, with logging and exception handling.
- agent/nodes/memory_updater.py: Maintains a rolling chat history in the agent state.

…it; create hybrid-retriever.py file; update requirements.txt for new dependencies

…nit_collection method and enhance as_langchain_vectorstore for hybrid retrieval; improve BaseRetriever documentation and remove hybrid-retriever.py file.

…ctor run method to return dense and sparse vectors; update test script to integrate new pipeline functionality and add metadata handling for documents.

… logging for collection creation and document insertion; refactor insert_documents method for improved clarity. Update EmbeddingPipeline to use new splitter method. Modify SparseEmbedder to default to CUDA. Add hybrid_retriever.py file.

…nitialization and embedding retrieval; improve logging and document preparation in test_embedding_pipeline.

…ctor_size parameter for improved clarity.

…ds to retrieve client and collection name. Update EmbeddingPipeline and SparseEmbedder to use Embeddings instead of BaseEmbedder. Add QdrantHybridRetriever for hybrid retrieval functionality and update test scripts accordingly.

… docstrings for clarity.

… hybrid-retriever

Hybrid retriever is functional and ready to deploy to development branch

… clarity.

Hybrid retriever

…gration

…er; add text processing pipeline for PDF documents

…s for clarity

…; implement metadata handling and enrich documents for upload

…rocessing test script

…ctory in test script

…date requirements for PyMuPDF

…sing test script

…g, and memory updating; add retriever routing logic and logging

…ment generator and retriever nodes

… requirements.txt with dependency version upgrades

- Added 'split' parameter to StackOverflowAdapter for better data segmentation. - Introduced 'dense_embedding' and 'sparse_embedding' fields in ChunkMeta for improved embedding metadata. - Updated EmbeddingPipeline to directly assign embeddings to respective fields. - Made allowed characters in DocumentValidator more permissive for HTML/code content.

- Introduced a comprehensive Quick Start Guide for implementing an MLOps pipeline for RAG systems, covering project initialization, dataset adapters, configuration, processing components, and testing. - Implemented a CSV dataset adapter for reading and converting CSV data into documents. - Created a configuration schema for managing dataset, chunking, embedding, and vector store settings. - Developed core processing components including a document chunker and an embedding pipeline. - Added a simple CLI interface for ingestion with logging and configuration handling. - Implemented a sparse embedding mechanism and integrated it into the embedding pipeline. - Added inspection script for analyzing vector structures in Qdrant. - Created smoke tests for validating ingestion processes and vector store uploads. - Added test script for verifying sparse embedding serialization. - Updated existing configurations for stackoverflow datasets to support hybrid and dense embedding strategies.

… framework - Created `experimental.yml` for testing new components in the retrieval pipeline. - Added `hybrid_multistage.yml` for hybrid retrieval with multi-stage reranking. - Implemented tests for the new answer-focused adapter in `test_new_adapter.py`. - Developed advanced reranking tests in `test_advanced_rerankers.py`. - Introduced answer retrieval tests in `test_answer_retrieval.py`. - Demonstrated retrieval pipeline extensibility in `test_extensibility.py`. - Showcased modular pipeline features in `test_modular_pipeline.py`. - Added a comprehensive test runner in `run_all_tests.py`. - Updated agent retrieval tests to support configurable pipelines in `test_agent_retrieval.py`.

- Implemented unit tests for the RetrievalPipeline, RetrievalResult, and associated components (Retriever, Reranker, Filter) in `test_retrieval_pipeline.py`. - Created mock classes for testing purposes to simulate retrieval, reranking, and filtering behaviors. - Added tests for basic functionality, component addition/removal, and pipeline execution with various configurations. - Introduced tests for the RetrievalPipelineFactory to validate pipeline creation with dense and hybrid configurations. - Added minimal and example tests for the SOSum adapter to ensure basic functionality without heavy dependencies. - Implemented smoke tests for the ingestion process and overall system quality checks. - Updated the test runner to include new tests and organized the test structure for better clarity.

- Updated langchain-core to version 0.3.75 - Added new dependencies: cachetools, distro, filetype, google-ai-generativelanguage, google-api-core, google-auth, googleapis-common-protos, grpcio-status, jiter, langchain-google-genai, langchain-openai, langgraph, langgraph-checkpoint, langgraph-prebuilt, langgraph-sdk, openai, ormsgpack, proto-plus, psycopg2-binary, pyasn1, pyasn1_modules, rsa, tiktoken, xxhash - Updated existing dependencies to their latest versions test: Enhance tests for rerankers and retrieval pipeline - Refactored test cases in test_rerankers.py for better readability and maintainability - Added new tests for the RetrievalPipeline and its components in test_retrieval_pipeline.py - Improved mock implementations for better isolation in tests feat: Add debug scripts for StackOverflow adapter - Introduced debug_row_order.py to check row order and types from StackOverflow data - Added debug_stackoverflow_adapter.py to investigate issues with document reading in the StackOverflow adapter test: Implement tests for ingestion pipeline and adapter functionality - Created test_full_ingestion.py to validate the full ingestion pipeline with StackOverflow data - Added test_adapter_fix.py to verify the StackOverflow adapter produces documents correctly chore: Update test runner to include new tests - Modified run_all_tests.py to include new test files for retrieval and ingestion

… reporting

- Removed legacy retriever wrapping and introduced modern retriever classes. - Updated `RetrievalPipelineFactory` to create dense, hybrid, sparse, and semantic pipelines using new retriever implementations. - Created `ModernBaseRetriever` as a base class for all retrievers, providing common functionality and configuration handling. - Implemented `QdrantDenseRetriever`, `QdrantHybridRetriever`, `QdrantSparseRetriever`, and `SemanticRetriever` with improved initialization and search methods. - Removed deprecated `router.py` and integrated routing logic into the new retriever classes. - Enhanced logging and error handling across retrievers for better debugging and monitoring. - Updated imports and module structure to reflect the new architecture.

- Enhanced the `load_config` function to include detailed error handling and logging. - Introduced `get_retriever_config`, `get_benchmark_config`, and `get_pipeline_config` functions for better configuration management. - Added `load_config_with_overrides` to support configuration overrides. - Updated `QdrantVectorDB` initialization to accept configuration parameters. - Created example scripts for unified configuration usage and retriever configuration examples. - Removed outdated retrieval pipeline configurations to streamline the codebase. - Improved logging throughout the retriever classes for better traceability.

…ality - Refactored benchmark runner to utilize unified configuration approach. - Updated imports to reflect new module structure for benchmarks and metrics. - Introduced new benchmark scripts for simple and full dataset evaluations. - Enhanced retrieval pipeline initialization to support unified config. - Created comprehensive dataset adapter for full StackOverflow dataset evaluation. - Added real data benchmark runner for testing with actual StackOverflow queries. - Updated configuration file to include new retrieval strategies and parameters. - Documented changes in CONFIG_CONSOLIDATION_COMPLETE.md and UNIFIED_CONFIG.md. - Added examples demonstrating the use of the new unified configuration system.

…issing ground truth and improving document ID extraction

…rant integration

…and configuration - Removed the full dataset benchmark script to streamline the benchmarking process. - Updated the real benchmark script to ensure proper imports and functionality. - Enhanced the configuration file to include new fusion methods and adjustable weights for hybrid retrieval. - Refactored dense and sparse retrievers to improve embedding initialization and search processes. - Implemented a new hybrid retriever that combines dense and sparse results using configurable fusion methods. - Deleted the synthetic dataset text processing script to clean up unused code. - Added a comprehensive test suite for all retrievers in the full benchmark pipeline to ensure reliability and performance.

…flow - Created `natural_questions.yml` for Google Natural Questions dataset with hybrid embedding strategy, chunking, validation, and evaluation settings. - Created `stackoverflow.yml` for SOSum dataset with hybrid embedding strategy, chunking, validation, and evaluation settings. - Added `stackoverflow_hybrid.yml` for hybrid dense and sparse embeddings configuration. - Introduced dataset template `dataset_template.yml` for easy dataset configuration. - Added retrieval configuration templates: `retrieval_template.yml` for agent retrieval setup. - Implemented legacy configurations for various models: `stackoverflow_bge_large.yml`, `stackoverflow_e5_large.yml`, and `stackoverflow_minilm.yml`. - Created high-performance retrieval configurations: `fast_hybrid.yml`, `modern_dense.yml`, and `modern_hybrid.yml`. - Removed outdated retriever configurations: `dense_retriever.yml`, `hybrid_retriever.yml`, `semantic_retriever.yml`, and `sparse_retriever.yml`. - Updated tests for agent retrieval and streamlined agent functionality, ensuring compatibility with new configurations.

…script - Deleted the following test files: - test_full_ingestion.py - test_modular_pipeline.py - run_all_tests.py - test_adapter_fix.py - test_agent_retrieval.py - test_retriever_direct.py - test_streamlined_agent.py - Added a new test file: test_local_setup.py - This script checks prerequisites and runs progressive tests for the pipeline.

…t.txt and updating pipeline tests to use requirements-minimal.txt

…ation structure for improved clarity and maintainability. - Deleted SYSTEM_EXTENSION_GUIDE.md, UNIFIED_CONFIG.md, agent_retrieval_upgrade_summary.md, config_reorganization_summary.md, integration_testing_setup.md, and sql_removal_summary.md. - Simplified agent graph by removing SQL-related nodes and dependencies. - Consolidated configuration files into a unified structure, enhancing usability and reducing clutter.

…s for LangChain and dotenv

…rant in requirements-minimal.txt

…rements documentation

… setup script

…iever configuration

…tion

…nd connectivity

spyrchat added 30 commits April 26, 2025 17:00

Add SparseEmbedder class and update get_embedder function to support …

d0d0668

…it; create hybrid-retriever.py file; update requirements.txt for new dependencies

Refactor QdrantVectorDB to support dense and sparse vectors; update i…

7c6bf9c

…nit_collection method and enhance as_langchain_vectorstore for hybrid retrieval; improve BaseRetriever documentation and remove hybrid-retriever.py file.

Enhance EmbeddingPipeline to support optional sparse embeddings; refa…

f283739

…ctor run method to return dense and sparse vectors; update test script to integrate new pipeline functionality and add metadata handling for documents.

Refactor QdrantVectorDB and embedding factory to enhance collection i…

6cbb9e4

…nitialization and embedding retrieval; improve logging and document preparation in test_embedding_pipeline.

Refactor init_collection method in QdrantVectorDB to remove sparse_ve…

c02e61a

…ctor_size parameter for improved clarity.

Refactor BaseVectorDB to specify return types for methods and enhance…

14cbf21

… docstrings for clarity.

Merge branch 'development' of https://github.com/spyrchat/Thesis into…

862134b

… hybrid-retriever

Merge pull request #1 from spyrchat/hybrid-retriever

7f75ff9

Hybrid retriever is functional and ready to deploy to development branch

Remove BaseEmbedder inheritance from HuggingFaceEmbedder for improved…

b5d9c2b

… clarity.

Remove BaseEmbedder inheritance from TitanEmbedder for improved clarity.

4b51eda

Merge pull request #2 from spyrchat/hybrid-retriever

9cca60c

Hybrid retriever

Add PostgresController and connection test script for PostgreSQL inte…

df599be

…gration

Implement image and table asset insertion methods in PostgresControll…

9045ddf

…er; add text processing pipeline for PDF documents

Add table extraction and SQL uploading functionality; refactor import…

31a2a7e

…s for clarity

Add PDF processing, table extraction, and text chunking functionality…

bdb9037

…; implement metadata handling and enrich documents for upload

Enhance embedding pipeline with dynamic embedding strategy; add PDF p…

bcc92da

…rocessing test script

Refactor import statements to use relative paths; update sandbox dire…

d400d85

…ctory in test script

Enhance Qdrant document insertion with error handling and logging; up…

40d43c9

…date requirements for PyMuPDF

Add table extraction functionality with logging; implement PDF proces…

d6c07b5

…sing test script

Implement modular RAG pipeline with query interpretation, SQL plannin…

f75f74f

…g, and memory updating; add retriever routing logic and logging

Add Dockerfile, docker-compose.yml, and main application logic; imple…

53d213b

…ment generator and retriever nodes

Refactor QdrantVectorDB: remove unused import and add spacing; update…

985440e

… requirements.txt with dependency version upgrades

Updated requirements.txt

20a99bd

full pipeline is functional

8c08f87

Added logging

b6feff5

Added docstrings for clarity

4187574

Added Docstrings

fa53084

added config.yml

346f0d6

spyrchat added 29 commits August 21, 2025 14:44

feat: Implement Stack Overflow adapter analysis and testing tools

811b2c6

feat: Add answer metadata tests and enhance answer retrieval output

663dbbd

feat: Enhance embedding strategy configuration and improve smoke test…

db65791

… reporting

feat: Enhance benchmark evaluation by implementing NaN handling for m…

8483973

…issing ground truth and improving document ID extraction

feat: Improve document ID handling and external ID preservation in Qd…

9acb29c

…rant integration

chore: Update Python version to 3.13 in pipeline tests

10b6620

chore: Update testing dependencies and Python version in CI workflows

056f007

refactor: Simplify dependency management by removing requirements-tes…

7323f4b

…t.txt and updating pipeline tests to use requirements-minimal.txt

chore: Update requirements-minimal.txt to include missing dependencie…

b39c51d

…s for LangChain and dotenv

chore: Add missing dependencies for boto3, botocore, and langchain-qd…

e1beb8b

…rant in requirements-minimal.txt

refactor: Enhance Qdrant connectivity tests and remove outdated requi…

149ab30

…rements documentation

chore: Remove outdated GitHub Actions CI configuration and local test…

63d29af

… setup script

chore: Remove outdated example scripts and sample data files for retr…

18903d8

…iever configuration

Fix Google dependencies conflict in requirements.txt

3393ec5

fix: Remove unnecessary blank line in insert_documents method

9415e07

fix: Improve .env loading and add default values for Qdrant configura…

2748761

…tion

fix: Update Qdrant service configuration for improved health checks a…

8d6d16f

…nd connectivity

spyrchat closed this Sep 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Benchmark #11

Benchmark #11

Uh oh!

spyrchat commented Sep 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Benchmark #11

Benchmark #11

Uh oh!

Conversation

spyrchat commented Sep 7, 2025

1. CI/CD and Deployment

2. Documentation

3. Agent Graph Implementation

4. Modular Agent Nodes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants