Skip to content

Conversation

hemangjoshi37a
Copy link

TLDR

Complete RAG (Retrieval-Augmented Generation) system implementation with HippoRAG, Milvus vector database integration, and automated file watching capabilities. Fixes TypeScript module resolution issues and adds comprehensive test coverage.

Dive Deeper

This PR introduces a full-featured RAG system that enables intelligent code retrieval and context-aware responses. The implementation includes:

Core Components Added:

  • MilvusCodeStorage: Vector database integration for storing code chunks with embeddings
  • HippoRAG: High-level API wrapper for RAG operations
  • RAGService: Service layer for application integration
  • FileWatcherService: Automatic RAG updates when files change
  • retrieve-code & update-rag: CLI tools for manual RAG operations

Technical Improvements:

  • Fixed ECMAScript module resolution by adding .js extensions to all TypeScript imports
  • Resolved test mocking issues in MilvusCodeStorage tests
  • Added proper error handling for undefined values in collection operations
  • Updated system prompt to include RAG integration instructions

Architecture:

  • Uses Milvus as the vector database for semantic code search
  • Implements chunked code storage with metadata
  • Provides both programmatic API and CLI tools
  • Includes automated file watching for real-time updates

Reviewer Test Plan

To validate this change works correctly:

  1. Run the test suite:

    npm run test --workspace=packages/core --if-present -- --testNamePattern="MilvusCodeStorage"
    npm run test --workspace=packages/core --if-present -- --testNamePattern="HippoRAG"
    npm run test --workspace=packages/core --if-present -- --testNamePattern="RAGService"
  2. Test the RAG tools:

    # Test retrieve functionality
    node -e "import('./packages/core/src/rag/index.js').then(async ({ retrieveCode }) => { console.log(await retrieveCode('function declaration', 3)); })"
    
    # Test update functionality
    node -e "import('./packages/core/src/rag/index.js').then(async ({ updateRag }) => { await updateRag('test.ts', 'const test = () => console.log(\"hello\");'); console.log('Updated'); })"
  3. Integration test:

    • Create a test file with some TypeScript code
    • Use the file watcher to verify automatic updates
    • Query for relevant code snippets

Testing Matrix

🍏 🪟 🐧
npm run
npx
Docker
Podman - -
Seatbelt - -

Linked issues / bugs

  • Resolves TypeScript module resolution issues with .js extensions
  • Fixes undefined value handling in Milvus collection operations
  • Addresses missing RAG components in the codebase
  • Closes potential runtime errors in vector database interactions

- Fixed ECMAScript import path extensions by adding .js extensions to all relative imports
- Added missing RAG components: FileWatcherService, retrieve-code, and update-rag
- Fixed MilvusCodeStorage test mocks to properly handle undefined values
- Updated system prompt to include RAG integration instructions
- All RAG tests now passing: MilvusCodeStorage (5/5), HippoRAG (4/4), RAGService (4/4)
- Resolved TypeScript module resolution issues
@hemangjoshi37a hemangjoshi37a changed the title Main Feat : Implement RAG System with HippoRAG & Milvus Integration, Fix Module Resolution and Test Mocks Aug 27, 2025
@hemangjoshi37a
Copy link
Author

image Screenshot from 2025-08-27 16-25-30

@hemangjoshi37a
Copy link
Author

  • HippoRAG, Milvus, and Chonkie Integration:
    • Added Milvus and Chonkie as dependencies.
    • Created a rag directory in packages/core/src to house the RAG implementation.
    • Implemented a MilvusCodeStorage class to handle the interaction with the Milvus database.
    • Implemented a HippoRAG class to provide a high-level API for the RAG functionality.
    • Implemented a RAGService to integrate the RAG functionality with the rest of the application.
  • New Tools:
    • Created a retrieve_code tool to allow the user to retrieve relevant code snippets from the RAG system.
    • Created an update_rag tool to allow the user to manually update the RAG system.
  • File Watcher:
    • Implemented a FileWatcherService to automatically update the RAG system when files are changed.
  • System Prompt:
    • Updated the system prompt to include information about the new RAG functionality and tools.
  • Build System:
    • Updated the build system to handle .node files, which are used by the onnxruntime-node dependency of chonkie.

@hemangjoshi37a
Copy link
Author

Screenshot from 2025-08-27 18-17-20

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant