Local-RAG is a Retrieval-Augmented Generation (RAG) application designed to work entirely with local resources.
- Vector Database: Qdrant
- Language Model: Ollama (Llama 3.1)
- Embedding Model: Snowflake/snowflake-arctic-embed-m-v1.5
- User Interface: Streamlit
- Contextual Conversations
- History Tracking using SQlite
- Advanced RAG Features (reranking, query processing, handling complex queries etc)
- Configurable Setup
- Better error handling and logging
-
Clone the repository
git clone https://github.com/pmgautam/local-rag.git cd local-rag
-
Create virtual environment
python -m venv venv source venv/bin/activate # Linux/Mac # or .\venv\Scripts\activate # Windows
-
Install dependencies
pip install -r requirements.txt
-
Install and start Qdrant
# Start Qdrant with persistent storage and default ports docker run -d \ -p 6333:6333 \ -p 6334:6334 \ -v $(pwd)/qdrant_storage:/qdrant/storage:z \ qdrant/qdrant
-
Install Ollama and download model
# Install Ollama from ollama.com # Pull the Llama 3.1 model ollama pull llama3.1
# Index documents with default settings
python -m app.indexer --folder /path/to/documents --collection my_docs
# Start with default configuration
streamlit run app/chat_app.py
# Start with custom configuration
streamlit run app/chat_app.py -- --config path/to/config.yaml
Contributions are welcome! Please send a PR for any improvements you would want to make.