A Retrieval-Augmented Generation (RAG) application that allows you to chat with your PDF documents using AI. Built with Streamlit, LangChain, ChromaDB, and Ollama.
- 📄 PDF Document Processing: Upload and process multiple PDF files
- 🔍 Intelligent Search: Vector-based semantic search through your documents
- 💬 Interactive Chat: Natural language conversations with your documents
- 🤖 Local AI: Uses Ollama for privacy-focused local AI inference
- 📊 Document Management: View, manage, and delete processed documents
- ⚡ Streaming Responses: Real-time streaming of AI responses
- 🎛️ Model Selection: Choose from available Ollama models
- UI Framework: Streamlit
- LLM Host: Ollama (local AI models)
- Vector Database: ChromaDB
- Embeddings: HuggingFace Sentence Transformers
- Document Processing: LangChain + PyPDF2
- Dependency Management: UV
- Python 3.9+
- Ollama: Install and run Ollama locally
# Install Ollama (visit https://ollama.ai for instructions) # Pull a model (e.g., llama2) ollama pull llama2
-
Clone the repository:
git clone <repository-url> cd rag-chat-app
-
Install UV (if not already installed):
pip install uv
-
Install dependencies:
uv sync
-
Set up environment variables:
cp .env.example .env # Edit .env with your preferences
-
Start Ollama (in a separate terminal):
ollama serve
-
Start the application:
uv run streamlit run app.py
-
Open your browser and navigate to
http://localhost:8501
-
Upload PDF documents:
- Use the file uploader in the left panel
- Click "Process Documents" to add them to your knowledge base
-
Start chatting:
- Ask questions about your documents in the chat interface
- View sources and relevance scores for transparency
rag-chat-app/
├── pyproject.toml # UV dependency management
├── README.md # Project documentation
├── .env.example # Environment variables template
├── app.py # Main Streamlit application
├── config/
│ └── settings.py # Application settings
├── src/
│ ├── document_processor.py # PDF processing and chunking
│ ├── vector_store.py # ChromaDB operations
│ ├── embeddings.py # Embedding functionality
│ ├── llm_client.py # Ollama LLM client
│ └── rag_pipeline.py # RAG query pipeline
├── data/
│ ├── uploads/ # Uploaded PDF files
│ └── chroma_db/ # ChromaDB storage
└── utils/
└── helpers.py # Helper functions
Key configuration options in .env
:
OLLAMA_BASE_URL
: Ollama server URL (default: http://localhost:11434)OLLAMA_MODEL
: Default model to use (default: llama2)CHUNK_SIZE
: Text chunk size for processing (default: 1000)CHUNK_OVERLAP
: Overlap between chunks (default: 200)TOP_K_RESULTS
: Number of relevant chunks to retrieve (default: 5)SIMILARITY_THRESHOLD
: Minimum similarity score for relevance (default: 0.7)
The application supports any model available in Ollama. Popular options:
llama2
: General-purpose modelcodellama
: Code-focused modelmistral
: Fast and efficient modelllama2:13b
: Larger model for better quality
Pull models using:
ollama pull <model-name>
-
"Ollama Not Connected":
- Ensure Ollama is running:
ollama serve
- Check the URL in your
.env
file - Verify no firewall is blocking the connection
- Ensure Ollama is running:
-
"No models found":
- Pull at least one model:
ollama pull llama2
- Restart the application
- Pull at least one model:
-
PDF processing errors:
- Ensure PDFs are not password-protected
- Check file size and format
- Try with a different PDF
-
Memory issues:
- Reduce
CHUNK_SIZE
in.env
- Process fewer documents at once
- Use a smaller embedding model
- Reduce
- Use smaller models for faster responses (e.g.,
mistral
) - Adjust chunk size based on your documents
- Clear the knowledge base periodically to free memory
- Process documents in batches for large collections
- Custom Document Processors: Extend
document_processor.py
- New Vector Stores: Implement interface similar to
vector_store.py
- Additional LLM Providers: Create new client similar to
llm_client.py
uv run pytest tests/
# Format code
uv run black src/ utils/ config/
# Check code quality
uv run flake8 src/ utils/ config/
# Type checking
uv run mypy src/ utils/ config/
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.