Skip to content

Latest commit

 

History

History
93 lines (64 loc) · 3.19 KB

File metadata and controls

93 lines (64 loc) · 3.19 KB

🎈 Anveshak AI - Query English Documents in Sanskrit

Anveshak AI is an advanced Retrieval-Augmented Generation (RAG)-based Streamlit application that allows users to query English documents using Sanskrit. It creates a vector database for a selected document and processes queries using Ollama and LangChain.

🌟 Features

Upload & Process PDFs - Converts documents into vectorized data for efficient retrieval.

Multi-Language Query Support - Users can ask questions in Sanskrit, and the system retrieves relevant English information.

Advanced AI Models - Utilizes Ollama embeddings and LLM models to enhance query responses.

Seamless Integration - Built with Streamlit, allowing for an interactive and user-friendly experience.

Efficient Query Handling - Uses LangChain for better contextual understanding and accurate responses.

🚀 Installation & Setup

1️⃣ Clone the Repository

 git clone https://github.com/PythonicVarun/Anveshak-AI.git
 cd Anveshak-AI

2️⃣ Set Up Virtual Environment (Recommended)

 python -m venv venv
 source venv/bin/activate   # For Linux/macOS
 venv\Scripts\activate      # For Windows

3️⃣ Install Dependencies

 pip install -r requirements.txt

4️⃣ Pull the Required Ollama Models

Ensure you have the required models before running the application:

 ollama pull nomic-embed-text
 ollama pull llama2

5️⃣ Create Environment File

Copy the provided .env.example to a new file named .env. This file contains the default environment settings including Ollama host, vector DB path, and logging level. You can modify it if needed.

cp .env.example .env

6️⃣ Set Environment Variables

Set the following environment variable to avoid issues with Protocol Buffers:

 export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python

▶️ Running the Application

 python run.py

🔧 How to Use

1️⃣ Upload a PDF or Select a Sample Document from the provided list.

2️⃣ Choose a LLM model from the available Ollama models.

3️⃣ Type your query in Sanskrit 📜 in the chatbox.

4️⃣ The system will process the question and return accurate answers based on the document's content.

5️⃣ Click "Delete Collection" if you want to clear uploaded documents from memory.

📦 Dependencies

  • Python 3.9+ 🐍
  • Streamlit (for UI)
  • Ollama & LangChain (for AI processing)
  • ChromaDB (for vector storage)
  • PDFPlumber (for PDF parsing)

📜 Submission for Hackademia 2k25

Anveshak AI is built as part of the Hackademia 2k25 hackathon challenge to push the boundaries of AI-assisted multilingual knowledge retrieval! 🚀


🌟 Star ⭐ this repo if you like this project!

"Unlock the power of Sanskrit queries with AI-powered retrieval!" 🚀

Built with ❤️ by Varun Agnihotri!

Follow me on GitHub | X | LinkedIn | Instagram