🎈 Anveshak AI - Query English Documents in Sanskrit

Anveshak AI is an advanced Retrieval-Augmented Generation (RAG)-based Streamlit application that allows users to query English documents using Sanskrit. It creates a vector database for a selected document and processes queries using Ollama and LangChain.

🌟 Features

✅ Upload & Process PDFs - Converts documents into vectorized data for efficient retrieval.

✅ Multi-Language Query Support - Users can ask questions in Sanskrit, and the system retrieves relevant English information.

✅ Advanced AI Models - Utilizes Ollama embeddings and LLM models to enhance query responses.

✅ Seamless Integration - Built with Streamlit, allowing for an interactive and user-friendly experience.

✅ Efficient Query Handling - Uses LangChain for better contextual understanding and accurate responses.

🚀 Installation & Setup

1️⃣ Clone the Repository

 git clone https://github.com/PythonicVarun/Anveshak-AI.git
 cd Anveshak-AI

2️⃣ Set Up Virtual Environment (Recommended)

 python -m venv venv
 source venv/bin/activate   # For Linux/macOS
 venv\Scripts\activate      # For Windows

3️⃣ Install Dependencies

 pip install -r requirements.txt

4️⃣ Pull the Required Ollama Models

Ensure you have the required models before running the application:

 ollama pull nomic-embed-text
 ollama pull llama2

5️⃣ Create Environment File

Copy the provided .env.example to a new file named .env. This file contains the default environment settings including Ollama host, vector DB path, and logging level. You can modify it if needed.

cp .env.example .env

6️⃣ Set Environment Variables

Set the following environment variable to avoid issues with Protocol Buffers:

 export PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python

▶️ Running the Application

 python run.py

🔧 How to Use

1️⃣ Upload a PDF or Select a Sample Document from the provided list.

2️⃣ Choose a LLM model from the available Ollama models.

3️⃣ Type your query in Sanskrit 📜 in the chatbox.

4️⃣ The system will process the question and return accurate answers based on the document's content.

5️⃣ Click "Delete Collection" if you want to clear uploaded documents from memory.

📦 Dependencies

Python 3.9+ 🐍
Streamlit (for UI)
Ollama & LangChain (for AI processing)
ChromaDB (for vector storage)
PDFPlumber (for PDF parsing)

📜 Submission for Hackademia 2k25

Anveshak AI is built as part of the Hackademia 2k25 hackathon challenge to push the boundaries of AI-assisted multilingual knowledge retrieval! 🚀

🌟 Star ⭐ this repo if you like this project!

"Unlock the power of Sanskrit queries with AI-powered retrieval!" 🚀

Built with ❤️ by Varun Agnihotri!

Follow me on GitHub | X | LinkedIn | Instagram

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🎈 Anveshak AI - Query English Documents in Sanskrit

🌟 Features

🚀 Installation & Setup

1️⃣ Clone the Repository

2️⃣ Set Up Virtual Environment (Recommended)

3️⃣ Install Dependencies

4️⃣ Pull the Required Ollama Models

5️⃣ Create Environment File

6️⃣ Set Environment Variables

▶️ Running the Application

🔧 How to Use

📦 Dependencies

📜 Submission for Hackademia 2k25

🌟 Star ⭐ this repo if you like this project!

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

🎈 Anveshak AI - Query English Documents in Sanskrit

🌟 Features

🚀 Installation & Setup

1️⃣ Clone the Repository

2️⃣ Set Up Virtual Environment (Recommended)

3️⃣ Install Dependencies

4️⃣ Pull the Required Ollama Models

5️⃣ Create Environment File

6️⃣ Set Environment Variables

▶️ Running the Application

🔧 How to Use

📦 Dependencies

📜 Submission for Hackademia 2k25

🌟 Star ⭐ this repo if you like this project!