Skip to content

shivamworld0608/Ollabot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

42 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ” Smart RAG Based Q&A Assistant

AI-powered assistant that answers natural language questions from uploaded PDFs using semantic retrieval, and LLM-based re-ranking. Delivers accurate, explainable responses with relevance scores and direct PDF highlighting for traceability.


πŸ“Έ Demo

image image image image image image image image

🧠 Features

Feature Description
πŸ“„ PDF Upload Upload and parse PDF with page-level metadata
πŸ” Semantic Search Embed + retrieve most relevant chunks using similarity scoring
🧠 Re-Ranking Use LLM to sort top chunks before answering
πŸ’¬ Chat Support Ask questions via text or voice
πŸ“Š Similarity Scores View how relevant each chunk is to your query
πŸ”¦ PDF Highlighting See exactly which paragraph the answer came from

πŸ› οΈ Tech Stack

Layer Tools
Frontend React.js, TailwindCSS, React-PDF, Web Speech API
Backend (Node.js) Node.js, Express.js, OAuth, JWT
Backend (AI-Engine) Python, FastAPI , PyMuPDF
LLM Ollama
Embeddings all-MiniLM-L6-v2
Vector DB ChromaDB

βš™οΈ Project Workflow

A clear separation of responsibilities ensures maintainability and scalability. The project is divided into two major flows:


πŸ› οΈ Admin Workflow (Document Management & Embedding)

Step Description
1️⃣ PDF Upload: Admin uploads PDF files via the dashboard.
2️⃣ PDF Parsing: The system extracts text from each page using PyMuPDF .
3️⃣ Text Chunking: Extracted text is split into smaller chunks (with token limits).
4️⃣ Metadata Addition: Each chunk is enriched with metadata: chunk_index, page_number, text_start, text_end, source, etc.
5️⃣ Embedding Generation: Each chunk is passed through an embedding model (e.g., all-MiniLM-L6-v2) and stored in a vector database like ChromaDB.
βœ… Ready for Querying: Admin-processed files are now available for user interaction.

πŸ™‹ User Workflow (Querying via Text or Voice)

Step Description
1️⃣ File Selection: User selects a specific PDF file to query from the list of uploaded documents.
2️⃣ Input Method: User types a question or uses voice input (handled via react-speech-recognition).
3️⃣ Embedding Query: User query is converted into an embedding and searched in the vector database (ChromaDB).
4️⃣ Top-k Retrieval: Most similar chunks are retrieved based on cosine similarity.
5️⃣ Contextual Prompt Construction: Retrieved chunks and metadata are appended to the query for contextual understanding.
6️⃣ Answer Generation: Query is sent to a language model (e.g., GPT-4) along with relevant context for accurate response generation.
7️⃣ Answer Display: Answer is rendered on the UI. Additional metadata like page number and similarity score is also shown.
8️⃣ PDF Viewer Sync (Bonus): The highlighted chunk is shown in the PDF viewer with react-pdf.

βœ… Bonus Features Implemented

  • πŸ”Ž Similarity Score Display
  • πŸ“„ PDF Viewer with Highlighted Chunks
  • πŸŽ™οΈ Voice Input Support
  • πŸ” Admin-only Access for Upload & Embedding

πŸ“· Screenshots

Chat Interface PDF Highlight
image image

Setup Instructions

  1. Clone the Repository

    git clone https://github.com/shivamworld0608/Ollabot.git
    cd Ollabot
    
  2. Create a .env file in backend

    #OAuth Credentials
    GOOGLE_CLIENT_ID=87796226935-nr26lcqfgqfsoepgr3h30qc4nn224lqt.apps.googleusercontent.com
    GOOGLE_CLIENT_SECRET=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
    
    
    #JWT Credentials
    JWT_SECRET="b1a26c4a14718e4244721cc7db67f6e42befce460c32f7e08f2040cb07ae4ed3"
    JWT_EXPIRES_IN="30d"
    JWT_COOKIE_EXPIRES_IN=30
    
    #basic server credentials
    MONGO_URI="mongodb+srv://pandeyashishivam:[email protected]/"
    CLIENT_URL="http://localhost:5173"
    AI_ENGINE_URL="http://localhost:8000"
    SERVER_URL="http://localhost:5000"
    PORT=5000
    
    
    
  3. Create a .env file in frontend

    VITE_APP_BASE_URL='http://localhost:5000'
    
    
  4. AI-Engine Setup (FastAPI)

    cd ai-engine
    pip install -r requirements.txt
    python main.py
    
  5. Backend Setup (Nodejs,Express)

     cd backend
     npm i
     npx nodemon server.js
    
  6. Frontend Setup (React)

    cd frontend
    npm install
    npm run dev
    

Make sure to set the correct backend URL in your frontend env and also correct ai-engine url in backend env

🀝 Contributing

We welcome contributions from the community! Here’s how you can help:

πŸ“Œ Guidelines

  • πŸ“ Open an Issue
    For major features or changes, please open an issue first to discuss your ideas.

  • πŸ“‚ Follow Standards
    Stick to the existing project structure and naming conventions for consistency.

  • βœ… Test Before Push
    Ensure all features are tested and stable before submitting a pull request.


πŸ“¬ Contact

Have questions, feedback, or just want to connect? Feel free to reach out!

Platform Link
GitHub @shivamworld0608
Email [email protected]
LinkedIn linkedin.com/in/pandey-shivam-

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published