Skip to content

Nagavenkatasai7/rag-from-scratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RAG from Scratch

Building a Retrieval-Augmented Generation pipeline from the ground up — chunking, embedding, vector search, and LLM-powered Q&A with RAGAS evaluation metrics.

Python LangChain ChromaDB License


Why This Project?

Most RAG tutorials use high-level abstractions that hide the mechanics. This project builds every component from scratch so you understand exactly what happens at each stage — from raw text to cited answers.


Architecture

graph LR
    A[PDF / Text Documents] --> B[Document Loader]
    B --> C[Text Chunking]
    C --> D[Embedding Generation]
    D --> E[ChromaDB Vector Store]
    
    F[User Query] --> G[Query Embedding]
    G --> H[Similarity Search]
    E --> H
    H --> I[Context Assembly]
    I --> J[LLM Generation]
    J --> K[Answer + Sources]
    
    K --> L[RAGAS Evaluation]
Loading

Key Features

Feature Description
Document Loading PDF and text file ingestion with metadata extraction
Semantic Chunking Recursive text splitting with configurable overlap
Vector Embeddings OpenAI/HuggingFace embedding models
Similarity Search Cosine similarity over ChromaDB vector store
LLM Generation Context-grounded answer generation with source citations
RAGAS Evaluation Automated metrics — faithfulness, relevance, context precision
Flask Chat UI Interactive web interface for document Q&A

Pipeline Deep-Dive

1. Document Ingestion

  • Load PDFs and text files using custom parsers
  • Extract metadata (filename, page number, section)
  • Handle encoding edge cases

2. Chunking Strategy

  • Recursive character text splitter
  • Configurable chunk size (default: 1000 tokens) and overlap (200 tokens)
  • Preserves paragraph boundaries where possible

3. Embedding & Indexing

  • Generate dense vector embeddings (OpenAI text-embedding-3-small or HuggingFace alternatives)
  • Store in ChromaDB with metadata for filtered retrieval
  • Persistent storage for production use

4. Retrieval & Generation

  • Embed user query → cosine similarity search → top-k retrieval
  • Context window assembly with source tracking
  • LLM generates grounded answers with inline citations

5. Evaluation (RAGAS)

  • Faithfulness: Is the answer supported by retrieved context?
  • Answer Relevancy: Does the answer address the question?
  • Context Precision: Are the retrieved chunks relevant?
  • Context Recall: Did retrieval capture all needed information?

Quick Start

Prerequisites

  • Python 3.11+
  • OpenAI API key

Setup

git clone https://github.com/Nagavenkatasai7/rag-from-scratch.git
cd rag-from-scratch/rag-project
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt

Configure

export OPENAI_API_KEY="your-key-here"

Run

python app.py

Open http://localhost:5000 to use the chat interface.


Project Structure

rag-from-scratch/
├── rag-project/
│   ├── app.py              # Flask chat UI
│   ├── rag_pipeline.py     # Core RAG pipeline
│   ├── chunking.py         # Text splitting logic
│   ├── embeddings.py       # Embedding generation
│   ├── retriever.py        # Vector search & retrieval
│   ├── evaluation.py       # RAGAS evaluation metrics
│   ├── templates/          # HTML templates for Flask UI
│   └── data/               # Sample documents
├── RAG-FROM-SCRATCH-GUIDE.md  # Detailed implementation guide
├── requirements.txt
└── README.md

Tech Stack

Component Technology
Language Python 3.11+
Framework LangChain, Flask
Vector Store ChromaDB
Embeddings OpenAI / HuggingFace
LLM GPT-4o / GPT-3.5-turbo
Evaluation RAGAS
Frontend Flask + HTML/CSS

What I Learned

  • How chunking strategy directly impacts retrieval quality
  • Why overlap matters — without it, answers miss context at chunk boundaries
  • RAGAS evaluation reveals failure modes invisible to manual testing
  • ChromaDB's metadata filtering enables precise document-scoped queries

License

MIT — see LICENSE for details.

About

Building a RAG pipeline from scratch — chunking, embedding, vector search, and retrieval-augmented generation with evaluation metrics (RAGAS). Educational deep-dive into RAG internals.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages