RAG from Scratch

Building a Retrieval-Augmented Generation pipeline from the ground up — chunking, embedding, vector search, and LLM-powered Q&A with RAGAS evaluation metrics.

Why This Project?

Most RAG tutorials use high-level abstractions that hide the mechanics. This project builds every component from scratch so you understand exactly what happens at each stage — from raw text to cited answers.

Architecture

graph LR
    A[PDF / Text Documents] --> B[Document Loader]
    B --> C[Text Chunking]
    C --> D[Embedding Generation]
    D --> E[ChromaDB Vector Store]
    
    F[User Query] --> G[Query Embedding]
    G --> H[Similarity Search]
    E --> H
    H --> I[Context Assembly]
    I --> J[LLM Generation]
    J --> K[Answer + Sources]
    
    K --> L[RAGAS Evaluation]

Key Features

Feature	Description
Document Loading	PDF and text file ingestion with metadata extraction
Semantic Chunking	Recursive text splitting with configurable overlap
Vector Embeddings	OpenAI/HuggingFace embedding models
Similarity Search	Cosine similarity over ChromaDB vector store
LLM Generation	Context-grounded answer generation with source citations
RAGAS Evaluation	Automated metrics — faithfulness, relevance, context precision
Flask Chat UI	Interactive web interface for document Q&A

Pipeline Deep-Dive

1. Document Ingestion

Load PDFs and text files using custom parsers
Extract metadata (filename, page number, section)
Handle encoding edge cases

2. Chunking Strategy

Recursive character text splitter
Configurable chunk size (default: 1000 tokens) and overlap (200 tokens)
Preserves paragraph boundaries where possible

3. Embedding & Indexing

Generate dense vector embeddings (OpenAI text-embedding-3-small or HuggingFace alternatives)
Store in ChromaDB with metadata for filtered retrieval
Persistent storage for production use

4. Retrieval & Generation

Embed user query → cosine similarity search → top-k retrieval
Context window assembly with source tracking
LLM generates grounded answers with inline citations

5. Evaluation (RAGAS)

Faithfulness: Is the answer supported by retrieved context?
Answer Relevancy: Does the answer address the question?
Context Precision: Are the retrieved chunks relevant?
Context Recall: Did retrieval capture all needed information?

Quick Start

Prerequisites

Python 3.11+
OpenAI API key

Setup

git clone https://github.com/Nagavenkatasai7/rag-from-scratch.git
cd rag-from-scratch/rag-project
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
pip install -r requirements.txt

Configure

export OPENAI_API_KEY="your-key-here"

Run

python app.py

Open http://localhost:5000 to use the chat interface.

Project Structure

rag-from-scratch/
├── rag-project/
│   ├── app.py              # Flask chat UI
│   ├── rag_pipeline.py     # Core RAG pipeline
│   ├── chunking.py         # Text splitting logic
│   ├── embeddings.py       # Embedding generation
│   ├── retriever.py        # Vector search & retrieval
│   ├── evaluation.py       # RAGAS evaluation metrics
│   ├── templates/          # HTML templates for Flask UI
│   └── data/               # Sample documents
├── RAG-FROM-SCRATCH-GUIDE.md  # Detailed implementation guide
├── requirements.txt
└── README.md

Tech Stack

Component	Technology
Language	Python 3.11+
Framework	LangChain, Flask
Vector Store	ChromaDB
Embeddings	OpenAI / HuggingFace
LLM	GPT-4o / GPT-3.5-turbo
Evaluation	RAGAS
Frontend	Flask + HTML/CSS

What I Learned

How chunking strategy directly impacts retrieval quality
Why overlap matters — without it, answers miss context at chunk boundaries
RAGAS evaluation reveals failure modes invisible to manual testing
ChromaDB's metadata filtering enables precise document-scoped queries

License

MIT — see LICENSE for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG from Scratch

Why This Project?

Architecture

Key Features

Pipeline Deep-Dive

1. Document Ingestion

2. Chunking Strategy

3. Embedding & Indexing

4. Retrieval & Generation

5. Evaluation (RAGAS)

Quick Start

Prerequisites

Setup

Configure

Run

Project Structure

Tech Stack

What I Learned

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
rag-project		rag-project
.gitgnore		.gitgnore
RAG-FROM-SCRATCH-GUIDE.md		RAG-FROM-SCRATCH-GUIDE.md
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

RAG from Scratch

Why This Project?

Architecture

Key Features

Pipeline Deep-Dive

1. Document Ingestion

2. Chunking Strategy

3. Embedding & Indexing

4. Retrieval & Generation

5. Evaluation (RAGAS)

Quick Start

Prerequisites

Setup

Configure

Run

Project Structure

Tech Stack

What I Learned

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages