Nutrition RAG Chatbot: From Scratch (Manual RAG Engineering)

A fully engineered Retrieval‑Augmented Generation (RAG) chatbot built from first principles. The system answers user questions grounded strictly in the textbook:

Human Nutrition (University of Hawai‘i at Mānoa)

PDF: https://pressbooks.oer.hawaii.edu/humannutrition2/open/download?type=pdf

Demo

Below is a demo of the chatbot in action, showcasing its ability to provide accurate, grounded responses based on the textbook content.

Introduction
Project Objectives
System Architecture Overview
Dataset and Source Material
Ingestion and Extraction
Chunking Approaches
Embedding and Vector Storage
PostgreSQL + pgvector Setup
Frontend and Backend
Setup Instructions (Local Installation)
Repository Structure
How Query Processing Works
Detailed Walkthrough Notebook
RAG Evaluation
Future Improvements
License

1. Introduction

This repository implements a simple chatbot that provides grounded responses based on information from the referenced nutrition textbook. The project is intentionally engineered without relying on turnkey RAG abstractions, enabling full visibility and control over:

Ingestion pipeline
Chunking logic
Embedding failure modes
Vector storage and retrieval
Response generation

2. Project Objectives

Build a fully manual RAG implementation end‑to‑end
Understand how ingestion affects downstream retrieval performance
Implement multiple chunking strategies and evaluate their impact
Store embeddings directly in PostgreSQL using pgvector
Build a minimal full‑stack application (Next.js frontend + backend API)
Document the entire process in a reproducible notebook

3. System Architecture Overview

Processing flow:

PDF → Extracted Text
→ Exploratory Data Analysis (token lengths, truncation risks)
→ Chunking (multiple engineering methods)
→ Embedding
→ PostgreSQL + pgvector storage
→ SQL‑based similarity search
→ LLM response generation (grounded output)

4. Dataset and Source Material

The chatbot is grounded on:

Human Nutrition, University of Hawai‘i at Mānoa

Full PDF: https://pressbooks.oer.hawaii.edu/humannutrition2/open/download?type=pdf

The PDF is stored locally at: data/human_nutrition_text_book.pdf

5. Ingestion and Extraction

Different document formats require different extraction pipelines:

Digital PDFs: PyMuPDF
Scanned documents: Tesseract OCR
Hybrid documents (tables, charts, layouts): DOCkling or layout‑aware OCR

Extraction quality directly affects tokenization, chunking behavior, embedding quality, and retrieval recall.

The ingestion pipeline is implemented in: scripts/ingest.py

6. Chunking Approaches

Six different chunking strategies were implemented and tested:

Method	Avg Tokens	Notes
Fixed‑size	~65	Predictable, but ignores meaning
Structure‑based	~1342	Matches chapter hierarchy but exceeds embedding windows
Semantic	~13	Very coherent but over‑fragmented
Recursive	~89	Best practical trade‑off
LLM‑based	~92	High quality, but costly
Hybrid	Variable	Combines structure‑awareness with window control

Key lessons:

Most real‑world RAG failures originate in chunking and ingestion, not the LLM.
Without performing dataset‑level EDA, many chunks silently truncate before embedding.

Detailed chunking experiments are documented in: notebooks/rag_chunking_strategies.ipynb

7. Embedding and Vector Storage

Embeddings are generated using:

all‑mpnet‑base‑v2

Each chunk is embedded and stored directly inside PostgreSQL using the pgvector extension, eliminating the need for external vector databases while enabling efficient similarity search within SQL.

8. PostgreSQL + pgvector Setup

Enable vector support:

create extension if not exists vector;

Create table:

create table if not exists public.chunks (
  id bigserial primary key,
  doc_id text not null,
  chunk_index int not null,
  content text not null,
  metadata jsonb default '{}'::jsonb,
  embedding vector(1024)
);

Create IVFFlat index:

create index if not exists idx_chunks_embedding
on public.chunks using ivfflat (embedding vector_cosine_ops)
with (lists=100);

Similarity search function:

create or replace function public.match_documents(
    query_embedding vector(1024),
    match_count int default 5,
    filter jsonb default '{}'::jsonb
) returns table (
  id bigint,
  doc_id text,
  chunk_index int,
  content text,
  metadata jsonb,
  similarity float
) language plpgsql stable as $$
begin
  return query
  select
    c.id,
    c.doc_id,
    c.chunk_index,
    c.content,
    c.metadata,
    1 - (c.embedding <=> query_embedding) as similarity
  from public.chunks c
  where (filter = '{}'::jsonb) or (c.metadata @> filter)
  order by (c.embedding <=> query_embedding)
  limit match_count;
end;
$$;

9. Frontend and Backend

The application uses:

Next.js for both UI and backend endpoints
Groq for securely handling model API keys
Backend endpoints handle:
- Query embedding
- SQL similarity search
- LLM grounded response generation

Frontend application: rag-chat/

Backend API route: rag-chat/src/app/api/chat/route.ts

Main chat interface: rag-chat/src/app/page.tsx

10. Setup Instructions (Local Installation)

Step 1 – Clone

git clone https://github.com/KushalRegmi61/rag.git
cd rag

Step 2 – Python Environment

python3 -m venv .venv
source .venv/bin/activate         # Linux/Mac
.venv\Scripts\activate            # Windows

pip install -r requirements.txt

See: requirements.txt

Step 3 – PostgreSQL + pgvector

Install pgvector:

sudo apt install postgresql-16-pgvector

Run the SQL scripts above to create tables and indexes.

Step 4 – Run Notebook

jupyter lab

Open: notebooks/production_level_from_scratch.ipynb

Execute fully to:

Extract book
Analyze token distribution
Chunk
Compute embeddings
Store in database

Step 5 – Frontend Installation

cd rag-chat
npm install

Create .env.local:

DATABASE_URL=postgres://...
GROQ_API_KEY=your_api_key

Run the development server:

npm run dev

Access the application at: http://localhost:3000

11. Repository Structure

rag/
├── data/
│   └── human_nutrition_text_book.pdf
├── notebooks/
│   ├── production_level_from_scratch.ipynb
│   └── rag_chunking_strategies.ipynb
├── scripts/
│   └── ingest.py
├── test/
│   └── test_embeddings.py
├── rag-chat/
│   ├── src/
│   │   └── app/
│   │       ├── api/
│   │       │   └── chat/
│   │       │       └── route.ts
│   │       ├── page.tsx
│   │       ├── layout.tsx
│   │       └── globals.css
│   ├── public/
│   ├── package.json
│   └── tsconfig.json
├── .venv/
├── requirements.txt
├── package.json
├── .env
├── .gitignore
├── LICENSE
└── README.md

Key files:

data/human_nutrition_text_book.pdf – Source textbook
notebooks/production_level_from_scratch.ipynb – Main implementation notebook
notebooks/rag_chunking_strategies.ipynb – Chunking experiments
scripts/ingest.py – Document ingestion pipeline
test/test_embeddings.py – Embedding tests
rag-chat/src/app/api/chat/route.ts – Backend API endpoint
rag-chat/src/app/page.tsx – Frontend chat interface
requirements.txt – Python dependencies
LICENSE – MIT License

12. How Query Processing Works

User query
→ Query embedding
→ SQL vector similarity search
→ Top chunks returned
→ LLM generates grounded output
→ Rendered in chat interface

The complete flow is implemented in:

Frontend: rag-chat/src/app/page.tsx
Backend API: rag-chat/src/app/api/chat/route.ts

13. Detailed Walkthrough Notebook

All the engineering and reasoning is documented step‑by‑step in:

notebooks/production_level_from_scratch.ipynb

This is the primary reference for the implementation.

Additional experiments and chunking strategy comparisons:

notebooks/rag_chunking_strategies.ipynb

14. RAG Evaluation

The RAG system was evaluated using the Ragas library. Below are the overall average scores:

Metric	Description	Value	Remarks
Faithfulness	Measures the factual consistency of the generated answer with respect to the provided context.	0.200	Poor factual consistency
Answer Relevancy	Evaluates how relevant the generated answer is to the user's original question.	0.199	Poor relevance
Context Recall	Determines the extent to which all relevant information from the ground truth is retrieved within the context.	0.900	Excellent recall
Context Precision	Assesses the proportion of retrieved context that is actually relevant to the question.	1.000	Excellent precision

15. Future Improvements

Retrieval re‑ranking
Embedding‑quality comparison
Multi‑vector per chunk scoring
Structured citations
Deployment using containerization

16. License

This repository is released under the MIT License. See LICENSE for details.

Author: Kushal Regmi

GitHub: https://github.com/KushalRegmi61

Project Repository: https://github.com/KushalRegmi61/rag

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Nutrition RAG Chatbot: From Scratch (Manual RAG Engineering)

Demo

Table of Contents

1. Introduction

2. Project Objectives

3. System Architecture Overview

4. Dataset and Source Material

5. Ingestion and Extraction

6. Chunking Approaches

7. Embedding and Vector Storage

8. PostgreSQL + pgvector Setup

9. Frontend and Backend

10. Setup Instructions (Local Installation)

Step 1 – Clone

Step 2 – Python Environment

Step 3 – PostgreSQL + pgvector

Step 4 – Run Notebook

Step 5 – Frontend Installation

11. Repository Structure

12. How Query Processing Works

13. Detailed Walkthrough Notebook

14. RAG Evaluation

15. Future Improvements

16. License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
data		data
notebooks		notebooks
rag-chat		rag-chat
results		results
scripts		scripts
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Nutrition RAG Chatbot: From Scratch (Manual RAG Engineering)

Demo

Table of Contents

1. Introduction

2. Project Objectives

3. System Architecture Overview

4. Dataset and Source Material

5. Ingestion and Extraction

6. Chunking Approaches

7. Embedding and Vector Storage

8. PostgreSQL + pgvector Setup

9. Frontend and Backend

10. Setup Instructions (Local Installation)

Step 1 – Clone

Step 2 – Python Environment

Step 3 – PostgreSQL + pgvector

Step 4 – Run Notebook

Step 5 – Frontend Installation

11. Repository Structure

12. How Query Processing Works

13. Detailed Walkthrough Notebook

14. RAG Evaluation

15. Future Improvements

16. License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages