Skip to content

Yashaswini-V21/Nexus-Research

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ”¬ Nexus Research

Parallel Multi-Agent AI Research System

One query. Four AI agents running in parallel. Real-time web search, etstructured debate analysis, historical timelines, interactive knowledge graphs, and fact verification β€” all delivered via live WebSocket streaming.


GitHub Profile Repository YouTube Demo

Python FastAPI Groq Tavily ChromaDB Docker vis-network ReportLab MIT


πŸ”— Quick Access

Overview β€’ Quick Start β€’ Demo Video β€’ Environment Variables β€’ Troubleshooting β€’ API β€’ Roadmap

Why This Matters: Multi-agent orchestration, semantic memory persistence, real-time streaming, and end-to-end product thinking β€” all in one portfolio project.

πŸš€ Overview

Nexus Research is a multi-agent AI research platform that analyzes any topic through four distinct analytical lenses simultaneously. Instead of a single LLM response, you get a comprehensive, multi-dimensional research report β€” complete with interactive visualizations, source-grounded facts, and exportable reports in PDF, Markdown, and HTML formats.

What Makes It Different

Traditional Research Tools Nexus Research
Single LLM response 4 parallel AI agents, each with a distinct analytical lens
No source grounding Real-time Tavily web search feeds every agent with live data
Stateless conversations ChromaDB vector memory persists & semantically retrieves past sessions
Text-only output Interactive knowledge graph (vis-network) + PDF / Markdown / HTML export
Sequential processing Async parallel execution β€” all 4 agents run simultaneously via asyncio.gather
No progress feedback WebSocket streaming β€” real-time stage updates as each agent completes

🧠 The Four Research Dimensions

Every query is analyzed through four specialized AI agents running in parallel:

                         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                         β”‚    USER QUERY    β”‚
                         β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                  β”‚
                     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                     β”‚   Tavily Web Search      β”‚
                     β”‚   5 results Β· basic/deep β”‚
                     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                  β”‚
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β–Ό           β–Ό               β–Ό           β–Ό
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚  DEBATE  β”‚ β”‚ TIMELINE β”‚ β”‚ KNOWLEDGE  β”‚ β”‚   FACT   β”‚
        β”‚  AGENT   β”‚ β”‚  AGENT   β”‚ β”‚   GRAPH    β”‚ β”‚ VERIFIER β”‚
        β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜
             β”‚            β”‚              β”‚             β”‚
             β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                 β–Ό
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚   Unified JSON Report  β”‚
                    β”‚   ChromaDB Β· PDF Export β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Dimension Agent Output
βš–οΈ Debate Analysis DebateAgent Mainstream view, devil's advocate contrarian view, synthesis & verdict
πŸ“… Historical Timeline TimelineAgent 8–12 chronological events with type badges, era summary, future outlook
πŸ•ΈοΈ Knowledge Graph MindmapAgent 10–14 typed nodes + 12–18 weighted edges β†’ interactive vis-network map
βœ… Fact Verification VerifyAgent Per-claim status (verified / disputed / misleading), confidence score, key uncertainties

πŸ”„ Research Flowchart

flowchart TD
  A[User Query] --> B[FastAPI /api/research]
  B --> C[Tavily Search]
  B --> D[ChromaDB Related Context]
  C --> E[DebateAgent]
  C --> F[TimelineAgent]
  C --> G[MindmapAgent]
  C --> H[VerifyAgent]
  D --> E
  D --> G
  E --> I[Unified Result]
  F --> I
  G --> I
  H --> I
  I --> J[ChromaDB Save]
  I --> K[Frontend UI]
  I --> L[PDF / Markdown / HTML Exports]
Loading

🧭 Search Lifecycle

sequenceDiagram
  participant U as User
  participant F as Frontend
  participant A as FastAPI
  participant S as Tavily
  participant M as ChromaDB
  participant G as Groq Agents

  U->>F: Enter query
  F->>A: POST /api/research
  A->>S: Fetch live sources
  A->>M: Load related memory
  par Parallel execution
    A->>G: Debate analysis
    A->>G: Timeline extraction
    A->>G: Graph construction
    A->>G: Fact verification
  end
  G-->>A: Structured outputs
  A->>M: Persist session
  A-->>F: Unified response JSON
  F-->>U: Debate + Timeline + Graph + Verification
Loading

πŸ› οΈ Technology Stack

Core LLM & Search

  • Groq LLaMA 3.3 70B β€” Ultra-fast token generation (300+ tokens/sec) powering all 4 agents
  • Tavily Search API β€” Real-time web retrieval with configurable depth (basic or advanced)

Backend Architecture

  • FastAPI + Uvicorn β€” Async REST API with WebSocket streaming and async.gather orchestration
  • asyncio β€” True parallel agent execution with stage-based progress updates
  • Rate Limiting β€” Per-IP throttle middleware to prevent abuse

Data & Memory

  • ChromaDB β€” Local persistent vector database for semantic search across research history
  • JSON Serialization β€” Efficient result storage with metadata indexing

Frontend & Visualization

  • Vanilla JavaScript β€” Zero-build SPA with particle background, glassmorphism UI, dark/light theme toggle
  • vis-network β€” Interactive, physics-based knowledge graph with zoom, fit, and fullscreen controls
  • ReportLab β€” Professional PDF generation with styled sections
  • Markdown & HTML β€” Multi-format export for portability

Deployment Stack

  • Docker + Docker Compose β€” One-command containerized setup
  • Nginx Alpine β€” Reverse proxy with same-origin API/WebSocket forwarding
  • Environment-Based Config β€” CORS allowlist, rate limit, and model selection via .env

⚑ Key Competitive Advantages

Feature Impact Differentiator
True Parallel Execution 4x faster research than sequential agents All agents via asyncio.gather, not fake concurrency
Live Stage Streaming User sees progress in real-time WebSocket sends stage completion as it happens
Semantic Memory Retrieves contextually similar past sessions ChromaDB vector search, not keyword matching
Multi-Format Export PDF, Markdown, HTML from one result No need for user to convert or refactor
Fault Tolerance One agent failure doesn't crash report Each agent wrapped in _safe_run() error handler
Zero-Build Frontend Single HTML file, opens instantly No npm, webpack, or build step required
Production Hardening Environment-scoped CORS, configurable rate limits Ready for real world, not just demos

πŸš€ Performance Metrics

  • Research Latency: ~2–4 seconds (all 4 agents in parallel)
  • Token Generation: 300+ tokens/sec via Groq LLaMA 3.3 70B
  • API Response Time: <100ms for history/export endpoints
  • Memory Footprint: ~150MB base (ChromaDB + dependencies)
  • Concurrent Users: Supports 15 queries/min per IP (configurable)

πŸ“Š Learning Outcomes (Why Build This?)

  • Systems Design: Async orchestration of multiple LLM agents
  • Real-time UX: WebSocket streaming + reactive UI updates
  • Semantic Search: Vector databases for contextual retrieval
  • Full-Stack: Backend API, database, frontend, exports, Docker
  • Production Practices: Rate limiting, error handling, logging, config management
  • Multi-dimensional Analysis: Structuring complex outputs (debate, timeline, graph, verification)

πŸ—οΈ Architecture

frontend/index.html             ← Particle BG Β· Glassmorphism Β· vis-network Β· Dark/Light Theme (zero build)
        β”‚
        β”‚  REST API + WebSocket (CORS enabled)
        β–Ό
backend/main.py                 ← FastAPI app Β· asyncio.gather Β· REST + WS endpoints Β· Rate Limiting Β· Logging
β”œβ”€β”€ search.py                   ← Tavily web search (5 results max, client reuse)
β”œβ”€β”€ memory.py                   ← ChromaDB PersistentClient (vector store)
β”œβ”€β”€ pdf_export.py               ← ReportLab PDF generation
└── agents/
    β”œβ”€β”€ debate.py               ← Mainstream vs contrarian + synthesis
    β”œβ”€β”€ timeline.py             ← Chronological events + era summary
    β”œβ”€β”€ mindmap.py              ← Knowledge graph nodes / edges / types
    └── verify.py               ← Per-claim fact verification + trust score

⚑ Quick Start

Option A: Local Setup

1. Clone & Install

git clone https://github.com/Yashaswini-V21/Nexus-Research.git
cd Nexus-Research
py -m pip install -r requirements.txt

2. Configure API Keys

Create a .env file in the project root:

GROQ_API_KEY=gsk_your_groq_api_key_here
TAVILY_API_KEY=tvly-your_tavily_api_key_here

Free tiers available: Groq Console (free) Β· Tavily Dashboard (1,000 free searches/month)

3. Start the Backend

py -m uvicorn backend.main:app --reload --host 0.0.0.0 --port 8000

3.1 Run Tests (Optional but Recommended)

py -m pip install -r requirements-dev.txt
py -m pytest -q

4. Open the Frontend

Open frontend/index.html directly in your browser β€” no build step, no Node.js required.

Option B: Docker (One Command)

# Set your API keys in .env first, then:
docker compose up --build

When running with Docker Compose, Nginx serves the frontend and reverse-proxies:

  • /api/* β†’ nexus-api:8000
  • /ws/* β†’ nexus-api:8000

This gives same-origin API calls from the frontend and smoother browser behavior.

CORS Configuration

Set CORS_ORIGINS as a comma-separated list in .env for non-Docker deployments.

CORS_ORIGINS=http://localhost:3000,http://127.0.0.1:3000

πŸŽ₯ Demo Video

Use this section to share your walkthrough once uploaded:

Watch the Nexus Research Demo

⚑ Try In 60 Seconds

  1. Start backend with py -m uvicorn backend.main:app --reload --host 0.0.0.0 --port 8000
  2. Open frontend/index.html
  3. Run this sample query in the UI:
Will AGI create more jobs than it replaces by 2035?

Expected result:

  • Debate output with mainstream vs contrarian arguments
  • Timeline with key milestones
  • Interactive knowledge graph
  • Claim verification with confidence and uncertainties

πŸ” Environment Variables

Variable Required Default Description
GROQ_API_KEY Yes None API key for Groq LLaMA inference
TAVILY_API_KEY Yes None API key for Tavily search retrieval
CORS_ORIGINS No http://localhost:3000,http://127.0.0.1:3000 Comma-separated frontend origins
RATE_LIMIT_RPM No 15 Requests per minute per IP
MODEL_NAME No Project default Override Groq model selection

If you add new env variables in code later, extend this table to keep deployment docs production-ready.

🧰 Troubleshooting

Issue Likely Cause Fix
API returns auth errors Missing/invalid GROQ_API_KEY or TAVILY_API_KEY Recheck .env keys and restart server
Frontend cannot call API CORS origin mismatch Add frontend URL to CORS_ORIGINS
Docker app not loading on :3000 Containers not healthy or still building Run docker compose ps and check logs
Empty/weak results Search depth too shallow or vague query Use deeper query depth and more specific prompt
WebSocket updates not appearing Reverse proxy path or WS route mismatch Ensure /ws/* is proxied to backend in Nginx

❓ FAQ

Q: Can I swap the LLM model?
Yes. Configure your model setting (for example through MODEL_NAME) and restart the backend.

Q: Is research history stored locally?
Yes. Sessions are persisted in local ChromaDB storage.

Q: Can I deploy this without Docker?
Yes. Run FastAPI directly and open frontend/index.html in the browser.

Q: Is this production-ready?
It includes core production practices (rate limits, logging, CORS, health checks), and can be extended with auth and observability.

πŸ–₯️ Frontend Experience

  • Landing page introduces the four-dimension research model
  • Workspace separates output into debate, timeline, graph, and verification tabs
  • Knowledge graph includes zoom, fit, fullscreen, screenshot, physics toggle, legend, and node detail panel
  • Verification cards show confidence bars and uncertainty summaries
  • Sources are clickable and reports can be exported as PDF, Markdown, and HTML
  • Theme toggle persists with local storage

πŸ“‘ API Reference

Method Endpoint Description
POST /api/research Run full 4D research on a query (rate-limited)
GET /api/health Health and runtime configuration status
GET /api/history List all past research sessions
GET /api/history/{id} Retrieve a specific session by ID
DELETE /api/history/{id} Delete a session from ChromaDB
POST /api/export/pdf/{id} Download a session as a formatted PDF
GET /api/export/markdown/{id} Download a session as Markdown
GET /api/export/html/{id} Download a session as styled HTML
WS /ws/research WebSocket β€” real-time streaming with stage updates
Example Request & Response

Request:

{
  "query": "Impact of AGI on the global economy",
  "depth": "deep"
}

Response:

{
  "id": "uuid",
  "query": "...",
  "timestamp": "ISO 8601",
  "search_summary": [{ "title": "...", "url": "...", "content": "..." }],
  "debate": {
    "mainstream_view": {},
    "contrarian_view": {},
    "synthesis": "...",
    "verdict": "..."
  },
  "timeline": {
    "events": [],
    "era_summary": "...",
    "future_outlook": "..."
  },
  "mindmap": {
    "nodes": [],
    "edges": [],
    "central_insight": "..."
  },
  "verify": {
    "claims": [],
    "overall_confidence": 0.0,
    "key_uncertainties": []
  }
}

πŸ“ Project Structure

Nexus-Research/
β”œβ”€β”€ requirements.txt              # Python dependencies
β”œβ”€β”€ README.md
β”œβ”€β”€ Dockerfile                    # Container image for the API
β”œβ”€β”€ docker-compose.yml            # One-command deployment (API + Nginx)
β”œβ”€β”€ .dockerignore
β”œβ”€β”€ .env                          # API keys (GROQ + TAVILY)
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ main.py                   # FastAPI app β€” REST + WebSocket + rate limiting + logging
β”‚   β”œβ”€β”€ search.py                 # Tavily search wrapper (client reuse)
β”‚   β”œβ”€β”€ memory.py                 # ChromaDB vector memory
β”‚   β”œβ”€β”€ pdf_export.py             # ReportLab PDF exporter
β”‚   └── agents/
β”‚       β”œβ”€β”€ __init__.py           # Clean agent imports
β”‚       β”œβ”€β”€ debate.py             # DebateAgent (Groq)
β”‚       β”œβ”€β”€ mindmap.py            # MindmapAgent (Groq)
β”‚       β”œβ”€β”€ timeline.py           # TimelineAgent (Groq)
β”‚       └── verify.py             # VerifyAgent (Groq)
β”œβ”€β”€ frontend/
β”‚   └── index.html                # Full SPA (dark/light theme, WebSocket, vis-network)
└── chroma_db/                    # Auto-created on first run

🎯 Design Philosophy

Decision Rationale
Parallel agents via asyncio.gather 4x faster than sequential β€” all agents run simultaneously
WebSocket streaming Real-time progress β€” users see each agent complete live
Fault-tolerant agents Each agent wrapped in _safe_run() β€” one failure won't crash the whole report
Rate limiting Per-IP throttle (configurable via RATE_LIMIT_RPM env var) protects the API
Structured logging Python logging module across all files β€” production-ready observability
ChromaDB for memory Semantic similarity search across past research; fully local, zero cloud dependency
Zero-build frontend Single HTML file β€” no npm, no webpack, no React. Opens instantly in any browser
Dark / Light theme Persistent theme toggle with localStorage β€” respects user preference
Multi-format export PDF (styled), Markdown (portable), HTML (self-contained) β€” one click each
Groq inference LLaMA 3.3 70B at 300+ tokens/sec β€” near-instant agent responses
Docker Compose One-command deployment β€” API + Nginx frontend, persistent ChromaDB volume

πŸ—ΊοΈ Roadmap

Phase 1 β€” MVP Complete βœ…

  • WebSocket progress streaming
  • Docker Compose support
  • Dark/light theme toggle
  • Markdown and HTML export
  • Rate limiting and logging
  • Fault-tolerant per-agent execution
  • Graph controls and node inspection
  • Test suite with CI/CD ready
  • Production hardening (CORS, health checks, timezone-aware timestamps)

Phase 2 β€” Future Enhancements (Optional)

  • Multi-model comparison (GPT-4, Claude, Mistral)
  • Shared collaborative sessions
  • Scheduled recurring research
  • Authentication and multi-user support
  • Benchmark dashboard (latency, token cost, confidence trends)

Built with curiosity, rigor, and a builder's mindset.

If this project helped you, consider starring the repository and connecting on GitHub.

GitHub: @Yashaswini-V21 β€’ Project Repository

🀝 Contributing

Contributions, issues, and feature requests are welcome! Feel free to open an issue or submit a pull request.

πŸŽ“ Summary

Nexus Research demonstrates full-stack AI engineering: multi-agent LLM orchestration, async task scheduling, semantic memory systems, real-time frontend streaming, and production deployment practices. Ideal for roles in AI systems, backend engineering, or full-stack AI product development.

πŸ“¬ Contact: yashasyashu0987@gmail.com

Built with Groq Β· Tavily Β· FastAPI Β· ChromaDB Β· vis-network Β· ReportLab Β· Docker Β· Tested with pytest

About

πŸš€ "Your AI research companion that thinks in four dimensions β€” knowledge graphs, timelines, devil's advocate debates, and fact verification. Built with FastAPI + Groq + Tavily."😁

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors