🚧 This project is work in progress. Parts — or even the entire platform — may not function correctly at this stage. Expect bugs, unfinished features, and potential breaking changes. 🛠️🚀
An AI-powered, self-hosted Ask-Me-Anything system for Inovus Labs. Ask questions about Inovus Labs and get accurate, grounded answers based on official Inovus documents and knowledge.
This project implements a custom Retrieval Augmented Generation (RAG) pipeline using:
✅ Gemini API for both embeddings and answer generation
✅ Pinecone for efficient, scalable vector search
✅ Cloudflare R2 for secure document storage
✅ A modern Nuxt 3 frontend for seamless user interaction
✅ Node.js + Hono API backend for orchestrating the RAG flow
❎ Planned MCP Server integration for real-time, live knowledge
All answers are generated based on your private document collection, with strict topic control. No unrelated or hallucinated information is allowed.
Check out the live demo at Inovus Labs AMA (currently in development, may be unstable).
Layer | Technology |
---|---|
Frontend | Nuxt 3 (Vue 3) + Tailwind CSS |
RAG Backend | Node.js + Hono |
Vector Storage | Pinecone |
Document Storage | Cloudflare R2 |
Embeddings | Gemini API (embedding-001) |
Completion | Gemini API (models/gemini-2.5-flash) |
Deployment | Cloudflare Workers / Pages |
Planned | MCP Server for live data |
✅ Conversation History & Context Awareness
✅ Intelligent Follow-up Question Detection
✅ Smart Conversation Summarization (token optimization)
✅ Dynamic Follow-up Suggestions
✅ Clear Conversation functionality
- User asks a question via the Nuxt frontend
- API server generates a question embedding (Gemini)
- Pinecone returns relevant knowledge chunks
- System composes a grounded prompt
- Gemini API generates a final answer
- Response with source references is displayed in the chat UI
In the future, the system will also pull real-time knowledge from the Inovus MCP Server.
This project is licensed under the MIT License. See the LICENSE file for details.
- Streaming answers to frontend
- Advanced rate limiting with Cloudflare Workers
- MCP Server integration for real-time knowledge
- Source citations and traceability