LynxEngine: Neural Data Retrieval System

LynxEngine is a high-performance, AI-powered semantic search engine built on the MERN stack. Unlike traditional keyword-based search, LynxEngine understands user intent using high-dimensional vector embeddings and features an automated "Hunter" logic that scrapes and indexes Wikipedia in real-time when local data is insufficient.

Tech Stack

Component	Technology
Frontend	React.js, React Router, Lucide-React
Backend	Node.js, Express.js, Axios, Cheerio
Database	MongoDB Atlas (Vector Search Index)
AI Models	Ollama (mxbai-embed-large & Gemma 2:2b)
Styling	Custom CSS (Windows XP "Luna" Design System)

System Architecture

LynxEngine operates through a Three-Stage Retrieval Pipeline designed for maximum precision:

1. Semantic Normalization

The user query is processed by Gemma 2:2b to resolve entities.

Example: "lui hamilto" → Lewis_Hamilton.
This ensures the "Hunter" lands on the correct Wikipedia URL immediately.

2. Vector Search (Semantic Mapping)

The normalized query is converted into a 1024-dimension vector using mxbai-embed-large. The system performs a $vectorSearch against MongoDB to find matches based on Cosine Similarity.

3. The Hunter Protocol (Auto-Ingestion)

If the highest similarity score falls below a dynamic threshold (e.g., 0.67), the engine:

Triggers a real-time crawl of Wikipedia.
Parses and cleans the content using Cheerio.
Generates new embeddings and indexes the data for future users.

Key Features

Vector Stability Check: Uses a custom scoring algorithm to determine if search results are high-quality or if a fresh crawl is required.
Flag Prioritization: Specialized scraper logic to detect and prioritize national flags and high-res SVG media for country-based queries.
Resilient Async Flow: Implements AbortController watchdogs to prevent system hangs during local AI generation.
Nostalgic UX: A complete, custom-built UI inspired by the Windows XP (2005) desktop environment, optimized for low-glare viewing.

Getting Started

Prerequisites

Ollama installed and running.
MongoDB Atlas account with a Vector Search Index named vector_index.
Node.js & NPM.

Installation

Clone the Repository:

git clone [https://github.com/swayamvaza/AI-Search-Engine.git](https://github.com/swayamvaza/AI-Search-Engine.git)
cd AI-Search-Engine

Pull AI Models:

   ollama pull mxbai-embed-large
   ollama pull gemma2:2b

Launch Backend Engine:

Create a directory for backend

   cd backend
   npm install
   node server.js

Launch Frontend Engine:

Create a directory for frontend

   cd frontend
   npm install
   npm start

├── client/ # React Frontend (XP Design System) ├── server.js # Node.js Server & Scraper Logic ├── Item.js # Mongoose Schema for Vector Documents ├── .env # System Environment Variables └── README.md # System Documentation

Author

Swayam Kumar
Project Repository	AI-Search-Engine
Tech Stack	MERN + AI (Ollama)

Developed by Swayam Kumar
LynxEngine v0.1-alpha

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.idea		.idea
backend		backend
frontend		frontend
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LynxEngine: Neural Data Retrieval System

Tech Stack

System Architecture

1. Semantic Normalization

2. Vector Search (Semantic Mapping)

3. The Hunter Protocol (Auto-Ingestion)

Key Features

Getting Started

Prerequisites

Installation

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LynxEngine: Neural Data Retrieval System

Tech Stack

System Architecture

1. Semantic Normalization

2. Vector Search (Semantic Mapping)

3. The Hunter Protocol (Auto-Ingestion)

Key Features

Getting Started

Prerequisites

Installation

Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages