Skip to content

Latest commit

 

History

History
221 lines (162 loc) · 7.89 KB

File metadata and controls

221 lines (162 loc) · 7.89 KB

Tech Stack — Restaurant Recommendation v2

Table of Contents

  1. Overview
  2. FalkorDB
  3. CocoIndex
  4. PyTorch Geometric (PyG)
  5. GPT-4o
  6. LangChain
  7. PostgreSQL
  8. Supporting Technologies

Overview

Layer Component Version Role
Graph DB FalkorDB 4.x CRKG storage, Cypher queries
Data Pipeline CocoIndex latest CDC sync, indexing, retrieval
GNN Training PyTorch + PyG 2.x / 2.x GCN embeddings, multi-task loss
LLM OpenAI GPT-4o gpt-4o Grounded response generation
Orchestration LangChain 0.2.x Chain design, memory management
Raw Data PostgreSQL 16 Relational source of truth
API Gateway FastAPI 0.111.x Async REST + SSE + WebSocket
Cache Redis 7.x Short-TTL query result cache

FalkorDB

Why FalkorDB Instead of Neo4j?

Criterion FalkorDB Neo4j
Storage Fully in-memory (RAM) Disk-based with page cache
Query language Cypher (100 % compatible) Cypher
Latency (typical graph traversal) Sub-millisecond 10 – 200 ms
Open-source Yes (BSD-3) Community edition only
Redis ecosystem Native (evolved from RedisGraph) No
Horizontal scaling Redis Cluster Causal Cluster (paid)
Embedding support Via property arrays Via additional plugins

Why FalkorDB Fits This Project

  • Real-time Graph-RAG requires sub-ms Cypher queries so that the end-to-end chatbot latency stays below 2 s.
  • CRKG fits comfortably in RAM: ~10 M nodes × 200 bytes ≈ 2 GB, easily hosted on a modern server.
  • Cypher compatibility means the team reuses skills and query patterns from the Neo4j/graph ecosystem.
  • Snapshot persistence (RDB/AOF) provides durability without sacrificing in-memory speed.

Alternatives Considered

Alternative Reason Rejected
Neo4j Disk-based, higher latency, expensive enterprise licence
Amazon Neptune Managed but higher latency, no in-memory mode, costly
TigerGraph Proprietary, complex deployment
Memgraph Good alternative; FalkorDB chosen for Redis ecosystem maturity

CocoIndex

Dual Role

  1. Data Pipeline — listens to PostgreSQL CDC events, orchestrates LLM-based extraction, generates triplets, writes to FalkorDB.
  2. Retrieval Backbone — provides semantic query understanding, Cypher query generation, and subgraph packaging for the Graph-RAG flow.

Why CocoIndex Instead of FAISS / Pinecone / Weaviate?

Criterion CocoIndex FAISS Pinecone Weaviate
Graph-native retrieval ✅ Yes ❌ Vector only ❌ Vector only Partial (via modules)
Built-in CDC / sync ✅ Yes ❌ Manual ❌ Manual Partial
Incremental indexing ✅ Automatic ❌ Manual rebuild ✅ Yes ✅ Yes
Cypher query generation ✅ Yes
Subgraph serialisation ✅ Yes
Self-hosted ✅ Yes ✅ Yes ❌ SaaS ✅ Yes

CocoIndex is the only component that bridges all three: change detection → indexing → graph-aware retrieval.

Alternatives Considered

Alternative Reason Rejected
FAISS + custom ETL No graph awareness; heavy custom code
Pinecone SaaS only, no Cypher integration
Weaviate Good vector search but weak graph traversal
LlamaIndex + custom More boilerplate; CocoIndex graph-native

PyTorch Geometric (PyG)

Why PyG for GNN?

Criterion PyG DGL Spektral
Heterogeneous graph support HeteroData ❌ Limited
GCN / GAT / GraphSAGE ✅ All built-in
Mini-batch training NeighborLoader
PyTorch ecosystem ✅ Native Partial ❌ TF/Keras
Research community Very active Active Less active
Production deployments Many Many Few

GNN Models Supported

  • GCN (Graph Convolutional Network) — primary model for embedding generation.
  • GAT (Graph Attention Network) — optional upgrade for weighted neighbourhood aggregation.
  • GraphSAGE — inductive learning for cold-start new nodes.

Alternatives Considered

Alternative Reason Rejected
DGL PyTorch-native team prefers PyG's API style
Spektral TensorFlow-based; team uses PyTorch stack
StellarGraph Less active development

GPT-4o

Why GPT-4o?

Criterion GPT-4o GPT-3.5-turbo Claude 3 Opus Gemini 1.5 Pro
Instruction following Excellent Good Excellent Good
Structured output (JSON) ✅ Reliable Inconsistent ✅ Reliable ✅ Reliable
Context window 128 K tokens 16 K tokens 200 K tokens 1 M tokens
Latency ~1–2 s ~0.5–1 s ~2–3 s ~1–3 s
Cost (input/1M tokens) $5 $0.5 $15 $3.5
OpenAI ecosystem

Cost Analysis

Assuming 10 000 chatbot sessions/day, average 500 input tokens (system prompt + graph context) + 200 output tokens per turn:

  • Input cost: 10 000 × 500 / 1 000 000 × $5 = $0.025/day
  • Output cost: 10 000 × 200 / 1 000 000 × $15 = $0.03/day
  • Total ≈ $0.055/day ≈ $1.65/month — negligible at early scale.

Fallback Strategy

  1. Primary: GPT-4o via OpenAI API.
  2. Fallback (rate limit / outage): GPT-3.5-turbo (lower quality, still grounded).
  3. Offline fallback: Return pre-computed Top-K GNN results with static template text, no LLM call.

LangChain

Role in the Architecture

LangChain serves as the orchestration layer connecting CocoIndex retrieval, conversation memory, and GPT-4o generation.

Component Purpose
ConversationChain Maintains multi-turn dialogue context
RetrievalChain Wires CocoIndex retriever into the LLM chain
ConversationBufferWindowMemory Keeps last N turns in context window
Tool definitions Expose CocoIndex + FalkorDB as callable tools
Streaming callbacks Enable SSE streaming of GPT-4o output

Chain Design

User Query
    └─▶ Intent Router
            ├─▶ [graph_query tool] → CocoIndex → FalkorDB subgraph
            └─▶ [history tool]    → ConversationMemory
                        └─▶ Prompt Assembly → GPT-4o → Response

Alternatives Considered

Alternative Reason Rejected
LlamaIndex Good RAG but weaker agent/tool orchestration
Semantic Kernel .NET-first; team uses Python
Custom orchestration High maintenance overhead

PostgreSQL

Role

PostgreSQL is the raw data source of truth. It stores operational data before it is transformed into knowledge graph form.

Table Content
restaurants Restaurant metadata
recipes Recipe details, ingredients list
users User accounts and health profiles
orders Transactional order history
reviews User ratings and text reviews
ingredients Ingredient master data

CDC (Change Data Capture) for CocoIndex

Tool: Debezium or PostgreSQL logical replication.

  1. PostgreSQL writes a change to the WAL (Write-Ahead Log).
  2. Debezium captures the change event and publishes to a message queue (Kafka/Redis Streams).
  3. CocoIndex consumes the event, triggers the appropriate pipeline step (LLM extraction → triplet → FalkorDB write).

This ensures FalkorDB is eventually consistent with the operational database within minutes of any change.


Supporting Technologies

Technology Purpose
Redis 7 Short-TTL Cypher result cache (TTL 1 h)
FastAPI 0.111 Async API gateway, SSE streaming, WebSocket
Docker / Kubernetes Container orchestration
Prometheus + Grafana Metrics and dashboards
ELK Stack Structured log aggregation and search
Debezium PostgreSQL CDC connector