FormulAI is a modern full-stack application built with a clean separation of concerns, following microservices patterns with queue-based processing for scalability and reliability.
graph TB
subgraph "Client Layer"
A[React Frontend<br/>Port 3000]
A1[Form Builder UI]
A2[Analytics Dashboard]
A3[Response View]
end
subgraph "API Layer"
B[NestJS Backend<br/>Port 3001]
B1[REST Controllers]
B2[JWT Auth Guard]
B3[Swagger API Docs]
end
subgraph "Service Layer"
C1[Forms Service]
C2[Auth Service]
C3[Response Service]
C4[Analytics Service]
C5[Email Service]
C6[AI Service]
end
subgraph "Queue Processing Layer"
D1[Orchestration Queue]
D2[Response Processing Queue]
D3[Topic Clustering Queue]
D4[Aggregation Queue]
D5[AI Generation Queue]
end
subgraph "Analytics Processors"
E1[Response Processor<br/>Extract topics & sentiment]
E2[Topic Clustering<br/>Merge similar topics]
E3[Aggregation Processor<br/>Calculate statistics]
E4[Summary Generator<br/>LLM-based summary]
E5[Findings Generator<br/>Key insights]
E6[Recommendations Generator<br/>Action items]
end
subgraph "Data Layer"
F[(MongoDB<br/>Port 27017)]
F1[Forms Collection]
F2[Responses Collection]
F3[Users Collection]
F4[Analytics Cache]
end
subgraph "External Services"
G1[OpenAI API<br/>GPT-4o]
G2[SMTP Server<br/>Email delivery]
end
subgraph "Infrastructure"
H1[Redis/Bull Queue<br/>Job management]
H2[SSE Streaming<br/>Real-time updates]
end
A --> B
A1 --> C1
A2 --> C4
A3 --> C3
B1 --> B2
B1 --> C1
B1 --> C2
B1 --> C3
B1 --> C4
C1 --> F1
C3 --> F2
C2 --> F3
C4 --> D1
C5 --> G2
C6 --> G1
D1 --> D2
D1 --> D3
D1 --> D4
D1 --> D5
D2 --> E1
D3 --> E2
D4 --> E3
D5 --> E4
D5 --> E5
D5 --> E6
E1 --> F2
E2 --> F4
E3 --> F4
E4 --> F4
E5 --> F4
E6 --> F4
D1 --> H1
D2 --> H1
D3 --> H1
D4 --> H1
D5 --> H1
C4 --> H2
┌─────────────────────┐
│ Client Layer │
│ React 19 + Vite │
│ Tailwind CSS │
└─────────────────────┘
│
┌─────────────────────┐
│ API Gateway │
│ NestJS REST API │
│ JWT Authentication │
│ Swagger Docs │
└─────────────────────┘
│
┌───────────────┼───────────────┐
│ │ │
┌───────────────┐ ┌────────────┐ ┌────────────┐
│ Form Service │ │ Auth │ │ Response │
│ │ │ Service │ │ Service │
└───────────────┘ └────────────┘ └────────────┘
│
┌───────────────────────────────────────────────┐
│ Analytics Orchestration Layer │
│ Queue-based parallel processing with Redis │
└───────────────────────────────────────────────┘
│
┌───────┴───────────────────────────────┐
│ │
┌───────────────┐ ┌──────────────┐
│ Processing │ │ AI Generation│
│ Queues │ │ Queues │
│ - Response │ │ - Summary │
│ - Clustering │ │ - Findings │
│ - Aggregation │ │ - Recommends │
└───────────────┘ └──────────────┘
│ │
└───────────────┬───────────────┘
│
┌───────────────┐
│ AI Service │
│ OpenAI GPT-4 │
└───────────────┘
│
┌───────────────┐
│ Data Layer │
│ MongoDB │
│ - Forms │
│ - Responses │
│ - Analytics │
└───────────────┘
Technology: React 19, TypeScript, Vite, Tailwind CSS
Components:
- Form Builder: Drag-and-drop interface for creating forms
- Analytics Dashboard: 13 interactive cards displaying insights
- Response Management: Real-time tracking and filtering
- Authentication: JWT-based secure login
Key Files:
client/src/pages/FormEditor.tsx- Form builder UIclient/src/pages/FormAnalytics.tsx- Analytics dashboardclient/src/components/analytics/- Analytics visualization componentsclient/src/services/formsService.ts- API client
Technology: NestJS, TypeScript, Express
Controllers:
FormsController- CRUD operations for formsAuthController- User authenticationResponseController- Response submission and retrieval
Guards & Middleware:
- JWT Authentication Guard
- Request validation
- Error handling
- Logging middleware
Key Files:
server/src/forms/forms.controller.tsserver/src/auth/jwt-auth.guard.tsserver/src/main.ts- Application bootstrap
Business Logic & Orchestration
Core Services:
- FormsService: Form CRUD, validation, ownership
- AnalyticsService: Analytics retrieval and caching
- ResponseService: Response processing and storage
- AIService: OpenAI API integration
- EmailService: Invitation and notification emails
Key Files:
server/src/forms/forms.service.tsserver/src/analytics/analytics.service.tsserver/src/ai/ai.service.ts
Technology: Bull (Redis-backed job queues)
Queue Architecture:
Orchestration Queue (Master)
├── Response Processing Queue (Workers)
├── Topic Clustering Queue (Workers)
├── Aggregation Queue (Workers)
└── AI Generation Queue (Workers)
Queue Responsibilities:
-
Orchestration Queue:
- Coordinates entire analytics pipeline
- Manages workflow stages
- Publishes SSE progress updates
-
Response Processing Queue:
- Batch processes responses (20 per job)
- Extracts topics and sentiment per response
- Uses AI for thematic coding
-
Topic Clustering Queue:
- Merges similar topics into canonical themes
- Reduces 50+ raw topics to 8-15 clusters
- Semantic similarity matching
-
Aggregation Queue:
- Calculates topic frequencies and distributions
- Computes sentiment statistics
- Generates co-occurrence matrices
-
AI Generation Queue:
- Three parallel jobs: Summary, Findings, Recommendations
- LLM-based executive summary
- Evidence-based key findings
- Prioritized recommendations
Key Files:
server/src/analytics/queues/orchestration.consumer.tsserver/src/analytics/queues/response-processing.consumer.tsserver/src/analytics/queues/topic-clustering.consumer.tsserver/src/analytics/queues/aggregation.consumer.tsserver/src/analytics/queues/ai-generation.consumer.ts
Specialized Processing Logic:
- Response Processor: Extract topics and sentiment from individual responses
- Batch Processor: Adaptive chunking (10-25 responses per AI call)
- Topic Clustering: Semantic similarity and merging algorithm
- Aggregation: Statistical calculations and correlation analysis
- Summary Generator: LLM-based narrative generation
- Findings Generator: Algorithmic key insights extraction
- Recommendations Generator: Prioritized action items
Key Files:
server/src/analytics/processors/response.processor.tsserver/src/analytics/processors/batch.processor.tsserver/src/analytics/generators/summary.generator.tsserver/src/analytics/generators/findings.generator.tsserver/src/analytics/generators/recommendations.generator.ts
Technology: MongoDB with Mongoose ODM
Collections:
-
Forms Collection:
- Form metadata (title, description, settings)
- Questions array
- Analytics cache (embedded document)
-
Responses Collection:
- User submissions
- Metadata (topics, sentiment, quotes)
- Processing status flags
-
Users Collection:
- Authentication credentials
- JWT tokens
- User preferences
-
Analytics Tasks (embedded in forms):
- Cached results
- Topic distributions
- Sentiment analysis
- Quotes and insights
Key Files:
server/src/schemas/form.schema.tsserver/src/schemas/response.schema.tsserver/src/schemas/user.schema.ts
OpenAI API (GPT-4o):
- Topic extraction from responses
- Sentiment and emotion classification
- Topic clustering (semantic similarity)
- Summary generation
- Prompt construction and response parsing
SMTP Server:
- Form invitations
- Notification emails
- Response confirmations
User → Form Editor UI → POST /forms
→ FormsController → FormsService
→ MongoDB (forms collection)
→ Response (form ID + metadata)
Respondent → Public Form View → POST /responses
→ ResponseController → ResponseService
→ MongoDB (responses collection)
→ Email notification (optional)
- Initiation (User clicks "Generate Analytics"):
Analytics Dashboard → GET /forms/:id/analytics/stream
→ SSE Connection established
→ OrchestrationProducer.startAnalytics(formId)
→ Orchestration Queue (job created)
- Stage 1: Response Processing (Parallel batches):
Orchestration Consumer → Response Processing Queue
→ 4 parallel jobs (20 responses each)
→ AI extracts topics + sentiment per response
→ Save to responses.metadata
→ Progress: 0% → 60%
- Stage 2: Topic Clustering:
Orchestration Consumer → Topic Clustering Queue
→ Load all extracted topics
→ AI merges similar topics (50+ → 8-15)
→ Save canonical topics mapping
→ Progress: 60% → 70%
- Stage 3: Aggregation:
Orchestration Consumer → Aggregation Queue
→ Calculate topic frequencies
→ Compute sentiment distributions
→ Generate co-occurrence matrix
→ Calculate correlations
→ Save to form.analytics
→ Progress: 70% → 75%
- Stage 4: AI Insights (Parallel generation):
Orchestration Consumer → AI Generation Queue (3 jobs)
├── Summary Job → LLM generates executive summary
├── Findings Job → Algorithmic key findings
└── Recommendations Job → LLM generates action items
→ Save to form.analytics.insights
→ Progress: 75% → 95%
- Stage 5: Completion:
Orchestration Consumer → Save final results
→ Update form.analytics.lastUpdated
→ Publish 'complete' event via SSE
→ Progress: 100%
→ Close SSE connection
User → Analytics Dashboard → GET /forms/:id/analytics
→ AnalyticsService.getFormAnalytics(formId)
→ Load from MongoDB (cached)
→ Return analytics object
→ Render 13 analytics cards
- Provides a single Axios instance with request and response interceptors.
- Handles JWT attachment, global 401 response handling (clears auth state and redirects to login), and centralized error handling.
- Centralized functions to extract user‑friendly error messages from Axios errors and generic errors.
- Used across services and components to ensure consistent error display.
- Simple wrapper around
consolewith log levels (debug,info,warn,error). - Allows easy replacement with a more sophisticated logging library in the future.
- React error boundary component that catches rendering errors in the component tree.
- Displays a fallback UI and logs the error via the logger utility.
- Framework: React 19
- Language: TypeScript 5
- Build Tool: Vite 5
- Styling: Tailwind CSS 3
- State Management: React Context + Hooks
- Routing: React Router 6
- HTTP Client: Axios
- UI Components: Custom components with Lucide icons
- See detailed client utilities in architecture-overview.md
- Framework: NestJS 10
- Language: TypeScript 5
- Runtime: Node.js 20+
- API Style: REST with Swagger documentation
- Authentication: JWT (jsonwebtoken)
- Validation: class-validator, class-transformer
- Queue System: Bull (Redis-backed)
- Database: MongoDB 5.0+
- ODM: Mongoose 8
- Queue Backend: Redis 7
- Containerization: Docker + Docker Compose
- Development: Nodemon, Concurrently
- Primary: OpenAI GPT-4o (via API)
- Alternative: Ollama (local models)
- Prompting: Custom prompt templates for each task
- Parsing: Structured JSON responses with fallbacks
- Each queue consumer is an independent worker
- Loose coupling via message queues
- Horizontal scalability (add more workers)
- Asynchronous task execution
- Retry mechanisms and dead letter queues
- Job progress tracking
- Graceful failure handling
- Mongoose models as repositories
- Service layer abstracts data access
- Schema validation at database level
- Prompt builders for different AI tasks
- Generator classes for insights
- Processor classes for analytics stages
- SSE for real-time progress updates
- Event-driven queue processing
- Frontend subscribed to backend events
- Different AI providers (OpenAI, Ollama)
- Adaptive batch sizing
- Configurable processing strategies
- API Layer: Stateless, can run multiple instances
- Queue Workers: Add more consumers for parallel processing
- Database: MongoDB replica sets and sharding
- Batch Size: Configurable (currently 20 responses/job)
- Concurrency: Configurable parallel workers (currently 4)
- Chunk Size: Adaptive based on response length
- Analytics Cache: Stored in MongoDB (form.analytics)
- Incremental Updates: Only reprocess new responses
- Cache Invalidation: On manual refresh or new responses
- Parallel Processing: 4 concurrent batches
- Adaptive Chunking: 10-25 responses per AI call based on length
- Smart Clustering: Merge 50+ topics to 8-15 in one pass
- Streaming Updates: SSE prevents blocking UI
- Background Jobs: Queue system prevents API timeouts
- JWT tokens with expiration
- Route guards on protected endpoints
- User ownership validation for forms/responses
- Password hashing (bcrypt)
- Environment variable secrets
- CORS configuration
- Input validation and sanitization
- Rate limiting (planned)
- Request size limits
- SQL/NoSQL injection prevention via ODM
- XSS protection via React
- Console logging with timestamps
- Queue job progress tracking
- Error stack traces
- AI API call logging
- Real-time SSE updates
- Job status in UI (Not started, Pending, Complete, Failed)
- Stage-by-stage progress (0-100%)
- Detailed substage information
- Dead letter queues for failed jobs
- Retry mechanisms (3 attempts)
- Graceful degradation
- User-friendly error messages
localhost:3000 (Frontend) → localhost:3001 (Backend) → localhost:27017 (MongoDB)
CDN (Frontend) → Load Balancer → NestJS Instances → MongoDB Atlas
↓
Redis Queue
↓
Queue Workers (auto-scale)
- Minimum: 1GB RAM, 1 CPU core, 10GB storage
- Recommended: 2GB RAM, 2 CPU cores, 25GB storage
- High Load: 4GB+ RAM, 4+ CPU cores, auto-scaling workers
- Redis caching layer for frequently accessed data
- Database indexing optimization
- Custom AI model fine-tuning
- Comparative analytics across forms
- Team collaboration and sharing
- Advanced export formats (PDF, PPTX)
- CI/CD pipeline (GitHub Actions)
- Distributed tracing (Jaeger)
- Auto-scaling based on queue depth
- NestJS Documentation
- Bull Queue Documentation
- MongoDB Best Practices
- OpenAI API Reference
- React Documentation
Last Updated: November 6, 2025
Version: 1.0.0