Production

Deploy Hexis to a cloud environment with managed services.

Architecture Overview

Managed Postgres  <--  N stateless workers  <--  App services
     (RDS/Cloud SQL)       (polling external_calls)    (CLI/API/MCP)

Workers are stateless -- scale them horizontally by running multiple instances. All state lives in Postgres.

Use any managed PostgreSQL service (AWS RDS, Google Cloud SQL, Azure Database, etc.):

Extensions required: pgvector, age (Apache AGE), btree_gist, pg_trgm
Minimum version: PostgreSQL 14+
Check that your managed service supports Apache AGE -- not all do

Apply schema files from db/*.sql in order:

for f in db/*.sql; do
  psql -h <host> -U <user> -d <database> -f "$f"
done

Options for production:

Option	Pros	Cons
Ollama on host	Simple, fast for small scale	Single point of failure
HuggingFace TEI	Docker-based, scalable	CPU-only (float32)
OpenAI Embeddings	No infrastructure	Cost per request, latency
vLLM / LiteLLM	GPU support, OpenAI-compatible	More infrastructure

Run workers as long-lived processes (systemd, Docker, Kubernetes):

hexis-worker --mode heartbeat --instance production
hexis-worker --mode maintenance --instance production

Key properties:

Memory consolidation: recommended every 4-6 hours (handled by maintenance worker)
Database optimization: schedule during off-peak hours
Vector indexes: monitor HNSW index performance with large datasets (10K+ memories)
Connection pooling: use HEXIS_POOL_MIN_SIZE / HEXIS_POOL_MAX_SIZE to tune