Skip to content

Latest commit

 

History

History
84 lines (60 loc) · 2.5 KB

File metadata and controls

84 lines (60 loc) · 2.5 KB

Production

Deploy Hexis to a cloud environment with managed services.

Architecture Overview

Managed Postgres  <--  N stateless workers  <--  App services
     (RDS/Cloud SQL)       (polling external_calls)    (CLI/API/MCP)

Workers are stateless -- scale them horizontally by running multiple instances. All state lives in Postgres.

Managed PostgreSQL

Use any managed PostgreSQL service (AWS RDS, Google Cloud SQL, Azure Database, etc.):

  • Extensions required: pgvector, age (Apache AGE), btree_gist, pg_trgm
  • Minimum version: PostgreSQL 14+
  • Check that your managed service supports Apache AGE -- not all do

Schema Initialization

Apply schema files from db/*.sql in order:

for f in db/*.sql; do
  psql -h <host> -U <user> -d <database> -f "$f"
done

Embedding Service

Options for production:

Option Pros Cons
Ollama on host Simple, fast for small scale Single point of failure
HuggingFace TEI Docker-based, scalable CPU-only (float32)
OpenAI Embeddings No infrastructure Cost per request, latency
vLLM / LiteLLM GPU support, OpenAI-compatible More infrastructure

Workers

Run workers as long-lived processes (systemd, Docker, Kubernetes):

hexis-worker --mode heartbeat --instance production
hexis-worker --mode maintenance --instance production

Key properties:

  • Stateless -- kill and restart freely
  • Advisory locks -- prevent double-execution
  • Horizontal scaling -- run N heartbeat workers for N instances

Security

  • Set HEXIS_API_KEY for API server authentication
  • Set HEXIS_BIND_ADDRESS=127.0.0.1 and use a reverse proxy
  • Store API keys in environment variables, not in the database
  • Use separate database credentials per environment

Scaling Considerations

  • Memory consolidation: recommended every 4-6 hours (handled by maintenance worker)
  • Database optimization: schedule during off-peak hours
  • Vector indexes: monitor HNSW index performance with large datasets (10K+ memories)
  • Connection pooling: use HEXIS_POOL_MIN_SIZE / HEXIS_POOL_MAX_SIZE to tune

Related