-
Notifications
You must be signed in to change notification settings - Fork 355
Description
Bug Description
When running Hindsight on Azure PostgreSQL Flexible Server, things break in two steps when the pg_trgm database extension is not available.
Step 1 — Startup crash: On first boot, the database migration c1a2b3d4e5f6 fails immediately because it tries to create search indexes that depend on pg_trgm. Azure's managed Postgres doesn't always have this extension installed, so the whole app crashes at startup.
Step 2 — Silent runtime failure after bypass: If you work around Step 1 by setting HINDSIGHT_API_DISABLE_PG_TRGM_MIGRATION=true, the app starts fine — but background "retain" jobs start failing right away with the error: operator does not exist: text % text.
This happens because even after skipping the migration, Hindsight still defaults to a trigram-based search strategy (DEFAULT_RETAIN_ENTITY_LOOKUP) at runtime, which also requires pg_trgm. So the failure just moves from startup to runtime, and it's not obvious why things are broken unless you already know about the extension dependency.
Database: Azure PostgreSQL Flexible Server (managed)
Migration affected: c1a2b3d4e5f6
First seen: 2026-03-16
Steps to Reproduce
- Set up an Azure PostgreSQL Flexible Server where the
pg_trgmextension is not installed (this is common on Azure managed Postgres — the extension is not always available by default). - Point a fresh Hindsight deployment at this database and start the API.
- Watch the startup logs — you'll see a crash during database migration
c1a2b3d4e5f6with errors related to a missing operator or index class that requirespg_trgm. - Add the environment variable
HINDSIGHT_API_DISABLE_PG_TRGM_MIGRATION=trueand restart. The app now starts successfully. - Trigger any retain operation (for example, store some content so Hindsight can retrieve it later).
- Check the background worker logs — you'll see repeated errors like
operator does not exist: text % text.
Confirmed workaround (both env vars needed):
HINDSIGHT_API_DISABLE_PG_TRGM_MIGRATION=true— skips the broken migration at bootHINDSIGHT_API_DEFAULT_RETAIN_ENTITY_LOOKUP=full— switches runtime search away from trigram, stopping the background errors
Expected Behavior
If pg_trgm is not installed on the database, Hindsight should handle this automatically — no manual env vars should be needed.
- At migration time: Hindsight checks whether
pg_trgmis available (e.g.SELECT * FROM pg_extension WHERE extname = 'pg_trgm') and, if it's not there, skips or gracefully downgrades the trigram-related migration without crashing. - At runtime: Hindsight detects on startup that
pg_trgmis missing and automatically falls back to a simpler full-scan entity lookup instead of trigram. Background retain jobs should run without errors.
In short: the app should start and work correctly on Azure managed Postgres out of the box, just using a slightly less optimized search path when the extension isn't available.
Actual Behavior
With default settings (no env vars): The app crashes at startup during migration c1a2b3d4e5f6. The error points to a missing pg_trgm-backed operator or index class.
After setting HINDSIGHT_API_DISABLE_PG_TRGM_MIGRATION=true: Startup succeeds, but background retain jobs start failing immediately with:
operator does not exist: text % text
This is because the default entity lookup mode is still set to trigram, which runs pg_trgm-backed SQL queries at runtime even though the migration was skipped. The connection between "skipping the migration" and "disabling trigram at runtime" is not automatic, so the app appears healthy but silently fails on every retain operation.
Note: This is the same pattern seen with the pg_diskann / vectorscale issue (fixed in commits 9f9dd25 and 34fba3d) — Hindsight assumes optional Postgres extensions are always present without checking first.
Version
Migration revision c1a2b3d4e5f6 present — Azure PostgreSQL Flexible Server
LLM Provider
OpenAI