Problem
Pituitary's vector search runs at full stored dimension today. For small corpora (< ~5k chunks) that's fine. For larger corpora — especially the cross-repo governance use case (#173) and multi-repo workspaces (#228) — full-dim cosine across thousands of chunks becomes the dominant query cost.
Proposal
Adopt stroma v2's matryoshka prefilter via SearchParams.SearchDimension:
- Run the first-pass vector scan at a smaller truncated dimension (e.g. 256 for an embedder trained with matryoshka loss).
- Rescore the shortlist with full-dimension cosine.
Result: latency drops roughly proportionally to the prefilter dimension, recall stays within ~1-2% of full-dim for matryoshka-trained embedders.
Why this matters for Pituitary
Implementation notes
- Configuration surface:
runtime.search.prefilter_dimension = 256 (or similar). Default 0 = full-dim, no prefilter.
- Only meaningful for matryoshka-trained embedders (most OpenAI models, some OSS models). Gate behind embedder capability detection or explicit opt-in to avoid recall regressions on non-matryoshka models.
- Three call sites today embed
VectorSearchQuery: internal/index/search.go, internal/analysis/semantic_terminology.go, internal/analysis/repository_similarity.go. All three need to thread the prefilter config.
- Benchmark against representative corpus to confirm recall stays within tolerance.
Acceptance criteria
Context
Unlocked by stroma v2.0.0 (merged in #337). Part of the Phase 4 stroma adoption plan.
Problem
Pituitary's vector search runs at full stored dimension today. For small corpora (< ~5k chunks) that's fine. For larger corpora — especially the cross-repo governance use case (#173) and multi-repo workspaces (#228) — full-dim cosine across thousands of chunks becomes the dominant query cost.
Proposal
Adopt stroma v2's matryoshka prefilter via
SearchParams.SearchDimension:Result: latency drops roughly proportionally to the prefilter dimension, recall stays within ~1-2% of full-dim for matryoshka-trained embedders.
Why this matters for Pituitary
Implementation notes
runtime.search.prefilter_dimension = 256(or similar). Default 0 = full-dim, no prefilter.VectorSearchQuery:internal/index/search.go,internal/analysis/semantic_terminology.go,internal/analysis/repository_similarity.go. All three need to thread the prefilter config.Acceptance criteria
SearchParams.SearchDimensionthreaded through all three vector-search call sitesContext
Unlocked by stroma v2.0.0 (merged in #337). Part of the Phase 4 stroma adoption plan.