You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Stroma snapshots store embeddings as float32 blobs. On a modest corpus (~200 spec/doc records, ~3k chunks, dim=1536) this is already multiple megabytes per snapshot — and snapshots are content-addressed, so every non-trivial corpus change writes a fresh file.
This starts to matter when:
Users commit snapshot fixtures to repos (dogfooding, tests, reproducible CI).
CI artifacts carry snapshots across jobs.
Pre-commit hooks rebuild and discard many snapshot revisions per day.
Proposal
Expose stroma v2's quantization options through BuildOptions:
int8 quantization — 4× smaller, near-identical recall in typical conditions.
binary quantization — 32× smaller, 1-bit sign + full-precision rescore. More aggressive; worth measuring recall impact on spec/doc corpora specifically.
Configure via the runtime block (e.g. runtime.quantization = "float32" | "int8" | "binary"), default float32 so nothing changes without opt-in.
Why this matters for Pituitary
Makes snapshot-in-repo a realistic pattern — a 32× smaller .stroma.db fits in a repo without bloating it.
Enables snapshot-as-CI-artifact without paying bandwidth + storage taxes.
Positions Pituitary for larger corpora (multi-repo governance, RFC: Cross-repo spec governance #173 cross-repo work) where full-precision vectors stop being free.
Implementation notes
Use stroma/v2/store.{Encode,Decode}VectorBlob{,Int8,Binary}.
Binary quantization's rescore step adds a cosine-similarity pass at full dim — confirm it stays within the SearchParams latency budget on a representative corpus before defaulting.
The embedder stays float32 on the query side; quantization is a storage-and-prefilter concern.
Verify the reuse-probe path works correctly across quantization changes (quantization change should be equivalent to an embedder-fingerprint mismatch → forces rebuild).
Acceptance criteria
runtime.quantization config surface
int8 and binary paths exercised through rebuild → search end-to-end
Benchmark: snapshot size reduction AND precision@k/recall@k delta on a representative corpus
Quantization change triggers rebuild (not a reuse-compatible delta)
Documentation recommending int8 as the near-always-safe choice and calling out binary's rescore overhead
Context
Unlocked by stroma v2.0.0 (merged in #337). Part of the Phase 4 stroma adoption plan.
Problem
Stroma snapshots store embeddings as
float32blobs. On a modest corpus (~200 spec/doc records, ~3k chunks, dim=1536) this is already multiple megabytes per snapshot — and snapshots are content-addressed, so every non-trivial corpus change writes a fresh file.This starts to matter when:
Proposal
Expose stroma v2's quantization options through
BuildOptions:Configure via the
runtimeblock (e.g.runtime.quantization = "float32" | "int8" | "binary"), defaultfloat32so nothing changes without opt-in.Why this matters for Pituitary
.stroma.dbfits in a repo without bloating it.Implementation notes
stroma/v2/store.{Encode,Decode}VectorBlob{,Int8,Binary}.SearchParamslatency budget on a representative corpus before defaulting.float32on the query side; quantization is a storage-and-prefilter concern.reuse-probepath works correctly across quantization changes (quantization change should be equivalent to an embedder-fingerprint mismatch → forces rebuild).Acceptance criteria
runtime.quantizationconfig surfaceint8andbinarypaths exercised through rebuild → search end-to-endint8as the near-always-safe choice and calling out binary's rescore overheadContext
Unlocked by stroma v2.0.0 (merged in #337). Part of the Phase 4 stroma adoption plan.