Evaluate quantized vector storage (int8/binary) for snapshot size reduction

## Problem

Stroma snapshots store embeddings as `float32` blobs. On a modest corpus (~200 spec/doc records, ~3k chunks, dim=1536) this is already multiple megabytes per snapshot — and snapshots are content-addressed, so every non-trivial corpus change writes a fresh file.

This starts to matter when:

- Users commit snapshot fixtures to repos (dogfooding, tests, reproducible CI).
- CI artifacts carry snapshots across jobs.
- Pre-commit hooks rebuild and discard many snapshot revisions per day.

## Proposal

Expose stroma v2's quantization options through `BuildOptions`:

- **int8 quantization** — 4× smaller, near-identical recall in typical conditions.
- **binary quantization** — 32× smaller, 1-bit sign + full-precision rescore. More aggressive; worth measuring recall impact on spec/doc corpora specifically.

Configure via the `runtime` block (e.g. `runtime.quantization = "float32" | "int8" | "binary"`), default `float32` so nothing changes without opt-in.

## Why this matters for Pituitary

- Makes snapshot-in-repo a realistic pattern — a 32× smaller `.stroma.db` fits in a repo without bloating it.
- Enables snapshot-as-CI-artifact without paying bandwidth + storage taxes.
- Positions Pituitary for larger corpora (multi-repo governance, #173 cross-repo work) where full-precision vectors stop being free.

## Implementation notes

- Use `stroma/v2/store.{Encode,Decode}VectorBlob{,Int8,Binary}`.
- Binary quantization's rescore step adds a cosine-similarity pass at full dim — confirm it stays within the `SearchParams` latency budget on a representative corpus before defaulting.
- The embedder stays `float32` on the query side; quantization is a storage-and-prefilter concern.
- Verify the `reuse-probe` path works correctly across quantization changes (quantization change should be equivalent to an embedder-fingerprint mismatch → forces rebuild).

## Acceptance criteria

- [ ] `runtime.quantization` config surface
- [ ] `int8` and `binary` paths exercised through rebuild → search end-to-end
- [ ] Benchmark: snapshot size reduction AND precision@k/recall@k delta on a representative corpus
- [ ] Quantization change triggers rebuild (not a reuse-compatible delta)
- [ ] Documentation recommending `int8` as the near-always-safe choice and calling out binary's rescore overhead

## Context

Unlocked by stroma v2.0.0 (merged in #337). Part of the Phase 4 stroma adoption plan.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluate quantized vector storage (int8/binary) for snapshot size reduction #340

Problem

Proposal

Why this matters for Pituitary

Implementation notes

Acceptance criteria

Context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Evaluate quantized vector storage (int8/binary) for snapshot size reduction #340

Description

Problem

Proposal

Why this matters for Pituitary

Implementation notes

Acceptance criteria

Context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions