feat(embed): add Gemini Embedding 2 multimodal provider#365
feat(embed): add Gemini Embedding 2 multimodal provider#365
Conversation
- Add google-embed.ts with gemini-embedding-2-preview API client - Support text, image, PDF, and interleaved multimodal embeddings - Add GoogleHybridLLM: Gemini API for embeddings, local llama.cpp for reranking - Extend collections to support image/PDF file patterns - Add --provider and --dimensions CLI flags - Add cross-modal search support in store - Add google-embed tests WIP: needs review and integration testing
…ider setting Reverts the resolvedEmbedProvider fallback that caused dimension mismatch when GEMINI_API_KEY was set in the environment but tests used local 768d vectors.
CREATE INDEX on content_type was running before ALTER TABLE added the column, crashing on existing databases without the column.
|
Quick cleanup pass landed on this branch. What changed since the initial draft:
Focused verification run on the updated branch covered:
There is still some node-llama/Vulkan fallback noise on this Linux host during local runtime setup, but the Google embedding path itself is working and the PR body has been updated to reflect the current branch accurately. |
Production-readiness review completeRebased onto Changes made
Test resultsThe only failing test file ( BranchPushed to: How to use Gemini Embedding 2
Dimensions
Switching back to localqmd embed --provider local --force |
Summary
Adds Gemini Embedding 2 (
gemini-embedding-2-preview) as a cloud embedding provider for QMD, including multimodal embedding support for text, images, and small PDFs.What's new
google— useqmd embed --provider googlepng,jpg,jpeg)<= 6pages)3072(default)1536768Retry-AfterhandlingRETRIEVAL_DOCUMENTfor indexingRETRIEVAL_QUERYfor search/query embeddingImportant behavior notes
embeddinggemma, 768d) are not compatible with Gemini's default 3072d vectors.**/*.md). This PR adds multimodal embedding support, but does not silently change all existing collections to index binaries by default.Main implementation areas
src/google-embed.tssrc/llm.tssrc/store.tssrc/cli/qmd.ts--provider--dimensionsBug fixes included
content_typeindex creation order so fresh/existing DB migration paths don't crashTesting performed
Verified on this branch with focused tests covering the changed behavior:
test/google-embed.test.tstest/generate-embeddings.multimodal.test.tsSpecifically verified:
**/*.mdKnown limitations / follow-ups