Skip to content

feat: add meilisearch search plugin#14

Open
Ryrahul wants to merge 7 commits into
vendurehq:mainfrom
Ryrahul:feat/add-meilisearch-plugin
Open

feat: add meilisearch search plugin#14
Ryrahul wants to merge 7 commits into
vendurehq:mainfrom
Ryrahul:feat/add-meilisearch-plugin

Conversation

@Ryrahul
Copy link
Copy Markdown

@Ryrahul Ryrahul commented Mar 10, 2026

Add Meilisearch-powered search plugin as a drop-in replacement for the default search. Supports full-text search with typo tolerance, synonyms, stop words, AI hybrid search (semantic + keyword) via OpenAI/HuggingFace/ Ollama/REST embedders, faceted search, price range filtering, similar document recommendations, and custom product/variant field mappings.

Also adds mock data and updated populate script for dev server testing.

@Ryrahul Ryrahul marked this pull request as draft March 11, 2026 06:05
@Ryrahul
Copy link
Copy Markdown
Author

Ryrahul commented Mar 11, 2026

E2E test remaining

@Ryrahul Ryrahul marked this pull request as ready for review April 3, 2026 18:41
@vendure-developer-hub
Copy link
Copy Markdown

vendure-developer-hub Bot commented Apr 3, 2026

Community PluginsView preview

d8dfa21

@grolmus
Copy link
Copy Markdown

grolmus commented May 11, 2026

@michaelbromley — pulled this branch locally to get a feel for the scope. The plugin itself is in good shape on the keyword-search side, but the AI/hybrid-search surface is non-trivial and I'd like your call on it before we go further. Summary below; happy to take any direction.

(Separately, the PR also needs a rebase — dev-server/dev-config.ts conflicts with main — and the build is failing on a single lint error in meilisearch.service.ts:608. Both are quick fixes and orthogonal to the scope question.)

cc @Ryrahul — thanks for the very thorough work here. Not asking for changes yet; the comments below are mostly questions for the maintainers about how broad the AI surface should be in a v1 community plugin.


AI/embedder surface — what's actually in the PR

The plugin exposes a configurable embedder layer that forwards to one of five sources:

source: 'openAi' | 'huggingFace' | 'ollama' | 'rest' | 'userProvided'

Per-embedder config (in AiSearchConfig.embedders: Record<string, EmbedderConfig>):

  • model, dimensions, documentTemplate (Liquid), documentTemplateMaxBytes
  • apiKey (required for OpenAI; optional for others)
  • url (required for ollama and rest)
  • request / response / headers (free-form Record<string, any> for rest)

Two consumer-facing capabilities:

  1. Hybrid search — every regular search() call gains a semanticRatio (0.0 keyword-only → 1.0 semantic-only, default 0.5). Pushed straight to Meilisearch's hybrid endpoint.
  2. similarDocuments API — a "More like this / customers also viewed" recommender that takes a document id and runs index.searchSimilarDocuments({ id, embedder, ... }).

Important: we don't call OpenAI/HF/Ollama ourselves. Meilisearch is the integration point; credentials are forwarded to Meilisearch which calls the upstream provider. The plugin's job is to pass config through and shape request/response.

Things to weigh

  1. Blast radius is bounded. Because the plugin doesn't hold the AI vendor surface, an API change at OpenAI/HF lands as a Meilisearch issue, not ours. We're not vendor-locked into any single embedder.

  2. Secret handling is the user's problem. apiKey: string is plaintext-forwarded to Meilisearch. No envelope encryption, no rotation hook, no secret-store integration. README would need to be unambiguous: load from env or secret manager, rotate on Meilisearch restart. Standard, worth saying explicitly.

  3. rest source is an open passthrough. request: Record<string, any> and headers: Record<string, string> mean any HTTP API can be wired. From our side it's fine, but the security surface (e.g. pointing it at an internal service with weak auth) is on the user.

  4. No usage-cost guardrails. Embeddings are generated for every indexed document; hybrid search runs on every query. For a large catalog with text-embedding-3-large this is real money. The plugin has no throttling, batching config, or "AI off in dev" switch beyond "don't configure ai.embedders." Suggest a doc warning with rough cost-per-1k-docs, and possibly an env-aware default.

  5. No telemetry / audit hooks. Users who want to log "which queries used semantic, what semanticRatio was applied, how often does the keyword fallback fire" have to wrap the resolver themselves. Probably acceptable for a v1; a richer SearchEvent payload exposing AI params would be a small addition later.

  6. Fallback on embedder failure is silent. In meilisearch.service.ts around line 218, when Meilisearch can't find the configured embedder during a search (e.g. embedder settings absent during a reindex-swap window), the plugin retries without hybrid. A warning is logged; no metric. So a misconfiguration silently degrades AI search to keyword and you only see it in logs.

  7. similarDocuments is a soft no-op when AI is unconfigured. It logs a warning and returns { items: [], totalItems: 0 } rather than throwing. If a frontend ships a "Similar products" carousel and AI config is misconfigured in prod, the carousel silently renders empty. Probably should throw or return an error result type a frontend can branch on.

  8. Maintenance surface. No new transitive deps beyond meilisearch (^0.55.0). Config surface in options.ts is 811 lines, ~150 of which are AI/embedder. Every Meilisearch release touching the embedder API may need a tracking update — bounded but real.

  9. No tests covering the AI paths. e2e/meilisearch-plugin.e2e-spec.ts (749 lines) covers indexing + full-text but not hybrid search or similarDocuments. Real gap — hybrid is the main new capability and there's no regression guard. Doable without external network using the userProvided embedder mode.

  10. AI is fully opt-in. Users who don't configure ai.embedders get a regular Meilisearch plugin; isAiSearchEnabled gates every AI-touching path. So the maintenance burden is bounded — if we ever want to remove AI, the keyword side keeps working.

My take

The AI bit is a thin passthrough to Meilisearch's native vector capability, not a homegrown embedding stack. Risk-wise it's defensible. What I'd want before merge:

  • A userProvided e2e exercising the hybrid path end-to-end (no external network).
  • README section explicit about secret handling, cost, and the userProvided escape hatch for testing.
  • similarDocuments should error or return a result type a frontend can branch on, not silently return empty.
  • Decision on whether AI lives in the same plugin or a sibling @vendure-community/meilisearch-ai-plugin. Either is defensible.

Curious what you think on scope — keep AI in v1, split it out, or trim it down to just hybrid search and drop similarDocuments for a later release?

@michaelbromley
Copy link
Copy Markdown
Member

Hey @Ryrahul - before we proceed here, can you give me some background around your use of this plugin and the degree to which you or your company intend to maintain it?

@Ryrahul
Copy link
Copy Markdown
Author

Ryrahul commented May 11, 2026

@michaelbromley — pulled this branch locally to get a feel for the scope. The plugin itself is in good shape on the keyword-search side, but the AI/hybrid-search surface is non-trivial and I'd like your call on it before we go further. Summary below; happy to take any direction.

(Separately, the PR also needs a rebase — dev-server/dev-config.ts conflicts with main — and the build is failing on a single lint error in meilisearch.service.ts:608. Both are quick fixes and orthogonal to the scope question.)

cc @Ryrahul — thanks for the very thorough work here. Not asking for changes yet; the comments below are mostly questions for the maintainers about how broad the AI surface should be in a v1 community plugin.

AI/embedder surface — what's actually in the PR

The plugin exposes a configurable embedder layer that forwards to one of five sources:

source: 'openAi' | 'huggingFace' | 'ollama' | 'rest' | 'userProvided'

Per-embedder config (in AiSearchConfig.embedders: Record<string, EmbedderConfig>):

  • model, dimensions, documentTemplate (Liquid), documentTemplateMaxBytes
  • apiKey (required for OpenAI; optional for others)
  • url (required for ollama and rest)
  • request / response / headers (free-form Record<string, any> for rest)

Two consumer-facing capabilities:

  1. Hybrid search — every regular search() call gains a semanticRatio (0.0 keyword-only → 1.0 semantic-only, default 0.5). Pushed straight to Meilisearch's hybrid endpoint.
  2. similarDocuments API — a "More like this / customers also viewed" recommender that takes a document id and runs index.searchSimilarDocuments({ id, embedder, ... }).

Important: we don't call OpenAI/HF/Ollama ourselves. Meilisearch is the integration point; credentials are forwarded to Meilisearch which calls the upstream provider. The plugin's job is to pass config through and shape request/response.

Things to weigh

  1. Blast radius is bounded. Because the plugin doesn't hold the AI vendor surface, an API change at OpenAI/HF lands as a Meilisearch issue, not ours. We're not vendor-locked into any single embedder.
  2. Secret handling is the user's problem. apiKey: string is plaintext-forwarded to Meilisearch. No envelope encryption, no rotation hook, no secret-store integration. README would need to be unambiguous: load from env or secret manager, rotate on Meilisearch restart. Standard, worth saying explicitly.
  3. rest source is an open passthrough. request: Record<string, any> and headers: Record<string, string> mean any HTTP API can be wired. From our side it's fine, but the security surface (e.g. pointing it at an internal service with weak auth) is on the user.
  4. No usage-cost guardrails. Embeddings are generated for every indexed document; hybrid search runs on every query. For a large catalog with text-embedding-3-large this is real money. The plugin has no throttling, batching config, or "AI off in dev" switch beyond "don't configure ai.embedders." Suggest a doc warning with rough cost-per-1k-docs, and possibly an env-aware default.
  5. No telemetry / audit hooks. Users who want to log "which queries used semantic, what semanticRatio was applied, how often does the keyword fallback fire" have to wrap the resolver themselves. Probably acceptable for a v1; a richer SearchEvent payload exposing AI params would be a small addition later.
  6. Fallback on embedder failure is silent. In meilisearch.service.ts around line 218, when Meilisearch can't find the configured embedder during a search (e.g. embedder settings absent during a reindex-swap window), the plugin retries without hybrid. A warning is logged; no metric. So a misconfiguration silently degrades AI search to keyword and you only see it in logs.
  7. similarDocuments is a soft no-op when AI is unconfigured. It logs a warning and returns { items: [], totalItems: 0 } rather than throwing. If a frontend ships a "Similar products" carousel and AI config is misconfigured in prod, the carousel silently renders empty. Probably should throw or return an error result type a frontend can branch on.
  8. Maintenance surface. No new transitive deps beyond meilisearch (^0.55.0). Config surface in options.ts is 811 lines, ~150 of which are AI/embedder. Every Meilisearch release touching the embedder API may need a tracking update — bounded but real.
  9. No tests covering the AI paths. e2e/meilisearch-plugin.e2e-spec.ts (749 lines) covers indexing + full-text but not hybrid search or similarDocuments. Real gap — hybrid is the main new capability and there's no regression guard. Doable without external network using the userProvided embedder mode.
  10. AI is fully opt-in. Users who don't configure ai.embedders get a regular Meilisearch plugin; isAiSearchEnabled gates every AI-touching path. So the maintenance burden is bounded — if we ever want to remove AI, the keyword side keeps working.

My take

The AI bit is a thin passthrough to Meilisearch's native vector capability, not a homegrown embedding stack. Risk-wise it's defensible. What I'd want before merge:

  • A userProvided e2e exercising the hybrid path end-to-end (no external network).
  • README section explicit about secret handling, cost, and the userProvided escape hatch for testing.
  • similarDocuments should error or return a result type a frontend can branch on, not silently return empty.
  • Decision on whether AI lives in the same plugin or a sibling @vendure-community/meilisearch-ai-plugin. Either is defensible.

Curious what you think on scope — keep AI in v1, split it out, or trim it down to just hybrid search and drop similarDocuments for a later release?

Yeah, I think separating the AI/vector functionality into an optional companion plugin could make sense for the initial v1.
Something like a dedicated meilisearch-ai layer also gives us room to iterate independently on embeddings, hybrid search, recommendations, provider support, telemetry, etc. without increasing the surface area of the base plugin too early.

The core Meilisearch plugin already provides solid value with indexing + keyword/full-text search, while the AI side introduces a much broader surface area (embedders, provider configs, hybrid ranking, recommendations, cost/ops concerns, etc.).

Keeping those concerns isolated in something like a meilisearch-ai plugin might make the maintenance and adoption story cleaner, while still allowing the current architecture and AI work to be reused almost as-is.

@Ryrahul
Copy link
Copy Markdown
Author

Ryrahul commented May 11, 2026

Hey @Ryrahul - before we proceed here, can you give me some background around your use of this plugin and the degree to which you or your company intend to maintain it?

Hey @michaelbromley , we do use this plugin in one of our core client projects, which is partly why I ended up investing quite a bit into extending it.

I can’t really speak on behalf of the company regarding long-term ownership commitments, but personally I’m happy to continue maintaining and contributing to it going forward. Even outside of immediate project needs, I’d still be able to spare time for fixes, compatibility updates, and maintenance around the plugin itself.

Ryrahul added 7 commits May 12, 2026 00:50
Add Meilisearch-powered search plugin as a drop-in replacement for the
default search. Supports full-text search with typo tolerance, synonyms,
stop words, AI hybrid search (semantic + keyword) via OpenAI/HuggingFace/
Ollama/REST embedders, faceted search, price range filtering, similar
document recommendations, and custom product/variant field mappings.

Also adds mock data and updated populate script for dev server testing.
- Revert dev-server config to match main, keep only commented meilisearch entries
- Change output dir from dist/ to lib/ to match other plugins
- Align peerDependencies and devDependencies versions
- Add provenance, rimraf, typescript to match elasticsearch-plugin
- Add CHANGELOG.md for initial 1.0.0 release
- Register meilisearch-plugin in docs pipeline and generate docs
- Update .gitignore to match repo convention
- Add comprehensive e2e test suite (51 tests) mirroring elasticsearch plugin
- Fix index swap to not pass rename:false which prevented settings (embedders)
  from being carried over during reindex
- Add graceful fallback to keyword search when AI embedder is temporarily
  unavailable during reindex swap window
- Fix groupByProduct totalItems using facetDistribution for accurate counts
- Add collectionIds/collectionSlugs (plural) filter support
- Fix grouped facetValues/collections to use product-level fields with distinct
- Add e2e/watch scripts to package.json
These files were added for local testing but are not needed by the plugin.
The dev-server uses the punchout-gateway fixture data from main.
Remove all AI/embedder/hybrid-search/similarDocuments code to ship a
clean keyword-search-only Meilisearch plugin for v1. The AI layer can
be added back later as a companion plugin.

Removed:
- EmbedderConfig and AiSearchConfig interfaces from options
- Hybrid search params injection and fallback logic from service
- similarDocuments method, resolver, and GraphQL schema
- AI embedder configuration from index setup
- All AI-related exports, docs, and README sections
… fields

- Synonyms are now automatically expanded to be bidirectional so users
  only need to define e.g. laptop: ['notebook'] and the reverse mapping
  is generated automatically.
- Add formattedProductName and formattedDescription fields to SearchResult
  to expose Meilisearch highlight/crop data to frontends.
@Ryrahul Ryrahul force-pushed the feat/add-meilisearch-plugin branch from e586862 to d8dfa21 Compare May 11, 2026 20:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants