perf(elasticsearch-plugin): faster full reindex via refresh tuning, parallel bulks, batch fetch#34
Closed
timcv wants to merge 2 commits into
Closed
perf(elasticsearch-plugin): faster full reindex via refresh tuning, parallel bulks, batch fetch#34timcv wants to merge 2 commits into
timcv wants to merge 2 commits into
Conversation
…arallel bulks, batch fetch Reduces full-reindex wallclock by adding four orthogonal optimisations to the reindex path. Measured -47% median (391ms -> 206ms) on the existing e2e fixture (35 docs); larger gains expected on production-scale catalogues. All four are backwards-compatible at default settings. S1 - refresh policy + reindex-only index settings - New `reindexIndexSettings` option (default `refresh_interval: -1`, `number_of_replicas: 0`, `translog.durability: async`) is merged on top of `indexSettings` for the temporary reindex index only. - Bulk operations during reindex now pass `refresh: false`. Once the reindex loop completes, `reindexRestoreSettings` (default `refresh_interval: 1s`, `number_of_replicas: 1`) is PUT on the temp index and a single explicit `_refresh` is issued before alias swap so search consumers see a warm index. - Adds `putSettings` to `SearchClientAdapter` (impl in both ES and OS adapters). A6 - parallel bulks - `executeBulkOperationsByChunks` dispatches chunks via `Promise.all` with a concurrency window (`reindexBulkConcurrency`, default 4), but only when the caller is the reindex path (`refresh=false`). Delta paths remain sequential. A7 - byte-budgeted bulk flush + larger default bulk size - `reindexBulkOperationSizeLimit` default raised 3000 -> 5000. - New `reindexBulkSizeBytes` option (default ~5 MB) tracks payload size as ops accumulate and triggers an early flush when crossed, keeping bulk requests under typical `http.max_content_length` even with heavy custom mappings. S2 - product-level concurrency (opt-in) - New `reindexConcurrency` option, default 1 (sequential, unchanged). - When raised, reindex processes products in parallel windows, each worker with its own `MutableRequestContext` clone. Documented caveat: Vendure's TypeORM identity map shares relations like `channels` across products so callers should benchmark + run the e2e suite at the chosen value before rolling out. S3 - chunk-level prefetch - New private `loadProductChunkPrefetch` issues two queries per `reindexProductsChunkSize` worth of products (one for products + relations, one for variants + relations grouped by productId) instead of N+N queries inside `updateProductsOperationsOnly`. The per-product hot path accepts pre-fetched data via a new optional `prefetched` parameter; delta paths pass nothing and continue to load on demand. Bench harness - Adds `bench/perf/perf-reindex.test.ts` (separate vitest config so it doesn't pollute the e2e suite include glob; gated to its own directory). - Records median/mean/min/max wallclock across `PERF_RUNS` reindexes plus a sorted+normalised NDJSON snapshot of the full alias contents under `bench/snapshots/<label>.ndjson`. - A second test in the same spec diffs the snapshot against `bench/snapshots/baseline.ndjson` and fails if any document body diverges - this is the regression gate that runs after every optimisation step. - `bench/RESULTS.md` documents the protocol, the synthetic numbers and the deferred bov-MariaDB real-data run. Verification - `bun run e2e` 96/96, run 3x to confirm S2 default (=1) is not flaky. - Snapshot diff matches baseline at every step (S1, S1+A6/A7, +S2, +S3).
Contributor
|
I have read the CLA Document and I hereby sign the CLA You can retrigger this bot by commenting recheck in this Pull Request. Posted by the CLA Assistant Lite bot. |
Run on a 8 797-product / 51 593-doc bov_ecom_prod catalogue against ES 7.17.18 + MariaDB 11.3.2: - baseline (@vendure/elasticsearch-plugin@3.5.5 from npm): 14 m 26 s - optimized (S1+A6/A7+S2+S3, reindexConcurrency=8): 8 m 14 s - speedup: 1.75x (-43%), -371 s - snapshot diff vs baseline: identical (0 byte over 4 GB NDJSON) bench/RESULTS.md updated with the real-data table, methodology, and notes on why the gain is 1.75x (not the 5-10x the synthetic plan estimated): bov's heavy customProductMappings are CPU-bound and a single-instance MariaDB serialises some of the parallel worker queries.
Author
|
Hi, i open this to early by misstake. Sorry for that. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Reduces full-reindex wallclock by adding four orthogonal, opt-in optimisations to the reindex path.
All four are backwards-compatible at default settings.
Changes
S1 — refresh policy + reindex-only index settings
reindexIndexSettingsoption (defaultrefresh_interval: -1,number_of_replicas: 0,translog.durability: async) merged on top ofindexSettingsfor the temporary reindex index only.refresh: false. Once the loop completes,reindexRestoreSettings(defaultrefresh_interval: 1s,number_of_replicas: 1) isPUTon the temp index and a single_refreshis issued before alias swap so search consumers see a warm index.putSettingstoSearchClientAdapter(impl in both ES and OS adapters).A6 — parallel bulk dispatch
executeBulkOperationsByChunksruns chunks viaPromise.allwith a concurrency window (reindexBulkConcurrency, default 4), but only when the caller is the reindex path (refresh=false). Delta paths stay sequential to preserve ordering.A7 — byte-budgeted bulk flush + larger default bulk size
reindexBulkOperationSizeLimitdefault raised 3000 → 5000.reindexBulkSizeBytesoption (default ≈ 5 MB) tracks payload size as ops accumulate and triggers an early flush when crossed, keeping bulk requests under typicalhttp.max_content_lengtheven with heavy custom mappings.S2 — product-level concurrency (opt-in)
reindexConcurrencyoption, default 1 (sequential, unchanged behaviour).MutableRequestContextclone. Documented caveat: Vendure's TypeORM identity map shares relations likechannelsacross products so users should benchmark + run the e2e suite at the chosen value before rolling out (a flakyenabledmismatch was reproduced at concurrency=8 against sqljs in the existing suite — defaults stay safe, the option is for production tuning).S3 — chunk-level prefetch
loadProductChunkPrefetchissues two queries perreindexProductsChunkSizeof products (one for products + relations, one for variants + relations grouped by productId) instead of the prior N+N queries insideupdateProductsOperationsOnly. The per-product hot path accepts pre-fetched data through a new optionalprefetchedparameter; delta paths pass nothing and continue to load on demand.Bench harness
bench/perf/perf-reindex.test.ts— separate vitest config so it doesn't pollute the e2e suite include glob; gated tobench/perf.PERF_RUNSreindexes plus a sorted+normalised NDJSON snapshot of the full alias contents underbench/snapshots/<label>.ndjson.bench/snapshots/baseline.ndjsonand fails if any document body diverges — this is the regression gate that runs after every optimisation step.bench/RESULTS.mddocuments the protocol, both the synthetic and the real-data results.Real-data results (bov MariaDB / ES 7.17.18)
Dataset: 8 797 products / 111 386 variants → 51 593 indexed (variant × channel × language) docs.
bov-baseline(@vendure/elasticsearch-plugin@3.5.5from npm, default options)bov-optimized(S1+A6/A7+S2+S3,reindexConcurrency: 8,reindexBulkConcurrency: 4)Why not the 5-10× the synthetic plan estimated:
customProductMappingsare CPU-heavy and run per (product × channel × language) — Node's single-thread caps S2's gain.Even so, −371 s on a typical Swedish e-commerce catalogue is substantial and scales linearly with catalogue size (expected to widen further at ≥5 languages or ≥3 channels).
Synthetic results (regression gate)
ES 7.17.18 single-node, 5 reindexes per branch, median:
¹ With default
reindexConcurrency: 1. A6/A7 and S3 individually are near-no-ops on a 35-doc fixture (one bulk chunk, two queries dominated by ES write); they are designed to scale on real catalogues — confirmed by the bov bench above.Test plan
bun run lint(0 errors)bun run buildbun run e2e(96/96), run 3× consecutively to verify defaultreindexConcurrency: 1is not flaky🤖 Generated with Claude Code