Use IVF_PQ for GPU index build for large datasets #137126

mayya-sharipova · 2025-10-24T18:51:40Z

Use IVF_PQ fallback for insufficient GPU memory

Add adaptive fallback to IVF_PQ algorithm when GPU memory is insufficient for
NN_DESCENT. Include distance type awareness to avoid Cosine distance with
IVF_PQ (unsupported in CUVS 25.10). Add TODO for CUVS 25.12+ upgrade.

Use IVF_PQ algorithm for GPU index building for large dataset (>= 5M vectors).
Temporarily add a factory for calculating IVF_PQ params (to be
removed with CUVS 25.12+ upgrade.)

Use IVF_PQ algorithm for GPU index building for large dataset (>= 1M vectors). Temporarily add a factory for calculating IVF_PQ params. Also skip estimation of needed memory when IVF_PQ is used.

elasticsearchmachine · 2025-10-24T18:52:05Z

Pinging @elastic/es-search-relevance (Team:Search Relevance)

elasticsearchmachine · 2025-10-24T18:52:06Z

Hi @mayya-sharipova, I've created a changelog YAML for you.

mayya-sharipova · 2025-10-24T19:29:58Z

With this params (1M byte vectors):

@achirkin Notice how here when switching from NN_DESCENT, we got worse graph building time, and more dense graphs.

gist: 1_000_000 docs; 960 dims; euclidean metric

index_type	force_merge_time (ms)	QPS1 seg	recall1 seg
cpu	130129	421	0.91
gpu NN_DESCENT	20643	467	0.92
gpu IVF_PQ	36536	149	1

Add adaptive fallback to IVF_PQ algorithm when GPU memory is insufficient for NN_DESCENT. Include distance type awareness to avoid Cosine distance with IVF_PQ (unsupported in CUVS 25.10). Add TODO for CUVS 25.12+ upgrade.

mayya-sharipova · 2025-10-26T21:35:39Z

These are benchmarks on NVIDIA GeForce RTX 4060; 8Gb memory; that before this PR could not force-merge to 1 seg because of insufficient GPU memory. With this PR:

opeanai: 2.6M docs; float32; 1536 dims; dot_product

index_type	index_time(ms)	force_merge_time (ms)	QPS 1 seg	recall 1 seg
cpu	638066	807401	139	0.99
gpu	121300	191283	140	1

hotpotqa-arctic: 5.2M docs; float32; 768 dims; dot_product

index_type	index_time(ms)	force_merge_time (ms)	QPS 1 seg	recall 1 seg
cpu	666102	1231002	430	0.69
gpu	238028	506233	137	0.95

achirkin · 2025-10-27T06:50:04Z

x-pack/plugin/gpu/src/main/java/org/elasticsearch/xpack/gpu/codec/CuVSIvfPqParamsFactory.java

+        double kmeansTrainsetFraction = Math.clamp(1.0 / Math.sqrt(nRows * 1e-5), minKmeansTrainsetFraction, maxKmeansTrainsetFraction);
+
+        // Calculate number of probes based on number of lists
+        int nProbes = Math.round((float) (Math.sqrt(nLists) / 20.0 + 4.0));


In cuVS version, this parameter is overridden by CAGRA after calling the ivf-pq params constructor to:

std::round(2 + std::sqrt(ivf_pq_params.build_params.n_lists) / 20 + ef_construction / 16);

The differencfe is rather small though.

Addressed in fec345f

ldematte

The params factory looks good, but I have reservations about the resource manager changes. Let's see if we can figure out a more robust/cleaner way!

ldematte · 2025-11-05T11:14:19Z

One good option could be to limit this PR to changes to CagraIndexParams creation, and we can worry how to include the fallback for dataset > total GPU memory later.

- Implement automatic switching to IVF_PQ algorithm when NN_DESCENT requires more GPU memory than available. - Cache GPU total memory in the GPUSupport module

mayya-sharipova · 2025-11-07T20:42:55Z

@ldematte Thanks for the review so far, I have updated the PR. It is ready for another round for reviews.

Use IVF_PQ for GPU index build for large datasets

22d23c3

Use IVF_PQ algorithm for GPU index building for large dataset (>= 1M vectors). Temporarily add a factory for calculating IVF_PQ params. Also skip estimation of needed memory when IVF_PQ is used.

mayya-sharipova added >enhancement auto-backport Automatically create backport pull requests when merged :Search Relevance/Vectors Vector search v9.2.1 v9.3.0 labels Oct 24, 2025

elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Oct 24, 2025

Update docs/changelog/137126.yaml

931d06a

mayya-sharipova marked this pull request as draft October 24, 2025 19:17

Use IVF_PQ fallback for insufficient GPU memory

8c81977

Add adaptive fallback to IVF_PQ algorithm when GPU memory is insufficient for NN_DESCENT. Include distance type awareness to avoid Cosine distance with IVF_PQ (unsupported in CUVS 25.10). Add TODO for CUVS 25.12+ upgrade.

mayya-sharipova marked this pull request as ready for review October 26, 2025 20:21

achirkin reviewed Oct 27, 2025

View reviewed changes

mayya-sharipova mentioned this pull request Nov 4, 2025

[GPU] Extend CuVSResourcesManager #137588

Merged

ldematte requested changes Nov 5, 2025

View reviewed changes

Merge remote-tracking branch 'upstream/main' into gpu-ivf-pq

df7c7b0

elasticsearchmachine added v9.2.2 and removed v9.2.1 labels Nov 6, 2025

mayya-sharipova added 2 commits November 7, 2025 15:39

Add memory-aware algorithm selection for GPU HNSW vector indexing

fec345f

- Implement automatic switching to IVF_PQ algorithm when NN_DESCENT requires more GPU memory than available. - Cache GPU total memory in the GPUSupport module

Merge branch 'main' into gpu-ivf-pq

4758c21

mayya-sharipova requested a review from ChrisHegarty November 7, 2025 20:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use IVF_PQ for GPU index build for large datasets #137126

Use IVF_PQ for GPU index build for large datasets #137126

mayya-sharipova commented Oct 24, 2025 •

edited

Loading

Uh oh!

elasticsearchmachine commented Oct 24, 2025

Uh oh!

elasticsearchmachine commented Oct 24, 2025

Uh oh!

mayya-sharipova commented Oct 24, 2025

Uh oh!

mayya-sharipova commented Oct 26, 2025 •

edited

Loading

Uh oh!

achirkin Oct 27, 2025

Uh oh!

mayya-sharipova Nov 7, 2025

Uh oh!

ldematte left a comment

Uh oh!

ldematte commented Nov 5, 2025

Uh oh!

mayya-sharipova commented Nov 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Use IVF_PQ for GPU index build for large datasets #137126

Are you sure you want to change the base?

Use IVF_PQ for GPU index build for large datasets #137126

Conversation

mayya-sharipova commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented Oct 24, 2025

Uh oh!

elasticsearchmachine commented Oct 24, 2025

Uh oh!

mayya-sharipova commented Oct 24, 2025

Uh oh!

mayya-sharipova commented Oct 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

achirkin Oct 27, 2025

Choose a reason for hiding this comment

Uh oh!

mayya-sharipova Nov 7, 2025

Choose a reason for hiding this comment

Uh oh!

ldematte left a comment

Choose a reason for hiding this comment

Uh oh!

ldematte commented Nov 5, 2025

Uh oh!

mayya-sharipova commented Nov 7, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mayya-sharipova commented Oct 24, 2025 •

edited

Loading

mayya-sharipova commented Oct 26, 2025 •

edited

Loading