Threadpool merge scheduler #120869

albertzaharovits · 2025-01-26T13:59:14Z

This adds a new merge scheduler implementation that uses a (new) dedicated thread pool to run the merges. This way the number of concurrent merges is limited to the number of threads in the pool (i.e. the number of allocated processors to the ES JVM).

It implements dynamic IO throttling (the same target IO rate for all merges, roughly, with caveats) that's adjusted based on the number of currently active (queued + running) merges.
Smaller merges are always preferred to larger ones, irrespective of the index shard that they're coming from.
The implementation also supports the per-shard "max thread count" and "max merge count" settings, the later being used today for indexing throttling.
Note that IO throttling, max merge count, and max thread count work similarly, but not identical, to their siblings in the ConcurrentMergeScheduler.

The per-shard merge statistics are not affected, and the thread-pool statistics should reflect the merge ones (i.e. the completed thread pool stats reflects the total number of merges, across shards, per node).

…ake-2

henningandersen

Read through the main parts, left some comments, let me know if we need to discuss any of them.

server/src/main/java/org/elasticsearch/index/engine/ThreadPoolMergeExecutorService.java

henningandersen · 2025-03-07T06:39:31Z

server/src/main/java/org/elasticsearch/index/engine/ThreadPoolMergeExecutorService.java

+                            smallestMergeTask = null;
+                            // the merge task is backlogged by the merge scheduler, try to get the next smallest one
+                            // it's then the duty of the said merge scheduler to re-enqueue the backlogged merge task when it can be run
+                        } catch (InterruptedException e) {


If we only use interrupt at shutdown time, perhaps we should not loop then but rather exit? Or do we need to run a merge regardless? It seems like at shutdown we may never execute the runnable (might be rejected) so it feels slight inconsistent?

server/src/main/java/org/elasticsearch/index/engine/ThreadPoolMergeExecutorService.java

henningandersen · 2025-03-07T07:29:11Z

server/src/main/java/org/elasticsearch/index/engine/ThreadPoolMergeExecutorService.java

+        }
+    }
+
+    private void maybeUpdateIORateBytesPerSec(int currentlySubmittedIOThrottledMergeTasks) {


I think this method is more or less updateAndGet with a special case for when the calculation results in the same value (since then we do not want to update all threads).

I wonder if it would be clearer to have an updateAndGet method that is a copy of the original updateAndGet, with the if (prev == next) condition returning -1 or Long.MIN_VALUE as a sentinel value?

As it is now, this seems quite custom and thus hard to read vs something that is just an updateAndGet and then the function as separate parts.

Good point! I've pushed 507896e

henningandersen · 2025-03-07T07:52:24Z

server/src/main/java/org/elasticsearch/index/engine/ThreadPoolMergeScheduler.java

+            }
+        }
+
+        void abortOnGoingMerge() {


The name is slightly confusing. I think this only works if the task did not start yet. earlyAbort or abort would seem more suitable to me.

Can we verify that mergeStartTimeNS is not set yet as an assertion? And maybe set it to ensure run is not called?

And document in javadoc, that we expect one or the other only, never both on the same task.

Addressed in 34ab7f6 (renamed to abort).

henningandersen · 2025-03-07T08:15:47Z

server/src/main/java/org/elasticsearch/index/engine/ThreadPoolMergeScheduler.java

+        if (closed) {
+            // Do not backlog or execute tasks when closing the merge scheduler, instead abort them.
+            mergeTask.abortOnGoingMerge();
+            throw new ElasticsearchException("merge task aborted because scheduler is shutting down");


I am somewhat in doubt about throwing here or returning true. I get that if we return true we should not call doMerge in abortOngoingMerge when called from this callsite. I'd probably find it slightly more intuitive, since the exceptional termination here seems like a 3rd return value.

I've refactored the runNowOrBacklog method to a different schedule method, that indeed has 3 return values: a217d12 .
Aborting and running merge tasks have to do some accounting and cleanup. Backlogging merge tasks doesn't need to do anything.

…ake-2

…le();

…Count

albertzaharovits and others added 30 commits January 16, 2025 17:27

ExecutorMergeScheduler

bf557d2

Merge branch 'main' into threadpool-merge-scheduler

a3f87df

[CI] Auto commit changes from spotless

f5a1a8d

wrap for merge in the executor merge scheduler

f0b72fe

spotless

9b03950

Merge branch 'main' into threadpool-merge-scheduler

26e4043

Fix InternalEngineTests

aba69d0

Merge branch 'main' into threadpool-merge-scheduler

52796b5

implemented Throttling

c0667bf

Merge branch 'main' into threadpool-merge-scheduler

2da753f

[CI] Auto commit changes from spotless

2c8dc7f

Checkstyle

81cc0f1

Fix threadpool size for SnapshotResiliencyTests

f58120f

Spotless

5ca992d

Nit

3c203cb

Implemented max thread setting

6c21654

Throttling ?

68079d9

Checkstyle

7b68ba9

Indexing throttling !

9e467a1

Better throttling logging

a8f5297

Merge branch 'main' into threadpool-merge-scheduler

928fd32

Don't wrap errors during merging

3f5b4a8

Merge branch 'main' into threadpool-merge-scheduler

0e714a1

Merge branch 'main' into threadpool-merge-scheduler

0297cce

Refresh config

2b79809

Nit

57c2a5c

WIP

60a71b8

Merge branch 'main' into threadpool-merge-scheduler-sort-all-merges

68db209

IO throttling

4099ac5

Merge branch 'main' into threadpool-merge-scheduler-sort-all-merges

5554bc2

Merge branch 'main' into threadpool-merge-scheduler-sort-all-merges-t…

910289f

…ake-2

github-actions bot deployed to docs-preview March 6, 2025 17:42 View deployment

henningandersen reviewed Mar 7, 2025

View reviewed changes

albertzaharovits added 2 commits March 7, 2025 14:24

Merge branch 'main' into threadpool-merge-scheduler-sort-all-merges-t…

669a87e

…ake-2

abortOnGoingMerge -> abort

34ab7f6

github-actions bot deployed to docs-preview March 7, 2025 16:23 View deployment

albertzaharovits added 7 commits March 7, 2025 18:45

Nit

669b349

nits

c2b0cce

Merge branch 'main' into threadpool-merge-scheduler-sort-all-merges-t…

d03d808

…ake-2

currentlyRunningMergeTasks -> runningMergeTasks

1d7a94e

Merge branch 'main' into threadpool-merge-scheduler-sort-all-merges-t…

d975a1c

…ake-2

nit

c35518a

ThreadPoolMergeScheduler.Schedule schedule = smallestMergeTask.schedu…

a217d12

…le();

github-actions bot deployed to docs-preview March 9, 2025 11:22 View deployment

[CI] Auto commit changes from spotless

1fba8a1

github-actions bot deployed to docs-preview March 9, 2025 11:29 View deployment

albertzaharovits added 2 commits March 9, 2025 14:24

currentlySubmittedIOThrottledMergeTasksCount -> ioThrottledMergeTasks…

7ac3328

…Count

Fix ThreadPoolMergeExecutorServiceTests

fe7e9eb

github-actions bot deployed to docs-preview March 9, 2025 12:25 View deployment

[CI] Auto commit changes from spotless

171fc22

github-actions bot deployed to docs-preview March 9, 2025 12:33 View deployment

Enhance ThreadPoolMergeExecutorServiceTests post MergeTask#abort

79e6abf

github-actions bot deployed to docs-preview March 9, 2025 14:21 View deployment

Checkstyle

a25940d

github-actions bot deployed to docs-preview March 9, 2025 14:39 View deployment

AtomicIORate

507896e

github-actions bot deployed to docs-preview March 9, 2025 18:12 View deployment

albertzaharovits requested a review from henningandersen March 9, 2025 18:16

[CI] Auto commit changes from spotless

eb6279e

github-actions bot deployed to docs-preview March 9, 2025 18:21 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Threadpool merge scheduler #120869

Threadpool merge scheduler #120869

albertzaharovits commented Jan 26, 2025 •

edited

Loading

henningandersen left a comment

henningandersen Mar 7, 2025

henningandersen Mar 7, 2025

albertzaharovits Mar 9, 2025

henningandersen Mar 7, 2025

albertzaharovits Mar 9, 2025 •

edited

Loading

henningandersen Mar 7, 2025

albertzaharovits Mar 9, 2025

Threadpool merge scheduler #120869

Are you sure you want to change the base?

Threadpool merge scheduler #120869

Conversation

albertzaharovits commented Jan 26, 2025 • edited Loading

henningandersen left a comment

Choose a reason for hiding this comment

henningandersen Mar 7, 2025

Choose a reason for hiding this comment

henningandersen Mar 7, 2025

Choose a reason for hiding this comment

albertzaharovits Mar 9, 2025

Choose a reason for hiding this comment

henningandersen Mar 7, 2025

Choose a reason for hiding this comment

albertzaharovits Mar 9, 2025 • edited Loading

Choose a reason for hiding this comment

henningandersen Mar 7, 2025

Choose a reason for hiding this comment

albertzaharovits Mar 9, 2025

Choose a reason for hiding this comment

albertzaharovits commented Jan 26, 2025 •

edited

Loading

albertzaharovits Mar 9, 2025 •

edited

Loading