-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Configure final reduce phase threads for heavy aggreagtion functions #14662
Conversation
pinot-core/src/main/java/org/apache/pinot/core/data/table/IndexedTable.java
Show resolved
Hide resolved
pinot-core/src/main/java/org/apache/pinot/core/data/table/IndexedTable.java
Show resolved
Hide resolved
Object[] values = record.getValues(); | ||
for (int i = 0; i < numAggregationFunctions; i++) { | ||
int colId = i + _numKeyColumns; | ||
values[colId] = _aggregationFunctions[i].extractFinalResult(values[colId]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd make sense to either :
- put an upper limit on _numThreadsForFinalReduce (e.g. 2 or 3* Runtime.getRuntime().availableProcessors()) or
- change the variable to a boolean flag
enableParallelFinalReduce
and use a sensible number of task
to prevent using excessive number of futures or various error modes, e.g.
if _numThreadsForFinalReduce is Integer.MAX_VALUE then chunkSize is going to be negative.
If shared thread pool is overwhelmed by running tasks it might be good to use current thread not only to wait but also task processing, stealing tasks until there's nothing left and only then waiting for futures to finish.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If shared thread pool is overwhelmed by running tasks it might be good to use current thread not only to wait but also task processing, stealing tasks until there's nothing left and only then waiting for futures to finish.
Potentially, and this can be done transparently by configuring the executor's rejected execution handler to CallerRunsPolicy. However, beware if the executor, which does non-blocking work, is sized to the number of available processors, then if the thread pool is overwhelmed, it means the available CPUs are overwhelmed too. Performing reductions on the caller thread would only lead to excessive context switching and it might be better, from a global perspective, for the task to wait for capacity to be available.
@@ -232,6 +232,12 @@ public static Integer getGroupTrimThreshold(Map<String, String> queryOptions) { | |||
return uncheckedParseInt(QueryOptionKey.GROUP_TRIM_THRESHOLD, groupByTrimThreshold); | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to show that final reduce is parallelized in explain output ?
pinot-core/src/main/java/org/apache/pinot/core/data/table/IndexedTable.java
Outdated
Show resolved
Hide resolved
can we do this automatically if the keys > X and for specific aggregation functions like funnel etc? |
1f6e6b6
to
30d28c3
Compare
put some heuristic logic here. |
51be961
to
36dbcca
Compare
pinot-common/src/main/java/org/apache/pinot/common/utils/config/QueryOptionsUtils.java
Show resolved
Hide resolved
pinot-core/src/main/java/org/apache/pinot/core/data/table/IndexedTable.java
Show resolved
Hide resolved
f189992
to
fcb643d
Compare
pinot-core/src/main/java/org/apache/pinot/core/data/table/IndexedTable.java
Outdated
Show resolved
Hide resolved
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/custom/ArrayTest.java
Show resolved
Hide resolved
pinot-spi/src/main/java/org/apache/pinot/spi/utils/CommonConstants.java
Outdated
Show resolved
Hide resolved
fb0b6f8
to
ceac8f5
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #14662 +/- ##
============================================
+ Coverage 61.75% 63.73% +1.98%
- Complexity 207 1469 +1262
============================================
Files 2436 2708 +272
Lines 133233 151490 +18257
Branches 20636 23389 +2753
============================================
+ Hits 82274 96551 +14277
- Misses 44911 47683 +2772
- Partials 6048 7256 +1208
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
pinot-spi/src/main/java/org/apache/pinot/spi/utils/CommonConstants.java
Outdated
Show resolved
Hide resolved
pinot-integration-tests/src/test/java/org/apache/pinot/integration/tests/custom/ArrayTest.java
Show resolved
Hide resolved
pinot-spi/src/main/java/org/apache/pinot/spi/utils/CommonConstants.java
Outdated
Show resolved
Hide resolved
|
||
|
||
/** | ||
* Base implementation of Map-based Table for indexed lookup | ||
*/ | ||
@SuppressWarnings({"rawtypes", "unchecked"}) | ||
public abstract class IndexedTable extends BaseTable { | ||
private static final int THREAD_POOL_SIZE = Math.max(Runtime.getRuntime().availableProcessors(), 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(minor) Some constants are available in ResourceManager
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This name is also confusing. Seems this is the upper bound when _numThreadsForServerFinalReduce
is not configured. Why not use the same upper bound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, reusing QueryMultiThreadingUtils.MAX_NUM_THREADS_PER_QUERY
ceac8f5
to
4d67c0b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with minor comments
pinot-common/src/main/java/org/apache/pinot/common/utils/config/QueryOptionsUtils.java
Outdated
Show resolved
Hide resolved
@@ -84,6 +94,10 @@ protected IndexedTable(DataSchema dataSchema, boolean hasFinalInput, QueryContex | |||
assert _hasOrderBy || (trimSize == Integer.MAX_VALUE && trimThreshold == Integer.MAX_VALUE); | |||
_trimSize = trimSize; | |||
_trimThreshold = trimThreshold; | |||
// NOTE: The upper limit of threads number for final reduce is set to 2 * number of available processors by default | |||
_numThreadsExtractFinalResult = Math.min(queryContext.getNumThreadsExtractFinalResult(), | |||
Math.max(1, 2 * Runtime.getRuntime().availableProcessors())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably cap it at CPU cores because this is CPU heavy operation
pinot-core/src/main/java/org/apache/pinot/core/data/table/IndexedTable.java
Outdated
Show resolved
Hide resolved
for (int threadId = 0; threadId < numThreadsExtractFinalResult; threadId++) { | ||
int startIdx = threadId * chunkSize; | ||
int endIdx = Math.min(startIdx + chunkSize, topRecordsList.size()); | ||
if (startIdx < endIdx) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this always true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not always the case in the test with very small segment.
public static final int DEFAULT_NUM_THREADS_FOR_FINAL_REDUCE = 1; | ||
public static final int DEFAULT_PARALLEL_CHUNK_SIZE_FOR_FINAL_REDUCE = 10_000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
4d67c0b
to
bcb01ca
Compare
…pache#14662) * Configure final reduce phase threads for heavy aggreagtion functions * Address comments * Add tests with numThreadsForFinalReduce
…pache#14662) * Configure final reduce phase threads for heavy aggreagtion functions * Address comments * Add tests with numThreadsForFinalReduce
Add a new query option:
numThreadsForFinalReduce
to allow customize the number of threads per aggregate/reduce call.This will significantly reduce the execution time of aggregation groupby, where there are many groups and each group final reduce is very costly like funnel functions.