Skip to content

Conversation

@pgandhi999
Copy link
Member

Description

Previously, there was work done in OSS Trino by @pettyjamesm to implement a vectorized approach for combined columnar hash calculation generation via codegen(PR: #19302) for FlatGroupByHash operator. This PR extends the work for Partitioned Exchange and Local Exchange Operators.

The results from running BenchmarkPartitionedOutputOperator.verifyAddPage for varying number of columns are summarized below:

Environment: MacOS Local Machine
Partition Count: 256
Runtime JDK: Java 25
Page Count: 5000

Number of Columns Baseline(InterpretedHashGenerator)seconds Prototype(Codegen based Vectorized HashGenerator)seconds % Gain
1 3.1 2.6 16.12903
5 5.4 3.8 29.62963
10 8 6.3 21.25
50 40 28 30
100 93 69 25.80645
500 603 343 43.11774
1000 1065 623 41.50235
       

Additional context and related issues

Release notes

( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
( ) Release notes are required, with the following suggested text:

## Section
* Fix some things. ({issue}`issuenumber`)

@pgandhi999
Copy link
Member Author

The Old PR(#27607) got accidentally closed when i forced push my branch so I had to create a new pull request.

@starburstdata-automation
Copy link

starburstdata-automation commented Dec 11, 2025

Started benchmark workflow for this PR with test type = iceberg/sf1000_parquet_part.

Building Trino finished with status: success
Benchmark finished with status: success
Comparing results to the static baseline values, follow above workflow link for more details/logs.
Status message: NO Regression found.
Benchmark Comparison to the closest run from Master: Report

@starburstdata-automation
Copy link

starburstdata-automation commented Dec 11, 2025

Started benchmark workflow for this PR with test type = iceberg/sf1000_parquet_unpart.

Building Trino finished with status: failure
Building Trino finished with status: success
Benchmark finished with status: success
Comparing results to the static baseline values, follow above workflow link for more details/logs.
Status message: NO Regression found.
Benchmark Comparison to the closest run from Master: Report

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

2 participants