Skip to content

Commit

Permalink
[CALCITE-6038] Remove 'ORDER BY ... LIMIT n' when input has at most o…
Browse files Browse the repository at this point in the history
…ne row, n >= 1, and there is no 'OFFSET' clause

This change extends SortRemoveRedundantRule, which was added
in [CALCITE-5994].

Following [CALCITE-5940], which added SortMergeRule, rename
field LIMIT_MREGE to LIMIT_MERGE.

Close apache#3467
  • Loading branch information
LakeShen authored and julianhyde committed Oct 25, 2023
1 parent 345961a commit 79f2f61
Show file tree
Hide file tree
Showing 5 changed files with 201 additions and 47 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -722,7 +722,7 @@ private CoreRules() {}

/** Rule that merge a {@link Sort} representing the Limit semantics and
* another {@link Sort} representing the Limit or TOPN semantics. */
public static final SortMergeRule LIMIT_MREGE =
public static final SortMergeRule LIMIT_MERGE =
SortMergeRule.Config.LIMIT_MERGE.toRule();

/** Rule that removes keys from a {@link Sort}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@
*
* <p> In the future,we could also extend other sort merge logic in this rule.
*
* @see CoreRules#LIMIT_MREGE
* @see CoreRules#LIMIT_MERGE
*/
@Value.Enclosing
public class SortMergeRule
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,39 +26,66 @@

import java.util.Optional;

/** Rule that removes redundant {@code Order By} or {@code Limit} when its input RelNode's
* max row count is less than or equal to specified row count.All of them
* are represented by {@link Sort}
/**
* Rule that removes redundant {@code ORDER BY} or {@code LIMIT} when its input
* RelNode's maximum row count is less than or equal to specified row count.
* All of them are represented by {@link Sort}.
*
* <p> If a {@code Sort} is pure order by,and its offset is null,when its input RelNode's
* max row count is less than or equal to 1,then we could remove the redundant sort.
* <p>If a {@code Sort} is an {@code ORDER BY} with no {@code OFFSET}, and if
* input's maximum row count is less than or equal to 1, then the sort is
* redundant (because every relation with 1 or fewer rows is sorted), and the
* rule removes the redundant sort.
*
* <p>For example:
*
* <p> For example:
* <blockquote><pre>{@code
* select max(totalprice) from orders order by 1}
* </pre></blockquote>
* SELECT max(totalprice)
* FROM orders
* ORDER BY 1
* }</pre></blockquote>
*
* <p>could be converted to
*
* <p> could be converted to
* <blockquote><pre>{@code
* select max(totalprice) from orders}
* </pre></blockquote>
* SELECT max(totalprice)
* FROM orders
* }</pre></blockquote>
*
* <p> If a {@code Sort} is pure limit,and its offset is null, when its input
* RelNode's max row count is less than or equal to the limit's fetch,then we could
* remove the redundant sort.
* <p>For example:
*
* <p> For example:
* <blockquote><pre>{@code
* SELECT * FROM (VALUES 1,2,3,4,5,6) AS t1 LIMIT 10}
* </pre></blockquote>
* SELECT count(*)
* FROM orders
* ORDER BY 1 LIMIT 10
* }</pre></blockquote>
*
* <p>could be converted to
*
* <blockquote><pre>{@code
* SELECT count(*)
* FROM orders
* }</pre></blockquote>
*
* <p>If a {@code Sort} is a pure {@code LIMIT} (with no {@code ORDER BY} or
* {@code OFFSET}), and its input RelNode's maximum row count is less than or
* equal to the limit's fetch, the rule removes the redundant {@code LIMIT}.
*
* <p>For example:
*
* <p> The above values max row count is 6 rows, and the limit's fetch is 10,
* <blockquote><pre>{@code
* SELECT *
* FROM (VALUES 1, 2, 3, 4, 5, 6) AS t1
* LIMIT 10
* }</pre></blockquote>
*
* <p>The above {@code VALUES} row count is 6 rows, and the limit's fetch is 10,
* so we could remove the redundant sort.
*
* <p> It could be converted to:
* <p>It could be converted to:
*
* <blockquote><pre>{@code
* SELECT * FROM (VALUES 1,2,3,4,5,6) AS t1}
* </pre></blockquote>
* SELECT * FROM (VALUES 1,2,3,4,5,6) AS t1
* }</pre></blockquote>
*
* @see CoreRules#SORT_REMOVE_REDUNDANT
*/
Expand All @@ -77,34 +104,48 @@ protected SortRemoveRedundantRule(final SortRemoveRedundantRule.Config config) {
return;
}

// Get the max row count for sort's input RelNode.
final Double inputMaxRowCount = call.getMetadataQuery().getMaxRowCount(sort.getInput());
// Get the maximum row count for sort's input RelNode.
final Double inputMaxRowCount =
call.getMetadataQuery().getMaxRowCount(sort.getInput());

// Get the target max row count with sort's semantics.
// If sort is pure order by, the target max row count is 1.
// If sort is pure limit, the target max row count is the limit's fetch.
final Optional<Double> targetMaxRowCount = getSortInputSpecificMaxRowCount(sort);
// Get the target threshold with sort's semantics.
// If sort is 'order by x' or 'order by x limit n', target threshold is 1.
// If sort is pure limit, the target threshold is the limit's fetch.
// If the limit's fetch is 0, we could use
// CoreRules.SORT_FETCH_ZERO_INSTANCE to deal with it, so we don't need to
// deal with it in this rule.
final Optional<Integer> rowCountThreshold = getRowCountThreshold(sort);

if (!targetMaxRowCount.isPresent()) {
if (!rowCountThreshold.isPresent()) {
return;
}

// If the max row count is not null and less than or equal to targetMaxRowCount,
// If the threshold is not null and less than or equal to targetMaxRowCount,
// then we could remove the redundant sort.
if (inputMaxRowCount != null && inputMaxRowCount <= targetMaxRowCount.get()) {
if (inputMaxRowCount != null && inputMaxRowCount <= rowCountThreshold.get()) {
call.transformTo(sort.getInput());
}
}

private static Optional<Double> getSortInputSpecificMaxRowCount(Sort sort) {
// If the sort is pure limit, the specific max row count is limit's fetch.
if (RelOptUtil.isPureLimit(sort)) {
final double limit =
sort.fetch instanceof RexLiteral ? RexLiteral.intValue(sort.fetch) : -1D;
return Optional.of(limit);
private Optional<Integer> getRowCountThreshold(Sort sort) {
if (RelOptUtil.isLimit(sort)) {
final int fetch =
sort.fetch instanceof RexLiteral ? RexLiteral.intValue(sort.fetch) : 0;

// We don't need to deal with fetch is 0.
if (fetch == 0) {
return Optional.empty();
}

// If sort is 'order by x limit n', the target threshold is 1.
if (RelOptUtil.isOrder(sort)) {
return Optional.of(1);
}

// If sort is 'limit n', the target threshold is the limit's fetch.
return Optional.of(fetch);
} else if (RelOptUtil.isPureOrder(sort)) {
// If the sort is pure order by, the specific max row count is 1.
return Optional.of(1D);
return Optional.of(1);
}
return Optional.empty();
}
Expand Down
62 changes: 55 additions & 7 deletions core/src/test/java/org/apache/calcite/test/RelOptRulesTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -1368,6 +1368,54 @@ private void checkSemiOrAntiJoinProjectTranspose(JoinRelType type) {
.checkUnchanged();
}

/** Test case for
* <a href="https://issues.apache.org/jira/browse/CALCITE-6038">[CALCITE-6038]
* Remove 'ORDER BY ... LIMIT n' when input has at most one row, n >= 1,
* and there is no 'OFFSET' clause</a>. */
@Test void testSortRemoveWhenIsOrderAndLimit() {
final String sql = "SELECT count(*) FROM sales.emp ORDER BY 1 LIMIT 10";
sql(sql)
.withRule(CoreRules.SORT_REMOVE_REDUNDANT)
.check();
}

/** Test case for
* <a href="https://issues.apache.org/jira/browse/CALCITE-6038">[CALCITE-6038]
* Remove 'ORDER BY ... LIMIT n' when input has at most one row, n >= 1,
* and there is no 'OFFSET' clause</a>. */
@Test void testSortNotRemoveWhenIsOrderAndLimit() {
final String sql = "select * from\n"
+ "(SELECT * FROM sales.emp limit 100)\n"
+ "ORDER BY 1 LIMIT 10";
sql(sql)
.withRule(CoreRules.SORT_REMOVE_REDUNDANT)
.checkUnchanged();
}

/** Test case for
* <a href="https://issues.apache.org/jira/browse/CALCITE-6038">[CALCITE-6038]
* Remove 'ORDER BY ... LIMIT n' when input has at most one row, n >= 1,
* and there is no 'OFFSET' clause</a>. */
@Test void testSortNotRemoveWhenLimitFetchIsZeroHasOrder() {
final String sql = "select * from\n"
+ "(SELECT * FROM sales.emp limit 1)\n"
+ "ORDER BY 1 LIMIT 0";
sql(sql)
.withRule(CoreRules.SORT_REMOVE_REDUNDANT)
.checkUnchanged();
}

/** Test case for
* <a href="https://issues.apache.org/jira/browse/CALCITE-6038">[CALCITE-6038]
* Remove 'ORDER BY ... LIMIT n' when input has at most one row, n >= 1,
* and there is no 'OFFSET' clause</a>. */
@Test void testSortNotRemoveWhenLimitFetchIsZeroWithoutOrder() {
final String sql = "SELECT count(*) FROM sales.emp LIMIT 0";
sql(sql)
.withRule(CoreRules.SORT_REMOVE_REDUNDANT)
.checkUnchanged();
}

/** Tests that an {@link EnumerableLimit} and {@link EnumerableSort} are
* replaced by an {@link EnumerableLimitSort}, per
* <a href="https://issues.apache.org/jira/browse/CALCITE-3920">[CALCITE-3920]
Expand Down Expand Up @@ -1417,7 +1465,7 @@ private void checkSemiOrAntiJoinProjectTranspose(JoinRelType type) {
+ "limit 10";
sql(sql)
.withPreRule(CoreRules.SORT_PROJECT_TRANSPOSE)
.withRule(CoreRules.LIMIT_MREGE, CoreRules.PROJECT_REMOVE)
.withRule(CoreRules.LIMIT_MERGE, CoreRules.PROJECT_REMOVE)
.check();
}

Expand All @@ -1430,7 +1478,7 @@ private void checkSemiOrAntiJoinProjectTranspose(JoinRelType type) {
+ "limit 10";
sql(sql)
.withPreRule(CoreRules.SORT_PROJECT_TRANSPOSE)
.withRule(CoreRules.LIMIT_MREGE, CoreRules.PROJECT_MERGE)
.withRule(CoreRules.LIMIT_MERGE, CoreRules.PROJECT_MERGE)
.check();
}

Expand All @@ -1443,7 +1491,7 @@ private void checkSemiOrAntiJoinProjectTranspose(JoinRelType type) {
+ "limit 10000";
sql(sql)
.withPreRule(CoreRules.SORT_PROJECT_TRANSPOSE)
.withRule(CoreRules.LIMIT_MREGE, CoreRules.PROJECT_MERGE)
.withRule(CoreRules.LIMIT_MERGE, CoreRules.PROJECT_MERGE)
.check();
}

Expand All @@ -1456,7 +1504,7 @@ private void checkSemiOrAntiJoinProjectTranspose(JoinRelType type) {
+ "order by deptno desc";
sql(sql)
.withPreRule(CoreRules.SORT_PROJECT_TRANSPOSE)
.withRule(CoreRules.LIMIT_MREGE, CoreRules.PROJECT_REMOVE)
.withRule(CoreRules.LIMIT_MERGE, CoreRules.PROJECT_REMOVE)
.check();
}

Expand All @@ -1469,7 +1517,7 @@ private void checkSemiOrAntiJoinProjectTranspose(JoinRelType type) {
+ "order by deptno limit 10000";
sql(sql)
.withPreRule(CoreRules.SORT_PROJECT_TRANSPOSE)
.withRule(CoreRules.LIMIT_MREGE, CoreRules.PROJECT_MERGE)
.withRule(CoreRules.LIMIT_MERGE, CoreRules.PROJECT_MERGE)
.check();
}

Expand All @@ -1482,7 +1530,7 @@ private void checkSemiOrAntiJoinProjectTranspose(JoinRelType type) {
+ "order by deptno limit 10";
sql(sql)
.withPreRule(CoreRules.SORT_PROJECT_TRANSPOSE)
.withRule(CoreRules.LIMIT_MREGE, CoreRules.PROJECT_MERGE)
.withRule(CoreRules.LIMIT_MERGE, CoreRules.PROJECT_MERGE)
.check();
}

Expand All @@ -1495,7 +1543,7 @@ private void checkSemiOrAntiJoinProjectTranspose(JoinRelType type) {
+ "limit 10";
sql(sql)
.withPreRule(CoreRules.SORT_PROJECT_TRANSPOSE)
.withRule(CoreRules.LIMIT_MREGE)
.withRule(CoreRules.LIMIT_MERGE)
.checkUnchanged();
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14100,6 +14100,51 @@ LogicalProject(DEPTNO=[$0], EMPNO=[$2])
LogicalJoin(condition=[=($0, $9)], joinType=[left])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
]]>
</Resource>
</TestCase>
<TestCase name="testSortNotRemoveWhenIsOrderAndLimit">
<Resource name="sql">
<![CDATA[select * from
(SELECT * FROM sales.emp limit 100)
ORDER BY 1 LIMIT 10]]>
</Resource>
<Resource name="planBefore">
<![CDATA[
LogicalSort(sort0=[$0], dir0=[ASC], fetch=[10])
LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
LogicalSort(fetch=[100])
LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
]]>
</Resource>
</TestCase>
<TestCase name="testSortNotRemoveWhenLimitFetchIsZeroHasOrder">
<Resource name="sql">
<![CDATA[select * from
(SELECT * FROM sales.emp limit 1)
ORDER BY 1 LIMIT 0]]>
</Resource>
<Resource name="planBefore">
<![CDATA[
LogicalSort(sort0=[$0], dir0=[ASC], fetch=[0])
LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
LogicalSort(fetch=[1])
LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
]]>
</Resource>
</TestCase>
<TestCase name="testSortNotRemoveWhenLimitFetchIsZeroWithoutOrder">
<Resource name="sql">
<![CDATA[SELECT count(*) FROM sales.emp LIMIT 0]]>
</Resource>
<Resource name="planBefore">
<![CDATA[
LogicalSort(fetch=[0])
LogicalAggregate(group=[{}], EXPR$0=[COUNT()])
LogicalProject($f0=[0])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
]]>
</Resource>
</TestCase>
Expand Down Expand Up @@ -14295,6 +14340,26 @@ LogicalSort(fetch=[10])
<![CDATA[
LogicalProject(T1=[$0])
LogicalValues(tuples=[[{ 1 }, { 2 }, { 3 }, { 4 }, { 5 }, { 6 }]])
]]>
</Resource>
</TestCase>
<TestCase name="testSortRemoveWhenIsOrderAndLimit">
<Resource name="sql">
<![CDATA[SELECT count(*) FROM sales.emp ORDER BY 1 LIMIT 10]]>
</Resource>
<Resource name="planBefore">
<![CDATA[
LogicalSort(sort0=[$0], dir0=[ASC], fetch=[10])
LogicalAggregate(group=[{}], EXPR$0=[COUNT()])
LogicalProject($f0=[0])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
]]>
</Resource>
<Resource name="planAfter">
<![CDATA[
LogicalAggregate(group=[{}], EXPR$0=[COUNT()])
LogicalProject($f0=[0])
LogicalTableScan(table=[[CATALOG, SALES, EMP]])
]]>
</Resource>
</TestCase>
Expand Down

0 comments on commit 79f2f61

Please sign in to comment.