[V1] Adds APIs for Strategy and Pattern to compiler. #1625

RCHowell · 2024-10-22T00:09:41Z

Description

This PR begins the strategy-based compiler which enables users to provide patterns for matching logical sub-tree operators, then the match is sent to a strategy which returns a custom physical operators.

This is similar to both calcite's rule-based logical optimizer and the physical operator strategies from spark.

Other Information

Updated Unreleased Section in CHANGELOG: NO
Any backward-incompatible changes? NO
Any new external dependencies? NO
Do your changes comply with the Contributing Guidelines
and Code Style Guidelines? YES

License Information

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

github-actions · 2024-10-22T00:17:17Z

CROSS-ENGINE-REPORT ❌

	BASE (LEGACY-V0.14.8)	TARGET (EVAL-095FD52)	+/-
% Passing	89.67%	94.39%	4.72% ✅
Passing	5287	5565	278 ✅
Failing	609	50	-559 ✅
Ignored	0	281	281 🔶
Total Tests	5896	5896	0 ✅

Testing Details

Base Commit: v0.14.8
Base Engine: LEGACY
Target Commit: 095fd52
Target Engine: EVAL

Result Details

❌ REGRESSION DETECTED. See Now Failing/Ignored Tests. ❌
Passing in both: 2643
Failing in both: 17
Ignored in both: 0
PASSING in BASE but now FAILING in TARGET: 3
PASSING in BASE but now IGNORED in TARGET: 108
FAILING in BASE but now PASSING in TARGET: 180
IGNORED in BASE but now PASSING in TARGET: 0

Now FAILING Tests ❌

The following 3 test(s) were previously PASSING in BASE but are now FAILING in TARGET:

Click here to see

undefinedUnqualifiedVariableWithUndefinedVariableBehaviorMissing, compileOption: PERMISSIVE
undefinedUnqualifiedVariableIsNullExprWithUndefinedVariableBehaviorMissing, compileOption: PERMISSIVE
undefinedUnqualifiedVariableIsMissingExprWithUndefinedVariableBehaviorMissing, compileOption: PERMISSIVE

Now IGNORED Tests ❌

The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.

Now Passing Tests

180 test(s) were previously failing in BASE (LEGACY-V0.14.8) but now pass in TARGET (EVAL-095FD52). Before merging, confirm they are intended to pass.

The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact.

CROSS-COMMIT-REPORT ✅

	BASE (EVAL-F0569BD)	TARGET (EVAL-095FD52)	+/-
% Passing	94.39%	94.39%	0.00% ✅
Passing	5565	5565	0 ✅
Failing	50	50	0 ✅
Ignored	281	281	0 ✅
Total Tests	5896	5896	0 ✅

Testing Details

Base Commit: f0569bd
Base Engine: EVAL
Target Commit: 095fd52
Target Engine: EVAL

Result Details

Passing in both: 5565
Failing in both: 50
Ignored in both: 281
PASSING in BASE but now FAILING in TARGET: 0
PASSING in BASE but now IGNORED in TARGET: 0
FAILING in BASE but now PASSING in TARGET: 0
IGNORED in BASE but now PASSING in TARGET: 0

partiql-eval/src/main/java/org/partiql/eval/Environment.java

johnedquinn

The changes seem intuitive, though adding tests for replacing a single node and multiple nodes would garner some confidence in the public APIs exposed.

partiql-eval/src/main/java/org/partiql/eval/compiler/Strategy.java

johnedquinn · 2024-10-29T22:28:45Z

partiql-eval/src/main/kotlin/org/partiql/eval/internal/compiler/StandardCompiler.kt

+            for (strategy in strategies) {
+                val op = strategy.apply(operator)
+                if (op != null) {
+                    // first match
+                    return op
+                }
+            }


I know that this PR is only intending for the strategy to work for single node replacements, but I'm trying to see how the public APIs will allow for multi-node replacements.

Let's say a tree looks like:

SELECT var(0) \__ PROJECT t.c \__ FILTER t.b > 127 \__ FILTER t.a == 49 \__ SCAN t

Perhaps you want to consolidate two of those relational operators to a single node:

SELECT_IMPL var(0) \__ PROJECT_IMPL t.c \__ FILTER_IMPL t.b > 127 AND t.a == 49 \__ SCAN_IMPL t

With the existing APIs, I don't know if you'd be able to do this. I could be reading it wrong, but it would be useful to have a test applying a multi-node strategy to replace a subtree. I think each strategy might need to have a reference to all of the existing strategies to allow for replacing of its children.

This is a good point and a bit more interesting since the calcite model is only based on logical nodes, so need to recursively invoke the compiler – whereas spark has a "plan later" stub for its strategies. This model is somewhat in-between and not completed. The simplest solution would be to pass the compiler instance to apply(..).

In the interest on time, I will do this and add the test to show it in practice. Interestingly the cascades paper blends physical and logical, so an optimizer rule need only return a top-level physical and the remaining can be logical. I don't think I have a strong grasp of these various models yet, but seems Spark is more like cascades insofar as it returns exec nodes (physical impls) but can stub out logical ones for "plan later".

We might consider this, but the benefit isn't immediate - it's simple enough now to pass the compiler in apply.

I haven't read the cascades paper, but it seems similar to how substrait handles logical/physical. They are all part of the same tree. From my interpretation, the compilation of those nodes (either logical/physical) would therefore be 1:1 with the executable that it is compiled to.

So, a plan might go through your strategies (a more comprehensive implementation of physical planner pass) and produce a plan, not an executable. So, from the example above, it could look like:

Select -- var(0) \__ Project -- t.c \__ Filter -- t.b > 127 \__ ScanKey : Custom -- expr = t, key = a, value = 49

And, the compiler would have a registry of 1:1 plan nodes (logical and physical) to implementations:

Select::class -> SelectImplDefault(it) Project::class -> ProjectImplDefault(it) Filter::class -> FilterImplDefault(it) ScanKey::class -> SomeCustomImplProvidedByUser(it)

Not proposing this -- just conveying another option on how we can leverage strategies by introducing a "custom" rel node -- which would just be an empty interface (maybe with a string identifier).

I see, this is what the rules are though in the logical planning phase. Strategies are different than rules. While a rule maps plan-to-plan (calcite/volcano), the strategies map plans to expression (physical).

An interesting idea is the 1:1 to physical, but then we limit custom operators to what is available in the logical plans – meaning we may need more custom operators. This might be simplest though, but would introduce many more logical operators.

Just for clarity for users reading this in the future -- we discussed offline, and this implementation doesn't limit ourselves from introducing a custom logical node with a logical plan rewrite. This PR is the second half of the processing -- converting plan node(s) (either the built-in ones or the potential custom ones) into physical implementations. If users, however, wanted to introduce custom logical operators, they are not barred from doing that -- they can use their custom plan rewrites in combination with these strategies for whatever their use-case is.

[V1] Adds basic strategies to compiler (see #1625)

RCHowell commented Oct 22, 2024

View reviewed changes

partiql-eval/src/main/java/org/partiql/eval/Environment.java Show resolved Hide resolved

Base automatically changed from v1-udop to v1 October 22, 2024 23:00

RCHowell requested a review from johnedquinn October 25, 2024 17:06

johnedquinn requested changes Oct 29, 2024

View reviewed changes

RCHowell force-pushed the v1-udop-strats branch from a531e1b to 38b199b Compare October 29, 2024 22:48

RCHowell changed the title ~~Adds APIs for Strategy and Pattern to compiler.~~ [V1] Adds APIs for Strategy and Pattern to compiler. Nov 4, 2024

RCHowell added 4 commits November 4, 2024 15:23

Adds APIs for Strategies and Patterns to compiler

e8a27d4

Backup pattern tree builders

6344b6f

Backup

05657aa

Backup

4e6c177

RCHowell force-pushed the v1-udop-strats branch from ea9af3c to 4e6c177 Compare November 5, 2024 00:19

RCHowell mentioned this pull request Nov 5, 2024

[V1] Adds basic strategies to compiler (see #1625) #1644

Merged

johnedquinn added a commit that referenced this pull request Nov 5, 2024

Merge pull request #1644 from partiql/v1-udop

e2e7735

[V1] Adds basic strategies to compiler (see #1625)

johnedquinn marked this pull request as draft November 15, 2024 17:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[V1] Adds APIs for Strategy and Pattern to compiler. #1625

[V1] Adds APIs for Strategy and Pattern to compiler. #1625

RCHowell commented Oct 22, 2024

github-actions bot commented Oct 22, 2024 •

edited

Loading

johnedquinn left a comment

johnedquinn Oct 29, 2024

RCHowell Oct 30, 2024

johnedquinn Oct 30, 2024

RCHowell Oct 30, 2024

johnedquinn Oct 30, 2024

[V1] Adds APIs for Strategy and Pattern to compiler. #1625

Are you sure you want to change the base?

[V1] Adds APIs for Strategy and Pattern to compiler. #1625

Conversation

RCHowell commented Oct 22, 2024

Description

Other Information

License Information

github-actions bot commented Oct 22, 2024 • edited Loading

CROSS-ENGINE-REPORT ❌

Testing Details

Result Details

Now FAILING Tests ❌

Now IGNORED Tests ❌

Now Passing Tests

CROSS-COMMIT-REPORT ✅

Testing Details

Result Details

johnedquinn left a comment

Choose a reason for hiding this comment

johnedquinn Oct 29, 2024

Choose a reason for hiding this comment

RCHowell Oct 30, 2024

Choose a reason for hiding this comment

johnedquinn Oct 30, 2024

Choose a reason for hiding this comment

RCHowell Oct 30, 2024

Choose a reason for hiding this comment

johnedquinn Oct 30, 2024

Choose a reason for hiding this comment

github-actions bot commented Oct 22, 2024 •

edited

Loading