-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[V1] Adds APIs for Strategy and Pattern to compiler. #1625
base: v1
Are you sure you want to change the base?
Conversation
CROSS-ENGINE-REPORT ❌
Testing Details
Result Details
Now FAILING Tests ❌The following 3 test(s) were previously PASSING in BASE but are now FAILING in TARGET: Click here to see
Now IGNORED Tests ❌The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact. Now Passing Tests180 test(s) were previously failing in BASE (LEGACY-V0.14.8) but now pass in TARGET (EVAL-095FD52). Before merging, confirm they are intended to pass. The complete list can be found in GitHub CI summary, either from Step Summary or in the Artifact. CROSS-COMMIT-REPORT ✅
Testing DetailsResult Details
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes seem intuitive, though adding tests for replacing a single node and multiple nodes would garner some confidence in the public APIs exposed.
partiql-eval/src/main/java/org/partiql/eval/compiler/Strategy.java
Outdated
Show resolved
Hide resolved
partiql-eval/src/main/java/org/partiql/eval/compiler/Strategy.java
Outdated
Show resolved
Hide resolved
partiql-eval/src/main/java/org/partiql/eval/compiler/Strategy.java
Outdated
Show resolved
Hide resolved
for (strategy in strategies) { | ||
val op = strategy.apply(operator) | ||
if (op != null) { | ||
// first match | ||
return op | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that this PR is only intending for the strategy to work for single node replacements, but I'm trying to see how the public APIs will allow for multi-node replacements.
Let's say a tree looks like:
SELECT var(0)
\__ PROJECT t.c
\__ FILTER t.b > 127
\__ FILTER t.a == 49
\__ SCAN t
Perhaps you want to consolidate two of those relational operators to a single node:
SELECT_IMPL var(0)
\__ PROJECT_IMPL t.c
\__ FILTER_IMPL t.b > 127 AND t.a == 49
\__ SCAN_IMPL t
With the existing APIs, I don't know if you'd be able to do this. I could be reading it wrong, but it would be useful to have a test applying a multi-node strategy to replace a subtree. I think each strategy might need to have a reference to all of the existing strategies to allow for replacing of its children.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good point and a bit more interesting since the calcite model is only based on logical nodes, so need to recursively invoke the compiler – whereas spark has a "plan later" stub for its strategies. This model is somewhat in-between and not completed. The simplest solution would be to pass the compiler instance to apply(..)
.
In the interest on time, I will do this and add the test to show it in practice. Interestingly the cascades paper blends physical and logical, so an optimizer rule need only return a top-level physical and the remaining can be logical. I don't think I have a strong grasp of these various models yet, but seems Spark is more like cascades insofar as it returns exec nodes (physical impls) but can stub out logical ones for "plan later".
We might consider this, but the benefit isn't immediate - it's simple enough now to pass the compiler in apply.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't read the cascades paper, but it seems similar to how substrait handles logical/physical. They are all part of the same tree. From my interpretation, the compilation of those nodes (either logical/physical) would therefore be 1:1 with the executable that it is compiled to.
So, a plan might go through your strategies (a more comprehensive implementation of physical planner pass) and produce a plan, not an executable. So, from the example above, it could look like:
Select -- var(0)
\__ Project -- t.c
\__ Filter -- t.b > 127
\__ ScanKey : Custom -- expr = t, key = a, value = 49
And, the compiler would have a registry of 1:1 plan nodes (logical and physical) to implementations:
Select::class -> SelectImplDefault(it)
Project::class -> ProjectImplDefault(it)
Filter::class -> FilterImplDefault(it)
ScanKey::class -> SomeCustomImplProvidedByUser(it)
Not proposing this -- just conveying another option on how we can leverage strategies by introducing a "custom" rel node -- which would just be an empty interface (maybe with a string identifier).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, this is what the rules are though in the logical planning phase. Strategies are different than rules. While a rule maps plan-to-plan (calcite/volcano), the strategies map plans to expression (physical).
An interesting idea is the 1:1 to physical, but then we limit custom operators to what is available in the logical plans – meaning we may need more custom operators. This might be simplest though, but would introduce many more logical operators.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for clarity for users reading this in the future -- we discussed offline, and this implementation doesn't limit ourselves from introducing a custom logical node with a logical plan rewrite. This PR is the second half of the processing -- converting plan node(s) (either the built-in ones or the potential custom ones) into physical implementations. If users, however, wanted to introduce custom logical operators, they are not barred from doing that -- they can use their custom plan rewrites in combination with these strategies for whatever their use-case is.
a531e1b
to
38b199b
Compare
ea9af3c
to
4e6c177
Compare
[V1] Adds basic strategies to compiler (see #1625)
Description
This PR begins the strategy-based compiler which enables users to provide patterns for matching logical sub-tree operators, then the match is sent to a strategy which returns a custom physical operators.
This is similar to both calcite's rule-based logical optimizer and the physical operator strategies from spark.
Other Information
and Code Style Guidelines? YES
License Information
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.