-
Notifications
You must be signed in to change notification settings - Fork 5.5k
feat(native): Add endpoint for expression optimization in sidecar #26475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry @pramodsatya, your pull request is larger than the review limit of 150000 diff characters
d3f5a74 to
4224f36
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry @pramodsatya, your pull request is larger than the review limit of 150000 diff characters
|
@aditi-pandit, @tdcmeehan, could you please take a look? |
| case core::ExprKind::kDereference: { | ||
| const auto* dereferenceTypedExpr = | ||
| expr->asUnchecked<core::DereferenceTypedExpr>(); | ||
| return getRowExpression(dereferenceTypedExpr->inputs().at(0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RowExpression models dereference as a special form. If the dereference is optimized away, wouldn't we expect that to happen in Velox, and here would should instead create a special form expression for that dereference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing it out, updated to return a special form expression of Dereference type.
| #include "velox/core/Expressions.h" | ||
| #include "velox/serializers/PrestoSerializer.h" | ||
|
|
||
| using namespace facebook::velox; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Velox guidelines disallow using namespace directives in header files. Please specify the fully qualified names below.
| if (type->isPrimitiveType()) { | ||
| boost::algorithm::to_lower(typeSignature); | ||
| } else { | ||
| // toString for Row type results in characters like `"":` for constants, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be great to give some examples of Row type signatures.
| return typeSignature; | ||
| } | ||
|
|
||
| std::shared_ptr<protocol::VariableReferenceExpression> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a shorthand for this shared_ptr type also.
| const std::string kWhen = "WHEN"; | ||
|
|
||
| protocol::TypeSignature getTypeSignature(const TypePtr& type) { | ||
| std::string typeSignature = type->toString(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit : shorten the variable name to signature
| // so they are constructed separately. | ||
| if (name == kSwitch) { | ||
| result.arguments = getSwitchSpecialFormExpressionArgs(expr); | ||
| } else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this code presumes that expr is a special forma expression. If yes, then better to add a VELOX_CHECK for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is only one caller for this function and the check if (isPrestoSpecialForm(exprName)) is present there:
case velox::core::ExprKind::kCall: {
const auto* callTypedExpr =
expr->asUnchecked<velox::core::CallTypedExpr>();
// Check if special form expression or call expression.
auto exprName = callTypedExpr->name();
boost::algorithm::to_lower(exprName);
if (isPrestoSpecialForm(exprName)) {
return getSpecialFormExpression(callTypedExpr);
}
return getCallExpression(callTypedExpr);
}
Could you please confirm if this is good or another check in getSpecialFormExpression would be better?
Description
To support constant folding and consistent semantics between the Presto Java coordinator and the Presto C++ worker, it is necessary to use consistent expression evaluation. To support this, a native expression evaluation endpoint,
v1/expressions, has been added to the Presto C++ sidecar. This endpoint leverages theExprOptimizerin Velox to optimize Presto expressions.The optimized Velox
core::TypedExprreturned by Velox'sExprOptimizeris converted to a Prestoprotocol::RowExpressionin the Presto native sidecar with helper classVeloxToPrestoExprConverter. The end to end flow between the coordinator and sidecar is as follows:If an error is encountered during Velox to Presto expression conversion, it is logged for further analysis and the unoptimized input
RowExpressionis returned instead. With the fuzzer testing (see test plan), we expect this endpoint to be ready for production workloads.When the
OptimizerLevelisEVALUATED, the endpoint throws for any error encountered during expression evaluation.Motivation and Context
The
native-sidecar-pluginwill implement theExpressionOptimizerinterface from Presto SPI to utilize thev1/expressionsendpoint on the sidecar for optimizing Presto expressions using the native expression evaluation engine.The primary goal is to achieve consistency between C++ and Java semantics for expression optimization. With this change, C++ functions including UDFs can be used for constant folding of expressions in the Presto planner.
Please refer to RFC-0006.
Test Plan
Tests have been added by abstracting testcases from
TestRowExpressionInterpreterto an interfaceAbstractTestExpressionInterpreter. This test interface is implemented inTestNativeExpressionInterpreterto test thev1/expressionsendpoint on the sidecar end to end.Unit tests for simple expression conversions are also added in
VeloxToPrestoExprConverter.cpp.This feature is still in Beta, and to support production workloads with complete certainty, the Velox expression fuzzer will be extended to test this endpoint with fuzzer generated expressions in a follow-up PR. This should surface any remaining bugs.
Release Notes