Skip to content

Conversation

@pramodsatya
Copy link
Contributor

@pramodsatya pramodsatya commented Oct 30, 2025

Description

To support constant folding and consistent semantics between the Presto Java coordinator and the Presto C++ worker, it is necessary to use consistent expression evaluation. To support this, a native expression evaluation endpoint, v1/expressions, has been added to the Presto C++ sidecar. This endpoint leverages the ExprOptimizer in Velox to optimize Presto expressions.

The optimized Velox core::TypedExpr returned by Velox's ExprOptimizer is converted to a Presto protocol::RowExpression in the Presto native sidecar with helper class VeloxToPrestoExprConverter. The end to end flow between the coordinator and sidecar is as follows:

RowExpression == (PrestoToVeloxExpr) ==> TypedExpr == (Velox ExprOptimizer optimize() API) ==> optimized TypedExpr == (VeloxToPrestoExprConverter) ==> optimized RowExpression

If an error is encountered during Velox to Presto expression conversion, it is logged for further analysis and the unoptimized input RowExpression is returned instead. With the fuzzer testing (see test plan), we expect this endpoint to be ready for production workloads.
When the OptimizerLevel is EVALUATED, the endpoint throws for any error encountered during expression evaluation.

Motivation and Context

The native-sidecar-plugin will implement the ExpressionOptimizer interface from Presto SPI to utilize the v1/expressions endpoint on the sidecar for optimizing Presto expressions using the native expression evaluation engine.

The primary goal is to achieve consistency between C++ and Java semantics for expression optimization. With this change, C++ functions including UDFs can be used for constant folding of expressions in the Presto planner.
Please refer to RFC-0006.

Test Plan

Tests have been added by abstracting testcases from TestRowExpressionInterpreter to an interface AbstractTestExpressionInterpreter. This test interface is implemented in TestNativeExpressionInterpreter to test the v1/expressions endpoint on the sidecar end to end.
Unit tests for simple expression conversions are also added in VeloxToPrestoExprConverter.cpp.

This feature is still in Beta, and to support production workloads with complete certainty, the Velox expression fuzzer will be extended to test this endpoint with fuzzer generated expressions in a follow-up PR. This should surface any remaining bugs.

Release Notes

== NO RELEASE NOTE ==

@prestodb-ci prestodb-ci added the from:IBM PR from IBM label Oct 30, 2025
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @pramodsatya, your pull request is larger than the review limit of 150000 diff characters

@pramodsatya pramodsatya force-pushed the expr_opt branch 2 times, most recently from d3f5a74 to 4224f36 Compare October 30, 2025 23:00
@pramodsatya pramodsatya marked this pull request as ready for review October 30, 2025 23:00
@prestodb-ci prestodb-ci requested review from a team and pratyakshsharma and removed request for a team October 30, 2025 23:00
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @pramodsatya, your pull request is larger than the review limit of 150000 diff characters

@pramodsatya
Copy link
Contributor Author

@aditi-pandit, @tdcmeehan, could you please take a look?

Comment on lines 252 to 255
case core::ExprKind::kDereference: {
const auto* dereferenceTypedExpr =
expr->asUnchecked<core::DereferenceTypedExpr>();
return getRowExpression(dereferenceTypedExpr->inputs().at(0));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RowExpression models dereference as a special form. If the dereference is optimized away, wouldn't we expect that to happen in Velox, and here would should instead create a special form expression for that dereference?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing it out, updated to return a special form expression of Dereference type.

#include "velox/core/Expressions.h"
#include "velox/serializers/PrestoSerializer.h"

using namespace facebook::velox;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Velox guidelines disallow using namespace directives in header files. Please specify the fully qualified names below.

if (type->isPrimitiveType()) {
boost::algorithm::to_lower(typeSignature);
} else {
// toString for Row type results in characters like `"":` for constants,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be great to give some examples of Row type signatures.

return typeSignature;
}

std::shared_ptr<protocol::VariableReferenceExpression>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a shorthand for this shared_ptr type also.

const std::string kWhen = "WHEN";

protocol::TypeSignature getTypeSignature(const TypePtr& type) {
std::string typeSignature = type->toString();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit : shorten the variable name to signature

// so they are constructed separately.
if (name == kSwitch) {
result.arguments = getSwitchSpecialFormExpressionArgs(expr);
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this code presumes that expr is a special forma expression. If yes, then better to add a VELOX_CHECK for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is only one caller for this function and the check if (isPrestoSpecialForm(exprName)) is present there:

    case velox::core::ExprKind::kCall: {
      const auto* callTypedExpr =
          expr->asUnchecked<velox::core::CallTypedExpr>();
      // Check if special form expression or call expression.
      auto exprName = callTypedExpr->name();
      boost::algorithm::to_lower(exprName);
      if (isPrestoSpecialForm(exprName)) {
        return getSpecialFormExpression(callTypedExpr);
      }
      return getCallExpression(callTypedExpr);
    }

Could you please confirm if this is good or another check in getSpecialFormExpression would be better?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

from:IBM PR from IBM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants