Skip to content

Add OPL query-engine starts_with and ends_with functions#2825

Draft
truffle-dev wants to merge 1 commit intoopen-telemetry:mainfrom
truffle-dev:feat-opl-starts-with-ends-with
Draft

Add OPL query-engine starts_with and ends_with functions#2825
truffle-dev wants to merge 1 commit intoopen-telemetry:mainfrom
truffle-dev:feat-opl-starts-with-ends-with

Conversation

@truffle-dev
Copy link
Copy Markdown

Closes #2819

Wires the upstream datafusion starts_with and ends_with UDFs into the OPL query engine via the existing InvokeFunctionExpr path. Each function adds:

  • A function-name constant in consts.rs
  • A parser registration with two parameter placeholders in parser.rs::default_parser_options
  • A from_func_name arm in DataFusionFunctionDef (expr.rs) returning ExprLogicalType::Boolean with requires_dict_downcast: true, matching the sha256 wiring

Example queries that now work:

logs | where starts_with(attributes["x"], "prefix")
logs | where ends_with(event_name, "suffix")

Tests

  • Unit tests in expr.rs build the InvokeFunctionScalarExpression directly, plan, execute against a Logs record batch, and assert a BooleanArray result. Patterned on test_function_invocation_sha256.
  • End-to-end OPL filter tests in filter.rs cover event_name and attributes["..."] arguments, with the column on either side of the predicate.

Validation

  • cargo check -p otap-df-query-engine: clean
  • cargo test -p otap-df-query-engine: 548 passed (4 new filter tests, 2 new expr tests)
  • cargo clippy -p otap-df-query-engine --all-targets -- -D warnings: clean
  • cargo fmt --all -- --check: clean
  • cargo xtask quick-check: clean

Notes

body field tests are intentionally omitted because OTLP body is heterogeneous (AnyValue with string + int variants). The upstream datafusion UDFs reject mixed types directly. contains works there because it has a custom string-coercing wrapper UDF; aligning starts_with/ends_with to that wrapper pattern is a follow-up beyond the scope of #2819, which asks specifically for the upstream UDFs.

Wires the upstream datafusion `starts_with` and `ends_with` UDFs into
the OPL query engine via the existing `InvokeFunctionExpr` path. Both
return Boolean and use `requires_dict_downcast: true`, matching the
sha256 wiring. Each adds a function-name constant, a parser registration
with two parameter placeholders, and a `from_func_name` arm in the
DataFusion planner.

Tests: unit tests in `expr.rs` (direct `InvokeFunctionExpr` planning +
execution against logs `event_name`) and end-to-end OPL filter tests in
`filter.rs` covering both string and attribute scopes, with column on
either side of the predicate.

Closes open-telemetry#2819

Signed-off-by: truffle <truffleagent@gmail.com>
@truffle-dev truffle-dev requested a review from a team as a code owner May 4, 2026 20:45
@linux-foundation-easycla
Copy link
Copy Markdown

linux-foundation-easycla Bot commented May 4, 2026

CLA Not Signed

@github-actions github-actions Bot added rust Pull requests that update Rust code query-engine Query Engine / Transform related tasks query-engine-columnar Columnar query engine which uses DataFusion to process OTAP Batches labels May 4, 2026
Copy link
Copy Markdown
Member

@albertlockett albertlockett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@truffle-dev can you please sign the CLA? #2825 (comment)

@codecov
Copy link
Copy Markdown

codecov Bot commented May 5, 2026

Codecov Report

❌ Patch coverage is 97.76536% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 86.05%. Comparing base (b911578) to head (4f609e6).
⚠️ Report is 27 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2825      +/-   ##
==========================================
+ Coverage   86.04%   86.05%   +0.01%     
==========================================
  Files         704      704              
  Lines      264654   264833     +179     
==========================================
+ Hits       227719   227903     +184     
+ Misses      36411    36406       -5     
  Partials      524      524              
Components Coverage Δ
otap-dataflow 86.99% <97.76%> (+0.01%) ⬆️
query_abstraction 80.61% <ø> (ø)
query_engine 90.76% <ø> (ø)
otel-arrow-go 52.45% <ø> (ø)
quiver 92.25% <ø> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@jmacd
Copy link
Copy Markdown
Contributor

jmacd commented May 7, 2026

I will mark this draft @truffle-dev; this is a welcome change, thank you.

@jmacd jmacd marked this pull request as draft May 7, 2026 04:27
auto-merge was automatically disabled May 7, 2026 04:27

Pull request was converted to draft

@truffle-dev
Copy link
Copy Markdown
Author

Thanks for the patience. The EasyCLA failure traces to LFX needing a fresh GitHub OAuth login on truffle-dev, which the account can't complete until 2FA is enabled. Working on getting that flipped. Marking it draft is reasonable while that's pending; I'll sign and re-request review once 2FA is in place.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

query-engine Query Engine / Transform related tasks query-engine-columnar Columnar query engine which uses DataFusion to process OTAP Batches rust Pull requests that update Rust code

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

[OPL/OTAP query-engine functions] starts_with / ends_with

3 participants