Skip to content

Conversation

@gwenaskell
Copy link

@gwenaskell gwenaskell commented Oct 20, 2025

Summary

The Datadog query parser incorrectly interprets queries mixing AND and OR statements.

For example:

  • it interprets A OR NOT B as A AND NOT B
  • it interprets A AND B OR C as A AND (B OR C)
  • it interprets A OR B AND C as B AND C
  • it interprets A OR B C as C
  • it interprets A B OR C, as well as A AND B OR C as A

This PR fixes this behavior, while simplifying the parser logic.

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

How did you test this PR?

Added and updated unit tests

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on
    our guidelines.
  • No. A maintainer will apply the "no-changelog" label to this PR.

Checklist

  • Our CONTRIBUTING.md is a good starting place.
  • If this PR introduces changes to LICENSE-3rdparty.csv, please
    run dd-rust-license-tool write and commit the changes. More details here.
  • For new VRL functions, please also create a sibling PR in Vector to document the new function.

References

@gwenaskell gwenaskell requested a review from a team as a code owner October 20, 2025 11:57
@gwenaskell gwenaskell force-pushed the yoenn.burban/OPA-3967-fix-datadog-query-parser branch from 1541953 to e82fc30 Compare October 20, 2025 12:03
@gwenaskell gwenaskell requested a review from bruceg October 20, 2025 13:57
@bruceg
Copy link
Member

bruceg commented Oct 20, 2025

Before reviewing, I would argue that, as written above, this is a breaking change, as it could make the behavior of existing queries change.

Copy link
Member

@bruceg bruceg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does look like a definite fix and good simplification of the logic. It does have me wondering if it would need to be something recursive, however, to fit all possible cases of nested parentheses. Could you comment on that?

Rule::PLUS => (),
Rule::NOT => {
modifier = Some(LuceneOccur::MustNot);
is_not = true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to toggle? i.e. is_not = !is_not;

Copy link
Author

@gwenaskell gwenaskell Oct 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like, a query containing "NOT NOT"? I just checked, the lexer does not allow it (it fails on "NOT -foo" and interprets NOT NOT foo as NOT "NOT" foo)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That still makes me nervous, given the prevalence of double-negation in other such expression parsers, but it's good to know it's at least not wrong.

@gwenaskell
Copy link
Author

gwenaskell commented Oct 21, 2025

It does have me wondering if it would need to be something recursive, however, to fit all possible cases of nested parentheses. Could you comment on that?

Parentheses are interpreted by the lexer as the beginning of a sub-clause. The parser is already recursive (visit_query -> visit_clause -> visit_query)

@gwenaskell
Copy link
Author

@pront could you review this as well?

@gwenaskell gwenaskell requested a review from pront October 27, 2025 17:11
Copy link
Member

@pront pront left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks for improving the tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants