Skip to content

Conversation

@yujun777
Copy link
Contributor

@yujun777 yujun777 commented Nov 20, 2025

What problem does this PR solve?

cherry-pick: #56424, #56469, #56756, #56932, #57025, #56899, #56941, #57537

Release note

None

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

…ache#56424)

for a case when condition, the condition evaluate result is null or
false have the same effect: not hit the condition.

in most case, nullable cann't fold in logistic expression, for example
`null and a = 1` and `null or a = 1` cann't fold.
but false can fold in logistic expression, `false and a=1` can fold to
false, `false or a = 1` can fold to `a = 1`.

so if we replace the null to false in case when condition, then the
expression may be fold more simple.

in fact, for case/if condition, null can replace with FALSE when it is
the expression root or all its ancestors are AND/OR/CASE IF CONDITION,
and this rewrite will not change the hit or not of the branch.

for example:  

for sql:   'case  when null and a > 1 then ...':
1. after use this rule rewrite to 'case when false and a > 1 then ... ',
2. then constant fold rule will rewrite it to 'case when false then ...',
3. then case when can remove this branch since its condition is false.
…to true/false (apache#56469)

for nested case when, replace the inner case duplicate condition to
true/false when this condition also exists in outer case condition:

1. if it exists in outer case's current branch condition, replace it with TRUE:
case when A then (case when A and B then 1 else 2 end)
 ...
end

then inner case condition A will replace with TRUE:
case when A then  (case when TRUE and B then 1 else 2 end)
...
end

2. if it exists in outer case's previous branch condition, replace it with FALSE:
case when A then C 
     when B then (case when A and D then 1 else 2 end)                 
 ...
end

then inner case condition A will replace with FALSE:
case when A then C 
     when B then (case when FALSE and D then 1 else 2 end)                    
...
end

this PR also opt fold case when and fold if statement. for case when /
if expression, if all their branches values equals, then rewrite them to
the same value.
for a boolean data type case when expression, if all its when clauses'
result are true / false literal, then can rewrite this case when to AND
/ OR expression.

for example:

case when a = 1 then true when b = 1 then false else c = 1 end
rewrite to:  (a = 1) <=> true  or (not((b = 1) <=> true) and c = 1)

if (a = 1, true,  b = 1)
rewrite to:  (a=1) <=> true or b = 1
1. Add fold constant for nullif function;
2. Opt fold nvl:   `nvl(a, a)`  => `a`,  `nvl(a, null)` => `a`;
3. Make AggregateFunction and TableGeneratingFunction to non-foldable.
…when (apache#57025)

For a condition expression, replace null to false, null safe equal to
equal. The condition include filter condition, join condition, if
condition, case when condition.

And for the condition expression, only replace those sub expression
which its ancestors to the condition root are all AND / OR / CASE WHEN /
IF. So, for a expression  in a filter, the first
null can be replaced, but the second null cann't be replaced because its
parent is NOT, not in AND / OR / CASE WHEN / IF.

For a expression  in condition, if one of them is not-nullable,
then it can rewrite to . Note that if a expression is not a
condition,  can rewrite to  require that both x and y
are not-nullable.
…che#56899)

simplify expression equals with true / false literal.
…6941)

push upper function into case when branch.

for expression f with n arguments a1, a2, ..., an, if one of its
argument is case when/if/nvl/nullif,
and the others are literals, then we can push f into each branch of case
when/if/nvl/nullif.
@yujun777 yujun777 requested a review from yiguolei as a code owner November 20, 2025 08:50
@Thearas
Copy link
Contributor

Thearas commented Nov 20, 2025

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@yujun777
Copy link
Contributor Author

run buildall

@yujun777
Copy link
Contributor Author

run feut

@yujun777 yujun777 marked this pull request as draft November 21, 2025 02:49
@yujun777 yujun777 changed the title branch-4.0: [feat](nereids) optimize case when expression branch-4.0: [draft](nereids) optimize case when expression Nov 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants