Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SCEV] Disallow simplifying phi(undef, X) to X #115109

Merged
merged 4 commits into from
Nov 7, 2024

Conversation

dtcxzyw
Copy link
Member

@dtcxzyw dtcxzyw commented Nov 6, 2024

See the following case:

@GlobIntONE = global i32 0, align 4

define ptr @src() {
entry:
  br label %for.body.peel.begin

for.body.peel.begin:                              ; preds = %entry
  br label %for.body.peel

for.body.peel:                                    ; preds = %for.body.peel.begin
  br i1 true, label %cleanup.peel, label %cleanup.loopexit.peel

cleanup.loopexit.peel:                            ; preds = %for.body.peel
  br label %cleanup.peel

cleanup.peel:                                     ; preds = %cleanup.loopexit.peel, %for.body.peel
  %retval.2.peel = phi ptr [ undef, %for.body.peel ], [ @GlobIntONE, %cleanup.loopexit.peel ]
  br i1 true, label %for.body.peel.next, label %cleanup7

for.body.peel.next:                               ; preds = %cleanup.peel
  br label %for.body.peel.next1

for.body.peel.next1:                              ; preds = %for.body.peel.next
  br label %entry.peel.newph

entry.peel.newph:                                 ; preds = %for.body.peel.next1
  br label %for.body

for.body:                                         ; preds = %cleanup, %entry.peel.newph
  %retval.0 = phi ptr [ %retval.2.peel, %entry.peel.newph ], [ %retval.2, %cleanup ]
  br i1 false, label %cleanup, label %cleanup.loopexit

cleanup.loopexit:                                 ; preds = %for.body
  br label %cleanup

cleanup:                                          ; preds = %cleanup.loopexit, %for.body
  %retval.2 = phi ptr [ %retval.0, %for.body ], [ @GlobIntONE, %cleanup.loopexit ]
  br i1 false, label %for.body, label %cleanup7.loopexit

cleanup7.loopexit:                                ; preds = %cleanup
  %retval.2.lcssa.ph = phi ptr [ %retval.2, %cleanup ]
  br label %cleanup7

cleanup7:                                         ; preds = %cleanup7.loopexit, %cleanup.peel
  %retval.2.lcssa = phi ptr [ %retval.2.peel, %cleanup.peel ], [ %retval.2.lcssa.ph, %cleanup7.loopexit ]
  ret ptr %retval.2.lcssa
}

define ptr @tgt() {
entry:
  br label %for.body.peel.begin

for.body.peel.begin:                              ; preds = %entry
  br label %for.body.peel

for.body.peel:                                    ; preds = %for.body.peel.begin
  br i1 true, label %cleanup.peel, label %cleanup.loopexit.peel

cleanup.loopexit.peel:                            ; preds = %for.body.peel
  br label %cleanup.peel

cleanup.peel:                                     ; preds = %cleanup.loopexit.peel, %for.body.peel
  %retval.2.peel = phi ptr [ undef, %for.body.peel ], [ @GlobIntONE, %cleanup.loopexit.peel ]
  br i1 true, label %for.body.peel.next, label %cleanup7

for.body.peel.next:                               ; preds = %cleanup.peel
  br label %for.body.peel.next1

for.body.peel.next1:                              ; preds = %for.body.peel.next
  br label %entry.peel.newph

entry.peel.newph:                                 ; preds = %for.body.peel.next1
  br label %for.body

for.body:                                         ; preds = %cleanup, %entry.peel.newph
  br i1 false, label %cleanup, label %cleanup.loopexit

cleanup.loopexit:                                 ; preds = %for.body
  br label %cleanup

cleanup:                                          ; preds = %cleanup.loopexit, %for.body
  br i1 false, label %for.body, label %cleanup7.loopexit

cleanup7.loopexit:                                ; preds = %cleanup
  %retval.2.lcssa.ph = phi ptr [ %retval.2.peel, %cleanup ]
  br label %cleanup7

cleanup7:                                         ; preds = %cleanup7.loopexit, %cleanup.peel
  %retval.2.lcssa = phi ptr [ %retval.2.peel, %cleanup.peel ], [ %retval.2.lcssa.ph, %cleanup7.loopexit ]
  ret ptr %retval.2.lcssa
}
  1. simplifyInstruction(%retval.2.peel) returns @GlobIntONE. Thus, ScalarEvolution::createNodeForPHI returns SCEV expr @GlobIntONE for %retval.2.peel.
  2. SimplifyIndvar::replaceIVUserWithLoopInvariant tries to replace the use of %retval.2.peel in %retval.2.lcssa.ph with @GlobIntONE.
  3. simplifyLoopAfterUnroll -> simplifyLoopIVs -> SCEVExpander::expand reuses %retval.2.peel = phi ptr [ undef, %for.body.peel ], [ @GlobIntONE, %cleanup.loopexit.peel ] to generate code for @GlobIntONE. It is incorrect.

This patch disallows simplifying phi(undef, X) to X by setting CanUseUndef to false.
Closes #114879.

@llvmbot
Copy link
Collaborator

llvmbot commented Nov 6, 2024

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: Yingwei Zheng (dtcxzyw)

Changes

See the following case:

@<!-- -->GlobIntONE = global i32 0, align 4

define ptr @<!-- -->src() {
entry:
  br label %for.body.peel.begin

for.body.peel.begin:                              ; preds = %entry
  br label %for.body.peel

for.body.peel:                                    ; preds = %for.body.peel.begin
  br i1 true, label %cleanup.peel, label %cleanup.loopexit.peel

cleanup.loopexit.peel:                            ; preds = %for.body.peel
  br label %cleanup.peel

cleanup.peel:                                     ; preds = %cleanup.loopexit.peel, %for.body.peel
  %retval.2.peel = phi ptr [ undef, %for.body.peel ], [ @<!-- -->GlobIntONE, %cleanup.loopexit.peel ]
  br i1 true, label %for.body.peel.next, label %cleanup7

for.body.peel.next:                               ; preds = %cleanup.peel
  br label %for.body.peel.next1

for.body.peel.next1:                              ; preds = %for.body.peel.next
  br label %entry.peel.newph

entry.peel.newph:                                 ; preds = %for.body.peel.next1
  br label %for.body

for.body:                                         ; preds = %cleanup, %entry.peel.newph
  %retval.0 = phi ptr [ %retval.2.peel, %entry.peel.newph ], [ %retval.2, %cleanup ]
  br i1 false, label %cleanup, label %cleanup.loopexit

cleanup.loopexit:                                 ; preds = %for.body
  br label %cleanup

cleanup:                                          ; preds = %cleanup.loopexit, %for.body
  %retval.2 = phi ptr [ %retval.0, %for.body ], [ @<!-- -->GlobIntONE, %cleanup.loopexit ]
  br i1 false, label %for.body, label %cleanup7.loopexit

cleanup7.loopexit:                                ; preds = %cleanup
  %retval.2.lcssa.ph = phi ptr [ %retval.2, %cleanup ]
  br label %cleanup7

cleanup7:                                         ; preds = %cleanup7.loopexit, %cleanup.peel
  %retval.2.lcssa = phi ptr [ %retval.2.peel, %cleanup.peel ], [ %retval.2.lcssa.ph, %cleanup7.loopexit ]
  ret ptr %retval.2.lcssa
}

define ptr @<!-- -->tgt() {
entry:
  br label %for.body.peel.begin

for.body.peel.begin:                              ; preds = %entry
  br label %for.body.peel

for.body.peel:                                    ; preds = %for.body.peel.begin
  br i1 true, label %cleanup.peel, label %cleanup.loopexit.peel

cleanup.loopexit.peel:                            ; preds = %for.body.peel
  br label %cleanup.peel

cleanup.peel:                                     ; preds = %cleanup.loopexit.peel, %for.body.peel
  %retval.2.peel = phi ptr [ undef, %for.body.peel ], [ @<!-- -->GlobIntONE, %cleanup.loopexit.peel ]
  br i1 true, label %for.body.peel.next, label %cleanup7

for.body.peel.next:                               ; preds = %cleanup.peel
  br label %for.body.peel.next1

for.body.peel.next1:                              ; preds = %for.body.peel.next
  br label %entry.peel.newph

entry.peel.newph:                                 ; preds = %for.body.peel.next1
  br label %for.body

for.body:                                         ; preds = %cleanup, %entry.peel.newph
  br i1 false, label %cleanup, label %cleanup.loopexit

cleanup.loopexit:                                 ; preds = %for.body
  br label %cleanup

cleanup:                                          ; preds = %cleanup.loopexit, %for.body
  br i1 false, label %for.body, label %cleanup7.loopexit

cleanup7.loopexit:                                ; preds = %cleanup
  %retval.2.lcssa.ph = phi ptr [ %retval.2.peel, %cleanup ]
  br label %cleanup7

cleanup7:                                         ; preds = %cleanup7.loopexit, %cleanup.peel
  %retval.2.lcssa = phi ptr [ %retval.2.peel, %cleanup.peel ], [ %retval.2.lcssa.ph, %cleanup7.loopexit ]
  ret ptr %retval.2.lcssa
}
  1. simplifyInstruction(%retval.2.peel) returns @<!-- -->GlobIntONE. Thus, ScalarEvolution::createNodeForPHI returns SCEV expr @<!-- -->GlobIntONE for %retval.2.peel.
  2. SimplifyIndvar::replaceIVUserWithLoopInvariant tries to replace the use of %retval.2.peel in %retval.2.lcssa.ph with @<!-- -->GlobIntONE.
  3. simplifyLoopAfterUnroll -&gt; simplifyLoopIVs -&gt; SCEVExpander::expand reuses %retval.2.peel = phi ptr [ undef, %for.body.peel ], [ @<!-- -->GlobIntONE, %cleanup.loopexit.peel ] to generate code for @<!-- -->GlobIntONE. It is incorrect.

This patch disallows simplifying phi(undef, X) to X by setting CanUseUndef to false.
Closes #114879.


Full diff: https://github.com/llvm/llvm-project/pull/115109.diff

7 Files Affected:

  • (modified) llvm/lib/Analysis/ScalarEvolution.cpp (+5-1)
  • (modified) llvm/test/Transforms/IndVarSimplify/invalidate-modified-lcssa-phi.ll (+2-1)
  • (modified) llvm/test/Transforms/IndVarSimplify/no-iv-rewrite.ll (+4-2)
  • (modified) llvm/test/Transforms/LoopStrengthReduce/X86/2012-01-13-phielim.ll (+8-8)
  • (added) llvm/test/Transforms/LoopUnroll/pr114879.ll (+62)
  • (modified) llvm/test/Transforms/LoopVectorize/uniform-blend.ll (+32-9)
  • (modified) llvm/unittests/Analysis/ScalarEvolutionTest.cpp (+1-1)
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index 58e23e9556f144..b10811133770e1 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -6023,7 +6023,11 @@ const SCEV *ScalarEvolution::createNodeForPHI(PHINode *PN) {
   if (const SCEV *S = createAddRecFromPHI(PN))
     return S;
 
-  if (Value *V = simplifyInstruction(PN, {getDataLayout(), &TLI, &DT, &AC}))
+  // We do not allow simplifying phi (undef, X) to X here, to avoid reusing the
+  // phi node for X.
+  if (Value *V = simplifyInstruction(
+          PN, {getDataLayout(), &TLI, &DT, &AC, /*CtxI=*/nullptr,
+               /*UseInstrInfo=*/true, /*CanUseUndef=*/false}))
     return getSCEV(V);
 
   if (const SCEV *S = createNodeFromSelectLikePHI(PN))
diff --git a/llvm/test/Transforms/IndVarSimplify/invalidate-modified-lcssa-phi.ll b/llvm/test/Transforms/IndVarSimplify/invalidate-modified-lcssa-phi.ll
index 856fc376204995..0538c1c64de34d 100644
--- a/llvm/test/Transforms/IndVarSimplify/invalidate-modified-lcssa-phi.ll
+++ b/llvm/test/Transforms/IndVarSimplify/invalidate-modified-lcssa-phi.ll
@@ -48,13 +48,14 @@ define i8 @test_pr52023(i1 %c.1, i1 %c.2) {
 ; CHECK-NEXT:    br label [[LOOP_1:%.*]]
 ; CHECK:       loop.1:
 ; CHECK-NEXT:    [[INC79:%.*]] = phi i8 [ [[TMP0:%.*]], [[LOOP_1_LATCH:%.*]] ], [ 0, [[ENTRY:%.*]] ]
-; CHECK-NEXT:    [[TMP0]] = add i8 [[INC79]], 1
+; CHECK-NEXT:    [[TMP1:%.*]] = add i8 [[INC79]], 1
 ; CHECK-NEXT:    br label [[LOOP_2:%.*]]
 ; CHECK:       loop.2:
 ; CHECK-NEXT:    br i1 [[C_1:%.*]], label [[LOOP_2_LATCH:%.*]], label [[LOOP_1_LATCH]]
 ; CHECK:       loop.2.latch:
 ; CHECK-NEXT:    br label [[LOOP_1_LATCH]]
 ; CHECK:       loop.1.latch:
+; CHECK-NEXT:    [[TMP0]] = phi i8 [ [[TMP1]], [[LOOP_2_LATCH]] ], [ undef, [[LOOP_2]] ]
 ; CHECK-NEXT:    br i1 [[C_2:%.*]], label [[EXIT:%.*]], label [[LOOP_1]]
 ; CHECK:       exit:
 ; CHECK-NEXT:    [[INC_LCSSA_LCSSA:%.*]] = phi i8 [ [[TMP0]], [[LOOP_1_LATCH]] ]
diff --git a/llvm/test/Transforms/IndVarSimplify/no-iv-rewrite.ll b/llvm/test/Transforms/IndVarSimplify/no-iv-rewrite.ll
index 579b8536cedf03..5339468fe8036b 100644
--- a/llvm/test/Transforms/IndVarSimplify/no-iv-rewrite.ll
+++ b/llvm/test/Transforms/IndVarSimplify/no-iv-rewrite.ll
@@ -260,11 +260,13 @@ define void @identityphi(i32 %limit) nounwind {
 ; CHECK-NEXT:  entry:
 ; CHECK-NEXT:    br label [[LOOP:%.*]]
 ; CHECK:       loop:
-; CHECK-NEXT:    br i1 undef, label [[IF_THEN:%.*]], label [[CONTROL:%.*]]
+; CHECK-NEXT:    [[IV:%.*]] = phi i32 [ 0, [[ENTRY:%.*]] ], [ [[IV_NEXT:%.*]], [[CONTROL:%.*]] ]
+; CHECK-NEXT:    br i1 undef, label [[IF_THEN:%.*]], label [[CONTROL]]
 ; CHECK:       if.then:
 ; CHECK-NEXT:    br label [[CONTROL]]
 ; CHECK:       control:
-; CHECK-NEXT:    [[CMP:%.*]] = icmp slt i32 0, [[LIMIT:%.*]]
+; CHECK-NEXT:    [[IV_NEXT]] = phi i32 [ [[IV]], [[LOOP]] ], [ undef, [[IF_THEN]] ]
+; CHECK-NEXT:    [[CMP:%.*]] = icmp slt i32 [[IV_NEXT]], [[LIMIT:%.*]]
 ; CHECK-NEXT:    br i1 [[CMP]], label [[LOOP]], label [[EXIT:%.*]]
 ; CHECK:       exit:
 ; CHECK-NEXT:    ret void
diff --git a/llvm/test/Transforms/LoopStrengthReduce/X86/2012-01-13-phielim.ll b/llvm/test/Transforms/LoopStrengthReduce/X86/2012-01-13-phielim.ll
index 492746615d8461..38f18f68e2db72 100644
--- a/llvm/test/Transforms/LoopStrengthReduce/X86/2012-01-13-phielim.ll
+++ b/llvm/test/Transforms/LoopStrengthReduce/X86/2012-01-13-phielim.ll
@@ -156,29 +156,29 @@ define fastcc void @test3(ptr nocapture %u) nounwind uwtable ssp {
 ; CHECK:       for.inc8.us.i:
 ; CHECK-NEXT:    br i1 true, label [[MESHBB1_LOOPEXIT:%.*]], label [[MESHBB:%.*]]
 ; CHECK:       for.body3.us.i:
-; CHECK-NEXT:    [[INDVARS_IV_I_SV_PHI:%.*]] = phi i64 [ [[INDVARS_IV_NEXT_I:%.*]], [[MESHBB]] ], [ 0, [[FOR_BODY3_LR_PH_US_I:%.*]] ]
+; CHECK-NEXT:    [[TMP:%.*]] = phi i32 [ [[LSR_IV_NEXT:%.*]], [[MESHBB]] ], [ [[TMP3:%.*]], [[FOR_BODY3_LR_PH_US_I:%.*]] ]
+; CHECK-NEXT:    [[SCEVGEP:%.*]] = phi ptr [ [[SCEVGEP1:%.*]], [[MESHBB]] ], [ [[U:%.*]], [[FOR_BODY3_LR_PH_US_I]] ]
 ; CHECK-NEXT:    [[OPQ_SA_CALC12:%.*]] = sub i32 undef, 227
-; CHECK-NEXT:    [[TMP0:%.*]] = add i64 [[LSR_IV:%.*]], [[INDVARS_IV_I_SV_PHI]]
-; CHECK-NEXT:    [[TMP:%.*]] = trunc i64 [[TMP0]] to i32
 ; CHECK-NEXT:    [[MUL_I_US_I:%.*]] = mul nsw i32 0, [[TMP]]
-; CHECK-NEXT:    [[TMP1:%.*]] = shl nuw nsw i64 [[INDVARS_IV_I_SV_PHI]], 3
-; CHECK-NEXT:    [[SCEVGEP:%.*]] = getelementptr i8, ptr [[U:%.*]], i64 [[TMP1]]
 ; CHECK-NEXT:    [[TMP2:%.*]] = load double, ptr [[SCEVGEP]], align 8
 ; CHECK-NEXT:    br i1 undef, label [[FOR_INC8_US_I:%.*]], label [[MESHBB]]
 ; CHECK:       for.body3.lr.ph.us.i.loopexit:
-; CHECK-NEXT:    [[LSR_IV_NEXT:%.*]] = add i64 [[LSR_IV]], 1
 ; CHECK-NEXT:    br label [[FOR_BODY3_LR_PH_US_I]]
 ; CHECK:       for.body3.lr.ph.us.i:
-; CHECK-NEXT:    [[LSR_IV]] = phi i64 [ [[LSR_IV_NEXT]], [[FOR_BODY3_LR_PH_US_I_LOOPEXIT:%.*]] ], [ undef, [[MESHBB1]] ]
+; CHECK-NEXT:    [[LSR_IV:%.*]] = phi i64 [ undef, [[MESHBB1]] ], [ [[INDVARS_IV8_I_SV_PHI24:%.*]], [[FOR_BODY3_LR_PH_US_I_LOOPEXIT:%.*]] ]
 ; CHECK-NEXT:    [[ARRAYIDX_US_I:%.*]] = getelementptr inbounds double, ptr undef, i64 [[LSR_IV]]
+; CHECK-NEXT:    [[TMP1:%.*]] = add i64 [[LSR_IV]], 1
+; CHECK-NEXT:    [[TMP3]] = trunc i64 [[LSR_IV]] to i32
 ; CHECK-NEXT:    br label [[FOR_BODY3_US_I:%.*]]
 ; CHECK:       for.inc8.us.i2:
 ; CHECK-NEXT:    unreachable
 ; CHECK:       eval_At_times_u.exit:
 ; CHECK-NEXT:    ret void
 ; CHECK:       meshBB:
+; CHECK-NEXT:    [[INDVARS_IV8_I_SV_PHI24]] = phi i64 [ undef, [[FOR_BODY3_US_I]] ], [ [[TMP1]], [[FOR_INC8_US_I]] ]
 ; CHECK-NEXT:    [[MESHSTACKVARIABLE_PHI:%.*]] = phi i32 [ [[OPQ_SA_CALC12]], [[FOR_BODY3_US_I]] ], [ undef, [[FOR_INC8_US_I]] ]
-; CHECK-NEXT:    [[INDVARS_IV_NEXT_I]] = add i64 [[INDVARS_IV_I_SV_PHI]], 1
+; CHECK-NEXT:    [[SCEVGEP1]] = getelementptr i8, ptr [[SCEVGEP]], i64 8
+; CHECK-NEXT:    [[LSR_IV_NEXT]] = add i32 [[TMP]], 1
 ; CHECK-NEXT:    br i1 true, label [[FOR_BODY3_LR_PH_US_I_LOOPEXIT]], label [[FOR_BODY3_US_I]]
 ; CHECK:       meshBB1.loopexit:
 ; CHECK-NEXT:    br label [[MESHBB1]]
diff --git a/llvm/test/Transforms/LoopUnroll/pr114879.ll b/llvm/test/Transforms/LoopUnroll/pr114879.ll
new file mode 100644
index 00000000000000..4cd91a0f6c09f7
--- /dev/null
+++ b/llvm/test/Transforms/LoopUnroll/pr114879.ll
@@ -0,0 +1,62 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -passes=loop-unroll-full -unroll-full-max-count=1 %s | FileCheck %s
+
+@GlobIntONE = global i32 0, align 4
+
+; Make sure we don't reuse the phi (undef, X) for X.
+
+define ptr @test() {
+; CHECK-LABEL: define ptr @test() {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    br label %[[FOR_BODY_PEEL_BEGIN:.*]]
+; CHECK:       [[FOR_BODY_PEEL_BEGIN]]:
+; CHECK-NEXT:    br label %[[FOR_BODY_PEEL:.*]]
+; CHECK:       [[FOR_BODY_PEEL]]:
+; CHECK-NEXT:    br i1 true, label %[[CLEANUP_PEEL:.*]], label %[[CLEANUP_LOOPEXIT_PEEL:.*]]
+; CHECK:       [[CLEANUP_LOOPEXIT_PEEL]]:
+; CHECK-NEXT:    br label %[[CLEANUP_PEEL]]
+; CHECK:       [[CLEANUP_PEEL]]:
+; CHECK-NEXT:    [[RETVAL_2_PEEL:%.*]] = phi ptr [ undef, %[[FOR_BODY_PEEL]] ], [ @GlobIntONE, %[[CLEANUP_LOOPEXIT_PEEL]] ]
+; CHECK-NEXT:    br i1 true, label %[[FOR_BODY_PEEL_NEXT:.*]], label %[[CLEANUP2:.*]]
+; CHECK:       [[FOR_BODY_PEEL_NEXT]]:
+; CHECK-NEXT:    br label %[[FOR_BODY_PEEL_NEXT1:.*]]
+; CHECK:       [[FOR_BODY_PEEL_NEXT1]]:
+; CHECK-NEXT:    br label %[[ENTRY_PEEL_NEWPH:.*]]
+; CHECK:       [[ENTRY_PEEL_NEWPH]]:
+; CHECK-NEXT:    br label %[[FOR_BODY:.*]]
+; CHECK:       [[FOR_BODY]]:
+; CHECK-NEXT:    br i1 false, label %[[CLEANUP:.*]], label %[[CLEANUP_LOOPEXIT:.*]]
+; CHECK:       [[CLEANUP_LOOPEXIT]]:
+; CHECK-NEXT:    br label %[[CLEANUP]]
+; CHECK:       [[CLEANUP]]:
+; CHECK-NEXT:    br i1 false, label %[[FOR_BODY]], label %[[CLEANUP2_LOOPEXIT:.*]], !llvm.loop [[LOOP0:![0-9]+]]
+; CHECK:       [[CLEANUP2_LOOPEXIT]]:
+; CHECK-NEXT:    [[RETVAL_2_LCSSA_PH:%.*]] = phi ptr [ @GlobIntONE, %[[CLEANUP]] ]
+; CHECK-NEXT:    br label %[[CLEANUP2]]
+; CHECK:       [[CLEANUP2]]:
+; CHECK-NEXT:    [[RETVAL_2_LCSSA:%.*]] = phi ptr [ [[RETVAL_2_PEEL]], %[[CLEANUP_PEEL]] ], [ [[RETVAL_2_LCSSA_PH]], %[[CLEANUP2_LOOPEXIT]] ]
+; CHECK-NEXT:    ret ptr [[RETVAL_2_LCSSA]]
+;
+entry:
+  br label %for.body
+
+for.body:
+  %retval.0 = phi ptr [ undef, %entry ], [ %retval.2, %cleanup ]
+  %cmp1.not = phi i1 [ true, %entry ], [ false, %cleanup ]
+  br i1 %cmp1.not, label %cleanup, label %cleanup.loopexit
+
+cleanup.loopexit:
+  br label %cleanup
+
+cleanup:
+  %retval.2 = phi ptr [ %retval.0, %for.body ], [ @GlobIntONE, %cleanup.loopexit ]
+  br i1 %cmp1.not, label %for.body, label %cleanup2
+
+cleanup2:
+  %retval.2.lcssa = phi ptr [ %retval.2, %cleanup ]
+  ret ptr %retval.2.lcssa
+}
+;.
+; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]]}
+; CHECK: [[META1]] = !{!"llvm.loop.peeled.count", i32 1}
+;.
diff --git a/llvm/test/Transforms/LoopVectorize/uniform-blend.ll b/llvm/test/Transforms/LoopVectorize/uniform-blend.ll
index c9fc8beb006d9b..05919cdcf93c8c 100644
--- a/llvm/test/Transforms/LoopVectorize/uniform-blend.ll
+++ b/llvm/test/Transforms/LoopVectorize/uniform-blend.ll
@@ -9,16 +9,27 @@ define void @blend_uniform_iv_trunc(i1 %c) {
 ; CHECK-NEXT:  [[ENTRY:.*]]:
 ; CHECK-NEXT:    br i1 false, label %[[SCALAR_PH:.*]], label %[[VECTOR_PH:.*]]
 ; CHECK:       [[VECTOR_PH]]:
+; CHECK-NEXT:    [[BROADCAST_SPLATINSERT:%.*]] = insertelement <4 x i1> poison, i1 [[C]], i64 0
+; CHECK-NEXT:    [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i1> [[BROADCAST_SPLATINSERT]], <4 x i1> poison, <4 x i32> zeroinitializer
 ; CHECK-NEXT:    br label %[[VECTOR_BODY:.*]]
 ; CHECK:       [[VECTOR_BODY]]:
 ; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[VECTOR_BODY]] ]
-; CHECK-NEXT:    [[TMP0:%.*]] = trunc i64 [[INDEX]] to i16
-; CHECK-NEXT:    [[TMP1:%.*]] = add i16 [[TMP0]], 0
-; CHECK-NEXT:    [[PREDPHI:%.*]] = select i1 [[C]], i16 [[TMP1]], i16 undef
+; CHECK-NEXT:    [[VEC_IND:%.*]] = phi <4 x i16> [ <i16 0, i16 1, i16 2, i16 3>, %[[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT:    [[PREDPHI1:%.*]] = select <4 x i1> [[BROADCAST_SPLAT]], <4 x i16> [[VEC_IND]], <4 x i16> undef
+; CHECK-NEXT:    [[PREDPHI:%.*]] = extractelement <4 x i16> [[PREDPHI1]], i32 0
 ; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr inbounds [32 x i16], ptr @dst, i16 0, i16 [[PREDPHI]]
-; CHECK-NEXT:    [[TMP3:%.*]] = getelementptr inbounds i16, ptr [[TMP2]], i32 0
-; CHECK-NEXT:    store <4 x i16> zeroinitializer, ptr [[TMP3]], align 2
+; CHECK-NEXT:    [[TMP8:%.*]] = extractelement <4 x i16> [[PREDPHI1]], i32 1
+; CHECK-NEXT:    [[TMP3:%.*]] = getelementptr inbounds [32 x i16], ptr @dst, i16 0, i16 [[TMP8]]
+; CHECK-NEXT:    [[TMP9:%.*]] = extractelement <4 x i16> [[PREDPHI1]], i32 2
+; CHECK-NEXT:    [[TMP5:%.*]] = getelementptr inbounds [32 x i16], ptr @dst, i16 0, i16 [[TMP9]]
+; CHECK-NEXT:    [[TMP6:%.*]] = extractelement <4 x i16> [[PREDPHI1]], i32 3
+; CHECK-NEXT:    [[TMP7:%.*]] = getelementptr inbounds [32 x i16], ptr @dst, i16 0, i16 [[TMP6]]
+; CHECK-NEXT:    store i16 0, ptr [[TMP2]], align 2
+; CHECK-NEXT:    store i16 0, ptr [[TMP3]], align 2
+; CHECK-NEXT:    store i16 0, ptr [[TMP5]], align 2
+; CHECK-NEXT:    store i16 0, ptr [[TMP7]], align 2
 ; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
+; CHECK-NEXT:    [[VEC_IND_NEXT]] = add <4 x i16> [[VEC_IND]], <i16 4, i16 4, i16 4, i16 4>
 ; CHECK-NEXT:    [[TMP4:%.*]] = icmp eq i64 [[INDEX_NEXT]], 32
 ; CHECK-NEXT:    br i1 [[TMP4]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP0:![0-9]+]]
 ; CHECK:       [[MIDDLE_BLOCK]]:
@@ -71,15 +82,27 @@ define void @blend_uniform_iv(i1 %c) {
 ; CHECK-NEXT:  [[ENTRY:.*]]:
 ; CHECK-NEXT:    br i1 false, label %[[SCALAR_PH:.*]], label %[[VECTOR_PH:.*]]
 ; CHECK:       [[VECTOR_PH]]:
+; CHECK-NEXT:    [[BROADCAST_SPLATINSERT:%.*]] = insertelement <4 x i1> poison, i1 [[C]], i64 0
+; CHECK-NEXT:    [[BROADCAST_SPLAT:%.*]] = shufflevector <4 x i1> [[BROADCAST_SPLATINSERT]], <4 x i1> poison, <4 x i32> zeroinitializer
 ; CHECK-NEXT:    br label %[[VECTOR_BODY:.*]]
 ; CHECK:       [[VECTOR_BODY]]:
 ; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, %[[VECTOR_PH]] ], [ [[INDEX_NEXT:%.*]], %[[VECTOR_BODY]] ]
-; CHECK-NEXT:    [[TMP0:%.*]] = add i64 [[INDEX]], 0
-; CHECK-NEXT:    [[PREDPHI:%.*]] = select i1 [[C]], i64 [[TMP0]], i64 undef
+; CHECK-NEXT:    [[VEC_IND:%.*]] = phi <4 x i64> [ <i64 0, i64 1, i64 2, i64 3>, %[[VECTOR_PH]] ], [ [[VEC_IND_NEXT:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT:    [[PREDPHI1:%.*]] = select <4 x i1> [[BROADCAST_SPLAT]], <4 x i64> [[VEC_IND]], <4 x i64> undef
+; CHECK-NEXT:    [[PREDPHI:%.*]] = extractelement <4 x i64> [[PREDPHI1]], i32 0
 ; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr inbounds [32 x i16], ptr @dst, i16 0, i64 [[PREDPHI]]
-; CHECK-NEXT:    [[TMP2:%.*]] = getelementptr inbounds i16, ptr [[TMP1]], i32 0
-; CHECK-NEXT:    store <4 x i16> zeroinitializer, ptr [[TMP2]], align 2
+; CHECK-NEXT:    [[TMP2:%.*]] = extractelement <4 x i64> [[PREDPHI1]], i32 1
+; CHECK-NEXT:    [[TMP8:%.*]] = getelementptr inbounds [32 x i16], ptr @dst, i16 0, i64 [[TMP2]]
+; CHECK-NEXT:    [[TMP4:%.*]] = extractelement <4 x i64> [[PREDPHI1]], i32 2
+; CHECK-NEXT:    [[TMP5:%.*]] = getelementptr inbounds [32 x i16], ptr @dst, i16 0, i64 [[TMP4]]
+; CHECK-NEXT:    [[TMP6:%.*]] = extractelement <4 x i64> [[PREDPHI1]], i32 3
+; CHECK-NEXT:    [[TMP7:%.*]] = getelementptr inbounds [32 x i16], ptr @dst, i16 0, i64 [[TMP6]]
+; CHECK-NEXT:    store i16 0, ptr [[TMP1]], align 2
+; CHECK-NEXT:    store i16 0, ptr [[TMP8]], align 2
+; CHECK-NEXT:    store i16 0, ptr [[TMP5]], align 2
+; CHECK-NEXT:    store i16 0, ptr [[TMP7]], align 2
 ; CHECK-NEXT:    [[INDEX_NEXT]] = add nuw i64 [[INDEX]], 4
+; CHECK-NEXT:    [[VEC_IND_NEXT]] = add <4 x i64> [[VEC_IND]], <i64 4, i64 4, i64 4, i64 4>
 ; CHECK-NEXT:    [[TMP3:%.*]] = icmp eq i64 [[INDEX_NEXT]], 32
 ; CHECK-NEXT:    br i1 [[TMP3]], label %[[MIDDLE_BLOCK:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP4:![0-9]+]]
 ; CHECK:       [[MIDDLE_BLOCK]]:
diff --git a/llvm/unittests/Analysis/ScalarEvolutionTest.cpp b/llvm/unittests/Analysis/ScalarEvolutionTest.cpp
index 37c61e4e4fa714..3450302f36f617 100644
--- a/llvm/unittests/Analysis/ScalarEvolutionTest.cpp
+++ b/llvm/unittests/Analysis/ScalarEvolutionTest.cpp
@@ -139,7 +139,7 @@ TEST_F(ScalarEvolutionsTest, SimplifiedPHI) {
   auto *Ty = Type::getInt32Ty(Context);
   auto *PN = PHINode::Create(Ty, 2, "", LoopBB->begin());
   PN->addIncoming(Constant::getNullValue(Ty), EntryBB);
-  PN->addIncoming(UndefValue::get(Ty), LoopBB);
+  PN->addIncoming(PoisonValue::get(Ty), LoopBB);
   ScalarEvolution SE = buildSE(*F);
   const SCEV *S1 = SE.getSCEV(PN);
   const SCEV *S2 = SE.getSCEV(PN);

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dtcxzyw dtcxzyw merged commit 0b9f1cc into llvm:main Nov 7, 2024
6 of 8 checks passed
@dtcxzyw dtcxzyw deleted the perf/fix-114879 branch November 7, 2024 07:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

WRONG code: GVN? Loop opts?
3 participants