-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement a mechanism to disable GEP->PHI InstCombiner. #231
base: aie-public
Are you sure you want to change the base?
Conversation
QoR results:
This affects other benchmarks whose results are not visible. For example Add2D_bf16_1: aie-public:
This PR:
If we turn our eyes to the post-swp, we have:
|
c377d51
to
a7ffe20
Compare
a7ffe20
to
5e3962c
Compare
5e3962c
to
3c222d7
Compare
; CHECK: bb3: | ||
; CHECK-NEXT: [[PHI:%.*]] = phi ptr [ [[TMP10]], [[BB1]] ], [ [[TMP4]], [[BB2]] ] | ||
; CHECK-NEXT: [[TMP25:%.*]] = load i32, ptr [[PHI]], align 4 | ||
; CHECK-NEXT: ret i32 [[TMP25]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. Probably the combine is better for very local alias analysis that doesn't want to follow too many PHI nodes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In our case the PHI in bb3 dominates one of the GEPs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In our case, one GEP should reach the load. The combiner basically does the following for this case:
%struct = type { i32, i32 }
define i32 @dontFoldGEPs(ptr %dm, i1 %arg4, i64 %arg9, i64 %arg19) {
bb:
%0 = load ptr, ptr %dm, align 8
br i1 %arg4, label %bb1, label %bb2
bb1: ; preds = %bb
%1 = trunc i64 %arg9 to i20
br label %bb3
bb2: ; preds = %bb
%2 = trunc i64 %arg19 to i20
br label %bb3
bb3: ; preds = %bb2, %bb1
%.pn = phi i20 [ %1, %bb1 ], [ %2, %bb2 ]
%phi = getelementptr inbounds %struct, ptr %0, i20 %.pn
%3 = load i32, ptr %phi, align 4
ret i32 %3
}
This helps to generate cleaner MIR for several kernels.