Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Analysis] atan2: isTriviallyVectorizable; add to massv and accelerate veclibs #113637

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

tex3d
Copy link
Contributor

@tex3d tex3d commented Oct 25, 2024

This change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

  • Return true for atan2 from isTriviallyVectorizable
  • Add atan2 to VecFuncs.def for massv and accelerate libraries.
  • Add atan2 to hasOptimizedCodeGen
  • Add atan2 support in llvm/lib/Analysis/ValueTracking.cpp llvm::getIntrinsicForCallSite and update vectorization tests
  • Add atan2 name check to isLoweredToCall in llvm/include/llvm/Analysis/TargetTransformInfoImpl.h

Thanks to @jroelofs for the atan2 accelerate veclib and associated test additions, plus the hasOptimizedCodeGen addition.

Part of: Implement the atan2 HLSL Function #70096.

@llvmbot
Copy link
Collaborator

llvmbot commented Oct 25, 2024

@llvm/pr-subscribers-llvm-analysis

@llvm/pr-subscribers-llvm-transforms

Author: Tex Riddell (tex3d)

Changes

This change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

This returns true from isTriviallyVectorizable for llvm.atan2 intrinsic. It also adds llvm atan2 intrinsic equivalents to VecFuncs.def for massv.

Part of: Implement the atan2 HLSL Function #70096.


Full diff: https://github.com/llvm/llvm-project/pull/113637.diff

3 Files Affected:

  • (modified) llvm/include/llvm/Analysis/VecFuncs.def (+2)
  • (modified) llvm/lib/Analysis/VectorUtils.cpp (+1)
  • (modified) llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll (+46)
diff --git a/llvm/include/llvm/Analysis/VecFuncs.def b/llvm/include/llvm/Analysis/VecFuncs.def
index c4586894e3e490..6f0ba3f9e406c4 100644
--- a/llvm/include/llvm/Analysis/VecFuncs.def
+++ b/llvm/include/llvm/Analysis/VecFuncs.def
@@ -289,7 +289,9 @@ TLI_DEFINE_VECFUNC("acosf", "__acosf4", FIXED(4), "_ZGV_LLVM_N4v")
 TLI_DEFINE_VECFUNC("atan", "__atand2", FIXED(2), "_ZGV_LLVM_N2v")
 TLI_DEFINE_VECFUNC("atanf", "__atanf4", FIXED(4), "_ZGV_LLVM_N4v")
 TLI_DEFINE_VECFUNC("atan2", "__atan2d2", FIXED(2), "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f64", "__atan2d2", FIXED(2), "_ZGV_LLVM_N2vv")
 TLI_DEFINE_VECFUNC("atan2f", "__atan2f4", FIXED(4), "_ZGV_LLVM_N4vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f32", "__atan2f4", FIXED(4), "_ZGV_LLVM_N4vv")
 
 // Hyperbolic Functions
 TLI_DEFINE_VECFUNC("sinh", "__sinhd2", FIXED(2), "_ZGV_LLVM_N2v")
diff --git a/llvm/lib/Analysis/VectorUtils.cpp b/llvm/lib/Analysis/VectorUtils.cpp
index 37c443011719b6..f3ffddeb40d4b4 100644
--- a/llvm/lib/Analysis/VectorUtils.cpp
+++ b/llvm/lib/Analysis/VectorUtils.cpp
@@ -69,6 +69,7 @@ bool llvm::isTriviallyVectorizable(Intrinsic::ID ID) {
   case Intrinsic::asin:
   case Intrinsic::acos:
   case Intrinsic::atan:
+  case Intrinsic::atan2:
   case Intrinsic::sin:
   case Intrinsic::cos:
   case Intrinsic::tan:
diff --git a/llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll b/llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll
index ba6e85e2852952..badc67866d2bad 100644
--- a/llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll
+++ b/llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll
@@ -1244,6 +1244,52 @@ for.end:
   ret void
 }
 
+define void @atan2_f64_intrinsic(ptr nocapture %varray) {
+; CHECK-LABEL: @atan2_f64_intrinsic(
+; CHECK: __atan2d2{{.*}}<2 x double>
+; CHECK: ret void
+;
+entry:
+  br label %for.body
+
+for.body:
+  %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
+  %tmp = trunc i64 %iv to i32
+  %conv = sitofp i32 %tmp to double
+  %call = tail call double @llvm.atan2.f64(double %conv, double %conv)
+  %arrayidx = getelementptr inbounds double, ptr %varray, i64 %iv
+  store double %call, ptr %arrayidx, align 4
+  %iv.next = add nuw nsw i64 %iv, 1
+  %exitcond = icmp eq i64 %iv.next, 1000
+  br i1 %exitcond, label %for.end, label %for.body
+
+for.end:
+  ret void
+}
+
+define void @atan2_f32_intrinsic(ptr nocapture %varray) {
+; CHECK-LABEL: @atan2_f32_intrinsic(
+; CHECK: __atan2f4{{.*}}<4 x float>
+; CHECK: ret void
+;
+entry:
+  br label %for.body
+
+for.body:
+  %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
+  %tmp = trunc i64 %iv to i32
+  %conv = sitofp i32 %tmp to float
+  %call = tail call float @llvm.atan2.f32(float %conv, float %conv)
+  %arrayidx = getelementptr inbounds float, ptr %varray, i64 %iv
+  store float %call, ptr %arrayidx, align 4
+  %iv.next = add nuw nsw i64 %iv, 1
+  %exitcond = icmp eq i64 %iv.next, 1000
+  br i1 %exitcond, label %for.end, label %for.body
+
+for.end:
+  ret void
+}
+
 define void @sinh_f64(ptr nocapture %varray) {
 ; CHECK-LABEL: @sinh_f64(
 ; CHECK: __sinhd2{{.*}}<2 x double>

@llvmbot
Copy link
Collaborator

llvmbot commented Oct 25, 2024

@llvm/pr-subscribers-backend-powerpc

Author: Tex Riddell (tex3d)

Changes

This change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

This returns true from isTriviallyVectorizable for llvm.atan2 intrinsic. It also adds llvm atan2 intrinsic equivalents to VecFuncs.def for massv.

Part of: Implement the atan2 HLSL Function #70096.


Full diff: https://github.com/llvm/llvm-project/pull/113637.diff

3 Files Affected:

  • (modified) llvm/include/llvm/Analysis/VecFuncs.def (+2)
  • (modified) llvm/lib/Analysis/VectorUtils.cpp (+1)
  • (modified) llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll (+46)
diff --git a/llvm/include/llvm/Analysis/VecFuncs.def b/llvm/include/llvm/Analysis/VecFuncs.def
index c4586894e3e490..6f0ba3f9e406c4 100644
--- a/llvm/include/llvm/Analysis/VecFuncs.def
+++ b/llvm/include/llvm/Analysis/VecFuncs.def
@@ -289,7 +289,9 @@ TLI_DEFINE_VECFUNC("acosf", "__acosf4", FIXED(4), "_ZGV_LLVM_N4v")
 TLI_DEFINE_VECFUNC("atan", "__atand2", FIXED(2), "_ZGV_LLVM_N2v")
 TLI_DEFINE_VECFUNC("atanf", "__atanf4", FIXED(4), "_ZGV_LLVM_N4v")
 TLI_DEFINE_VECFUNC("atan2", "__atan2d2", FIXED(2), "_ZGV_LLVM_N2vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f64", "__atan2d2", FIXED(2), "_ZGV_LLVM_N2vv")
 TLI_DEFINE_VECFUNC("atan2f", "__atan2f4", FIXED(4), "_ZGV_LLVM_N4vv")
+TLI_DEFINE_VECFUNC("llvm.atan2.f32", "__atan2f4", FIXED(4), "_ZGV_LLVM_N4vv")
 
 // Hyperbolic Functions
 TLI_DEFINE_VECFUNC("sinh", "__sinhd2", FIXED(2), "_ZGV_LLVM_N2v")
diff --git a/llvm/lib/Analysis/VectorUtils.cpp b/llvm/lib/Analysis/VectorUtils.cpp
index 37c443011719b6..f3ffddeb40d4b4 100644
--- a/llvm/lib/Analysis/VectorUtils.cpp
+++ b/llvm/lib/Analysis/VectorUtils.cpp
@@ -69,6 +69,7 @@ bool llvm::isTriviallyVectorizable(Intrinsic::ID ID) {
   case Intrinsic::asin:
   case Intrinsic::acos:
   case Intrinsic::atan:
+  case Intrinsic::atan2:
   case Intrinsic::sin:
   case Intrinsic::cos:
   case Intrinsic::tan:
diff --git a/llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll b/llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll
index ba6e85e2852952..badc67866d2bad 100644
--- a/llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll
+++ b/llvm/test/Transforms/LoopVectorize/PowerPC/massv-calls.ll
@@ -1244,6 +1244,52 @@ for.end:
   ret void
 }
 
+define void @atan2_f64_intrinsic(ptr nocapture %varray) {
+; CHECK-LABEL: @atan2_f64_intrinsic(
+; CHECK: __atan2d2{{.*}}<2 x double>
+; CHECK: ret void
+;
+entry:
+  br label %for.body
+
+for.body:
+  %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
+  %tmp = trunc i64 %iv to i32
+  %conv = sitofp i32 %tmp to double
+  %call = tail call double @llvm.atan2.f64(double %conv, double %conv)
+  %arrayidx = getelementptr inbounds double, ptr %varray, i64 %iv
+  store double %call, ptr %arrayidx, align 4
+  %iv.next = add nuw nsw i64 %iv, 1
+  %exitcond = icmp eq i64 %iv.next, 1000
+  br i1 %exitcond, label %for.end, label %for.body
+
+for.end:
+  ret void
+}
+
+define void @atan2_f32_intrinsic(ptr nocapture %varray) {
+; CHECK-LABEL: @atan2_f32_intrinsic(
+; CHECK: __atan2f4{{.*}}<4 x float>
+; CHECK: ret void
+;
+entry:
+  br label %for.body
+
+for.body:
+  %iv = phi i64 [ 0, %entry ], [ %iv.next, %for.body ]
+  %tmp = trunc i64 %iv to i32
+  %conv = sitofp i32 %tmp to float
+  %call = tail call float @llvm.atan2.f32(float %conv, float %conv)
+  %arrayidx = getelementptr inbounds float, ptr %varray, i64 %iv
+  store float %call, ptr %arrayidx, align 4
+  %iv.next = add nuw nsw i64 %iv, 1
+  %exitcond = icmp eq i64 %iv.next, 1000
+  br i1 %exitcond, label %for.end, label %for.body
+
+for.end:
+  ret void
+}
+
 define void @sinh_f64(ptr nocapture %varray) {
 ; CHECK-LABEL: @sinh_f64(
 ; CHECK: __sinhd2{{.*}}<2 x double>

Copy link
Contributor

@jroelofs jroelofs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tex3d tex3d changed the title [Analysis] isTriviallyVectorizable add vectorization for atan2 intrinsic [Analysis] atan2: isTriviallyVectorizable; add to massv and accelerate veclibs Oct 29, 2024
@farzonl
Copy link
Member

farzonl commented Nov 6, 2024

  1. Add libfunc mapping to intrinsic in llvm/lib/Analysis/ValueTracking.cpp llvm::getIntrinsicForCallSite
  2. Add atan2 name check in llvm/include/llvm/Analysis/TargetTransformInfoImpl.h isLoweredToCall
  3. I noticed there were some examples that use a amdlibm vector form below. Should we do a follow on PR that adds atan2 amdlibm and updates llvm/test/Transforms/LoopVectorize/X86/amdlibm-calls.ll?

amd_vrs4_atan2f
amd_vrs8_atan2f
amd_vrd2_atan2
amd_vrd4_atan2

This change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

This returns true from isTriviallyVectorizable for llvm.atan2 intrinsic.
It also adds llvm atan2 intrinsic equivalents to VecFuncs.def for massv.

Part of: Implement the atan2 HLSL Function llvm#70096.
…sis/TargetTransformInfoImpl.h`

TBD: Add testing for this change.
@tex3d
Copy link
Contributor Author

tex3d commented Nov 7, 2024

@farzonl

  • Add libfunc mapping to intrinsic in llvm/lib/Analysis/ValueTracking.cpp llvm::getIntrinsicForCallSite

Added.

  • Add atan2 name check in llvm/include/llvm/Analysis/TargetTransformInfoImpl.h isLoweredToCall

Added, but it had no test impact, so I'm looking for feedback on how this can be tested.

  • I noticed there were some examples that use a amdlibm vector form below. Should we do a follow on PR that adds atan2 amdlibm and updates llvm/test/Transforms/LoopVectorize/X86/amdlibm-calls.ll?

It turns out these example files have all the code under #if 0, and there doesn't appear to be any declarations of these vector operations in the header. I'll leave this for a later update when the operations are added to amdlibm.

@farzonl
Copy link
Member

farzonl commented Nov 7, 2024

See this pr: https://github.com/llvm/llvm-project/pull/95517/files

tests like this should show that the libcall name to intrinsic mapping in the check line

llvm/test/Transforms/SLPVectorizer/X86/call.ll

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants