-
Notifications
You must be signed in to change notification settings - Fork 223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add emulation of (s/u)_(add/sub)_sat intrinsics if needed #2138
base: main
Are you sure you want to change the base?
Conversation
While the translator was built around OpenCL - other targets that support SPIR-V and don't support OpenCL might use it. This patch adds --spirv-use-ocl-math-for-llvm-intrinsic option to control whether we should translate them as math intrinsics to OpenCL ext math instructions or emulate. Default is true aka translate as math instructions. I don't really want to end up implementing Quake's sqrt algorithm, but it's a possible scenario as well. Plans are: 1. merge existing --spirv-replace-fmuladd-with-ocl-mad with the new option; 2. optionally introduce InstCombine pass in reverse translation. Signed-off-by: Sidorov, Dmitry <[email protected]>
@LU-JOHN @asudarsa @maksimsab please take a look |
Signed-off-by: Sidorov, Dmitry <[email protected]>
lib/SPIRV/SPIRVWriter.cpp
Outdated
SPIRVValue *Add = | ||
BM->addBinaryInst(OpIAdd, Ty, FirstArgVal, SecondArgVal, BB); | ||
SPIRVValue *Cmp = | ||
BM->addCmpInst(OpUGreaterThan, SPVBoolTy, Add, FirstArgVal, BB); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think OpUGreaterThan is the right op. But, I am not able to reason out with an example. Can you please provide one if possible? Not a PR blocker.
Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be OpUGreaterThanEqual according to your comment from above
I've also checked generated IR after the translation by running inst combine - it's able to restore uadd_sat intrinsic
BM->addBinaryInst(OpISub, Ty, Max, SecondArgVal, BB); | ||
SPIRVValue *CanPosOverflow = | ||
BM->addCmpInst(OpSGreaterThan, SPVBoolTy, FirstArgVal, MaxSubB, BB); | ||
SPIRVValue *PosOverflow = BM->addInstTemplate( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if the order of the operand could impact performance here. Again just a request for quick check. Not a PR blocker.
Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH I'm not expecting this implementation be performant at all. Ideally we should generate such code that is collapsing to sadd/sub_sat intrinsic after we run InstCombine , but it's not happening. That is because InstCombine checks overflow by extending integer to a larger one and comparing the result with MAX/MIN of the previous type. We can't do this in general case, because we can't cast integer to a wider one in SPIR-V without extensions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One possible correction required. Please address. Thanks for this change.
Title can be: Improve translation support for (s/u)_(add/sub)_sat intrinsics |
tools/llvm-spirv/llvm-spirv.cpp
Outdated
@@ -258,6 +258,12 @@ static cl::opt<SPIRV::BuiltinFormat> SPIRVBuiltinFormat( | |||
clEnumValN(SPIRV::BuiltinFormat::Global, "global", | |||
"Use globals to represent SPIR-V builtin variables"))); | |||
|
|||
static cl::opt<bool> SPIRVUseOpenCLExtInstructionsForLLVMIntrinsic( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that this option sounds very generic. It sounds as if every llvm intrinsic is covered under this option. Can you please elaborate?
Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to SPIRVUseOpenCLExtInstructionsForLLVMMathIntrinsic
Signed-off-by: Sidorov, Dmitry <[email protected]>
Signed-off-by: Sidorov, Dmitry <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.Thanks
@@ -258,6 +258,12 @@ static cl::opt<SPIRV::BuiltinFormat> SPIRVBuiltinFormat( | |||
clEnumValN(SPIRV::BuiltinFormat::Global, "global", | |||
"Use globals to represent SPIR-V builtin variables"))); | |||
|
|||
static cl::opt<bool> SPIRVUseOpenCLExtInstructionsForLLVMMathIntrinsic( | |||
"spirv-use-ocl-for-llvm-math-intrinsic", cl::init(true), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@svenvh may I ask you to take a look if this option is OK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can generalize the option? If I understand correctly, you essentially want to control whether the translator is allowed to emit OpenCL.std OpExtInst
instructions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, exactly. So is your suggesting to rename it or something different? We kinda can find counterparts of https://registry.khronos.org/SPIR-V/specs/unified1/OpenCL.ExtendedInstructionSet.100.html in https://llvm.org/docs/LangRef.html#standard-c-c-library-intrinsics , so would it be spirv-use-std-ocl-for-llvm-std-intrinsics
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interface would be similar to --spirv-ext
. So for your use case it could look like --spirv-ext-inst=-OpenCL.std
for example.
It may require a bit of thinking/refactoring first though, so I don't want to hold up this patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmm, I see. I will spend some time thinking about it, I'd prefer not to add an option (and enable it in some compiler's frontend driver), only to remove it in a couple of weeks, replacing with a better solution :) So we can keep PR open for a while
The prospect of growing a math library inside llvm-spirv doesn't sound exciting indeed... ;-) |
BM->addCmpInst(OpSLessThan, SPVBoolTy, SecondArgVal, Zero, BB); | ||
|
||
if (IID == Intrinsic::sadd_sat) { | ||
// sadd.sat(a, b) -> if (b > 0) && a > MAX - b => overflow -> MAX |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use this sequence:
res = a+b
if (b>0 && a>res) res = MAX
if (b<0 && a<res) res = MIN
This makes it unnecessary to calculate "MAX - b" and "MIN - b"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another implementation with a single comparison:
sum = a+b;
if ( ( ~(a^b) & // addends have same sign
(sum^a) // result has different sign )
// test MSB
< 0)
sum = (0x7fffffff^(a>>31)); // saturated result
Some of the emulation sequences are long. Could they be modified (e.g. if we know an addend is always positive) or re-ordered so that InstCombine cannot recognize the sequence anymore? |
SPIRVValue *FirstSelect = BM->addSelectInst(PosOverflow, Max, Add, BB); | ||
return BM->addSelectInst(NegOverflow, Min, FirstSelect, BB); | ||
} | ||
// ssub.sat(a, b) -> if (b > 0) && a < MIN + b => overflow -> MIN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be:
res = a-b
if (b<0 && a>res) res = MAX
if (b>0 && a<res) res = MIN
This makes it unnecessary to calculate "MAX + b" and "MIN + b"
@@ -193,6 +193,14 @@ class TranslatorOpts { | |||
ReplaceLLVMFmulAddWithOpenCLMad = Value; | |||
} | |||
|
|||
void setUseOpenCLExtInstructionsForLLVMMathIntrinsic(bool Value) noexcept { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit regarding noexcept
:
LLVM itself prohibits C++ exceptions and they are turned off in CMakeLists by compiler flag -fno-exceptions
.
If we use exceptions in translator, then it would be better to stop doing it and start using the compiler flag -fno-exceptions
as well.
While the translator was built around OpenCL - other targets that support SPIR-V and don't support OpenCL might use it.
This patch adds --spirv-use-ocl-math-for-llvm-intrinsic option to control whether we should translate them as math intrinsics to OpenCL ext math instructions or emulate. Default is true aka translate as math instructions.
I don't really want to end up implementing Quake's sqrt algorithm, but it's a possible scenario as well.
Plans are: