Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add emulation of (s/u)_(add/sub)_sat intrinsics if needed #2138

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

MrSidims
Copy link
Contributor

@MrSidims MrSidims commented Aug 25, 2023

While the translator was built around OpenCL - other targets that support SPIR-V and don't support OpenCL might use it.

This patch adds --spirv-use-ocl-math-for-llvm-intrinsic option to control whether we should translate them as math intrinsics to OpenCL ext math instructions or emulate. Default is true aka translate as math instructions.

I don't really want to end up implementing Quake's sqrt algorithm, but it's a possible scenario as well.

Plans are:

  1. merge existing --spirv-replace-fmuladd-with-ocl-mad with the new option;
  2. introduce InstCombine pass in reverse translation under an option.

While the translator was built around OpenCL - other targets that
support SPIR-V and don't support OpenCL might use it.

This patch adds --spirv-use-ocl-math-for-llvm-intrinsic option to
control whether we should translate them as math intrinsics to OpenCL ext math
instructions or emulate. Default is true aka translate as math
instructions.

I don't really want to end up implementing Quake's sqrt algorithm, but
it's a possible scenario as well.

Plans are:
1. merge existing --spirv-replace-fmuladd-with-ocl-mad with the new
   option;
2. optionally introduce InstCombine pass in reverse translation.

Signed-off-by: Sidorov, Dmitry <[email protected]>
@MrSidims MrSidims requested review from svenvh and vmaksimo August 25, 2023 13:43
@MrSidims MrSidims changed the title Add emulation of {s/u)_(add/sub)_sat intrinsics if needed Add emulation of (s/u)_(add/sub)_sat intrinsics if needed Aug 25, 2023
@MrSidims
Copy link
Contributor Author

@LU-JOHN @asudarsa @maksimsab please take a look

Signed-off-by: Sidorov, Dmitry <[email protected]>
test/llvm-intrinsics/add_sub.sat.ll Outdated Show resolved Hide resolved
lib/SPIRV/SPIRVWriter.cpp Outdated Show resolved Hide resolved
SPIRVValue *Add =
BM->addBinaryInst(OpIAdd, Ty, FirstArgVal, SecondArgVal, BB);
SPIRVValue *Cmp =
BM->addCmpInst(OpUGreaterThan, SPVBoolTy, Add, FirstArgVal, BB);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think OpUGreaterThan is the right op. But, I am not able to reason out with an example. Can you please provide one if possible? Not a PR blocker.

Thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be OpUGreaterThanEqual according to your comment from above
I've also checked generated IR after the translation by running inst combine - it's able to restore uadd_sat intrinsic

lib/SPIRV/SPIRVWriter.cpp Outdated Show resolved Hide resolved
BM->addBinaryInst(OpISub, Ty, Max, SecondArgVal, BB);
SPIRVValue *CanPosOverflow =
BM->addCmpInst(OpSGreaterThan, SPVBoolTy, FirstArgVal, MaxSubB, BB);
SPIRVValue *PosOverflow = BM->addInstTemplate(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if the order of the operand could impact performance here. Again just a request for quick check. Not a PR blocker.

Thanks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH I'm not expecting this implementation be performant at all. Ideally we should generate such code that is collapsing to sadd/sub_sat intrinsic after we run InstCombine , but it's not happening. That is because InstCombine checks overflow by extending integer to a larger one and comparing the result with MAX/MIN of the previous type. We can't do this in general case, because we can't cast integer to a wider one in SPIR-V without extensions.

Copy link
Contributor

@asudarsa asudarsa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One possible correction required. Please address. Thanks for this change.

@asudarsa
Copy link
Contributor

Title can be: Improve translation support for (s/u)_(add/sub)_sat intrinsics

@@ -258,6 +258,12 @@ static cl::opt<SPIRV::BuiltinFormat> SPIRVBuiltinFormat(
clEnumValN(SPIRV::BuiltinFormat::Global, "global",
"Use globals to represent SPIR-V builtin variables")));

static cl::opt<bool> SPIRVUseOpenCLExtInstructionsForLLVMIntrinsic(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that this option sounds very generic. It sounds as if every llvm intrinsic is covered under this option. Can you please elaborate?

Thanks

Copy link
Contributor Author

@MrSidims MrSidims Aug 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to SPIRVUseOpenCLExtInstructionsForLLVMMathIntrinsic

Signed-off-by: Sidorov, Dmitry <[email protected]>
Signed-off-by: Sidorov, Dmitry <[email protected]>
Copy link
Contributor

@asudarsa asudarsa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.Thanks

@@ -258,6 +258,12 @@ static cl::opt<SPIRV::BuiltinFormat> SPIRVBuiltinFormat(
clEnumValN(SPIRV::BuiltinFormat::Global, "global",
"Use globals to represent SPIR-V builtin variables")));

static cl::opt<bool> SPIRVUseOpenCLExtInstructionsForLLVMMathIntrinsic(
"spirv-use-ocl-for-llvm-math-intrinsic", cl::init(true),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@svenvh may I ask you to take a look if this option is OK?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can generalize the option? If I understand correctly, you essentially want to control whether the translator is allowed to emit OpenCL.std OpExtInst instructions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, exactly. So is your suggesting to rename it or something different? We kinda can find counterparts of https://registry.khronos.org/SPIR-V/specs/unified1/OpenCL.ExtendedInstructionSet.100.html in https://llvm.org/docs/LangRef.html#standard-c-c-library-intrinsics , so would it be spirv-use-std-ocl-for-llvm-std-intrinsics ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interface would be similar to --spirv-ext. So for your use case it could look like --spirv-ext-inst=-OpenCL.std for example.

It may require a bit of thinking/refactoring first though, so I don't want to hold up this patch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmm, I see. I will spend some time thinking about it, I'd prefer not to add an option (and enable it in some compiler's frontend driver), only to remove it in a couple of weeks, replacing with a better solution :) So we can keep PR open for a while

@svenvh
Copy link
Member

svenvh commented Aug 30, 2023

I don't really want to end up implementing Quake's sqrt algorithm, but it's a possible scenario as well.

The prospect of growing a math library inside llvm-spirv doesn't sound exciting indeed... ;-)

BM->addCmpInst(OpSLessThan, SPVBoolTy, SecondArgVal, Zero, BB);

if (IID == Intrinsic::sadd_sat) {
// sadd.sat(a, b) -> if (b > 0) && a > MAX - b => overflow -> MAX
Copy link
Contributor

@LU-JOHN LU-JOHN Sep 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use this sequence:

res = a+b
if (b>0 && a>res) res = MAX
if (b<0 && a<res) res = MIN

This makes it unnecessary to calculate "MAX - b" and "MIN - b"

Copy link
Contributor

@LU-JOHN LU-JOHN Sep 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another implementation with a single comparison:

sum = a+b;
if ( ( ~(a^b)   &   // addends have same sign
       (sum^a)      // result has different sign )
    
     // test MSB
     < 0)
    sum = (0x7fffffff^(a>>31));  // saturated result

@LU-JOHN
Copy link
Contributor

LU-JOHN commented Sep 7, 2023

2. introduce InstCombine pass in reverse translation under an option.

Some of the emulation sequences are long. Could they be modified (e.g. if we know an addend is always positive) or re-ordered so that InstCombine cannot recognize the sequence anymore?

SPIRVValue *FirstSelect = BM->addSelectInst(PosOverflow, Max, Add, BB);
return BM->addSelectInst(NegOverflow, Min, FirstSelect, BB);
}
// ssub.sat(a, b) -> if (b > 0) && a < MIN + b => overflow -> MIN
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be:

res = a-b
if (b<0 && a>res) res = MAX
if (b>0 && a<res) res = MIN

This makes it unnecessary to calculate "MAX + b" and "MIN + b"

@@ -193,6 +193,14 @@ class TranslatorOpts {
ReplaceLLVMFmulAddWithOpenCLMad = Value;
}

void setUseOpenCLExtInstructionsForLLVMMathIntrinsic(bool Value) noexcept {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit regarding noexcept:
LLVM itself prohibits C++ exceptions and they are turned off in CMakeLists by compiler flag -fno-exceptions.
If we use exceptions in translator, then it would be better to stop doing it and start using the compiler flag -fno-exceptions as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants