Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Refine design of CIR_BitOp #1392

Open
FantasqueX opened this issue Feb 23, 2025 · 2 comments
Open

[RFC] Refine design of CIR_BitOp #1392

FantasqueX opened this issue Feb 23, 2025 · 2 comments

Comments

@FantasqueX
Copy link
Collaborator

When I try to add support for __lzcnt #1382, I encountered some issues.

The first problem is how to treat is_poison_zero. __builtin_clz returns undefined if input is 0, while __lzcnt returns the width of the input if input is 0. However, current implementation of CIR_ClzOp always lower to llvm.ctlz. {{%.*}} true which allows undefined result. If lowering cir dialect to math dialect and then to llvmir dialect, it will be math.ctlz and "llvm.intr.ctlz"(%0) <{is_zero_poison = false}> : (i16) -> i16 which doesn't allow undefined result.

The second problem is the return type. LLVM IR and Microsoft intrinsics require that the type of result matches the type of input, while gcc intrinsics always returns a signed 32-bit integer. If we design CIR_ClzOp to return a signed 32-bit integer, there will be two redundant CastOp when implementing __lzcnt. If we design CIR_ClzOp to return an integer whose type matches the input, there will be no problem for both __builtin_clz and __lzcnt

The last problem is about new IR design. The first way to solve the problem is how I showed in #1382 which modified the result type of CIR_BitOp base class, added a new IR CIR_LzcntOp. We can get a clean codegen and a complex lowering. If we want to support __builtin_clzg, we need to add another CIR. Another way to solve the problem is to reserve three Op in CIR, CountLeadingZerosOp, CountTrailingZerosOp and CtPopOp. Their definition is similar to LLVM IR.

@FantasqueX FantasqueX changed the title Refine design of CIR_BitOp [RFC] Refine design of CIR_BitOp Feb 23, 2025
@bcardosolopes
Copy link
Member

If lowering cir dialect to math dialect and then to llvmir dialect

I'm a bit confused, is this about direct LLVM lowering or through MLIR. I recommend solving one problem at a time.
You could have a undefined attribute or some other name to decide how CIR_ClzOp should be lowered.

If we design CIR_ClzOp to return a signed 32-bit integer, there will be two redundant CastOp when implementing __lzcnt.

Looks like you cannot get rid of the cast anyways? If the llvm counter part usually expectes 32-bit integer you have to make it fit.

If we design CIR_ClzOp to return an integer whose type matches the input, there will be no problem for both __builtin_clz and __lzcnt

This is probably best tho seems orthogonal to the above point.

@bcardosolopes
Copy link
Member

The last problem is about new IR design. The first way to solve the problem is how I showed in #1382 which modified the result type of CIR_BitOp base class, added a new IR CIR_LzcntOp. We can get a clean codegen and a complex lowering. If we want to support __builtin_clzg, we need to add another CIR. Another way to solve the problem is to reserve three Op in CIR, CountLeadingZerosOp, CountTrailingZerosOp and CtPopOp. Their definition is similar to LLVM IR.

There are two ways to lower builtins: (a) we map them to CIR operations or (b) we use a thin wrapper that can directly translate to any existing LLVM intrinsic via cir::LLVMIntrinsicCallOp. My take is to try (a) if it's an useful (popular) operation to have around, for things too low level I wouldn't bother as much (e.g. AArch64 intrinsics). YMMV because they have different trade offs. For this draft PR specifically, the approach sounds good to me. The example from another issue where there's a x86 flush sounds like is better off using cir::LLVMIntrinsicCallOp.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants