[RFC] Refine design of `CIR_BitOp` #1392

FantasqueX · 2025-02-23T10:44:44Z

When I try to add support for __lzcnt #1382, I encountered some issues.

The first problem is how to treat is_poison_zero. __builtin_clz returns undefined if input is 0, while __lzcnt returns the width of the input if input is 0. However, current implementation of CIR_ClzOp always lower to llvm.ctlz. {{%.*}} true which allows undefined result. If lowering cir dialect to math dialect and then to llvmir dialect, it will be math.ctlz and "llvm.intr.ctlz"(%0) <{is_zero_poison = false}> : (i16) -> i16 which doesn't allow undefined result.

The second problem is the return type. LLVM IR and Microsoft intrinsics require that the type of result matches the type of input, while gcc intrinsics always returns a signed 32-bit integer. If we design CIR_ClzOp to return a signed 32-bit integer, there will be two redundant CastOp when implementing __lzcnt. If we design CIR_ClzOp to return an integer whose type matches the input, there will be no problem for both __builtin_clz and __lzcnt

The last problem is about new IR design. The first way to solve the problem is how I showed in #1382 which modified the result type of CIR_BitOp base class, added a new IR CIR_LzcntOp. We can get a clean codegen and a complex lowering. If we want to support __builtin_clzg, we need to add another CIR. Another way to solve the problem is to reserve three Op in CIR, CountLeadingZerosOp, CountTrailingZerosOp and CtPopOp. Their definition is similar to LLVM IR.

The text was updated successfully, but these errors were encountered:

bcardosolopes · 2025-02-25T02:55:03Z

If lowering cir dialect to math dialect and then to llvmir dialect

I'm a bit confused, is this about direct LLVM lowering or through MLIR. I recommend solving one problem at a time.
You could have a undefined attribute or some other name to decide how CIR_ClzOp should be lowered.

If we design CIR_ClzOp to return a signed 32-bit integer, there will be two redundant CastOp when implementing __lzcnt.

Looks like you cannot get rid of the cast anyways? If the llvm counter part usually expectes 32-bit integer you have to make it fit.

If we design CIR_ClzOp to return an integer whose type matches the input, there will be no problem for both __builtin_clz and __lzcnt

This is probably best tho seems orthogonal to the above point.

bcardosolopes · 2025-02-27T01:59:38Z

The last problem is about new IR design. The first way to solve the problem is how I showed in #1382 which modified the result type of CIR_BitOp base class, added a new IR CIR_LzcntOp. We can get a clean codegen and a complex lowering. If we want to support __builtin_clzg, we need to add another CIR. Another way to solve the problem is to reserve three Op in CIR, CountLeadingZerosOp, CountTrailingZerosOp and CtPopOp. Their definition is similar to LLVM IR.

There are two ways to lower builtins: (a) we map them to CIR operations or (b) we use a thin wrapper that can directly translate to any existing LLVM intrinsic via cir::LLVMIntrinsicCallOp. My take is to try (a) if it's an useful (popular) operation to have around, for things too low level I wouldn't bother as much (e.g. AArch64 intrinsics). YMMV because they have different trade offs. For this draft PR specifically, the approach sounds good to me. The example from another issue where there's a x86 flush sounds like is better off using cir::LLVMIntrinsicCallOp.

FantasqueX changed the title ~~Refine design of CIR_BitOp~~ [RFC] Refine design of CIR_BitOp Feb 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] Refine design of `CIR_BitOp` #1392

[RFC] Refine design of `CIR_BitOp` #1392

FantasqueX commented Feb 23, 2025

bcardosolopes commented Feb 25, 2025

bcardosolopes commented Feb 27, 2025

[RFC] Refine design of CIR_BitOp #1392

[RFC] Refine design of CIR_BitOp #1392

Comments

FantasqueX commented Feb 23, 2025

bcardosolopes commented Feb 25, 2025

bcardosolopes commented Feb 27, 2025

[RFC] Refine design of `CIR_BitOp` #1392

[RFC] Refine design of `CIR_BitOp` #1392