chlo.lgamma const prop #182

vimarsh6739 · 2024-12-09T20:42:33Z

wsmoses · 2024-12-09T20:54:36Z

src/enzyme_ad/jax/Passes/EnzymeHLOOpt.cpp

+    if(!matchPattern(op.getOperand(),m_Constant(&inputAttr)))
+      return failure();
+
+    Value result = materializeLgamma(rewriter,op.getLoc(),op->getOperands());


so this is fun, but stablehlo at least has a eval method with the literal constant implemented. Apparently stablehlo doesn;t =/.

Since this code looks like it comes from lowerchlotostablehlo, maybe we can just run the relevant lowerchlo function here if it has a constant operand (rather than copying the lowering here).

would that mean calling this?
https://github.com/openxla/stablehlo/blob/ef176a130f28196dcb4a5735d0f2f6ed0f85bd5d/stablehlo/transforms/ChloLegalizeToStablehlo.cpp#L1753

yeah, but now that I look at it, it's marked static

Would be happy to expose the functions in some shareable way! Could make a header ChloDecompositionUtils.h and can expose the individual decomp pattern, or the materialize function of interest.

That would be helpful! It'd also be good to have a single implementation for the function definition

@wsmoses

We want to perform constant propogation through `chlo.lgamma` in Enzyme-JaX [Kevin](EnzymeAD/Enzyme-JAX#182 (comment)) mentioned he was open to exposing some materialize functions (which are currently static, and not callable from [our pass](https://github.com/EnzymeAD/Enzyme-JAX/blob/main/src/enzyme_ad/jax/Passes/EnzymeHLOOpt.cpp) atm) @wsmoses @GleasonK

wsmoses · 2025-01-03T18:17:09Z

@vimarsh6739 anything blocking here (I think we've since updated to a stablehlo version with the interface)?

vimarsh6739 · 2025-01-03T21:11:27Z

Not really, I'll cleanup and update the PR in a bit(need to add a lit test)
We also need to propagate through log and log1p but that can be a separate PR.

vimarsh6739 · 2025-01-04T02:56:49Z

So, interestingly, it seems like some other optimization is messing with chlo.lgamma after rebasing to main, because by itself, the constprop lowering works - I have attached the lit test output in test/lit_tests/chlo_lower_to_stablehlo.mlir here (basically only enabled GammaConstProp in the patterns).

module {
  func.func @lgamma_f32() -> tensor<f32> {
    %cst = stablehlo.constant dense<0x7F800000> : tensor<f32>
    %cst_0 = stablehlo.constant dense<1.14472985> : tensor<f32>
    %cst_1 = stablehlo.constant dense<3.14159274> : tensor<f32>
    %cst_2 = stablehlo.constant dense<0.918938517> : tensor<f32>
    %cst_3 = stablehlo.constant dense<2.01490307> : tensor<f32>
    %cst_4 = stablehlo.constant dense<7.500000e+00> : tensor<f32>
    %cst_5 = stablehlo.constant dense<8.000000e+00> : tensor<f32>
    %cst_6 = stablehlo.constant dense<1.50563267E-7> : tensor<f32>
    %cst_7 = stablehlo.constant dense<7.000000e+00> : tensor<f32>
    %cst_8 = stablehlo.constant dense<9.98436917E-6> : tensor<f32>
    %cst_9 = stablehlo.constant dense<6.000000e+00> : tensor<f32>
    %cst_10 = stablehlo.constant dense<-0.138571098> : tensor<f32>
    %cst_11 = stablehlo.constant dense<5.000000e+00> : tensor<f32>
    %cst_12 = stablehlo.constant dense<12.5073433> : tensor<f32>
    %cst_13 = stablehlo.constant dense<4.000000e+00> : tensor<f32>
    %cst_14 = stablehlo.constant dense<-176.615036> : tensor<f32>
    %cst_15 = stablehlo.constant dense<3.000000e+00> : tensor<f32>
    %cst_16 = stablehlo.constant dense<771.323425> : tensor<f32>
    %cst_17 = stablehlo.constant dense<2.000000e+00> : tensor<f32>
    %cst_18 = stablehlo.constant dense<-1259.13916> : tensor<f32>
    %cst_19 = stablehlo.constant dense<676.520386> : tensor<f32>
    %cst_20 = stablehlo.constant dense<1.000000e+00> : tensor<f32>
    %cst_21 = stablehlo.constant dense<5.000000e-01> : tensor<f32>
    %0 = stablehlo.compare  LT, %cst_20, %cst_21 : (tensor<f32>, tensor<f32>) -> tensor<i1>
    %1 = stablehlo.negate %cst_20 : tensor<f32>
    %2 = stablehlo.subtract %cst_20, %cst_20 : tensor<f32>
    %3 = stablehlo.select %0, %1, %2 : tensor<i1>, tensor<f32>
    %4 = stablehlo.add %3, %cst_20 : tensor<f32>
    %5 = stablehlo.divide %cst_19, %4 : tensor<f32>
    %6 = stablehlo.add %cst_20, %5 : tensor<f32>
    %7 = stablehlo.add %3, %cst_17 : tensor<f32>
    %8 = stablehlo.divide %cst_18, %7 : tensor<f32>
    %9 = stablehlo.add %6, %8 : tensor<f32>
    %10 = stablehlo.add %3, %cst_15 : tensor<f32>
    %11 = stablehlo.divide %cst_16, %10 : tensor<f32>
    %12 = stablehlo.add %9, %11 : tensor<f32>
    %13 = stablehlo.add %3, %cst_13 : tensor<f32>
    %14 = stablehlo.divide %cst_14, %13 : tensor<f32>
    %15 = stablehlo.add %12, %14 : tensor<f32>
    %16 = stablehlo.add %3, %cst_11 : tensor<f32>
    %17 = stablehlo.divide %cst_12, %16 : tensor<f32>
    %18 = stablehlo.add %15, %17 : tensor<f32>
    %19 = stablehlo.add %3, %cst_9 : tensor<f32>
    %20 = stablehlo.divide %cst_10, %19 : tensor<f32>
    %21 = stablehlo.add %18, %20 : tensor<f32>
    %22 = stablehlo.add %3, %cst_7 : tensor<f32>
    %23 = stablehlo.divide %cst_8, %22 : tensor<f32>
    %24 = stablehlo.add %21, %23 : tensor<f32>
    %25 = stablehlo.add %3, %cst_5 : tensor<f32>
    %26 = stablehlo.divide %cst_6, %25 : tensor<f32>
    %27 = stablehlo.add %24, %26 : tensor<f32>
    %28 = stablehlo.add %cst_4, %3 : tensor<f32>
    %29 = stablehlo.divide %3, %cst_4 : tensor<f32>
    %30 = stablehlo.log_plus_one %29 : tensor<f32>
    %31 = stablehlo.add %cst_3, %30 : tensor<f32>
    %32 = stablehlo.divide %28, %31 : tensor<f32>
    %33 = stablehlo.add %3, %cst_21 : tensor<f32>
    %34 = stablehlo.subtract %33, %32 : tensor<f32>
    %35 = stablehlo.multiply %34, %31 : tensor<f32>
    %36 = stablehlo.log %27 : tensor<f32>
    %37 = stablehlo.add %cst_2, %35 : tensor<f32>
    %38 = stablehlo.add %37, %36 : tensor<f32>
    %39 = stablehlo.abs %cst_20 : tensor<f32>
    %40 = stablehlo.floor %39 : tensor<f32>
    %41 = stablehlo.subtract %39, %40 : tensor<f32>
    %42 = stablehlo.compare  LT, %cst_21, %41 : (tensor<f32>, tensor<f32>) -> tensor<i1>
    %43 = stablehlo.subtract %cst_20, %41 : tensor<f32>
    %44 = stablehlo.select %42, %43, %41 : tensor<i1>, tensor<f32>
    %45 = stablehlo.multiply %cst_1, %44 : tensor<f32>
    %46 = stablehlo.sine %45 : tensor<f32>
    %47 = stablehlo.log %46 : tensor<f32>
    %48 = stablehlo.subtract %cst_0, %47 : tensor<f32>
    %49 = stablehlo.subtract %48, %38 : tensor<f32>
    %50 = stablehlo.is_finite %47 : (tensor<f32>) -> tensor<i1>
    %51 = stablehlo.negate %47 : tensor<f32>
    %52 = stablehlo.select %50, %49, %51 : tensor<i1>, tensor<f32>
    %53 = stablehlo.select %0, %52, %38 : tensor<i1>, tensor<f32>
    %54 = chlo.is_inf %cst_20 : tensor<f32> -> tensor<i1>
    %55 = stablehlo.select %54, %cst, %53 : tensor<i1>, tensor<f32>
    return %55 : tensor<f32>
  }
}

After re-enabling all other patterns though, I get this.

PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: bazel-bin/enzymexlamlir-opt --pass-pipeline=builtin.module(enzyme-hlo-opt) test/lit_te
sts/chlo_to_stablehlo.mlir
[1]    58336 segmentation fault  bazel-bin/enzymexlamlir-opt --pass-pipeline="builtin.module(enzyme-hlo-opt)"

Currently trying to sweep through the list of emitted ops, but would appreciate pointers as to why

wsmoses · 2025-01-04T03:02:46Z

If you run it in gdb can you tell which one is triggering.

another option is you can pass —debug and it will spew for every transform

vimarsh6739 · 2025-01-04T03:20:08Z

Haven't enabled debug symbols(laptop) but seems to crash in DivideSqrtToMultiplyRsqrt

% lldb bazel-bin/enzymexlamlir-opt -- --pass-pipeline="builtin.module(enzyme-hlo-opt)" test/lit_tests/debugger.ml
ir
(lldb) target create "bazel-bin/enzymexlamlir-opt"
Current executable set to '/Users/vsathia/dev/Enzyme-JAX/bazel-bin/enzymexlamlir-opt' (arm64).
(lldb) settings set -- target.run-args  "--pass-pipeline=builtin.module(enzyme-hlo-opt)" "test/lit_tests/debugger
.mlir"
(lldb) run
Process 59839 launched: '/Users/vsathia/dev/Enzyme-JAX/bazel-bin/enzymexlamlir-opt' (arm64)
Process 59839 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x24)
    frame #0: 0x00000001003db86c enzymexlamlir-opt`DivideSqrtToMultiplyRsqrt::matchAndRewrite(mlir::stablehlo::Di
vOp, mlir::PatternRewriter&) const + 80
enzymexlamlir-opt`DivideSqrtToMultiplyRsqrt::matchAndRewrite:
->  0x1003db86c <+80>: ldr    w8, [x8]
    0x1003db870 <+84>: cmp    w8, #0x0
    0x1003db874 <+88>: mov    x9, #-0x10 ; =-16 
    0x1003db878 <+92>: csel   x9, xzr, x9, eq
(lldb)

vimarsh6739 · 2025-01-04T03:22:01Z

which is again strange as the expanded op doesn't have a sqrt....

wsmoses · 2025-01-04T03:22:09Z

cc @avik-pal

vimarsh6739 · 2025-01-04T03:25:19Z

Ok, disabling it works for now - @avik-pal you can use the above example as a test case.

wsmoses · 2025-01-04T04:48:20Z

@vimarsh6739 does #216 fix it for you?

vimarsh6739 · 2025-01-04T04:57:02Z

let me check

vimarsh6739 · 2025-01-04T05:02:28Z

that works

wsmoses

lgtm, but maybe it would make sense to add the log, is_inf, and log_plus_one constprop ones first (so this test just becomes a single const return)?

vimarsh6739 · 2025-01-04T17:57:54Z

Yep, will add those as a separate PR. Let's hold off on merging for now then.

vimarsh6739 · 2025-01-04T21:05:54Z

log and log1p in #218

wsmoses reviewed Dec 9, 2024

View reviewed changes

wsmoses requested a review from Pangoraw December 9, 2024 20:54

vimarsh6739 changed the title ~~[draft] chlo::lgamma const prop~~ [draft] chlo.lgamma const prop Dec 11, 2024

vimarsh6739 mentioned this pull request Dec 11, 2024

Expose materialize functions in Chlo to Stablehlo lowering openxla/stablehlo#2665

Merged

vimarsh6739 changed the title ~~[draft] chlo.lgamma const prop~~ chlo.lgamma const prop Dec 11, 2024

vimarsh6739 and others added 8 commits January 3, 2025 15:20

skeleton transform

8d1916f

lgamma expansion

e9312b5

Legalize all CHLO ops

47b3546

Fix call to materialize

2b57745

Add lit test template

e2f4476

lgamma expansion

b229d4f

lgamma expansion

a3a7728

fix the diff

58443b8

vimarsh6739 force-pushed the gamma-prop branch from 0c5dd3a to 58443b8 Compare January 4, 2025 00:54

disabled div-sqrt for now

12138a2

Added LIT test lowering check

b1bd475

re-enable divideSqrt

3de3cca

wsmoses approved these changes Jan 4, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chlo.lgamma const prop #182

chlo.lgamma const prop #182

vimarsh6739 commented Dec 9, 2024 •

edited

Loading

wsmoses Dec 9, 2024

vimarsh6739 Dec 9, 2024

wsmoses Dec 9, 2024

GleasonK Dec 9, 2024

vimarsh6739 Dec 9, 2024 •

edited

Loading

wsmoses commented Jan 3, 2025

vimarsh6739 commented Jan 3, 2025

vimarsh6739 commented Jan 4, 2025

wsmoses commented Jan 4, 2025

vimarsh6739 commented Jan 4, 2025

vimarsh6739 commented Jan 4, 2025

wsmoses commented Jan 4, 2025

vimarsh6739 commented Jan 4, 2025

wsmoses commented Jan 4, 2025

vimarsh6739 commented Jan 4, 2025

vimarsh6739 commented Jan 4, 2025

wsmoses left a comment

vimarsh6739 commented Jan 4, 2025

vimarsh6739 commented Jan 4, 2025

chlo.lgamma const prop #182

Are you sure you want to change the base?

chlo.lgamma const prop #182

Conversation

vimarsh6739 commented Dec 9, 2024 • edited Loading

wsmoses Dec 9, 2024

Choose a reason for hiding this comment

vimarsh6739 Dec 9, 2024

Choose a reason for hiding this comment

wsmoses Dec 9, 2024

Choose a reason for hiding this comment

GleasonK Dec 9, 2024

Choose a reason for hiding this comment

vimarsh6739 Dec 9, 2024 • edited Loading

Choose a reason for hiding this comment

wsmoses commented Jan 3, 2025

vimarsh6739 commented Jan 3, 2025

vimarsh6739 commented Jan 4, 2025

wsmoses commented Jan 4, 2025

vimarsh6739 commented Jan 4, 2025

vimarsh6739 commented Jan 4, 2025

wsmoses commented Jan 4, 2025

vimarsh6739 commented Jan 4, 2025

wsmoses commented Jan 4, 2025

vimarsh6739 commented Jan 4, 2025

vimarsh6739 commented Jan 4, 2025

wsmoses left a comment

Choose a reason for hiding this comment

vimarsh6739 commented Jan 4, 2025

vimarsh6739 commented Jan 4, 2025

vimarsh6739 commented Dec 9, 2024 •

edited

Loading

vimarsh6739 Dec 9, 2024 •

edited

Loading