8346664: C2: Optimize mask check with constant offset #22856

mernst-github · 2024-12-21T14:08:02Z

Fixes JDK-8346664: extends the optimization of masked sums introduced in #6697 to cover constant values, which currently break the optimization.

Such constant values arise in an expression of the following form, for example from MemorySegmentImpl#isAlignedForElement:

(base + (index + 1) << 8) & 255
=> MulNode
(base + (index << 8 + 256)) & 255
=> AddNode
((base + index << 8) + 256) & 255

Currently, 256 is not being recognized as a shifted value. This PR enables further reduction:

((base + index << 8) + 256) & 255
=> MulNode (this PR)
(base + index << 8) & 255
=> MulNode (PR #6697)
base & 255 (loop invariant)

Implementation notes:

I verified that the originating issue "scaled varhandle indexed with i+1" (https://mail.openjdk.org/pipermail/panama-dev/2024-December/020835.html) is resolved with this PR.
~~in order to stay with the flow of the current implementation, I refrained from solving general (const & mask)==0 cases, but only those where const == _ << shift.~~
~~I modified existing test cases adding/subtracting from the index var (which would fail with current C2). Let me know if would like to see separate cases for these.~~

Progress

Change must be properly reviewed (1 review required, with at least 1 Reviewer)
Change must not contain extraneous whitespace
Commit message must refer to an issue

Issue

JDK-8346664: C2: Optimize mask check with constant offset (Enhancement - P4)

Reviewers

Quan Anh Mai (@merykitty - Committer) Review applies to 58375582
Emanuel Peter (@eme64 - Reviewer)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/22856/head:pull/22856
$ git checkout pull/22856

Update a local copy of the PR:
$ git checkout pull/22856
$ git pull https://git.openjdk.org/jdk.git pull/22856/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 22856

View PR using the GUI difftool:
$ git pr show -t 22856

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/22856.diff

Using Webrev

Link to Webrev Comment

Extends the optimization of masked sums introduced in openjdk#6697 to cover constant values, which currently break the optimization. Such constant values arise in an expression of the following form, for example from MemorySegmentImpl#isAlignedForElement: (base + (index + 1) << 8) & 255 => MulNode (base + (index << 8 + 256)) & 255 => AddNode ((base + index << 8) + 256) & 255 Currently, "256" is not being recognized as a shifted value. This PR enables: ((base + index << 8) + 256) & 255 => MulNode (base + index << 8) & 255 => MulNode (PR openjdk#6697) base & 255

bridgekeeper · 2024-12-21T14:08:55Z

Hi @mernst-github, welcome to this OpenJDK project and thanks for contributing!

We do not recognize you as Contributor and need to ensure you have signed the Oracle Contributor Agreement (OCA). If you have not signed the OCA, please follow the instructions. Please fill in your GitHub username in the "Username" field of the application. Once you have signed the OCA, please let us know by writing /signed in a comment in this pull request.

If you already are an OpenJDK Author, Committer or Reviewer, please click here to open a new issue so that we can record that fact. Please use "Add GitHub user mernst-github" as summary for the issue.

If you are contributing this work on behalf of your employer and your employer has signed the OCA, please let us know by writing /covered in a comment in this pull request.

openjdk · 2024-12-21T14:09:02Z

@mernst-github This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8346664: C2: Optimize mask check with constant offset

Reviewed-by: epeter, qamai

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 20 new commits pushed to the master branch:

b3a4026: 8349764: RISC-V: C1: Improve Class.isInstance intrinsic
071c8f5: 8349909: jdk.internal.jimage.decompressor.ZipDecompressor does not close the Inflater in exceptional cases
f1258f9: 8349755: Fix corner case issues in async UL
b1b4828: 8350086: Inline hot Method accessors for faster task selection
5e9d72e: 8350094: Linux gcc 13.2.0 build fails when ubsan is enabled
2192723: 8350162: ProblemList compiler/tiered/Level2RecompilationTest.java
7d11418: 8350147: Replace example in KEM class with the one from JEP 452
5cf1132: 8350098: jpackage test lib erroneously will run methods without @test annotation as tests
6234536: 8349915: CTW: Lots of level 3 compiles are done at level 2 after JDK-8348570
b6443f6: 8348347: Cleanup JavaThread subclass support in SA
... and 10 more: https://git.openjdk.org/jdk/compare/ff52859d2ad65b97c56dd19323213a0d07be47ae...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@merykitty, @eme64) but any other Committer may sponsor as well.

➡️ To flag this PR as ready for integration with the above commit message, type /integrate in a new comment. (Afterwards, your sponsor types /sponsor in a new comment to perform the integration).

openjdk · 2024-12-21T14:09:31Z

@mernst-github The following label will be automatically applied to this pull request:

hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

merykitty · 2024-12-23T16:22:18Z

src/hotspot/share/opto/mulnode.cpp

@@ -2052,94 +2052,88 @@ const Type* RotateRightNode::Value(PhaseGVN* phase) const {
  }
 }

-// Given an expression (AndX shift mask) or (AndX mask shift),
+// Returns a lower bound of the number of trailing zeros in expr.
+jint MulNode::AndIL_min_trailing_zeros(PhaseGVN* phase, Node* expr, BasicType bt) {


This could be a static function, I don't see much value in it being a method in MulNode.

merykitty · 2024-12-23T16:22:33Z

src/hotspot/share/opto/mulnode.cpp

+// Returns a lower bound of the number of trailing zeros in expr.
+jint MulNode::AndIL_min_trailing_zeros(PhaseGVN* phase, Node* expr, BasicType bt) {
+  expr = expr->uncast();
+  if (expr == nullptr) {


This should not be nullptr, you can safely remove it.

merykitty · 2024-12-23T16:23:31Z

src/hotspot/share/opto/mulnode.cpp

+
+  if (type->is_con()) {
+    long con = type->get_con_as_long(type->basic_type());
+    return con == 0L ? 0 : count_trailing_zeros(con);


For the sake of consistency, we should return the type width for con == 0, you can obtain this by type2aelementbytes(bt) * 8

merykitty · 2024-12-23T16:25:03Z

src/hotspot/share/opto/mulnode.cpp

+
+  if (expr->Opcode() == Op_ConvI2L) {
+    expr = expr->in(1);
+    if (expr == nullptr) {


This cannot be nullptr, you can safely remove it, the same for expr->uncast() below. In general, the only case when the input of a ConvI2L (and other nodes) not being an int is when it is top, which means it is empty. A.k.a unreachable code.

Done for all (also original inputs expr and mask). I don't think I understand under which conditions null/top may occur, so pls double-check.

merykitty · 2024-12-23T16:26:17Z

src/hotspot/share/opto/mulnode.cpp

+    if (expr == nullptr) {
+      return 0;
+    }
+    type = phase->type(expr)->isa_int();


You are trying to look through a ConvI2L, I think for the sake of consistency, you can reassign bt to T_INT at this point.

merykitty · 2024-12-23T16:27:28Z

src/hotspot/share/opto/mulnode.cpp

+      return 0;
+    }
+    const TypeInt* rhs_t = phase->type(rhs)->isa_int();
+    if (!rhs_t || !rhs_t->is_con()) {


We are trying to avoid implicit conversion to bool, you can use an explicit rhs_t != nullptr here.

merykitty · 2024-12-23T16:28:19Z

src/hotspot/share/opto/mulnode.cpp

+    if (!rhs_t || !rhs_t->is_con()) {
+      return 0;
+    }
+    return rhs_t->get_con() & ((type->isa_int() ? BitsPerJavaInteger : BitsPerJavaLong) - 1);


If you reassign bt, you can do type2aelementbytes(bt), which IMO is clearer.

merykitty · 2024-12-23T16:36:54Z

src/hotspot/share/opto/mulnode.cpp

-bool MulNode::AndIL_shift_and_mask_is_always_zero(PhaseGVN* phase, Node* shift, Node* mask, BasicType bt, bool check_reverse) {
-  if (mask == nullptr || shift == nullptr) {
+// mask M, we check for both operand orders.
+bool MulNode::AndIL_is_always_zero(PhaseGVN* phase, Node* expr, Node* mask, BasicType bt, bool check_reverse) {


Actually you cannot conclude that ((x + y) & m) == 0 iff (x & m) == 0 when (y & m) == 0 because the addition x + y can carry some bit into the positions at which m is set. Consider this example for illustration:

(0b1010 + 0b0010) & 0b0100 == 0b1100 & 0b0100 == 0b0100 != 0

even when

0b1010 & 0b0100 == 0 0b0010 & 0b0100 == 0

The most trivial sufficient condition we are using here is that the lowest bit set of y is larger than the highest bit set of m. Because then adding y into x does not carry any bit into the result that is set in m but not set in x. This method can be a static function, too IMO.

Very good point. Adjusted naming and comments a bit, and added a negative test case.

merykitty · 2024-12-23T16:38:29Z

test/hotspot/jtreg/compiler/c2/irTests/TestShiftAndMask.java

@@ -120,7 +120,7 @@ public static void checkShiftNonConstMaskLong(long res) {
    @IR(counts = { IRNode.AND_I, "1" })
    @IR(failOn = { IRNode.ADD_I, IRNode.LSHIFT_I })
    public static int addShiftMaskInt(int i, int j) {
-        return (j + (i << 2)) & 3; // transformed to: return j & 3;
+        return (j + ((i + 1) << 2)) & 3; // transformed to: return j & 3;


I would prefer you adding other test cases instead of modifying existing ones.

* make helpers static * remove null checks * separate test cases * naming / reword comments readability: * move reverse checks to (some) callsites

mernst-github

Appreciate the review!

mernst-github · 2024-12-24T09:02:20Z

src/hotspot/share/opto/mulnode.cpp

@@ -2052,94 +2052,88 @@ const Type* RotateRightNode::Value(PhaseGVN* phase) const {
  }
 }

-// Given an expression (AndX shift mask) or (AndX mask shift),
+// Returns a lower bound of the number of trailing zeros in expr.
+jint MulNode::AndIL_min_trailing_zeros(PhaseGVN* phase, Node* expr, BasicType bt) {


mernst-github · 2024-12-24T09:05:13Z

src/hotspot/share/opto/mulnode.cpp

+// Returns a lower bound of the number of trailing zeros in expr.
+jint MulNode::AndIL_min_trailing_zeros(PhaseGVN* phase, Node* expr, BasicType bt) {
+  expr = expr->uncast();
+  if (expr == nullptr) {


mernst-github · 2024-12-24T09:06:14Z

src/hotspot/share/opto/mulnode.cpp

+
+  if (expr->Opcode() == Op_ConvI2L) {
+    expr = expr->in(1);
+    if (expr == nullptr) {


Done for all (also original inputs expr and mask). I don't think I understand under which conditions null/top may occur, so pls double-check.

mernst-github · 2024-12-24T09:06:20Z

src/hotspot/share/opto/mulnode.cpp

+
+  if (type->is_con()) {
+    long con = type->get_con_as_long(type->basic_type());
+    return con == 0L ? 0 : count_trailing_zeros(con);


mernst-github · 2024-12-24T09:06:30Z

src/hotspot/share/opto/mulnode.cpp

+    if (expr == nullptr) {
+      return 0;
+    }
+    type = phase->type(expr)->isa_int();


mernst-github · 2024-12-24T09:06:36Z

src/hotspot/share/opto/mulnode.cpp

+      return 0;
+    }
+    const TypeInt* rhs_t = phase->type(rhs)->isa_int();
+    if (!rhs_t || !rhs_t->is_con()) {


mernst-github · 2024-12-24T09:06:46Z

src/hotspot/share/opto/mulnode.cpp

+    if (!rhs_t || !rhs_t->is_con()) {
+      return 0;
+    }
+    return rhs_t->get_con() & ((type->isa_int() ? BitsPerJavaInteger : BitsPerJavaLong) - 1);


mernst-github · 2024-12-24T09:07:30Z

src/hotspot/share/opto/mulnode.cpp

-bool MulNode::AndIL_shift_and_mask_is_always_zero(PhaseGVN* phase, Node* shift, Node* mask, BasicType bt, bool check_reverse) {
-  if (mask == nullptr || shift == nullptr) {
+// mask M, we check for both operand orders.
+bool MulNode::AndIL_is_always_zero(PhaseGVN* phase, Node* expr, Node* mask, BasicType bt, bool check_reverse) {


Very good point. Adjusted naming and comments a bit, and added a negative test case.

mernst-github · 2024-12-24T09:09:08Z

src/hotspot/share/opto/mulnode.cpp

+  if (AndIL_is_zero_element(phase, in(1), in(2), T_INT) ||
+      AndIL_is_zero_element(phase, in(2), in(1), T_INT)) {


I find it easier to reason about the "reverse" check when we simply expand it here.

mernst-github · 2024-12-24T09:15:15Z

test/hotspot/jtreg/compiler/c2/irTests/TestShiftAndMask.java

@@ -120,7 +120,7 @@ public static void checkShiftNonConstMaskLong(long res) {
    @IR(counts = { IRNode.AND_I, "1" })
    @IR(failOn = { IRNode.ADD_I, IRNode.LSHIFT_I })
    public static int addShiftMaskInt(int i, int j) {
-        return (j + (i << 2)) & 3; // transformed to: return j & 3;
+        return (j + ((i + 1) << 2)) & 3; // transformed to: return j & 3;


mernst-github · 2025-01-01T09:42:36Z

/signed

bridgekeeper · 2025-01-01T09:43:35Z

Thank you! Please allow for up to two weeks to process your OCA, although it is usually done within one to two business days. Also, please note that pull requests that are pending an OCA check will not usually be evaluated, so your patience is appreciated!

merykitty

LGTM, thanks a lot.

merykitty · 2025-01-02T11:30:52Z

src/hotspot/share/opto/mulnode.cpp

 // when the shift value N is large enough to zero out
 // all the set positions of the and-mask M.
 //   (AndI (LShiftI _ #N) #M) => #0
 //   (AndL (LShiftL _ #N) #M) => #0
 //   (AndL (ConvI2L (LShiftI _ #N)) #M) => #0
+//   as well as for constant operands:
+//   (AndI (ConI [+-] _ << #N) #M) => #0


(AndI (ConI (_ << #N)) #M)

I think writing like this is clearer, it is confusing talking about signs in bitwise operations. Also, please remove => #0 in these.

openjdk · 2025-01-02T11:41:57Z

@mernst-github Please do not rebase or force-push to an active PR as it invalidates existing review comments. Note for future reference, the bots always squash all changes into a single commit automatically as part of the integration. See OpenJDK Developers’ Guide for more information.

eme64 · 2025-01-08T07:32:47Z

test/hotspot/jtreg/compiler/c2/irTests/TestShiftAndMask.java

+            throw new RuntimeException("incorrect result: " + res);
+        }
+    }
+


Nice to see that you have some examples here!

I think it would be great to have some more though. The divil hides in the details. In the edge cases usually.

You currently have patterns like this:
(j + ((i + c1) << c2)) & c3;
What if you generate the constants c1, c2, c3 randomly:
public static final int C1 = random.nextInt() (or some other random distribution that makes more sense).
Then the compiler will see them as constants (because final), and attempt constant folding.

You can then do result verification: You create a method copy that you restrict to the interpreter, and the other copy can be compiled. Then you test the method with all sorts of random inputs for i, j, and verify the results of the two methods (compiled vs interpreted).

Maybe you can add some more patterns as well, just to have a better test coverage.

Does that make sense?

I believe I understand the intent, and I've now randomized all constant masks / shifts / consts in this file. But just to make sure: IIUC the tests are only compiled once per invocation, there is no way I can tell the framework to "C2 compile this x times with different random constants". I.e. I can make test this a hundred times locally, but I cannot create large coverage via the framework, right?

Also not quite sure I understand the verification proposal. How would that be different from the current comparisons if (result != expected simplified form) ? Now if the framework supported an automatic comparison of compiled vs interpreted invocation, that would be nice.

Yeah, creating large coverage with a single run under the IR framework is not currently possible I think.

Generally, there are other tricks to get "changing constants", see what I did with setConstant and int_con in this test:
test/hotspot/jtreg/compiler/loopopts/superword/TestAlignVectorFuzzer.java

But the tests are rerun a lot anyway, so that is not super necessary.

I am working on a Template framework that makes using random constants much easier, and also generating multiple methods where only the constants differ. That should make things a little easier.

I suppose that works: if (result != expected simplified form)
Though only for cases where we have a valid simplification. If you also want to test the cases that have a very similar pattern, but should not accidentally wrongly optimize, then you would have to do the compiled/interpreted comparison.

eme64 · 2025-01-08T07:40:04Z

src/hotspot/share/opto/mulnode.cpp

+  }
+
+  jint zeros = AndIL_min_trailing_zeros(phase, expr, bt);
+  return zeros > 0 && ((((jlong)1) << zeros) > mask_t->hi_as_long() && mask_t->lo_as_long() >= 0);


This line indicates that the mask could be a variable. You should make sure to add some tests for that in your patterns. You can create a variable in a specific range like this Math.min(5, Math.max(1, x)), should get you x clamped into the region 1..5.

Added a variant for adding consts using the same pattern as the other "NonConstMask" tests.

Can you please point me to the tests where mask_t is a range, and not a constant?

The tests addConstNonConstMask[Int,Long] (https://github.com/openjdk/jdk/pull/22856/files#diff-2c6beb2b7bcb76601adb439471e786963c6e0d5cb6db132381f64e10df5819daR207), copied from the existing NonConstMask tests exercise this.

eme64 · 2025-01-08T07:41:39Z

src/hotspot/share/opto/mulnode.cpp

+// Is expr a neutral element wrt addition under mask?
+static bool AndIL_is_zero_element(const PhaseGVN* phase, const Node* expr, const Node* mask, BasicType bt);


I would prefer a more expressive function name over a comment here. The comment is a little confusing to me too.

The old name at least talked about shift and mask - is that not relevant any more?

Or you just decide to name it AndIL_is_zero, and drop out the comment. Because who knows someone might add other things that check for zero in that method, and then your comment would be out-dated (but probably people would forget to adjust it).

Renamed to is_zero_element_under_mask. "zero element" for me drives down that it's neither

checking for expr == 0

nor checking for expr & mask == 0
but really (X + expr) & mask == X & mask for all X.

There is no requirement for shift node, e.g, we recognize is_zero_under_mask(192, 7). However, the constants that are recognized here are "shifts in spirit" (e.g. expanded from (i + 24) << 3). If you can think of a good term for this that doesn't suggest there's an actual "shift node" we could try and incorporate that.

Since this is a forward declaration, the elaborate comment is below. Went and dropped the short one here.

eme64 · 2025-01-08T07:42:41Z

src/hotspot/share/opto/mulnode.cpp

@@ -670,9 +670,13 @@ const Type *AndINode::mul_ring( const Type *t0, const Type *t1 ) const {
  return and_value<TypeInt>(r0, r1);
 }

+// Is expr a neutral element wrt addition under mask?
+static bool AndIL_is_zero_element(const PhaseGVN* phase, const Node* expr, const Node* mask, BasicType bt);
+
 const Type* AndINode::Value(PhaseGVN* phase) const {
  // patterns similar to (v << 2) & 3


Should this comment be updated now for a more general pattern?

I think it's better to drop this and refer the reader to the definition.

eme64 · 2025-01-08T07:47:54Z

src/hotspot/share/opto/mulnode.cpp

+// Returns a lower bound on the number of trailing zeros in expr.
+static jint AndIL_min_trailing_zeros(const PhaseGVN* phase, const Node* expr, BasicType bt) {


Is this method restricted to use in AndIL? Because it looks like it is doing something more generic: trying to figure out a lower bound on the trailing zeros of an expression.

If that is the case: Why not put it in Node::get_trailing_zeros_lower_bound(phase, bt), so it can be used elsewhere too?

I would argue that while this might be incidentally reusable outside of the scope of "And" nodes, as long as there is no actual demand to reuse this, I would rather not add it to the rather prominent Node class to avoid api bloat.

Iff the notion of "is known to be a multiple of a certain power of two" is really of general interest, I would expect it to become part of TypeInteger.

Sure, just leave it where it is for now. I'm ok with it.

mernst-github · 2025-02-14T07:19:47Z

I like it, thanks a lot for finishing that up! Incorporated and pushed your version.

I was trying to avoid a % operator and argue with "congruence", e.g. if a and b equiv mod m, then a & mask = b & mask, but did not succeed at it :-)

eme64 · 2025-02-14T08:56:08Z

@mernst-github Glad you like it 😊

I ran testing one more time, since the VM code changed slightly since last time, I think. If @merykitty doesn't disagree we can hopefully integrate after the weekend :)

eme64

@mernst-github Tests are passing. Nice work, and thanks for sticking with us! Approved.

mernst-github · 2025-02-17T12:18:23Z

Super, glad this worked out! I want to return the compliment, thanks for sticking with me :-)

mernst-github · 2025-02-17T12:18:31Z

/integrate

openjdk · 2025-02-17T12:19:29Z

@mernst-github
Your change (at version b7a16a1) is now ready to be sponsored by a Committer.

eme64 · 2025-02-17T13:06:03Z

/sponsor

openjdk · 2025-02-17T13:07:24Z

Going to push as commit 7f3ecb4.
Since your change was applied there have been 20 commits pushed to the master branch:

b3a4026: 8349764: RISC-V: C1: Improve Class.isInstance intrinsic
071c8f5: 8349909: jdk.internal.jimage.decompressor.ZipDecompressor does not close the Inflater in exceptional cases
f1258f9: 8349755: Fix corner case issues in async UL
b1b4828: 8350086: Inline hot Method accessors for faster task selection
5e9d72e: 8350094: Linux gcc 13.2.0 build fails when ubsan is enabled
2192723: 8350162: ProblemList compiler/tiered/Level2RecompilationTest.java
7d11418: 8350147: Replace example in KEM class with the one from JEP 452
5cf1132: 8350098: jpackage test lib erroneously will run methods without @test annotation as tests
6234536: 8349915: CTW: Lots of level 3 compiles are done at level 2 after JDK-8348570
b6443f6: 8348347: Cleanup JavaThread subclass support in SA
... and 10 more: https://git.openjdk.org/jdk/compare/ff52859d2ad65b97c56dd19323213a0d07be47ae...master

Your commit was automatically rebased without conflicts.

openjdk · 2025-02-17T13:07:33Z

@eme64 @mernst-github Pushed as commit 7f3ecb4.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

merykitty · 2025-02-17T13:35:43Z

@mernst-github Thanks a lot for going with us through this process, your contribution will be engraved in the (git) history 😂

TobiHartmann · 2025-02-24T09:34:21Z

This caused a regression in CCP: JDK-8350563 @mernst-github, could you please have a look? Thanks.

mernst-github · 2025-02-24T10:05:42Z

Sorry for that. Super-quick look: I have no idea what CCP is, but it appears that it contains redundant logic and knows what kind of optimizations are performed elsewhere. https://github.com/openjdk/jdk21u-dev/blob/fa896c742b494520e6c652539e8218b7127e877a/src/hotspot/share/opto/phaseX.cpp#L1972 looks salient, just from pattern matching I wonder if it's sufficient to add Op_ConI/L to this list.

mernst-github · 2025-02-24T10:08:52Z

Another place that seems to understand the Shift-And pattern: https://github.com/openjdk/jdk21u-dev/blob/master/src/hotspot/share/opto/phaseX.cpp#L1611

TobiHartmann · 2025-02-24T13:45:58Z

Yes, both CCP and IGVN need special logic to make sure that a node A is added to the worklist if another node B that is not a direct input to B changed such that node A could now be further optimized. You added such an optimization but the code to add the node to the worklist is missing. It's exactly the two places you referred to. I think it should be possible to create a test for IGVN as well, i.e. come up with a case where the optimization is not performed although it could.

mernst-github · 2025-02-25T06:09:54Z

Related: https://bugs.openjdk.org/browse/JDK-8288683

I'm still trying to understand:

why this is intermittent, it seems to me that it should trigger deterministically
what's the severity of this? do we need to back out or is this merely an indicator for a missed downstream optimization. 8288683 has a case where execution would be incorrect, also not sure I understand how that can happen.

eme64 · 2025-02-25T07:21:03Z

Related: https://bugs.openjdk.org/browse/JDK-8288683

I'm still trying to understand:

why this is intermittent, it seems to me that it should trigger deterministically

Sometimes that is due to timing, i.e. differences in profiling, what code gets inlined etc. That can change the shape of the IR and the order in which it is processed. That can change the order in which things get folded and propagated, and so in some cases the bug triggers and in others not. It would be good to extract a smaller reproducer where it triggers more reliably. Here StressIGVN and StressCCP with RepeatCompilation can be your friends.

what's the severity of this? do we need to back out or is this merely an indicator for a missed downstream optimization. 8288683 has a case where execution would be incorrect, also not sure I understand how that can happen.

Let me explain:

In IGVN, we only narrow types. If you miss an optimization, then the result is suboptimal, but still correct.
In CCP, we start with empty types and widen them until the types are correct. If we miss a step, then the type is too narrow and the result is incorrect. Imagine a type is too narrow, and then we have some condition based on it. Imagine the range is wrongly [0,5] but should be [0,10]. If we check if the value could be 9, we think it is false, but it could actually be true. We wrongly constant fold a condition and get wrong results.

Not sure if any of this applies in that example, but it might.

eme64 · 2025-02-25T07:22:56Z

About severity: As long as we find and integrate a fix during JDK25 it is fine (the issue does not break the CI that badly at the moment). If we get close to rampdown, then we have to consider if we want to backout 8346664 or if we defer the bug to JDK26.

mernst-github · 2025-02-28T06:54:52Z

The good news: it's easy to reproduce the issue by adding a loop to CheckProp, and I have a fix up for CCP that appears(!) straightforward. I do not quite understand the interplay with IGVN, the set of "push" rules does not look consistent between the two phases to me. But re-reading @eme64's response, CCP is necessary for correctness, IGVN is optimization potential(?).

That said, I have taken up new employment and need to sort out the contribution framework first before I can contribute the fix.

merykitty · 2025-02-28T07:20:57Z

@mernst-github IGVN and CCP try to infer information about a node from its inputs. As a result, when an input of a node is changed by IGVN or CCP, you need to do the inference on that node again. Theoretically, you need to push all nodes below a changed node to the worklist, but doing so is expensive and unnecessary. So, we use the fairly convoluted approach of pushing all immediate uses, and indirect uses are pushed in an ad-hoc manner depending on the way particular nodes do their inference. For example, in this case, since AndNode::Value looks through a LShiftNode, you need to do the opposite when pushing them to the IGVN worklist, i.e. pushing all AndNode that is a use of a direct use LShiftNode of the current node.

I do not quite understand the interplay with IGVN, the set of "push" rules does not look consistent between the two phases to me.

It is because IGVN uses Ideal, Identity, and Value, which means that the pushing logic needs to take into consideration the inference of all these methods. While CCP only uses Value, so the pushing logic only needs to know which nodes the Value methods try to look through.

CCP is necessary for correctness, IGVN is optimization potential(?).

Not really, we assume a graph is stable after IGVN, so missed idealisation here can also be considered a bug (especially missed transformations with CFG nodes). The difference is that these assumptions do not practically cover all nodes so we may get away with some of them.

That said, I have taken up new employment and need to sort out the contribution framework first before I can contribute the fix.

Don't worry, RDP1 of JDK-25 is not so urgent. So take your time.

mernst-github · 2025-03-03T12:54:02Z

Concretely, my understanding of what's breaking here is the following:

AndI ( CastII ( ConI ) ) : AndI reaches through the cast and, since 8346664, also recognizes constants in addition to shifts and can eliminate. This currently gets stuck on the Cast node in CCP.

Pushing the And node fixes the crash and looks straightforward, but it makes me wonder why it is necessary, and why the change doesn't bubble through the Cast already: shouldn't the cast node be re-pushed when its input changes,CastII(Con).Value() now return the constant, and then bubble further to the And? It appears to me that either there's something off in Cast's Value handling (there's more to it than expected ;-), OR - if it is somehow necessary to preserve this cast - it is not clear to me why And (and others) reaching through the Cast is legal in the first place. Does this make sense?

merykitty · 2025-03-03T16:52:00Z

The issue is interesting. The code shape looks like this AndI(CastII(Phi), ConI). Phi is a constant 0 while CastII is top. CastII is top because its control input is dead. And, as a result, CastII does not change after CCP and we do not push its inputs (in this case the AndI) to the worklist, resulting in the AndI remains to be a top. On the other hand, AndINode::Value looks through CastIINodes (in AndIL_min_trailing_zeros we do expr = expr->uncast()). Which means that it sees the Phi being a constant 0, and returns TypeInt::ZERO.

Of course, in this particular case, a solution is to check for top inputs before proceeding with any action. However, I think that it is not sufficient. Given that we often look through CastNodes when doing inference, I think it is suitable we push nodes to the worklist through CastNodes, too.

mernst-github · 2025-03-03T21:03:01Z

My proposed fix for this is in #23871 .

bridgekeeper bot added the oca Needs verification of OCA signatory status label Dec 21, 2024

openjdk bot added the hotspot-compiler [email protected] label Dec 21, 2024

mernst-github changed the title ~~8346664: C2: optimize constant addends in masked sums~~ 8346664: C2: Optimize mask check with constant offset Dec 21, 2024

merykitty reviewed Dec 23, 2024

View reviewed changes

mernst-github added 6 commits December 24, 2024 09:19

review:

2b8da62

* make helpers static * remove null checks * separate test cases * naming / reword comments readability: * move reverse checks to (some) callsites

undo method reordering

bc3c4f0

T != X

ba2f82b

counter

cabc0d8

--more null checks

df79a96

consistently use bt

2a79c24

mernst-github commented Dec 24, 2024

View reviewed changes

bridgekeeper bot added the oca-verify Needs verification of OCA signatory status label Jan 1, 2025

merykitty approved these changes Jan 2, 2025

View reviewed changes

comment readability

1555846

mernst-github force-pushed the mernst/JDK-8346664 branch from 0713345 to 1555846 Compare January 2, 2025 11:41

eme64 reviewed Jan 8, 2025

View reviewed changes

mernst-github added 3 commits January 8, 2025 13:50

naming, comment

e0f44b4

randomize, test "(j + const) & nonconst_mask".

6df13ff

addConstNonConstMaskLong

9d3ca06

eme64 approved these changes Feb 17, 2025

View reviewed changes

openjdk bot added the ready Pull request is ready to be integrated label Feb 17, 2025

openjdk bot added the sponsor Pull request is ready to be sponsored label Feb 17, 2025

openjdk bot added the integrated Pull request has been integrated label Feb 17, 2025

openjdk bot closed this Feb 17, 2025

openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review sponsor Pull request is ready to be sponsored labels Feb 17, 2025

cushon mentioned this pull request Mar 5, 2025

8350563: C2 compilation fails because PhaseCCP does not reach a fixpoint #23871

Open

3 tasks

		if (AndIL_is_zero_element(phase, in(1), in(2), T_INT) \|\|
		AndIL_is_zero_element(phase, in(2), in(1), T_INT)) {

		// Is expr a neutral element wrt addition under mask?
		static bool AndIL_is_zero_element(const PhaseGVN* phase, const Node* expr, const Node* mask, BasicType bt);

		// Returns a lower bound on the number of trailing zeros in expr.
		static jint AndIL_min_trailing_zeros(const PhaseGVN* phase, const Node* expr, BasicType bt) {

8346664: C2: Optimize mask check with constant offset #22856

8346664: C2: Optimize mask check with constant offset #22856

Conversation

mernst-github commented Dec 21, 2024 • edited by openjdk bot Loading

Progress

Issue

Reviewers

Reviewing

bridgekeeper bot commented Dec 21, 2024

openjdk bot commented Dec 21, 2024 • edited Loading

openjdk bot commented Dec 21, 2024

Choose a reason for hiding this comment

mernst-github Dec 24, 2024 • edited by bridgekeeper bot Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mernst-github Dec 24, 2024 • edited by bridgekeeper bot Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mernst-github Dec 24, 2024 • edited by bridgekeeper bot Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mernst-github Dec 24, 2024 • edited by bridgekeeper bot Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mernst-github Dec 24, 2024 • edited by bridgekeeper bot Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mernst-github Dec 24, 2024 • edited by bridgekeeper bot Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mernst-github Dec 24, 2024 • edited by bridgekeeper bot Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mernst-github Dec 24, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mernst-github Dec 24, 2024 • edited by bridgekeeper bot Loading

Choose a reason for hiding this comment

mernst-github left a comment • edited Loading

Choose a reason for hiding this comment

mernst-github Dec 24, 2024 • edited by bridgekeeper bot Loading

Choose a reason for hiding this comment

mernst-github Dec 24, 2024 • edited by bridgekeeper bot Loading

Choose a reason for hiding this comment

mernst-github Dec 24, 2024 • edited by bridgekeeper bot Loading

Choose a reason for hiding this comment

mernst-github Dec 24, 2024 • edited by bridgekeeper bot Loading

Choose a reason for hiding this comment

mernst-github Dec 24, 2024 • edited by bridgekeeper bot Loading

Choose a reason for hiding this comment

mernst-github Dec 24, 2024 • edited by bridgekeeper bot Loading

Choose a reason for hiding this comment

mernst-github Dec 24, 2024 • edited by bridgekeeper bot Loading

Choose a reason for hiding this comment

mernst-github Dec 24, 2024 • edited Loading

Choose a reason for hiding this comment

mernst-github Dec 24, 2024 • edited by bridgekeeper bot Loading

Choose a reason for hiding this comment

mernst-github Dec 24, 2024 • edited by bridgekeeper bot Loading

Choose a reason for hiding this comment

mernst-github commented Jan 1, 2025

bridgekeeper bot commented Jan 1, 2025

merykitty left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

openjdk bot commented Jan 2, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mernst-github commented Dec 21, 2024 •

edited by openjdk bot

Loading

openjdk bot commented Dec 21, 2024 •

edited

Loading

mernst-github Dec 24, 2024 •

edited by bridgekeeper bot

Loading

mernst-github Dec 24, 2024 •

edited by bridgekeeper bot

Loading

mernst-github Dec 24, 2024 •

edited by bridgekeeper bot

Loading

mernst-github Dec 24, 2024 •

edited by bridgekeeper bot

Loading

mernst-github Dec 24, 2024 •

edited by bridgekeeper bot

Loading

mernst-github Dec 24, 2024 •

edited by bridgekeeper bot

Loading

mernst-github Dec 24, 2024 •

edited by bridgekeeper bot

Loading

mernst-github Dec 24, 2024 •

edited

Loading

mernst-github Dec 24, 2024 •

edited by bridgekeeper bot

Loading

mernst-github left a comment •

edited

Loading

mernst-github Dec 24, 2024 •

edited by bridgekeeper bot

Loading

mernst-github Dec 24, 2024 •

edited by bridgekeeper bot

Loading

mernst-github Dec 24, 2024 •

edited by bridgekeeper bot

Loading

mernst-github Dec 24, 2024 •

edited by bridgekeeper bot

Loading

mernst-github Dec 24, 2024 •

edited by bridgekeeper bot

Loading

mernst-github Dec 24, 2024 •

edited by bridgekeeper bot

Loading

mernst-github Dec 24, 2024 •

edited by bridgekeeper bot

Loading

mernst-github Dec 24, 2024 •

edited

Loading

mernst-github Dec 24, 2024 •

edited by bridgekeeper bot

Loading

mernst-github Dec 24, 2024 •

edited by bridgekeeper bot

Loading

eme64 commented Feb 25, 2025 •

edited

Loading

mernst-github commented Feb 28, 2025 •

edited

Loading