-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8346664: C2: Optimize mask check with constant offset #22856
Conversation
Extends the optimization of masked sums introduced in openjdk#6697 to cover constant values, which currently break the optimization. Such constant values arise in an expression of the following form, for example from MemorySegmentImpl#isAlignedForElement: (base + (index + 1) << 8) & 255 => MulNode (base + (index << 8 + 256)) & 255 => AddNode ((base + index << 8) + 256) & 255 Currently, "256" is not being recognized as a shifted value. This PR enables: ((base + index << 8) + 256) & 255 => MulNode (base + index << 8) & 255 => MulNode (PR openjdk#6697) base & 255
Hi @mernst-github, welcome to this OpenJDK project and thanks for contributing! We do not recognize you as Contributor and need to ensure you have signed the Oracle Contributor Agreement (OCA). If you have not signed the OCA, please follow the instructions. Please fill in your GitHub username in the "Username" field of the application. Once you have signed the OCA, please let us know by writing If you already are an OpenJDK Author, Committer or Reviewer, please click here to open a new issue so that we can record that fact. Please use "Add GitHub user mernst-github" as summary for the issue. If you are contributing this work on behalf of your employer and your employer has signed the OCA, please let us know by writing |
@mernst-github This change now passes all automated pre-integration checks. ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details. After integration, the commit message for the final commit will be:
You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed. At the time when this comment was updated there had been 20 new commits pushed to the
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details. As you do not have Committer status in this project an existing Committer must agree to sponsor your change. Possible candidates are the reviewers of this PR (@merykitty, @eme64) but any other Committer may sponsor as well. ➡️ To flag this PR as ready for integration with the above commit message, type |
@mernst-github The following label will be automatically applied to this pull request:
When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command. |
src/hotspot/share/opto/mulnode.cpp
Outdated
@@ -2052,94 +2052,88 @@ const Type* RotateRightNode::Value(PhaseGVN* phase) const { | |||
} | |||
} | |||
|
|||
// Given an expression (AndX shift mask) or (AndX mask shift), | |||
// Returns a lower bound of the number of trailing zeros in expr. | |||
jint MulNode::AndIL_min_trailing_zeros(PhaseGVN* phase, Node* expr, BasicType bt) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be a static function, I don't see much value in it being a method in MulNode
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/hotspot/share/opto/mulnode.cpp
Outdated
// Returns a lower bound of the number of trailing zeros in expr. | ||
jint MulNode::AndIL_min_trailing_zeros(PhaseGVN* phase, Node* expr, BasicType bt) { | ||
expr = expr->uncast(); | ||
if (expr == nullptr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be nullptr
, you can safely remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/hotspot/share/opto/mulnode.cpp
Outdated
|
||
if (type->is_con()) { | ||
long con = type->get_con_as_long(type->basic_type()); | ||
return con == 0L ? 0 : count_trailing_zeros(con); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the sake of consistency, we should return the type width for con == 0
, you can obtain this by type2aelementbytes(bt) * 8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/hotspot/share/opto/mulnode.cpp
Outdated
|
||
if (expr->Opcode() == Op_ConvI2L) { | ||
expr = expr->in(1); | ||
if (expr == nullptr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This cannot be nullptr
, you can safely remove it, the same for expr->uncast()
below. In general, the only case when the input of a ConvI2L
(and other nodes) not being an int
is when it is top
, which means it is empty. A.k.a unreachable code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done for all (also original inputs expr and mask). I don't think I understand under which conditions null/top
may occur, so pls double-check.
if (expr == nullptr) { | ||
return 0; | ||
} | ||
type = phase->type(expr)->isa_int(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are trying to look through a ConvI2L
, I think for the sake of consistency, you can reassign bt
to T_INT
at this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/hotspot/share/opto/mulnode.cpp
Outdated
return 0; | ||
} | ||
const TypeInt* rhs_t = phase->type(rhs)->isa_int(); | ||
if (!rhs_t || !rhs_t->is_con()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are trying to avoid implicit conversion to bool
, you can use an explicit rhs_t != nullptr
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/hotspot/share/opto/mulnode.cpp
Outdated
if (!rhs_t || !rhs_t->is_con()) { | ||
return 0; | ||
} | ||
return rhs_t->get_con() & ((type->isa_int() ? BitsPerJavaInteger : BitsPerJavaLong) - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you reassign bt
, you can do type2aelementbytes(bt)
, which IMO is clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/hotspot/share/opto/mulnode.cpp
Outdated
bool MulNode::AndIL_shift_and_mask_is_always_zero(PhaseGVN* phase, Node* shift, Node* mask, BasicType bt, bool check_reverse) { | ||
if (mask == nullptr || shift == nullptr) { | ||
// mask M, we check for both operand orders. | ||
bool MulNode::AndIL_is_always_zero(PhaseGVN* phase, Node* expr, Node* mask, BasicType bt, bool check_reverse) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually you cannot conclude that ((x + y) & m) == 0 iff (x & m) == 0
when (y & m) == 0
because the addition x + y
can carry some bit into the positions at which m
is set. Consider this example for illustration:
(0b1010 + 0b0010) & 0b0100 == 0b1100 & 0b0100 == 0b0100 != 0
even when
0b1010 & 0b0100 == 0
0b0010 & 0b0100 == 0
The most trivial sufficient condition we are using here is that the lowest bit set of y
is larger than the highest bit set of m
. Because then adding y
into x
does not carry any bit into the result that is set in m
but not set in x
. This method can be a static function, too IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good point. Adjusted naming and comments a bit, and added a negative test case.
@@ -120,7 +120,7 @@ public static void checkShiftNonConstMaskLong(long res) { | |||
@IR(counts = { IRNode.AND_I, "1" }) | |||
@IR(failOn = { IRNode.ADD_I, IRNode.LSHIFT_I }) | |||
public static int addShiftMaskInt(int i, int j) { | |||
return (j + (i << 2)) & 3; // transformed to: return j & 3; | |||
return (j + ((i + 1) << 2)) & 3; // transformed to: return j & 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer you adding other test cases instead of modifying existing ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Appreciate the review!
src/hotspot/share/opto/mulnode.cpp
Outdated
@@ -2052,94 +2052,88 @@ const Type* RotateRightNode::Value(PhaseGVN* phase) const { | |||
} | |||
} | |||
|
|||
// Given an expression (AndX shift mask) or (AndX mask shift), | |||
// Returns a lower bound of the number of trailing zeros in expr. | |||
jint MulNode::AndIL_min_trailing_zeros(PhaseGVN* phase, Node* expr, BasicType bt) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/hotspot/share/opto/mulnode.cpp
Outdated
// Returns a lower bound of the number of trailing zeros in expr. | ||
jint MulNode::AndIL_min_trailing_zeros(PhaseGVN* phase, Node* expr, BasicType bt) { | ||
expr = expr->uncast(); | ||
if (expr == nullptr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/hotspot/share/opto/mulnode.cpp
Outdated
|
||
if (expr->Opcode() == Op_ConvI2L) { | ||
expr = expr->in(1); | ||
if (expr == nullptr) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done for all (also original inputs expr and mask). I don't think I understand under which conditions null/top
may occur, so pls double-check.
src/hotspot/share/opto/mulnode.cpp
Outdated
|
||
if (type->is_con()) { | ||
long con = type->get_con_as_long(type->basic_type()); | ||
return con == 0L ? 0 : count_trailing_zeros(con); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
if (expr == nullptr) { | ||
return 0; | ||
} | ||
type = phase->type(expr)->isa_int(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/hotspot/share/opto/mulnode.cpp
Outdated
return 0; | ||
} | ||
const TypeInt* rhs_t = phase->type(rhs)->isa_int(); | ||
if (!rhs_t || !rhs_t->is_con()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/hotspot/share/opto/mulnode.cpp
Outdated
if (!rhs_t || !rhs_t->is_con()) { | ||
return 0; | ||
} | ||
return rhs_t->get_con() & ((type->isa_int() ? BitsPerJavaInteger : BitsPerJavaLong) - 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/hotspot/share/opto/mulnode.cpp
Outdated
bool MulNode::AndIL_shift_and_mask_is_always_zero(PhaseGVN* phase, Node* shift, Node* mask, BasicType bt, bool check_reverse) { | ||
if (mask == nullptr || shift == nullptr) { | ||
// mask M, we check for both operand orders. | ||
bool MulNode::AndIL_is_always_zero(PhaseGVN* phase, Node* expr, Node* mask, BasicType bt, bool check_reverse) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good point. Adjusted naming and comments a bit, and added a negative test case.
src/hotspot/share/opto/mulnode.cpp
Outdated
if (AndIL_is_zero_element(phase, in(1), in(2), T_INT) || | ||
AndIL_is_zero_element(phase, in(2), in(1), T_INT)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it easier to reason about the "reverse" check when we simply expand it here.
@@ -120,7 +120,7 @@ public static void checkShiftNonConstMaskLong(long res) { | |||
@IR(counts = { IRNode.AND_I, "1" }) | |||
@IR(failOn = { IRNode.ADD_I, IRNode.LSHIFT_I }) | |||
public static int addShiftMaskInt(int i, int j) { | |||
return (j + (i << 2)) & 3; // transformed to: return j & 3; | |||
return (j + ((i + 1) << 2)) & 3; // transformed to: return j & 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
/signed |
Thank you! Please allow for up to two weeks to process your OCA, although it is usually done within one to two business days. Also, please note that pull requests that are pending an OCA check will not usually be evaluated, so your patience is appreciated! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks a lot.
src/hotspot/share/opto/mulnode.cpp
Outdated
// when the shift value N is large enough to zero out | ||
// all the set positions of the and-mask M. | ||
// (AndI (LShiftI _ #N) #M) => #0 | ||
// (AndL (LShiftL _ #N) #M) => #0 | ||
// (AndL (ConvI2L (LShiftI _ #N)) #M) => #0 | ||
// as well as for constant operands: | ||
// (AndI (ConI [+-] _ << #N) #M) => #0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(AndI (ConI (_ << #N)) #M)
I think writing like this is clearer, it is confusing talking about signs in bitwise operations. Also, please remove => #0
in these.
0713345
to
1555846
Compare
@mernst-github Please do not rebase or force-push to an active PR as it invalidates existing review comments. Note for future reference, the bots always squash all changes into a single commit automatically as part of the integration. See OpenJDK Developers’ Guide for more information. |
throw new RuntimeException("incorrect result: " + res); | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice to see that you have some examples here!
I think it would be great to have some more though. The divil hides in the details. In the edge cases usually.
You currently have patterns like this:
(j + ((i + c1) << c2)) & c3;
What if you generate the constants c1, c2, c3
randomly:
public static final int C1 = random.nextInt()
(or some other random distribution that makes more sense).
Then the compiler will see them as constants (because final), and attempt constant folding.
You can then do result verification: You create a method copy that you restrict to the interpreter, and the other copy can be compiled. Then you test the method with all sorts of random inputs for i, j
, and verify the results of the two methods (compiled vs interpreted).
Maybe you can add some more patterns as well, just to have a better test coverage.
Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe I understand the intent, and I've now randomized all constant masks / shifts / consts in this file. But just to make sure: IIUC the tests are only compiled once per invocation, there is no way I can tell the framework to "C2 compile this x times with different random constants". I.e. I can make test
this a hundred times locally, but I cannot create large coverage via the framework, right?
Also not quite sure I understand the verification proposal. How would that be different from the current comparisons if (result != expected simplified form)
? Now if the framework supported an automatic comparison of compiled vs interpreted invocation, that would be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, creating large coverage with a single run under the IR framework is not currently possible I think.
Generally, there are other tricks to get "changing constants", see what I did with setConstant
and int_con
in this test:
test/hotspot/jtreg/compiler/loopopts/superword/TestAlignVectorFuzzer.java
But the tests are rerun a lot anyway, so that is not super necessary.
I am working on a Template framework that makes using random constants much easier, and also generating multiple methods where only the constants differ. That should make things a little easier.
I suppose that works: if (result != expected simplified form)
Though only for cases where we have a valid simplification. If you also want to test the cases that have a very similar pattern, but should not accidentally wrongly optimize, then you would have to do the compiled/interpreted comparison.
src/hotspot/share/opto/mulnode.cpp
Outdated
} | ||
|
||
jint zeros = AndIL_min_trailing_zeros(phase, expr, bt); | ||
return zeros > 0 && ((((jlong)1) << zeros) > mask_t->hi_as_long() && mask_t->lo_as_long() >= 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line indicates that the mask
could be a variable. You should make sure to add some tests for that in your patterns. You can create a variable in a specific range like this Math.min(5, Math.max(1, x))
, should get you x
clamped into the region 1..5
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a variant for adding consts using the same pattern as the other "NonConstMask" tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please point me to the tests where mask_t
is a range, and not a constant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests addConstNonConstMask[Int,Long]
(https://github.com/openjdk/jdk/pull/22856/files#diff-2c6beb2b7bcb76601adb439471e786963c6e0d5cb6db132381f64e10df5819daR207), copied from the existing NonConstMask tests exercise this.
src/hotspot/share/opto/mulnode.cpp
Outdated
// Is expr a neutral element wrt addition under mask? | ||
static bool AndIL_is_zero_element(const PhaseGVN* phase, const Node* expr, const Node* mask, BasicType bt); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer a more expressive function name over a comment here. The comment is a little confusing to me too.
The old name at least talked about shift and mask - is that not relevant any more?
Or you just decide to name it AndIL_is_zero
, and drop out the comment. Because who knows someone might add other things that check for zero in that method, and then your comment would be out-dated (but probably people would forget to adjust it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to is_zero_element_under_mask. "zero element" for me drives down that it's neither
- checking for expr == 0
- nor checking for expr & mask == 0
but really (X + expr) & mask == X & mask for all X.
There is no requirement for shift node, e.g, we recognize is_zero_under_mask(192, 7). However, the constants that are recognized here are "shifts in spirit" (e.g. expanded from (i + 24) << 3
). If you can think of a good term for this that doesn't suggest there's an actual "shift node" we could try and incorporate that.
Since this is a forward declaration, the elaborate comment is below. Went and dropped the short one here.
src/hotspot/share/opto/mulnode.cpp
Outdated
@@ -670,9 +670,13 @@ const Type *AndINode::mul_ring( const Type *t0, const Type *t1 ) const { | |||
return and_value<TypeInt>(r0, r1); | |||
} | |||
|
|||
// Is expr a neutral element wrt addition under mask? | |||
static bool AndIL_is_zero_element(const PhaseGVN* phase, const Node* expr, const Node* mask, BasicType bt); | |||
|
|||
const Type* AndINode::Value(PhaseGVN* phase) const { | |||
// patterns similar to (v << 2) & 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this comment be updated now for a more general pattern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to drop this and refer the reader to the definition.
// Returns a lower bound on the number of trailing zeros in expr. | ||
static jint AndIL_min_trailing_zeros(const PhaseGVN* phase, const Node* expr, BasicType bt) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this method restricted to use in AndIL? Because it looks like it is doing something more generic: trying to figure out a lower bound on the trailing zeros of an expression.
If that is the case: Why not put it in Node::get_trailing_zeros_lower_bound(phase, bt)
, so it can be used elsewhere too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would argue that while this might be incidentally reusable outside of the scope of "And" nodes, as long as there is no actual demand to reuse this, I would rather not add it to the rather prominent Node class to avoid api bloat.
Iff the notion of "is known to be a multiple of a certain power of two" is really of general interest, I would expect it to become part of TypeInteger
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, just leave it where it is for now. I'm ok with it.
I like it, thanks a lot for finishing that up! Incorporated and pushed your version. I was trying to avoid a |
@mernst-github Glad you like it 😊 I ran testing one more time, since the VM code changed slightly since last time, I think. If @merykitty doesn't disagree we can hopefully integrate after the weekend :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mernst-github Tests are passing. Nice work, and thanks for sticking with us! Approved.
Super, glad this worked out! I want to return the compliment, thanks for sticking with me :-) |
/integrate |
@mernst-github |
/sponsor |
Going to push as commit 7f3ecb4.
Your commit was automatically rebased without conflicts. |
@eme64 @mernst-github Pushed as commit 7f3ecb4. 💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored. |
@mernst-github Thanks a lot for going with us through this process, your contribution will be engraved in the (git) history 😂 |
This caused a regression in CCP: JDK-8350563 @mernst-github, could you please have a look? Thanks. |
Sorry for that. Super-quick look: I have no idea what CCP is, but it appears that it contains redundant logic and knows what kind of optimizations are performed elsewhere. https://github.com/openjdk/jdk21u-dev/blob/fa896c742b494520e6c652539e8218b7127e877a/src/hotspot/share/opto/phaseX.cpp#L1972 looks salient, just from pattern matching I wonder if it's sufficient to add Op_ConI/L to this list. |
Another place that seems to understand the Shift-And pattern: https://github.com/openjdk/jdk21u-dev/blob/master/src/hotspot/share/opto/phaseX.cpp#L1611 |
Yes, both CCP and IGVN need special logic to make sure that a node A is added to the worklist if another node B that is not a direct input to B changed such that node A could now be further optimized. You added such an optimization but the code to add the node to the worklist is missing. It's exactly the two places you referred to. I think it should be possible to create a test for IGVN as well, i.e. come up with a case where the optimization is not performed although it could. |
Related: https://bugs.openjdk.org/browse/JDK-8288683 I'm still trying to understand:
|
Sometimes that is due to timing, i.e. differences in profiling, what code gets inlined etc. That can change the shape of the IR and the order in which it is processed. That can change the order in which things get folded and propagated, and so in some cases the bug triggers and in others not. It would be good to extract a smaller reproducer where it triggers more reliably. Here
Let me explain:
Not sure if any of this applies in that example, but it might. |
About severity: As long as we find and integrate a fix during |
The good news: it's easy to reproduce the issue by adding a loop to CheckProp, and I have a fix up for CCP that appears(!) straightforward. I do not quite understand the interplay with IGVN, the set of "push" rules does not look consistent between the two phases to me. But re-reading @eme64's response, CCP is necessary for correctness, IGVN is optimization potential(?). That said, I have taken up new employment and need to sort out the contribution framework first before I can contribute the fix. |
@mernst-github IGVN and CCP try to infer information about a node from its inputs. As a result, when an input of a node is changed by IGVN or CCP, you need to do the inference on that node again. Theoretically, you need to push all nodes below a changed node to the worklist, but doing so is expensive and unnecessary. So, we use the fairly convoluted approach of pushing all immediate uses, and indirect uses are pushed in an ad-hoc manner depending on the way particular nodes do their inference. For example, in this case, since AndNode::Value looks through a LShiftNode, you need to do the opposite when pushing them to the IGVN worklist, i.e. pushing all AndNode that is a use of a direct use LShiftNode of the current node.
It is because IGVN uses Ideal, Identity, and Value, which means that the pushing logic needs to take into consideration the inference of all these methods. While CCP only uses Value, so the pushing logic only needs to know which nodes the Value methods try to look through.
Not really, we assume a graph is stable after IGVN, so missed idealisation here can also be considered a bug (especially missed transformations with CFG nodes). The difference is that these assumptions do not practically cover all nodes so we may get away with some of them.
Don't worry, RDP1 of JDK-25 is not so urgent. So take your time. |
Concretely, my understanding of what's breaking here is the following:
Pushing the And node fixes the crash and looks straightforward, but it makes me wonder why it is necessary, and why the change doesn't bubble through the Cast already: shouldn't the cast node be re-pushed when its input changes, |
The issue is interesting. The code shape looks like this Of course, in this particular case, a solution is to check for top inputs before proceeding with any action. However, I think that it is not sufficient. Given that we often look through |
My proposed fix for this is in #23871 . |
Fixes JDK-8346664: extends the optimization of masked sums introduced in #6697 to cover constant values, which currently break the optimization.
Such constant values arise in an expression of the following form, for example from
MemorySegmentImpl#isAlignedForElement
:Currently,
256
is not being recognized as a shifted value. This PR enables further reduction:Implementation notes:
in order to stay with the flow of the current implementation, I refrained from solving general (const & mask)==0 cases, but only those where const == _ << shift.I modified existing test cases adding/subtracting from the index var (which would fail with current C2). Let me know if would like to see separate cases for these.Progress
Issue
Reviewers
Reviewing
Using
git
Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/22856/head:pull/22856
$ git checkout pull/22856
Update a local copy of the PR:
$ git checkout pull/22856
$ git pull https://git.openjdk.org/jdk.git pull/22856/head
Using Skara CLI tools
Checkout this PR locally:
$ git pr checkout 22856
View PR using the GUI difftool:
$ git pr show -t 22856
Using diff file
Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/22856.diff
Using Webrev
Link to Webrev Comment