Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8346664: C2: Optimize mask check with constant offset #22856

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mernst-github
Copy link

@mernst-github mernst-github commented Dec 21, 2024

Fixes JDK-8346664: extends the optimization of masked sums introduced in #6697 to cover constant values, which currently break the optimization.

Such constant values arise in an expression of the following form, for example from MemorySegmentImpl#isAlignedForElement:

(base + (index + 1) << 8) & 255
=> MulNode
(base + (index << 8 + 256)) & 255
=> AddNode
((base + index << 8) + 256) & 255

Currently, 256 is not being recognized as a shifted value. This PR enables further reduction:

((base + index << 8) + 256) & 255
=> MulNode (this PR)
(base + index << 8) & 255
=> MulNode (PR #6697)
base & 255 (loop invariant)

Implementation notes:

  • in order to stay with the flow of the current implementation, I refrained from solving general (const & mask)==0 cases, but only those where const == _ << shift.
  • I modified existing test cases adding/subtracting from the index var (which would fail with current C2). Let me know if would like to see separate cases for these.
  • I verified that the originating issue "scaled varhandle indexed with i+1" (https://mail.openjdk.org/pipermail/panama-dev/2024-December/020835.html) is resolved with this PR.

Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Error

 ⚠️ OCA signatory status must be verified

Issue

  • JDK-8346664: C2: Optimize mask check with constant offset (Enhancement - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/22856/head:pull/22856
$ git checkout pull/22856

Update a local copy of the PR:
$ git checkout pull/22856
$ git pull https://git.openjdk.org/jdk.git pull/22856/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 22856

View PR using the GUI difftool:
$ git pr show -t 22856

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/22856.diff

Extends the optimization of masked sums introduced in openjdk#6697 to cover constant values, which currently break the optimization.

Such constant values arise in an expression of the following form, for example from MemorySegmentImpl#isAlignedForElement:

(base + (index + 1) << 8) & 255
=> MulNode
(base + (index << 8 + 256)) & 255
=> AddNode
((base + index << 8) + 256) & 255

Currently, "256" is not being recognized as a shifted value. This PR enables:

((base + index << 8) + 256) & 255
=> MulNode
(base + index << 8) & 255
=> MulNode (PR openjdk#6697)
base & 255
@bridgekeeper bridgekeeper bot added the oca Needs verification of OCA signatory status label Dec 21, 2024
@bridgekeeper
Copy link

bridgekeeper bot commented Dec 21, 2024

Hi @mernst-github, welcome to this OpenJDK project and thanks for contributing!

We do not recognize you as Contributor and need to ensure you have signed the Oracle Contributor Agreement (OCA). If you have not signed the OCA, please follow the instructions. Please fill in your GitHub username in the "Username" field of the application. Once you have signed the OCA, please let us know by writing /signed in a comment in this pull request.

If you already are an OpenJDK Author, Committer or Reviewer, please click here to open a new issue so that we can record that fact. Please use "Add GitHub user mernst-github" as summary for the issue.

If you are contributing this work on behalf of your employer and your employer has signed the OCA, please let us know by writing /covered in a comment in this pull request.

@openjdk
Copy link

openjdk bot commented Dec 21, 2024

❗ This change is not yet ready to be integrated.
See the Progress checklist in the description for automated requirements.

@openjdk
Copy link

openjdk bot commented Dec 21, 2024

@mernst-github The following label will be automatically applied to this pull request:

  • hotspot-compiler

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@mernst-github mernst-github changed the title 8346664: C2: optimize constant addends in masked sums 8346664: C2: Optimize mask check with constant offset Dec 21, 2024
@@ -2052,94 +2052,88 @@ const Type* RotateRightNode::Value(PhaseGVN* phase) const {
}
}

// Given an expression (AndX shift mask) or (AndX mask shift),
// Returns a lower bound of the number of trailing zeros in expr.
jint MulNode::AndIL_min_trailing_zeros(PhaseGVN* phase, Node* expr, BasicType bt) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could be a static function, I don't see much value in it being a method in MulNode.

// Returns a lower bound of the number of trailing zeros in expr.
jint MulNode::AndIL_min_trailing_zeros(PhaseGVN* phase, Node* expr, BasicType bt) {
expr = expr->uncast();
if (expr == nullptr) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should not be nullptr, you can safely remove it.


if (type->is_con()) {
long con = type->get_con_as_long(type->basic_type());
return con == 0L ? 0 : count_trailing_zeros(con);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the sake of consistency, we should return the type width for con == 0, you can obtain this by type2aelementbytes(bt) * 8


if (expr->Opcode() == Op_ConvI2L) {
expr = expr->in(1);
if (expr == nullptr) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cannot be nullptr, you can safely remove it, the same for expr->uncast() below. In general, the only case when the input of a ConvI2L (and other nodes) not being an int is when it is top, which means it is empty. A.k.a unreachable code.

if (expr == nullptr) {
return 0;
}
type = phase->type(expr)->isa_int();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are trying to look through a ConvI2L, I think for the sake of consistency, you can reassign bt to T_INT at this point.

return 0;
}
const TypeInt* rhs_t = phase->type(rhs)->isa_int();
if (!rhs_t || !rhs_t->is_con()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are trying to avoid implicit conversion to bool, you can use an explicit rhs_t != nullptr here.

if (!rhs_t || !rhs_t->is_con()) {
return 0;
}
return rhs_t->get_con() & ((type->isa_int() ? BitsPerJavaInteger : BitsPerJavaLong) - 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you reassign bt, you can do type2aelementbytes(bt), which IMO is clearer.

bool MulNode::AndIL_shift_and_mask_is_always_zero(PhaseGVN* phase, Node* shift, Node* mask, BasicType bt, bool check_reverse) {
if (mask == nullptr || shift == nullptr) {
// mask M, we check for both operand orders.
bool MulNode::AndIL_is_always_zero(PhaseGVN* phase, Node* expr, Node* mask, BasicType bt, bool check_reverse) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually you cannot conclude that ((x + y) & m) == 0 iff (x & m) == 0 when (y & m) == 0 because the addition x + y can carry some bit into the positions at which m is set. Consider this example for illustration:

(0b1010 + 0b0010) & 0b0100 == 0b1100 & 0b0100 == 0b0100 != 0

even when

0b1010 & 0b0100 == 0
0b0010 & 0b0100 == 0

The most trivial sufficient condition we are using here is that the lowest bit set of y is larger than the highest bit set of m. Because then adding y into x does not carry any bit into the result that is set in m but not set in x. This method can be a static function, too IMO.

@@ -120,7 +120,7 @@ public static void checkShiftNonConstMaskLong(long res) {
@IR(counts = { IRNode.AND_I, "1" })
@IR(failOn = { IRNode.ADD_I, IRNode.LSHIFT_I })
public static int addShiftMaskInt(int i, int j) {
return (j + (i << 2)) & 3; // transformed to: return j & 3;
return (j + ((i + 1) << 2)) & 3; // transformed to: return j & 3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer you adding other test cases instead of modifying existing ones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hotspot-compiler [email protected] oca Needs verification of OCA signatory status
Development

Successfully merging this pull request may close these issues.

2 participants