Skip to content

Conversation

@rvansa
Copy link
Collaborator

@rvansa rvansa commented Oct 14, 2025

CRaC Engine can support storing additional metadata about the image. This can help the infrastructure to further refine the set of feasible images (constrained by CPU architecture and features) and select the image that is expected to perform best.


Progress

  • Change must not contain extraneous whitespace

Issue

  • JDK-8369566: CRaC: Record metrics during checkpoint (Enhancement - P4)

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/crac.git pull/269/head:pull/269
$ git checkout pull/269

Update a local copy of the PR:
$ git checkout pull/269
$ git pull https://git.openjdk.org/crac.git pull/269/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 269

View PR using the GUI difftool:
$ git pr show -t 269

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/crac/pull/269.diff

Using Webrev

Link to Webrev Comment

@bridgekeeper
Copy link

bridgekeeper bot commented Oct 14, 2025

👋 Welcome back rvansa! A progress list of the required criteria for merging this PR into crac will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk
Copy link

openjdk bot commented Oct 14, 2025

@rvansa This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8369566: CRaC: Record metrics during checkpoint

Reviewed-by: tpushkin

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 1 new commit pushed to the crac branch:

  • e2d4ede: 8369729: [CRaC] CRaC restore fails with different CPUs

Please see this link for an up-to-date comparison between the source branch of this pull request and the crac branch.
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the crac branch, type /integrate in a new comment.

@openjdk openjdk bot added the rfr Pull request is ready for review label Oct 14, 2025
@mlbridge
Copy link

mlbridge bot commented Oct 14, 2025

@rvansa rvansa changed the title 8368929: CRaC: Record metrics during checkpoint 8369566: CRaC: Record metrics during checkpoint Oct 14, 2025
@openjdk openjdk bot added the ready Pull request is ready to be integrated label Oct 14, 2025
@rvansa
Copy link
Collaborator Author

rvansa commented Oct 14, 2025

It seems that the PathPatternTest introduced in #264 is unstable :-/

@rvansa rvansa requested a review from TimPushkin October 15, 2025 12:25
@rvansa
Copy link
Collaborator Author

rvansa commented Oct 16, 2025

@TimPushkin I've piggy-backed a fix in PathPatternTest, so now this is green. Could you give an ETA on review, please?

@TimPushkin
Copy link
Collaborator

@rvansa I'll try to look today, if the time allows; otherwise, I'll look tomorrow.

Copy link
Collaborator

@TimPushkin TimPushkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be some way for the user to know which metrics are available because the names will be needed to write a selection policy. Ideally some documentation for the value ranges is also needed.

return false;
}
_score.foreach([&](const Score& score){
fprintf(f, "%s=%f\n", score._name, score._value);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no validation that = and \n are not present in the name so there will probably be problems with parsing

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As for =, you can easily say that the last = is the separator. The \n is user shooting himself into foot.
However in general, this format (and its limitations) is engine-specific and there is no way to report the unsuitability. When the score is reported on Java level, the engine is not consulted yet, and later on we could only fail the checkpoint, which seems excessive to me.
If you want, I can truncate the metric name at first \n when recording it here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot that the value is not a generic string. With that = is indeed not a problem, for \n we can truncate with a warning.

@rvansa
Copy link
Collaborator Author

rvansa commented Oct 22, 2025

@TimPushkin I've added the test, and changed the way how the global context size is reported in scores: We have reported the practically constant version of internal global context, not the other (!) global context exposed to the user.

Copy link
Collaborator

@TimPushkin TimPushkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe after our internal discussions we've decided to document the format of crexec's score output somewhere?

private static void setJdkResourceScore() {
int resources = 0;
for (var p : Core.Priority.values()) {
if (p.getContext() instanceof OrderedContext<?> octx) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer a cast just to ensure we don't miss a resource, but for now this should be equivalent

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Core.Priority.SCORE.getContext() is not an OrderedContext.

Copy link
Collaborator

@TimPushkin TimPushkin Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could check p != Core.Priority.SCORE then. I am suggesting this because it'll make sure we won't miss resources if we add some other type of context in the future. But if you think this is unlikely, I'm OK to leave it as is.

@rvansa
Copy link
Collaborator Author

rvansa commented Oct 24, 2025

/integrate

@openjdk
Copy link

openjdk bot commented Oct 24, 2025

Going to push as commit 1bf79d7.
Since your change was applied there has been 1 commit pushed to the crac branch:

  • e2d4ede: 8369729: [CRaC] CRaC restore fails with different CPUs

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Oct 24, 2025
@openjdk openjdk bot closed this Oct 24, 2025
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Oct 24, 2025
@openjdk
Copy link

openjdk bot commented Oct 24, 2025

@rvansa Pushed as commit 1bf79d7.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

integrated Pull request has been integrated

Development

Successfully merging this pull request may close these issues.

2 participants