Skip to content

Conversation

wks
Copy link
Collaborator

@wks wks commented Oct 9, 2025

We remove OnAllocationFail and add two boolean fields to AllocationOptions:

  • allow_overcommit: whether we allow overcommit
  • at_safepoint: whether this allocation is at a safepoint

Now Space::acquire always polls before trying to get new pages. Particularly, when allow_overcommit == true, polling and over-committing will happen in one allocation attempt. If we also set at_safepoint == false, the current mutator will be able to allocate normally in this allocation, but block for GC at the nearest safepoint. This is useful for certain VMs.

@wks wks mentioned this pull request Oct 10, 2025
wks added 2 commits October 13, 2025 17:54
We remove OnAllocationFail and add three boolean fields to
AllocationOptions.
@wks wks force-pushed the feature/overcommit-still-triggers-gc3 branch from 59ff447 to fa47af7 Compare October 13, 2025 10:04
@wks
Copy link
Collaborator Author

wks commented Oct 13, 2025

After this PR, the decision tree becomes: (assuming this is a mutator thread, and GC is already initialized)

  • Does the allocation option allow eager polling
    • If yes, poll, and move on.
    • If no, just move on.
  • Is any of the following true? (a) The poll above didn't trigger GC (Consider it "not triggered" if we skipped polling), or (2) the allocation option allows over-commit
    • If yes, try to get pages from the page resource, and move on.
    • If no, just move on.
  • Have we got pages from the page resource?
    • If yes, do the mmapping and return the address.
    • If no, are we at safepoint?
      • If yes, then
        • Have we tried getting new pages from the page resource?
          • If yes, force a GC, and move on.
          • If no, just move on.
        • block for GC.
        • return NULL.
      • If no, then return NULL immediately.

The control flow is more linear than before, with three steps, each using one boolean option.

By combining the three options, we can replicate the behaviors of the previous OnAllocationFail variants.

Variant eager_polling allow_overcommit at_safepoint
RequestGC true false true
ReturnFailure true false false
OverCommit false true false

We can make a new combination to poll (scheduling GC in the background) and overcommit at the same time, and postpone blocking for GC to the next safepoint. This can be useful for VMs where allocation never happens at safepoints.

new behavior eager_polling allow_overcommit at_safepoint
both poll and overcommit true true false

But I wonder whether we can remove the eager_polling option (i.e. always making it true). I can't think of any use cases where we don't want to poll. Polling only affects GC threads in the background. Even if GC is not initialized at this time, GC workers will be able to start the first GC immediately after GC is initialized. So it seems to be harmless to let it poll all the time.

@wks wks marked this pull request as ready for review October 14, 2025 07:54
@wks wks requested a review from qinsoon October 14, 2025 07:54
@wks
Copy link
Collaborator Author

wks commented Oct 15, 2025

We discussed in today's meeting. We should either

  • remove eager_polling (making the first polling compulsory, unless it is not a mutator or GC is not enabled), or
  • allow disabling the second polling (the one after trying to get pages from pr and failing), too.

I am in favor for removing eager_polling. I added eager_polling for the purpose of letting the user replicating the old OnAllocationFailure::OverCommit behavior. But we think we can change behavior as long as it is more reasonable. As I mentioned before, the GC is scheduled in the background without affecting the execution of the mutator, and I can't think of any reason why the mutator would try not to trigger GC. If it needs "critical section" semantics, we have a separate issue discussing this: #1398

The only thing it may affect may be NoGC. NoGC panics immediately in NoGC::schedule_collection. So the eager polling has a potential to let the process panic earlier than before, if allowing over-committing. I think this is actually reasonable because otherwise the program will simply ignore the heap size if it always uses over-committing.

Copy link
Member

@qinsoon qinsoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than #1400 (comment), this PR looks good to me.

@@ -1,6 +1,8 @@
// GITHUB-CI: MMTK_PLAN=NoGC
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this test only for NoGC?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it doesn't have to. The test is about the behavior of allocation after exceeding the heap size, and it is not about the GC. So it doesn't really matter which plan it is. But I changed it to "all" just in case any plan triggers GC differently (mainly ConcurrentImmix).

@wks
Copy link
Collaborator Author

wks commented Oct 15, 2025

I removed eager_polling and added migration guide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants