-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Change the way the MemoryManager runs. #618
Comments
Follow up after discovering new issues with our current design and profiling JNI overhead more extensively. I don't have this issues with regular scriptless instances, or even with scripts Object. This issue only exists for RefCounted instances with a JVM script. So far this model applied equally to wrappers and scripts, I can change it to be only for scripts. So far so good, it's a relatively simple task to handle, just use the refcounted callbacks Godot already provide and do that switch whenever the counter reaches that value of 1. Except we recently discovered a nasty side effect of that design, some cases create a high frequency pingpong. The one example I have in hand is when you use a Godot resource only referenced by the JVM side. In that situation, the C++ side only hold a weak reference to it. Now a Godot resource needs to be used somehow at some point, mostly by passing it as a parameter to a call to the C++ API and that when we kill our performances. Let's take that bit of code from our most efficient bunnymark benchmark: @RegisterFunction
override fun _draw() {
for (bunny in bunnies) {
drawTexture(bunnyTexture, bunny.position)
}
}
Here we use a basic
Every single call to the Godot API is causing 4 calls to the JNI API, which as you probably know are expensive. How much expensive ? Well it may be a bit unfair to use a benchmark made only for the sake of stressing this particular usecase, but the numbers are quite telling nonetheless. If I choose to run the benchmark with this pingpong behavior disabled (and so creating a potential memory leak), the final score on my computer goes from barely 35k to 80k+. It's more than twice the performances. As I explain earlier, I couldn't think of a way to get rid of the pingpong. At least not one that wouldn't require silly things like asking Kotlin users to do like in C++ and wrap all their RefCounted instances inside a Ref<> wrapper, which is a huge no for me when we are on a JVM with a GC, it's too big of an antipattern for me to accept this kind of solution. Now I think I can still create a design that would make the pingpong frequency so slow that it wouldn't matter any more. The current design is to promote/demote the JNI reference at the same time as the counter increments/decrements. We need to break this model. My suggestion is to "delay" those switches as long as possible. First, we need to identify when and how those switches happen. Fortunately, we have very little of them:
Summary: |
Part 2 of the follow-up focus on the implementation details. If our main issue is crossing the boundary too often, then we have to adapt the logic so it happens the least possible.
Instead of relying on a strong JNI reference to keep a JVM instance, what if we were just using this giant database as a way to keep a strong reference ? We wouldn't even need to directly switch them with a weak reference when we want to break the cycle; we can simply remove them for the DB itself. You may wonder: "If we remove a JVM instance from the DB, how do we ever get it back using its ID?". The answer is simple, we don't need it. Think about it. We switch to a weak reference whenever the C++ doesn't use this instance anymore. So how would C++ even query a JVM instance from an ID it can't have in the first place ? With that in mind, we can transform the ping-pong into a simple add to/remove from the DB. The new behaviour would be the following:
This new design also makes a serial Memory Manager easier to implement. We no longer need the Memory Manager to send its JVM instance to C++ to create a binding. Instead, the C++ side accumulates over a whole frame objects that needs to have their status changed. Then a call from C++ to the JVM is made sending the list of instances that have to be removed from the DB and returns the list of RefCounted instances whose counted must be decremented. I think all our needs are covered by that design. I hope I don't forget some sneaky details that would invalid the idea. |
Implemented by #661 |
Reminder:
The MemoryManager is our central database that holds a weak reference to all Godot related object created on the JVM side, being core types, wrappers for native types or even scripts.
When created, all those objects are registered to a ReferenceQueue that allows us to know when they have been collected by the GC.
This callback is necessary so we can decrement the counter of RefCounted objects on the Godot side. That operation is currently done in a separate thread.
Whenever that thread is woken up by the OS's scheduler, it's checking the queue to see if there is any RefCounted instance to decrement. This design has 3 consequences:
My proposal holds in 2 points:
frame()
method. Like the name suggests, it's called at the end of every frame once the Scene (and so scripts) have been fully processed, which also include the secondary threads running in the new multithreaded scene system. Instead of relying on a JVM thread calling C++ once for every collected reference. We can instead have the module directly query the memory manager for the instances that have been GCs during that frame (if any) and get all the relevant data in a single JNI call (made even faster now that we can convert a whole Kotlin collection in a single call). It technically means more work for the main thread, but the pacing will be stable (at least as stable as the game itself) and more granular. This will greatly reduce the amount of data race, the call happening after the scene system ran. It won't stop all of them, because Godot also has its ThreadPool and users can create their own threads as well. But it's safe to say that the scene threads are the ones that will interact the most with the Godot Script system anyway, and so with our module.All in all, I think the proposal will simplify the memory management a bit and make it way more resilient.
The text was updated successfully, but these errors were encountered: