QJS hooks and GC hooks #747

KaruroChori · 2024-12-10T07:36:49Z

Partial replacement of #407 with a different hook mechanism in place.

saghul

I feel like the current API proposallooks too broad while at the same time being very limited.

IMHO I'd rather have a GC specific API without the apparent flexibility.

saghul · 2024-12-10T08:03:31Z

quickjs.c

@@ -305,6 +305,9 @@ struct JSRuntime {
    void *user_opaque;
    void *libc_opaque;
    JSRuntimeFinalizerState *finalizers;
+
+    /* Unified handler for all the engine hooks */


We have something similar for finalizers, how is this generic? It only deals with GC. I'd call it GC hooks.

Alternatively we could fold finalizers in there too, but I'm not too thrilled about it.

saghul · 2024-12-10T08:03:58Z

quickjs.c

@@ -1440,7 +1443,17 @@ static void js_trigger_gc(JSRuntime *rt, size_t size)
                   (uint64_t)rt->malloc_state.malloc_size);
        }
 #endif
-        JS_RunGC(rt);
+        //To ensure JS_RunGC cannot be executed again within callbacks, disable and restore it after.


Please make the style match the surroundings, here and elsewhere.

saghul · 2024-12-10T08:05:05Z

quickjs.c

+        size_t tmp_threshold = rt->malloc_gc_threshold;
+        rt->malloc_gc_threshold=-1;
+
+        if((rt->hooks == NULL) || rt->hooks(JS_HOOK_GC_BEFORE,NULL)){


What's the reason for not running GC if the before handler returns FALSE? There is an API to disable GC already, what's the use case here?

The hook could change the threshold, but you are then re-setting it.

Yes, but while functionally similar they have different applications. Setting the GC on/off in specific critical regions can get a bit tricky with async/await. Not only that, but doing so would mask that a request was done, preventing the application from saying "oh yes, I had to delay it, I am now in a safe region, let's clean up things". Ensuring the before event can say, "hey no, please don't" is a fail-safe. Ideally the custom callback could set up alternative mechanism to run that gc event, just a bit later when no critical task it taking place.

Sounds very theoretical ;-) Why can't the mechanism that knows if we are safe or not enable / disable GC while in a critical region?

Based on the application requirements:

Option 1: I don't care, gc events as needed and scheduled by the engine. Not good for real time applications.

Option 2: gc events only when I determine regions are safe (possibly never). I will trigger it myself when and if I want it. The only feasible approach for the strictest real time applications.

Option 3: I would prefer gc to take place based on the engine's suggestions, but I cannot do it right now. I take notice of that, and skip it for now. In the meanwhile I ask the system to move itself in a safe state when possible, schedule the gc event in that safe state, and recover the current job from where it was left.

Having the option to skip gc events while tracking them enables option 3 which is better than option 2 for a good class of real time applications.
It is not theoretical, it is literally how my application works.
It does not mean it is good design, just that at the very least it is slightly above a purely theoretical scenario.

Thanks for elaborating! Looks like option 3 could work as follows:

GC event hits

app checks if it can afford it, if not it disabled GC and tries to move to the safe state

Once the safe state is hit, enable GC and run it

goto 1

Wouldn't that work?

By the time the app can check if it is ok to run a GC event, it is already in the body of before.
If before does not stop it by failing, and the caller obliges, what does?
Unless you plan to split before and evaluate in two separate hooks, but it would be virtually identical just for more complexity since there are no further places where they would be called.

Before can stop GC from running by disabling it.

So before get hit, but because you disable it it doesn't run. Then when you want GC the app can enable it and run it.

What wouldn't work?

saghul · 2024-12-10T08:05:35Z

quickjs.c

@@ -1810,6 +1823,8 @@ JSRuntime *JS_NewRuntime2(const JSMallocFunctions *mf, void *opaque)
    rt->malloc_state = ms;
    rt->malloc_gc_threshold = 256 * 1024;

+    rt->hooks = NULL;


I'd call this gc_handlers, gc_hooks or gc_callbacks, just hooks feels wrong.

saghul · 2024-12-10T08:06:30Z

quickjs.h

@@ -324,6 +324,13 @@ typedef void JSRuntimeFinalizer(JSRuntime *rt, void *arg);

 typedef struct JSGCObjectHeader JSGCObjectHeader;

+/* Engine hooks */
+typedef enum JSRuntimeHooks{
+    JS_HOOK_NONE,           //Don't use, just to catch potential bugs


I don't understand why this needs to exist. You want to avoid 0? Make JS_HOOK_GC_BEFORE start at 1?

Just a habit of mine. I usually keep 0 as an error value for enums, and I usually give them a mnemonic in place of starting with 1. It helps when testing debug builds where memory is initialized to 0.
In this specific context I agree it is not really useful.

saghul · 2024-12-10T08:08:12Z

quickjs.h

@@ -332,6 +339,8 @@ JS_EXTERN void JS_SetMemoryLimit(JSRuntime *rt, size_t limit);
 JS_EXTERN void JS_SetDumpFlags(JSRuntime *rt, uint64_t flags);
 JS_EXTERN size_t JS_GetGCThreshold(JSRuntime *rt);
 JS_EXTERN void JS_SetGCThreshold(JSRuntime *rt, size_t gc_threshold);
+/* register a hook dispatcher for this runtime. The handler should always return true for hooks which are not supported. */


The API suggests one could register multiple hooks, but you really cannot. The finalizers API can, on the contrary.

Now, if you add the ability to register multiple handlers, then the whole boolean return thing doesn't work...

saghul · 2024-12-10T08:11:02Z

quickjs.c

@@ -1932,6 +1948,11 @@ void JS_SetGCThreshold(JSRuntime *rt, size_t gc_threshold)
    rt->malloc_gc_threshold = gc_threshold;
 }

+void JS_SetHooksHandler(JSRuntime *rt, JS_BOOL (*fn)(JSRuntimeHooks type, void* opaque))


Why take an opaque if you are not doing anything with it? An opaque would make sense if you support multiple callbacks, so each could get their own.

KaruroChori · 2024-12-10T09:22:30Z

Yes, I agree it is a layer of complexity which makes no sense to me either, and I tried to sustain my point in the original PR. Still, you both insisted on having some kind of dispatching and a single entry point for those events, instead of normal function pointers for callbacks.

As I see it, there are only two reasonable options:

A generic mechanism to handle hooks engine-wide, of which those two for GC are the first to be implemented and possibly others will converge (but not those which are exposed as part of the js specs). I still don't think it should be a unified function to handle them all, but I would be very much in favour of a single entrypoint to set them based on the enum value. Still internally the engine should have a function table, and not how it was implemented in this PR.
Otherwise, we recognize the scope if just to cover those two for GC... so why all the downstream complexity of a unified event, and an additional enum to record just two entries? At that point the initial proposal of having two simple callback made much more sense to me.

Regardless of the adopted option, I don't see the need of supporting multiple hooks registered under the same event type (I am not talking from a JS perspective, but from the point of view of the engine). This would only require QJS to be much more opinionated on how to handle failures, their dependencies.... at which point it should be on the shoulders of the integrator not the engine itself.

saghul · 2024-12-10T14:36:49Z

Still, you both insisted on having some kind of dispatching and a single entry point for those events, instead of normal function pointers for callbacks.

We insisted there be a single callback with 2 events, not something that sounds generic for arbitrary events we don't currently have.

Otherwise, we recognize the scope if just to cover those two for GC... so why all the downstream complexity of a unified event, and an additional enum to record just two entries? At that point the initial proposal of having two simple callback made much more sense to me.

I guess that's personal peference, perhaps. A tangible argument is that a single function means enlarging the JSRuntime structure less.

I don't see the need of supporting multiple hooks registered under the same event type (I am not talking from a JS perspective, but from the point of view of the engine). This would only require QJS to be much more opinionated on how to handle failures, their dependencies.... at which point it should be on the shoulders of the integrator not the engine itself.

I don't have anything specific in mind, but it's easy to come up with something like:

one hook to log GC events somewhere
another hook to enable / disable GC

Could you do it with one? Sure. Does it make sense to have a separation of concerns? Depends. Just thinking out loud here.

KaruroChori · 2024-12-10T16:28:08Z

A tangible argument is that a single function means enlarging the JSRuntime structure less.

At the expense of crosstalk between logically distinct operations and a harder integration downstream, with a pattern which is not replicated anywhere else in the codebase for what I can tell.
I honestly fail to see how having a smaller JSRuntime structure would be beneficial, if not for extreme embedded scenarios. That being the case, it would always be possible to disable hooks via preprocessor if those bytes are needed.
Implementing a linear dispatcher downstream would still use up memory, so if there is not enough space for one strategy there is likely none for the other as well.

I don't have anything specific in mind, but it's easy to come up with something like

Those are totally valid options, and yet I am quite sure the rest of the runtime, not just the gc, would have similar cases of useful hooks to expose. So why are you excluding an implementation which can serve the entire runtime in favour of one for each subsystem?

Could you do it with one? Sure. Does it make sense to have a separation of concerns? Depends. Just thinking out loud here.

I don't follow. You are proposing a single event handler for all gc hooks. To avoid wasting space in the runtime structure. I don't agree, but at the very least I can follow the rationale behind. And yet, you suggest supporting multiple hooks for the same type of event. Would it not be a downstream decision for the person implementing the actual event handler? As per your suggestion of a unified handler, I cannot see how this could be a concern for qjs even if we wanted it to be. It would be quite literally code we are not writing.

checkroom added 2 commits December 10, 2024 06:59

Implementing registration for hooks

f73b7db

Implemented hooks for GC

4c60f3d

saghul reviewed Dec 10, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

QJS hooks and GC hooks #747

QJS hooks and GC hooks #747

KaruroChori commented Dec 10, 2024

saghul left a comment

saghul Dec 10, 2024

saghul Dec 10, 2024

saghul Dec 10, 2024

saghul Dec 10, 2024

KaruroChori Dec 10, 2024 •

edited

Loading

saghul Dec 10, 2024

KaruroChori Dec 10, 2024 •

edited

Loading

saghul Dec 10, 2024

KaruroChori Dec 10, 2024 •

edited

Loading

saghul Dec 10, 2024

saghul Dec 10, 2024

saghul Dec 10, 2024

KaruroChori Dec 10, 2024

saghul Dec 10, 2024

saghul Dec 10, 2024

KaruroChori commented Dec 10, 2024 •

edited

Loading

saghul commented Dec 10, 2024

KaruroChori commented Dec 10, 2024 •

edited

Loading

QJS hooks and GC hooks #747

Are you sure you want to change the base?

QJS hooks and GC hooks #747

Conversation

KaruroChori commented Dec 10, 2024

saghul left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KaruroChori Dec 10, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KaruroChori Dec 10, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KaruroChori Dec 10, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KaruroChori commented Dec 10, 2024 • edited Loading

saghul commented Dec 10, 2024

KaruroChori commented Dec 10, 2024 • edited Loading

KaruroChori Dec 10, 2024 •

edited

Loading

KaruroChori Dec 10, 2024 •

edited

Loading

KaruroChori Dec 10, 2024 •

edited

Loading

KaruroChori commented Dec 10, 2024 •

edited

Loading

KaruroChori commented Dec 10, 2024 •

edited

Loading