Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QJS hooks and GC hooks #747

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

KaruroChori
Copy link
Contributor

Partial replacement of #407 with a different hook mechanism in place.

Copy link
Contributor

@saghul saghul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the current API proposallooks too broad while at the same time being very limited.

IMHO I'd rather have a GC specific API without the apparent flexibility.

@@ -305,6 +305,9 @@ struct JSRuntime {
void *user_opaque;
void *libc_opaque;
JSRuntimeFinalizerState *finalizers;

/* Unified handler for all the engine hooks */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have something similar for finalizers, how is this generic? It only deals with GC. I'd call it GC hooks.

Alternatively we could fold finalizers in there too, but I'm not too thrilled about it.

@@ -1440,7 +1443,17 @@ static void js_trigger_gc(JSRuntime *rt, size_t size)
(uint64_t)rt->malloc_state.malloc_size);
}
#endif
JS_RunGC(rt);
//To ensure JS_RunGC cannot be executed again within callbacks, disable and restore it after.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make the style match the surroundings, here and elsewhere.

size_t tmp_threshold = rt->malloc_gc_threshold;
rt->malloc_gc_threshold=-1;

if((rt->hooks == NULL) || rt->hooks(JS_HOOK_GC_BEFORE,NULL)){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for not running GC if the before handler returns FALSE? There is an API to disable GC already, what's the use case here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hook could change the threshold, but you are then re-setting it.

Copy link
Contributor Author

@KaruroChori KaruroChori Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but while functionally similar they have different applications. Setting the GC on/off in specific critical regions can get a bit tricky with async/await. Not only that, but doing so would mask that a request was done, preventing the application from saying "oh yes, I had to delay it, I am now in a safe region, let's clean up things". Ensuring the before event can say, "hey no, please don't" is a fail-safe. Ideally the custom callback could set up alternative mechanism to run that gc event, just a bit later when no critical task it taking place.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds very theoretical ;-) Why can't the mechanism that knows if we are safe or not enable / disable GC while in a critical region?

Copy link
Contributor Author

@KaruroChori KaruroChori Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the application requirements:

  • Option 1: I don't care, gc events as needed and scheduled by the engine. Not good for real time applications.
  • Option 2: gc events only when I determine regions are safe (possibly never). I will trigger it myself when and if I want it. The only feasible approach for the strictest real time applications.
  • Option 3: I would prefer gc to take place based on the engine's suggestions, but I cannot do it right now. I take notice of that, and skip it for now. In the meanwhile I ask the system to move itself in a safe state when possible, schedule the gc event in that safe state, and recover the current job from where it was left.

Having the option to skip gc events while tracking them enables option 3 which is better than option 2 for a good class of real time applications.
It is not theoretical, it is literally how my application works.
It does not mean it is good design, just that at the very least it is slightly above a purely theoretical scenario.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for elaborating! Looks like option 3 could work as follows:

  • GC event hits
  • app checks if it can afford it, if not it disabled GC and tries to move to the safe state
  • Once the safe state is hit, enable GC and run it
  • goto 1

Wouldn't that work?

Copy link
Contributor Author

@KaruroChori KaruroChori Dec 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the time the app can check if it is ok to run a GC event, it is already in the body of before.
If before does not stop it by failing, and the caller obliges, what does?
Unless you plan to split before and evaluate in two separate hooks, but it would be virtually identical just for more complexity since there are no further places where they would be called.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before can stop GC from running by disabling it.

So before get hit, but because you disable it it doesn't run. Then when you want GC the app can enable it and run it.

What wouldn't work?

@@ -1810,6 +1823,8 @@ JSRuntime *JS_NewRuntime2(const JSMallocFunctions *mf, void *opaque)
rt->malloc_state = ms;
rt->malloc_gc_threshold = 256 * 1024;

rt->hooks = NULL;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd call this gc_handlers, gc_hooks or gc_callbacks, just hooks feels wrong.

@@ -324,6 +324,13 @@ typedef void JSRuntimeFinalizer(JSRuntime *rt, void *arg);

typedef struct JSGCObjectHeader JSGCObjectHeader;

/* Engine hooks */
typedef enum JSRuntimeHooks{
JS_HOOK_NONE, //Don't use, just to catch potential bugs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why this needs to exist. You want to avoid 0? Make JS_HOOK_GC_BEFORE start at 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a habit of mine. I usually keep 0 as an error value for enums, and I usually give them a mnemonic in place of starting with 1. It helps when testing debug builds where memory is initialized to 0.
In this specific context I agree it is not really useful.

@@ -332,6 +339,8 @@ JS_EXTERN void JS_SetMemoryLimit(JSRuntime *rt, size_t limit);
JS_EXTERN void JS_SetDumpFlags(JSRuntime *rt, uint64_t flags);
JS_EXTERN size_t JS_GetGCThreshold(JSRuntime *rt);
JS_EXTERN void JS_SetGCThreshold(JSRuntime *rt, size_t gc_threshold);
/* register a hook dispatcher for this runtime. The handler should always return true for hooks which are not supported. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API suggests one could register multiple hooks, but you really cannot. The finalizers API can, on the contrary.

Now, if you add the ability to register multiple handlers, then the whole boolean return thing doesn't work...

@@ -1932,6 +1948,11 @@ void JS_SetGCThreshold(JSRuntime *rt, size_t gc_threshold)
rt->malloc_gc_threshold = gc_threshold;
}

void JS_SetHooksHandler(JSRuntime *rt, JS_BOOL (*fn)(JSRuntimeHooks type, void* opaque))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why take an opaque if you are not doing anything with it? An opaque would make sense if you support multiple callbacks, so each could get their own.

@KaruroChori
Copy link
Contributor Author

KaruroChori commented Dec 10, 2024

Yes, I agree it is a layer of complexity which makes no sense to me either, and I tried to sustain my point in the original PR. Still, you both insisted on having some kind of dispatching and a single entry point for those events, instead of normal function pointers for callbacks.

As I see it, there are only two reasonable options:

  • A generic mechanism to handle hooks engine-wide, of which those two for GC are the first to be implemented and possibly others will converge (but not those which are exposed as part of the js specs). I still don't think it should be a unified function to handle them all, but I would be very much in favour of a single entrypoint to set them based on the enum value. Still internally the engine should have a function table, and not how it was implemented in this PR.
  • Otherwise, we recognize the scope if just to cover those two for GC... so why all the downstream complexity of a unified event, and an additional enum to record just two entries? At that point the initial proposal of having two simple callback made much more sense to me.

Regardless of the adopted option, I don't see the need of supporting multiple hooks registered under the same event type (I am not talking from a JS perspective, but from the point of view of the engine). This would only require QJS to be much more opinionated on how to handle failures, their dependencies.... at which point it should be on the shoulders of the integrator not the engine itself.

@saghul
Copy link
Contributor

saghul commented Dec 10, 2024

Still, you both insisted on having some kind of dispatching and a single entry point for those events, instead of normal function pointers for callbacks.

We insisted there be a single callback with 2 events, not something that sounds generic for arbitrary events we don't currently have.

  • Otherwise, we recognize the scope if just to cover those two for GC... so why all the downstream complexity of a unified event, and an additional enum to record just two entries? At that point the initial proposal of having two simple callback made much more sense to me.

I guess that's personal peference, perhaps. A tangible argument is that a single function means enlarging the JSRuntime structure less.

I don't see the need of supporting multiple hooks registered under the same event type (I am not talking from a JS perspective, but from the point of view of the engine). This would only require QJS to be much more opinionated on how to handle failures, their dependencies.... at which point it should be on the shoulders of the integrator not the engine itself.

I don't have anything specific in mind, but it's easy to come up with something like:

  • one hook to log GC events somewhere
  • another hook to enable / disable GC

Could you do it with one? Sure. Does it make sense to have a separation of concerns? Depends. Just thinking out loud here.

@KaruroChori
Copy link
Contributor Author

KaruroChori commented Dec 10, 2024

A tangible argument is that a single function means enlarging the JSRuntime structure less.

At the expense of crosstalk between logically distinct operations and a harder integration downstream, with a pattern which is not replicated anywhere else in the codebase for what I can tell.
I honestly fail to see how having a smaller JSRuntime structure would be beneficial, if not for extreme embedded scenarios. That being the case, it would always be possible to disable hooks via preprocessor if those bytes are needed.
Implementing a linear dispatcher downstream would still use up memory, so if there is not enough space for one strategy there is likely none for the other as well.

I don't have anything specific in mind, but it's easy to come up with something like

Those are totally valid options, and yet I am quite sure the rest of the runtime, not just the gc, would have similar cases of useful hooks to expose. So why are you excluding an implementation which can serve the entire runtime in favour of one for each subsystem?

Could you do it with one? Sure. Does it make sense to have a separation of concerns? Depends. Just thinking out loud here.

I don't follow. You are proposing a single event handler for all gc hooks. To avoid wasting space in the runtime structure. I don't agree, but at the very least I can follow the rationale behind. And yet, you suggest supporting multiple hooks for the same type of event. Would it not be a downstream decision for the person implementing the actual event handler? As per your suggestion of a unified handler, I cannot see how this could be a concern for qjs even if we wanted it to be. It would be quite literally code we are not writing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants