-
Notifications
You must be signed in to change notification settings - Fork 14
[OF-1524] Cross-thread Callbacks #806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This creates a callback queue using the emscripten api to be able to push callbacks to the main thread if we are not on that thread
Logic is now reversed now that we have fixed this test
We aere alwaysd logging that the connection was interrupted, even for a regular disconnect
🚀 Pull Request Review GuidelinesThank you for taking the time to review this Pull Request. The following is a summary of our Pull Request guidelines. The full guidelines can be found here. 💬 How to Provide FeedbackWe use a comment ladder when leaving review comments to avoid any ambiguity.
All comments should be prefixed with one of the above tags, for example: 🎯 PR Author Focus Areas
Thanks again for taking the time to review this Pull Request. |
| catch (const std::exception& e) | ||
| { | ||
| INVOKE_IF_NOT_NULL(NetworkInterruptionCallback, e.what()); | ||
| LogSystem.LogMsg(csp::common::LogLevel::Log, "Connection Interrupted."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Log should be in here, as this is when we call the NetworkInterruptionCallback
| {{# parameters }} | ||
| {{# type.is_string }} | ||
| _{{ name }} | ||
| (const char*)_{{ name }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was getting compile errors for strings, as were were creating chars. but we are expected cont char
| {{ name }}_StateObject, | ||
| {{# parameters }} | ||
| CallCallback({{ name }}, {{ name }}_StateObject, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now calls the wrapper function instead
| errors.some(e => e.message.includes('table index is out of bounds')), | ||
| 'Expected cross-thread `table index is out of bounds` error. Message not found, did you fix the bug (OB-3782)? Nice job! ', | ||
| ); | ||
| !errors.some(e => e.message.includes('table index is out of bounds'))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can flip this test logic now its fixed!
🚀 Pull Request Review GuidelinesThank you for taking the time to review this Pull Request. The following is a summary of our Pull Request guidelines. The full guidelines can be found here. 💬 How to Provide FeedbackWe use a comment ladder when leaving review comments to avoid any ambiguity.
All comments should be prefixed with one of the above tags, for example: 🎯 PR Author Focus Areas
Thanks again for taking the time to review this Pull Request. |
|
I don't think this PR is complete, there is not enough testing. We need at least one test for each of the known cases. And given that this is an emscripten implementation, I was also expecting tests demonstrating what happens when functions are called from web workers, or at least some broadcasting of what the model is on this.
I'm don't think this is acceptable, the way the testing framework or CSP is setup isn't immutable, did you experiment with a proxying server & changing the interruption timeout time to be much lower/instant? The below is ideological critique, and wouldn't neccesarily block a merge, that would be up to @MAG-AdamThorn as tech lead Primarily though, looking at this implementation I struggle to agree with the tradeoffs of using the emscripten api for this, rather than a hand-rolled queue. It seems almost equally large in terms of code footprint, with the disadvantages of :
Clients needing to call |
It's pretty well documented, with 1 emscripten api call, and a few small functions
Agreed. However, if we move to embind, it would be a better fit for this, and would be easy to move
The tick solution is limited, as it will push callbacks onto the main thread, ,so if clients want to call these functions on a worker in the future, it will still crash With the emscripten solution, we can update this to call back to the thread the client originally called it on. Also, I think the tick solution creates a confusing api, where certain functions will only callback if you are calling tick, especially since this affects functions during the csp initialization process (Login). This will break all of our tests and likely client implementations. |
Why is that true? The example I showed you only pushed data back onto the main thread via a copy, callbacks never moved. It was a simpler model imo which completely sidestepped the problem of moving capture state and should have worked from any calling environment.
I think this is a vast overstatement. Nonetheless, as we disagree on this there's not much to be done. I'll reluctantly review the existing implementation presuming @MAG-AdamThorn wants to press ahead with this, although my critiques around the level of test coverage still stands. I wish there was a way of moving this out of the mustache files still ... it's bound to just be forgotten about. |
The issue we are experiencing is that the callbacks are called on a different thread from which it was created, not that its called on a non-main thread. So, if clients call these functions on a worker thread, we will push to the main thread, which will crash again. This current solution doesn't support this. However, it can be updated to support if needed. |
|
I agree with the test coverage. As mentioned in the standup thread earlier, I will continue with this |
Ah, I see. Yeah, the tick solution suffers from the same issue as this implementation then without additional work, I'll admit this weakens my argument somewhat. |
|
We shouldn't be having this discussion at PR time. Next time we realize we disagree on implementation, lets try to properly resolve it earlier. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Despite my hesitation around solving the problem in this area at the moment, given the roughness of the terrain, I think the implementation is rather good, albeit underdocumented for the level of complexity on display.
| static pthread_t MainThread = 0; | ||
|
|
||
| // Creates an emscripten queue and sets the main thread if we haven't already initialized. | ||
| void Emscripten_InitializeProxyQueue() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Question] I don't understand why this logic is in a mustache file, and not in a proper .cpp with #EMSCRIPTEN guards like all of the rest of our emscripten specific code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. It would be a lot nicer in a cpp
| if (ProxyQueue != nullptr) | ||
| { | ||
| return; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Question] Does this imply this Initialize method is called lazily? Is this a necessity, [consider] if it could be called during the standard initialize flow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct. I was worried about doing it when we first created the variable because its global. Maybe it's ok to do so.
| // Structure to hold the callback and arguments. | ||
| template <typename ...T> | ||
| struct CallbackData | ||
| { | ||
| void (*Callback)(T...); | ||
| std::tuple<T...> Args; | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Question] How does captured state play into this, seeing that this is a function pointer? Normally function pointer based callback approaches have a *UserContext sort of thing to work as a shim for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Consider] Ah I see, the StateObject is always the first arg of the pack. Perhaps make that explicit and having the parameter pack only apply to user arguments in order to reduce arcanity, it's not possible to know how this works currently without also exploring the C-wrapper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Comment] This makes me worry about the "This will be trivial in embind" thing, we'll have to re-solve this. I suppose it's not the hugest thing.
| static void Emscripten_CallbackWrapper(void* InData) | ||
| { | ||
| auto* Data = static_cast<CallbackData<T...>*>(InData); | ||
| std::apply(Data->Callback, Data->Args); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Comment] Pretty neat, I almost never see std::apply used. I think its only real purpose is to do this sort of parameter unpack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Nit] You might want to consider moving/forwarding the args, just for a bit of optimality for callbacks with value signatures. I struggle to trace the InData ownership however, so this might not be the thing to do.
| // Generic callback function which will either call the callback directly, or, | ||
| // if we're on wasm, push to the main thread if we're not on the main thread. | ||
| template <typename ...T> | ||
| static void CallCallback(void (*Callback)(T...), T... Args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Consider] Naming this more specifically so in the C-api it's known why it exists. CallCallbackOnMainThread or something.
CSP sometimes fires certain callbacks on a different thread the function was invoked. This causes crashes in WASM builds due to it not supporting this behaviour.
I decided to fix this at the binding level (wrapper generator), as we only need to fix this on wasm, and not anywhere else. Also, creating a callback queue that needs to be polled may change expected behavior, as clients would need to call CSPFoundation::Tick before logging in to get the callback.
I used the emscripten api to create a callback queue, which pushes callbacks onto the main thread if we are not on main. Doing it here also means we don't have to poll, as Emscripten resolved the inter-thread messaging for us using worker-thread messaging.
I also fixed a small logging issue in the NetworkInterruptionCallback, as we were always logging, even if it was a regular disconnect.
I was unfortunatly unable to reproduce the NetworkInterruptionCallback crash, both via testing and using a web build. Our testing framework crashes when trying to block to wait for the callback, so it isn't easily testable. Also, using a web build showed the callback logs, but no crash.
I also tested my new changes on a local web build to make sure there were no obvious issues/