Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't invoke the RustFutureContinuationCallback while holding the mutex #1901

Closed

Conversation

bendk
Copy link
Contributor

@bendk bendk commented Dec 13, 2023

As described in the comments, this could lead to a deadlock if the foreign code immediately polled the future. This doesn't currently happen with any of the core bindings, but it might with other bindings and it seems like a footgun in general.

To accomplish this, I made a separate class to handle the state and moved the mutex to only wrap that state. The state mutation happens with the mutex locked and the callback invocation happens after it's unlocked.

@bendk bendk requested a review from a team as a code owner December 13, 2023 20:19
@bendk bendk requested review from jeddai and a team and removed request for a team December 13, 2023 20:19
@bendk bendk force-pushed the dont-invoke-async-callback-with-mutex branch from 1dcc634 to 3f7095b Compare December 13, 2023 20:22
As described in the comments, this could lead to a deadlock if the
foreign code immediately polled the future.  This doesn't currently
happen with any of the core bindings, but it might with other bindings
and it seems like a footgun in general.

To accomplish this, I made a separate class to handle the state and
moved the mutex to only wrap that state.  The state mutation happens
with the mutex locked and the callback invocation happens after it's
unlocked.
@bendk bendk force-pushed the dont-invoke-async-callback-with-mutex branch from 3f7095b to e120ddc Compare December 13, 2023 21:02
@bendk
Copy link
Contributor Author

bendk commented Dec 13, 2023

Actually, let's not do this.

I ran into this issue when working on #1837 and I believe this change fixes things on our side. However, I'm still getting deadlocks. I believe the issue is that the futures themselves don't expect to be polled while they are calling the waker function and will deadlock if that happens.

Instead of fixing this in code, I'll just update the docs in that branch to point this out.

@bendk bendk closed this Dec 13, 2023
jhugman added a commit to jhugman/uniffi-bindgen-react-native that referenced this pull request Sep 17, 2024
According to [The Big O of Code
Reviews](https://www.egorand.dev/the-big-o-of-code-reviews/), this is a
O(_n_) change.

This PR moves the polling of futures to outside of the continuation
callback; this has shown to be a source of deadlocks.

```
      // From https://github.com/mozilla/uniffi-rs/pull/1837/files#diff-8a28c9cf1245b4f714d406ea4044d68e1000099928eaca1afb504ccbc008fe9fR35-R37
      //
      // > WARNING: the call to [rust_future_poll] must be scheduled to happen soon after the callback is
      // > called, but not inside the callback itself.  If [rust_future_poll] is called inside the
      // > callback, some futures will deadlock and our scheduler code might as well.
      //
      // This delay is to ensure that `uniffiFutureContinuationCallback` returns before the next poll, i.e.
      // so that the next poll is outside of this callback.
      //
      // The length of the delay seems to be significant (at least in tests which hammer a network).
      // I would like to understand this more: I am still seeing deadlocks when this drops below its current
      // delay, but these maybe related to a different issue, as alluded to in
      // mozilla/uniffi-rs#1901
```

Unfortunately, the tests for this aren't likely to end up in this repo,
since it relies on a fork of the matrix-sdk.

As said in the comment, I'm don't think this has fixed all possible
deadlocks, but I'm _hoping_ this has at least fixed the proximal
problem.

Fixes #89
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant