Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: implement relay websocket connection state #128

Closed
wants to merge 3 commits into from

Conversation

geekbrother
Copy link
Contributor

@geekbrother geekbrother commented Oct 17, 2023

Description

This PR introduces the following changes:

  • WebSocketClientState enum for tracking the websocket connection state.
    • The Connecting state includes a tokio notifier to notify waiting for the connection/reconnection threads.
  • respawn_connection function that connects to the WebSocket and changes the connection state.
  • publish_message relay client publish function wrapper that tracks a connection state and spawns a reconnection in case of an error during the publish call.
  • Replacing old functions with the new wrappers and passing down additional parameters.

Resolves #74

How Has This Been Tested?

The concept was tested by the simple isolated code, and implementation was tested using the unit and integration tests.

Due Diligence

  • Breaking change
  • Requires a documentation update
  • Requires a e2e/integration test update

@geekbrother geekbrother force-pushed the max/fix/add_connection_state branch from a95fcb1 to a0bf2f2 Compare October 17, 2023 17:07
@geekbrother geekbrother marked this pull request as ready for review October 17, 2023 17:14
@@ -2,6 +2,12 @@
All notable changes to this project will be documented in this file. See [conventional commits](https://www.conventionalcommits.org/) for commit guidelines.

- - -
## v0.15.1 - 2023-10-17
Copy link
Member

@chris13524 chris13524 Oct 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once my PR is merged can you reset this CHANGELOG, revert the version in Cargo.toml, and delete any created GitHub releases before merging?

Comment on lines 344 to +346
client,
client_state,
app_state,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a lot to have to pass around. Can we wrap all 3 of these in a struct (maybe called RobustRelayClient) and put publish_message(&self) in its implementation?

#[derive(Debug, Clone)]
pub enum WebSocketClientState {
Connected,
Disconnected,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-blocking: I'm a little scared of this state being here. It doesn't look like this is ever set other than during startup, but it would be nice if there was an approach that avoided this extra state so it was always either Connected or Connecting.

}

if let Err(e) = wsclient
.publish(topic.clone(), message, tag, ttl, prompt)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's possible the relay gets disconnected after getting current_client_state_clone above and calling .publish() here. I think we should always assume it is connected, call .publish(), and if there is a connection error (not just any error), then start waiting for the reconnect to succeed.

Furthermore by calling respawn_connection() in this function and with the race condition mentioned above, it's possible for multiple calls to also call respawn_connection() for the same disconnect. I'd suggest putting respawn_connection() inside of the handler for wsclient::RelayClientEvent::Disconnected instead so that it happens 1 time and also automatically whenever the relay is disconnected, not just when a publish fails.

I don't think we need a connected/not-connected state variable actually, just a RwLock<Notify>. Whenever a disconnected-related publish() error happens, then it does a read-lock and calls notified(). And whenever the Disconnected and respawn_connection() complete, then it does a write-lock, calls notify_waiters(), replaces the Notify with a fresh one, and unlocks.

The assumption is that a disconnected-related publish() errors are strongly correlated with wsclient::RelayClientEvent::Disconnected events.

@@ -23,7 +24,11 @@ use {
},
serde::{Deserialize, Serialize},
sha2::Sha256,
std::{sync::Arc, time::Instant},
std::{
sync::{Arc, Mutex},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we always use tokio Mutex not blocking ones?

@geekbrother
Copy link
Contributor Author

After discussion, we've decided to switch publishing in handlers to the HTTP relay client.
This PR closing in favor of #142

@geekbrother geekbrother deleted the max/fix/add_connection_state branch November 8, 2023 11:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

fix: handler functions do not anticipate websocket disconnection
2 participants