Skip to content

Conversation

@benjaminfrueh
Copy link
Contributor

Fix WebSocket race condition causing "Still in CONNECTING state" errors

Problem

When multiple listen() calls occur while the WebSocket is still connecting, a race condition occurs:

  • First listen() call initializes WebSocket (state: CONNECTING)
  • Second listen() call sees existing WebSocket object and immediately sends without checking connection state
  • Results in InvalidStateError: Failed to execute 'send' on 'WebSocket': Still in CONNECTING state

This affects apps like Nextcloud Collectives where the page fails to render due to these WebSocket errors.

Solution

Add readyState check to only send when WebSocket is OPEN. When the WebSocket state is CONNECTING, the listen() calls are queued in window._notify_push_listeners and processed by the onopen handler in setupSocket() once the connection is established.

Copy link
Contributor

@mejo- mejo- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes look good to me. I didn't test it though.

@blizzz
Copy link

blizzz commented Sep 23, 2025

We have good evidence to expect that this solves the document not rendered issue sometimes happening in Collectives, when notify-push is in use.

Copy link
Contributor

@ShGKme ShGKme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concurrent call of setupSocket is covered by

	if (window._notify_push_ws) {
		return true;
	}
	window._notify_push_ws = true;

So it is safe to run setupSocket twice. Or is executed only once.


However, onsent event listener is async (a new macro task), so there is (a tiny) chance that
window._notify_push_ws.readyState === WebSocket.OPEN already, but window._notify_push_ws.onopen has not been executed yet.

Though this is called later anyway in onopen, authentication might have not been happened yet on this send('listen ' + name) call inside listen, and then be called again.

I cannot say if this is a problem, need to try or check the notify_push server source.

For example, I can imagine how response to listen results in Invalid credantials and breaks JSON.parse in the onmessage handler. Or adds complexity to the auth process.


Proposal: instead of relying on window._notify_push_ws.readyState (which does not indicate the setup ready state), add a new flag that actually indicates notify_push client ready state (the setup being complete).

Then we don't need to rely on listen being safe to be called before auth or twice.

@benjaminfrueh
Copy link
Contributor Author

benjaminfrueh commented Sep 24, 2025

@ShGKme thank you for the review and for the feedback.

In the recent commit, I added the window._notify_push_ready flag to track setup completion. This flag is reset to false in networkOffline and onclose of the WebSocket.

I'm setting the ready flag after the preauth is sent, and before the existing listeners loop runs. It could potentially cause duplicate sends, if a listen() call happens between that. Setting it after the listeners loop, could mean that if one send() throws an error, the ready flag never gets set for future listen() calls and they would just add to the queue.

What would be the preferred approach for handling this timing?

@susnux susnux force-pushed the fix/websocket-race-condition branch from fabdefd to fa363d2 Compare October 22, 2025 10:04
@susnux susnux added the bug Something isn't working label Oct 22, 2025
@susnux susnux merged commit 85300cd into nextcloud-libraries:main Oct 22, 2025
6 checks passed
@ShGKme ShGKme mentioned this pull request Oct 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants