Allow WebSocket and native clients on the same socket #362

Neverlord · 2023-06-11T08:04:14Z

This goes back to a discussion with @ckreibich. Having a single port for Broker would make it easier to use the WebSocket API for cluster management.

ckreibich · 2024-01-30T17:15:34Z

There's debate about whether this feature is really needed. Original motivation was to avoid needing to open up multiple ports to support either transport.

ckreibich · 2024-01-31T21:49:09Z

The above was a quick note during an update meeting. Here's where the idea came from.

There's no plan for the new Broker Python bindings (#380) to support anything but the WebSocket transport. Supporting the binary protocol would require linking bindings with the Broker library, or reimplementing the binary transport — neither of which is attractive. This implies that users of the bindings speak to other endpoints via WebSockets. In Zeek this requires an explicit Broker::listen_websocket().

One of the pain points folks running large clusters mentioned during Management framework research was the administrative overhead of opening up ports to their clusters. ZeekControl uses the current Broker bindings for Control framework event I/O with all nodes in the cluster. So at least during an interim phase where we have the new bindings but the management framework isn't yet ready, cluster operators using ZeekControl would have to open an additional port on every node, and we'd need to adjust the cluster code to listen on the additional port.

It seems easier to do away with the distinction at the API level and leave it to the implementation to sort it out. I agree that it's a bit unusal, but it's easy to do: binary goop vs an HTTP handshake at start of (post-TLS) payload. It implies a certain loss of control — let's say what if you had expected an endpoint to use the binary protocol and now something shows up speaking WebSockets — but that's unlikely to begin with and can be countered with appropriate logging. Pub/sub is already independent of individual connections.

Thoughts, given the above?

awelzel · 2024-02-01T10:11:58Z

Would much prefer recommending and pushing for an architecture where there's just a single WebSocket port open for the whole cluster (served by the manager - or, if failover is actually a topic, listen on the proxies with an nginx loadbalancer in front). And define "internal cluster communication" running separately over the existing ports. For example, this would allow to run WebSocket connections over SSL and internal cluster communication non-encrypted (less overhead and tcpdumping broker traffic appears still one of the most effective ways to troubleshoot cluster communication in production). Firewall rules can also be more specific: External hosts may connect to the WebSocket port, but can not fuzz around with the binary protocol.

One of the pain points folks running large clusters mentioned during Management framework research was the administrative overhead of opening up ports to their clusters.

I'm not in that space, but yes, I'm sure it's annoying to have another port open. It should be a one-time thing though. We've changed the complete port cluster port range with zeek/zeekctl#41. This must have affected people, but I haven't heard uproar - though maybe the motivation for the change was good enough.

ZeekControl would have to open an additional port on every node, and we'd need to adjust the cluster code to listen on the additional port.

I have not fully thought this through or looked at what it does, but if we change ZeekControl to use WebSockets, maybe the eventing it does can be threaded through the manager instead of connecting to individual nodes (which has caused issues like #312 for large clusters)

do: binary goop vs an HTTP handshake at start of (post-TLS) payload.

If this is implemented, maybe at least we should use ALPN as a mechanism rather than magic signature matching on payload. Though that won't work with plaintext.

My opinion - will not further comment given other opinions exist :-)

ckreibich · 2024-02-02T03:16:07Z

All good thoughts, thanks Arne. I just had another thought: I think it's actually okay to do nothing. When the new bindings come out, we'll obviously deprecate the current ones. But the deprecation window is up to us (the closest would be deprecating with 7.0 and pulling with 7.1, which arguably is quite short), and while everyone migrates to the new ones, we can simply keep using the old ones for zeekctl, which in itself is approaching deprecation. The downside is that we can't pull the old bindings from Broker until we're ready to move on from zeekctl, but I don't think that's the end of the world — it's not like we currently spend much if any effort maintaining them. Plus, it increases the incentive to get the management framework ready for production use.

ckreibich · 2024-02-02T21:25:00Z

Btw — I agree with the centralization idea (one point to connect into the cluster). The management framework takes a few steps in that direction. When you launch a cluster via it, workers currently don't listen, because nothing needs to connect to them — at least so far.

Regarding the TLS/policy arguments, also duly noted. I just looked at this a bit and Zeek's Broker use will need some tweaking to allow different per-port security policies ... looks like a Broker endpoint gets a single configuration (with certs etc), and can then listen on a port with the traditional protocol and/or another one with WebSockets. It matches how we currently define the certs etc as globals in the script layer, not tied to the individual listens.

Anyway — I'm closing this as nothing to do.

Neverlord added the Type: Enhancement label Jun 11, 2023

Neverlord self-assigned this Jun 11, 2023

Neverlord added this to Zeek 6.1 Jul 6, 2023

Neverlord moved this to In Progress in Zeek 6.1 Jul 6, 2023

Neverlord added this to Zeek 6.2 Oct 2, 2023

Neverlord removed this from Zeek 6.1 Oct 2, 2023

ckreibich added this to Zeek 7.0 Jan 30, 2024

ckreibich added the Invalid / Nothing to do label Feb 2, 2024

ckreibich removed this from Zeek 7.0 Feb 2, 2024

ckreibich closed this as completed Feb 2, 2024

github-project-automation bot moved this to Done in Zeek 6.2 Feb 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow WebSocket and native clients on the same socket #362

Allow WebSocket and native clients on the same socket #362

Neverlord commented Jun 11, 2023

ckreibich commented Jan 30, 2024

ckreibich commented Jan 31, 2024

awelzel commented Feb 1, 2024

ckreibich commented Feb 2, 2024

ckreibich commented Feb 2, 2024

Allow WebSocket and native clients on the same socket #362

Allow WebSocket and native clients on the same socket #362

Comments

Neverlord commented Jun 11, 2023

ckreibich commented Jan 30, 2024

ckreibich commented Jan 31, 2024

awelzel commented Feb 1, 2024

ckreibich commented Feb 2, 2024

ckreibich commented Feb 2, 2024