-
-
Notifications
You must be signed in to change notification settings - Fork 320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
horizontal scaling for self-hosted realtime servers, for broadcast and presence features #760
Comments
As I'm researching I've found DNS_NODES param, not sure if this works off of FLY yet? Looking for something that works in kubernetes or docker swarm. |
Hi @menasheh You might be having issues because you actually need to connect the erlang nodes. we're using libcluster postgres strategy to achieve it in Realtime. Where are you deploying your system? In theory the only thing you will need is to be able to ping another machines host name and that should be enough for them to connect between them. More details on the strategy: https://github.com/supabase/libcluster_postgres |
Thanks @filipecabaco. If I understood this correctly, the default cluster strategy is DNS, but I can set We're deploying on linux servers in a company-owned private cloud environment, using either Docker or Kubernetes. Attaching section of logs from one of the kubernetes pods:
|
How does realtime get the hostname? |
that might be the issue. we're using a vpc to achieve it so the ips are discoverable between all nodes which means that it works for Postgres libcluster strategy to just do a quick are you able to ping the hostnames / ip's from one container to another? |
Hello, I have 3 pods of realtime. Following this thread, I have configured the pod's hostnames to be their FQDN. I have also configure the following envVars: CLUSTER_STRATEGIES: POSTGRES
POSTGRES_CLUSTER_CHANNEL_NAME: realtime_broadcast_cluster I have tried a Do I need to manually create a tenant via a curl request to trigger realtime to set itself up properly? Edit: Final edit: |
Ok, I've had a play. I've tested this, and it seems to be working for the POSTGRES strategy. if [ -z $ip ]; then
ip=127.0.0.1
fi were overwriting anything that might be able to bootstrap the cluster connectivity, unless the environments exactly match fly's 6pn thingy or AWS fargate environment. patching: https://github.com/supabase/realtime/blob/cd04f2f744834296b5a4b3e360e95c3fab5f9165/rel/env.sh.eex #!/bin/sh
# Set the release to work across nodes. If using the long name format like
# the one below ([email protected]), you need to also uncomment the
# RELEASE_DISTRIBUTION variable below. Must be "sname", "name" or "none".
if [ -z $ip ]; then
# for Fly.io
ip=$(grep fly-local-6pn /etc/hosts | cut -f 1)
echo "No EnvVar 'ip' set, trying fly-local-6pn '${ip}'"
else
echo "ip set from EnvVar 'ip' '${ip}'"
fi
# for AWS ECS Fargate
if [ "$AWS_EXECUTION_ENV" = "AWS_ECS_FARGATE" ]; then
ip=$(hostname -I | awk '{print $3}')
echo "AWS_EXECUTION_ENV is AWS_ECS_FARGATE. Overriding 'ip' '${ip}'"
fi
# default to localhost
if [ -z $ip ]; then
ip=127.0.0.1
echo "No EnvVar 'ip' set and could not auto-configure, defaulting to '${ip}'"
fi
# assign the value of NODE_NAME if it exists, else assign the value of FLY_APP_NAME,
# and if that doesn't exist either, assign "realtime" to node_name
node_name="${NODE_NAME:=${FLY_APP_NAME:=realtime}}"
if [ -z $RELEASE_DISTRIBUTION ]; then
export RELEASE_DISTRIBUTION=name
echo "No RELEASE_DISTRIBUTION set. Using '${RELEASE_DISTRIBUTION}'"
fi
if [ -z $RELEASE_NODE ]; then
export RELEASE_NODE=$node_name@$ip
echo "No RELEASE_NODE set. Using '${RELEASE_NODE}'"
fi I've tested this on my local k8s cluster with the following env vars set: - name: ip
valueFrom:
fieldRef:
fieldPath: status.podIP
- name: CLUSTER_STRATEGIES
value: POSTGRES
- name: FLY_APP_NAME
value: testing
- name: SLOT_NAME_SUFFIX
valueFrom:
fieldRef:
fieldPath: metadata.labels['apps.kubernetes.io/pod-index'] and I am seeing goodies like these in the pod logs:
|
If we spin up a single realtime server, presence feature works beautifully for the most part, with the possible exception of long-opened tabs not counting as present after a while. However, if we spin up multiple parallel realtime servers and use a load balancer, we're seeing presence smaller than the number of tabs we have open. It seems some are connected to one server, and some to the other, so the presence becomes out of sync.
Is there a way to keep multiple realtime servers in sync for broadcast and presence features?
The text was updated successfully, but these errors were encountered: