Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rack-aware load balancing #31

Open
nyh opened this issue Aug 28, 2024 · 14 comments
Open

Rack-aware load balancing #31

nyh opened this issue Aug 28, 2024 · 14 comments
Assignees
Labels
enhancement New feature or request type/epic

Comments

@nyh
Copy link
Contributor

nyh commented Aug 28, 2024

Currently, all our Alternator load-balancing implementations in this repository ignore rack (a.k.a Amazon availability zone, AZ) information: We use the "/localnodes" API to get a list of all live Scylla servers in this data center (a.k.a Amazon region), and send the request to one of it.

But when the Scylla DC has multiple racks on different Amazon AZs, cross-AZ traffic costs money. It is cheaper for the client running on a specific AZ to send the request to a random node on the same AZ - and not to nodes on other AZs. This issue requests that the load balancers do this: Prefer to send requests to a node on the client's rack, not a node on other racks.

See scylladb/scylladb#12147 on a server-side modification to "/localnodes" that can help us get the list of nodes in the current AZ.

Beyond server-size modifications the following two points will also need to be considered:

  1. If the client itself is not balanced across the different AZs (e.g., is only running on a single AZ), using current-AZ-only load balancing would be inefficient as it would lead the Scylla nodes on this AZ to be more loaded than others. If we can't recognize this situation automatically, we should at least make rack-aware load balancing optional.
  2. If the cluster has just 3 nodes across 3 AZs, then each AZ has just a single node. If this node is temporarily down, listing the nodes on this AZ will return no node. So in any case that the current-AZ node list is empty, we should fall back to getting all the nodes in the whole DC, and use that list.
@dkropachev
Copy link
Collaborator

@nyh , I don't know details about alternator implementation, but I know that it uses LWT under the hood, recently we have been asked to make sure that regualr gocql drivers void load balancing logic for LWT because it creates extra congestion on server side, because if two queries that target same PK lands on two different replicas, replicas have to go through reconciliation process to serialize queries properly.
So I wonder if same applies for alternator.

If so, then we need to do token-aware load balancing as well to get more performance out of the cluster.

@nyh
Copy link
Contributor Author

nyh commented Oct 1, 2024

@dkropachev this is true - we have #11 for token-aware load balancer, but as I noted there, there is a difficulty: it will mean we'll need to monkey-patch the AWS SDK at a different place than we do today, to let it see the full query - and parse it (unfortunately) - to decide where to route a write request (for reads, LWT is not relevant).

We also have scylladb/scylladb#5703 on the Scylla side, which says that if the AWS SDK isn't token-aware (like it isn't today), we can rescue the contention problem by forward writes to the "right" node.

But you're right - if the load balancer is rack-aware (as this issue proposes), and different racks will send writes to different nodes, we will end up with more LWT contention. I don't know what to do about this - other than making rack-aware load balancing optional. Personally, I think the LWT whole-partition-contention problems need to be fixed (scylladb/scylladb#16261) instead of trying to work around them in the load balancer.

@mykaul
Copy link
Contributor

mykaul commented Oct 2, 2024

CC @kostja - thoughts?

@dkropachev
Copy link
Collaborator

if we have plans to enable LWT on tablets, then either we need to bring drivers to the alternator load balancers, or we need to expose routing info via API.

@mykaul
Copy link
Contributor

mykaul commented Oct 29, 2024

@dkropachev - any progress on this? (regardless of tablets)

@mykaul
Copy link
Contributor

mykaul commented Nov 4, 2024

@dkropachev ?

@dkropachev
Copy link
Collaborator

@mykaul, it is scheduled to the next sprint

@mykaul
Copy link
Contributor

mykaul commented Nov 5, 2024

@mykaul, it is scheduled to the next sprint

It'd be great if it can be prioritized and delivered sooner. It has a material impact.

@mykaul
Copy link
Contributor

mykaul commented Nov 19, 2024

@mykaul, it is scheduled to the next sprint

It'd be great if it can be prioritized and delivered sooner. It has a material impact.

@dkropachev , @roydahan - what's the status of this?

@nyh
Copy link
Contributor Author

nyh commented Nov 19, 2024

@dkropachev , @roydahan - what's the status of this?

@mykaul do you know which of the languages and SDK versions that we support you want this feature to appear in first?

@mykaul
Copy link
Contributor

mykaul commented Nov 19, 2024

@dkropachev , @roydahan - what's the status of this?

@mykaul do you know which of the languages and SDK versions that we support you want this feature to appear in first?

Java for sure. Not sure about SDK version.

@roydahan
Copy link
Collaborator

The first PRs are only for Java.

@dkropachev
Copy link
Collaborator

@dkropachev , @roydahan - what's the status of this?

@mykaul do you know which of the languages and SDK versions that we support you want this feature to appear in first?

Java, both SDK versions.

@nyh
Copy link
Contributor Author

nyh commented Dec 1, 2024

Implemented in Java, see pull request #40.
Not yet implemented in other languages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request type/epic
Projects
None yet
Development

No branches or pull requests

4 participants