Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch to ReplicaOrdering.RANDOM for select LBPs #32

Merged
merged 1 commit into from
Nov 19, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -103,22 +103,25 @@ public JavaDriverClient(StressSettings settings, List<String> hosts, int port, E
private LoadBalancingPolicy loadBalancingPolicy(StressSettings settings)
{
LoadBalancingPolicy ret = null;
ReplicaOrdering replicaOrdering = null;

if (settings.node.rack != null) {
RackAwareRoundRobinPolicy.Builder policyBuilder = RackAwareRoundRobinPolicy.builder();
if (settings.node.datacenter != null)
policyBuilder.withLocalDc(settings.node.datacenter);
policyBuilder = policyBuilder.withLocalRack(settings.node.rack);
ret = policyBuilder.build();
replicaOrdering = ReplicaOrdering.NEUTRAL;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this mean that a mix of using Rackaware and tablets would be imbalanced ?

and would be needed to be fix on the driver end ?

Copy link
Contributor Author

@Bouncheck Bouncheck Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All round robin policies (rack,dc) used with TokenAwarePolicy can be imbalanced with tablets when using neutral ordering. I think this combination should not be used if we want load to be as balanced as possible. Long story short let's say we have RF=3 and 6 nodes [A,B,C,D,E,F] and tablets are spread evenly but only on ABC (this can happen when growing the cluster). If round robin happens to point to either D,E,F,A then that request will hit replica A first. This results in A getting 4/6 of the load, B 1/6, and C 1/6. However if RF=3 and cluster has only A,B,C then all will be nearly perfectly balanced.

I'll try to make a comment with broader explanation what happens in scylladb/scylladb#19107 to better illustrate this issue with neutral ordering.

The rack aware one just does not work correctly (will ignore rack awareness in favor of local dc) with random ordering so it has to stay neutral for now. This is the part that needs fixing on driver's end.

} else {
DCAwareRoundRobinPolicy.Builder policyBuilder = DCAwareRoundRobinPolicy.builder();
if (settings.node.datacenter != null)
policyBuilder.withLocalDc(settings.node.datacenter);
ret = policyBuilder.build();
replicaOrdering = ReplicaOrdering.RANDOM;
}
if (settings.node.isWhiteList)
ret = new WhiteListPolicy(ret, settings.node.resolveAll(settings.port.nativePort));
return new TokenAwarePolicy(ret, ReplicaOrdering.NEUTRAL);
return new TokenAwarePolicy(ret, replicaOrdering);
}

public PreparedStatement prepare(String query)
Expand Down