You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The skip count gets incredibly high in a specific situation with high volume.
A set of clusters with cascading replication enabled and realtime replication occurring between the clusters
Puts are normally being made to cluster A, then the realtime queue that goes from cluster C to cluster B will normally see those puts as skips because B will already be in the routed clusters list. + Until something is delivered the skip count is not used
The first time something is put to cluster C, it gets replicated to B without a problem. However, after a significant number of puts are sent from A -> B -> C, without any objects going the reserve path (C-> B -> A), when the next object is sent from C -> B there will be an incredibly large skip count that has built up.
We saw a customer with a skip count of just over 73 million objects. In replication efforts, we were able to see skip counts become elevated in the single millions, however, this was not enough to create the latency observed by the customer.
A restart cleared the skip count and the latency returned to normal in the customer cluster.
This issue has a counterpart in the private riak_repl repository. Here is the link.
The text was updated successfully, but these errors were encountered:
The skip count gets incredibly high in a specific situation with high volume.
The first time something is put to cluster C, it gets replicated to B without a problem. However, after a significant number of puts are sent from A -> B -> C, without any objects going the reserve path (C-> B -> A), when the next object is sent from C -> B there will be an incredibly large skip count that has built up.
We saw a customer with a skip count of just over 73 million objects. In replication efforts, we were able to see skip counts become elevated in the single millions, however, this was not enough to create the latency observed by the customer.
A restart cleared the skip count and the latency returned to normal in the customer cluster.
This issue has a counterpart in the private riak_repl repository. Here is the link.
The text was updated successfully, but these errors were encountered: