Multiple riak_repl2_fscoordinator_sup workers per cluster when adding nodes [JIRA: RIAK-2675] #748

nerophon · 2016-07-07T11:30:42Z

A customer finds multiple fullsync coordinator workers running simultaneously on each of two clusters. This causes multiple fullsync schedules to run concurrently; the actual fullsync operations may or may not overlap, but each coordinator is active and has its own timer.

This state is reproducible as follows:

Set up two clusters, A & B.
Set up REPL and connect them (cluster manager 0.0.0.0:9080).
Set fullsync_on_connect to true (unclear whether this step is required).
Push continuous load onto cluster A.
Start fullsync with A as source and B as sink.
While fullsync is running, join one or more new nodes to A.
On all nodes riak attach and run supervisor:count_children(whereis(riak_repl2_fscoordinator_sup))..
Observe that worker count > 0 on more than one node. In my test, it was on the original coordinator and also the newly joined node.

The workaround for this issue is to manually kill all riak_repl2_fscoordinator_sup processes as follows:

stop & disable fullsync
wait a few minutes
on each node attach and run: Pid = whereis(riak_repl2_fscoordinator_sup). then erlang:exit(Pid,kill)..
wait a few minutes
enable & start fullsync

The symptoms of this issue are extremely slow fullsync operations, cluster overload / slowness, and fullsync activity in the logs when no fullsync ought to be running.

The text was updated successfully, but these errors were encountered:

nerophon · 2016-07-07T11:34:24Z

Public facing duplicate:
basho/riak_ee-issues#30

nerophon added the type:bug label Jul 7, 2016

Basho-JIRA changed the title ~~Multiple riak_repl2_fscoordinator_sup workers per cluster when adding nodes~~ Multiple riak_repl2_fscoordinator_sup workers per cluster when adding nodes [JIRA: RIAK-2675] Jul 7, 2016

Basho-JIRA added the JIRA: To Do label Jul 7, 2016

nerophon mentioned this issue Jul 7, 2016

Multiple riak_repl2_fscoordinator_sup workers per cluster when adding nodes basho/riak_ee-issues#30

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple riak_repl2_fscoordinator_sup workers per cluster when adding nodes [JIRA: RIAK-2675] #748

Multiple riak_repl2_fscoordinator_sup workers per cluster when adding nodes [JIRA: RIAK-2675] #748

nerophon commented Jul 7, 2016 •

edited

Loading

nerophon commented Jul 7, 2016

Multiple riak_repl2_fscoordinator_sup workers per cluster when adding nodes [JIRA: RIAK-2675] #748

Multiple riak_repl2_fscoordinator_sup workers per cluster when adding nodes [JIRA: RIAK-2675] #748

Comments

nerophon commented Jul 7, 2016 • edited Loading

nerophon commented Jul 7, 2016

nerophon commented Jul 7, 2016 •

edited

Loading