Skip to content

Commit

Permalink
chore: fix flaky test when leader peer terminates
Browse files Browse the repository at this point in the history
This commit should fix a flaky test that happens when the leadership
changes because the previous leader terminates for a given reason.

The issue happens because the signal emitted in the unit test is a
`GenServer.stop/1`, but the peer process is under the test supervisor,
so, in certain circumstances, the supervisor is fast enough to spawn a
new process. That process ends as the new leader.

A way to reproduce the flaky test locally could be like this:

```console
for i in (seq 1 10)
  mix test --seed 856757

  if test $status -eq 2
    break
  end
end
```

I'm using the fish shell here; the previous script might not work on
your favorite shell.

The previous command should produce an output like:

```console
Excluding tags: [:skip]

...................................................................................................................................................................................................................................................................................................................
Finished in 4.2 seconds (1.4s async, 2.7s sync)
24 doctests, 7 properties, 278 tests, 0 failures, 2 excluded

Randomized with seed 856757
Excluding tags: [:skip]

................................................................................................................................................................................................................................................................................................

  1) test leadership changes when a peer terminates (Oban.Peers.GlobalTest)
     test/oban/peers/global_test.exs:24
     Expected truthy, got nil
     code: assert Enum.find([peer_1, peer_2] -- [leader], &Global.leader?/1)
     arguments:

         # 1
         [#PID<0.3478.0>]

         # 2
         &Oban.Peers.Global.leader?/1

     stacktrace:
       test/oban/peers/global_test.exs:44: anonymous fn/3 in Oban.Peers.GlobalTest."test leadership changes when a peer terminates"/1
       (oban 2.16.3) test/support/case.ex:73: Oban.Case.with_backoff/4
       test/oban/peers/global_test.exs:43: (test)

..................
Finished in 5.3 seconds (1.4s async, 3.8s sync)
24 doctests, 7 properties, 278 tests, 1 failure, 2 excluded

Randomized with seed 856757
```

Here, you can see that the PID of the new leader is not in the target
list (`Enum.find`), so, as I mentioned before, one theory is that the
test supervisor spawned a new process, and that process ended being the
leader.
  • Loading branch information
milmazz committed Oct 31, 2023
1 parent 306c79d commit 9bdf8e2
Showing 1 changed file with 6 additions and 2 deletions.
8 changes: 6 additions & 2 deletions test/oban/peers/global_test.exs
Original file line number Diff line number Diff line change
Expand Up @@ -35,8 +35,12 @@ defmodule Oban.Peers.GlobalTest do
peer_1 = start_supervised!({Peer, name: A, conf: %{conf | name: Oban, node: "web.1"}})
peer_2 = start_supervised!({Peer, name: B, conf: %{conf | name: Oban, node: "web.2"}})

assert leader = Enum.find([peer_1, peer_2], &Global.leader?/1)
assert :ok = GenServer.stop(leader)
assert {leader, name} =
Enum.find([{peer_1, A}, {peer_2, B}], fn {pid, _name} ->
Global.leader?(pid)
end)

stop_supervised!(name)

assert_receive {:notification, :leader, %{"down" => _}}

Expand Down

0 comments on commit 9bdf8e2

Please sign in to comment.