-
Notifications
You must be signed in to change notification settings - Fork 234
Demo incorrect check_quorum behaviour #7375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
backup_ack_timeout_count++; | ||
auto search = all_other_nodes.find(node.first); | ||
if ( | ||
search == all_other_nodes.end() || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is always adding one to live nodes in config, is that right? In an atomic reconfiguration situation, the next configuration may not include the next primary, so including it in the count there does not seem obviously correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed: all_other_nodes contains every node in any configuration which is not the primary. So if there is a node in a configuration which is not in all_other_nodes then it is the primary, and hence is live.
I've added a comment to this effect.
swap_nodes,2,out,1,2 | ||
|
||
disconnect_node,0 | ||
|
||
periodic_one,0,100 | ||
periodic_one,1,100 | ||
dispatch_all |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comments and asserts if possible. We retire nodes 1 and 2, but 0 fails to replicate this update to the other nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated this, let me know if you think there are enough. More are easy enough to add :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR addresses a bug in the CheckQuorum implementation where the leader's quorum check condition was inconsistent with the commit condition. The change ensures that a leader only remains active if it has a committing quorum in every active configuration, not just any configuration. This prevents scenarios where a leader believes it has quorum but is actually unable to commit entries.
Key Changes:
- Modified CheckQuorum logic in
raft.h
to require quorum in all configurations instead of any configuration - Added test scenarios demonstrating the old incorrect behavior and validating the fix
- Introduced new test assertions (
assert_config
,assert_missing_config
) to verify configuration state in tests
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
src/consensus/aft/raft.h |
Changed CheckQuorum from OR logic (quorum in any config) to AND logic (quorum in every config) using std::all_of |
src/consensus/aft/test/driver.h |
Added assert_config and assert_missing_config helper methods for validating node configurations in tests |
src/consensus/aft/test/driver.cpp |
Registered the new assertion commands in the test driver command dispatcher |
tests/raft_scenarios/check_quorum_0_012 |
Test scenario demonstrating transition from single node to three-node cluster with CheckQuorum behavior |
tests/raft_scenarios/check_quorum_012_0 |
Test scenario demonstrating retirement of nodes while partitioned, verifying correct leader stepdown |
Co-authored-by: Copilot <[email protected]>
…e actual state of it during rollbacks
The reason getting trace validation to work required reducing the constraints on IsBecomeFollower is as follows: Actions from raft.h trace:
Required actions from Traceccfraft.tla pov:
So if IsBecomeFollower asserts that the spec state matches the raft.h state, and that it is unchanged, then IsReceiveAppendEntries's UpdateTerm will be in conflict with this. Hence to fix this specific issue we should not roll back the log or increment the term until the message is sent. |
We have CheckQuorum to ensure that a leader should step down if it is not a good leader, as this acts as a good liveness probe of the system at large.
Our CheckQuorum condition is that the leader has a committing quorum in any configuration.
Our commit condition is that the leader has a committing quorum in every configuration.
This PR includes a scenario (check_quorum_2) which demonstrates the difference between these two conditions.
Suppose we have a 3 node cluster with n0 the leader initially, and then retire both n1 and n2 but do not replicate this to them.
Then we partition n0.
Due to CheckQuorum n0 is still a good leader, however it is unable to commit.
While the remainder of the cluster elects a new replacement leader and continues to function.
The aim of the change to raft.h is to make clear what the condition is, as well as fixing this.
Note
The other direction of this bug would be more severe, (going from n0, to n0,n1,n2, check_quorum_3) with n0 being inaccessible but staying alive due to CheckQuorum.
But the backups are unable to elect themselves without a vote from n0, so this isn't a problem in practise.