-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix] [broker] fix incorrect delete sub when lastActive do not update. #21692
[fix] [broker] fix incorrect delete sub when lastActive do not update. #21692
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thetumbled Could you add a test for this case? Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we update the cursor when every check ?
If the consumer do not commit the offset for any reason, the |
in this case, the consumer is connected, how could the code run into pulsar/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentTopic.java Lines 2982 to 2985 in 8beac8b
|
If the broker1 shutdown ungracefully, the topic will be loaded on broker2 without update the |
I push a new commit to add the test code, PTAL, thanks. |
PTAL, thanks. @dao-jun @lhotari @Technoboy- @poorbarcode @coderzc @codelipenghui |
PersistentSubscription sub = persistentTopic.getSubscription("sub1"); | ||
|
||
// shutdown pulsar ungracefully | ||
// disable the updateLastActive method to simulate the ungraceful shutdown |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You want to mock an ungraceful shutdown, but the exact behavior you mocked is below:
- the result of
cursor.updateLastActive()
will not be persisted. - the result of
sub.cursor.updateLastActive()
you added at this line will be persisted.
This mock is wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't ensure that the updated timestamp is persisted in crash time. With this PR, we can fix most of the cases, as we update it every time broker do expiration check.
But strictly speaking, we can't fix all the cases. If we always fail to persist the cursor info, the incorrect sub deletion occurs too.
Incorrect deletion of subscription causes serious consequence, such as duplication or data lost. In my opinion, missing deletions are better than incorrect deletion, this is what PR #22794 try to do.
|
||
// wait for 1min, but consumer is still connected all the time. | ||
// so subscription should not be deleted. | ||
Thread.sleep(60000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sleep too long
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is, but the minimum expiration time is 1min, we have to wait for 1min to trigger the bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update last active when a consumer disconnect instead of every check.
closed as #22794 has been merged. |
Motivation
PR #17573 has fixed cases that consumer commit offset periodically. It update
lastActive
field of cursor when consumer commit the offset.But there is still corner case that the consumer do not commit the offset for any reason. For example, there is no content in the topic could be read, so the consumer do not commit the offset reasonably. In such case we has no reason delete the subscription.
Anyway, we should be cautious with the delete operation, which could result into data duplication or loss.
Modifications
Update the
lastActive
field when check the inactive sub.Verifying this change
(Please pick either of the following options)
This change is a trivial rework / code cleanup without any test coverage.
Does this pull request potentially affect one of the following parts:
If the box was checked, please highlight the changes
Documentation
doc
doc-required
doc-not-needed
doc-complete
Matching PR in forked repository
PR in forked repository: thetumbled#31