Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kafka(consumer): Improve AtLeastOnce handling #125

Merged
merged 8 commits into from
May 5, 2023

Conversation

marclop
Copy link
Contributor

@marclop marclop commented May 3, 2023

Improves how AtLeastOnceDelivery is handled by the kafka consumer. Previously, the consumer could discard up to cfg.MaxPollRecords, since it incorrectly assumed that unacknowledged records would be re-delivered by Kafka if not committed within a certain amount of time.

Now, it discards any records which can't be processed, but continues processing the rest of the records in the fetch. It can still result in some data loss, and is at the mercy of the failures returned by the processor, but it's much better than previously.

Additionally, adds a few tests which cover both AtLeastOnceDelivery and AtMostOnceDelivery, and a TestGracefulShutdown test.

Partially addresses #118
Part of #123

Improves how `AtLeastOnceDelivery` is handled by the kafka consumer.
Previously, the consumer could discard up to `cfg.MaxPollRecords`, since
it incorrectly assumed that unacknowledged records would be re-delivered
by Kafka if not committed within a certain amount of time.

Now, it discards any records which can't be processed, but continues
processing the rest of the records in the fetch. It can still result in
some data loss, and is at the mercy of the failures returned by the
processor, but it's much better than previously.

Additionally, adds a few tests which cover both `AtLeastOnceDelivery`
and `AtMostOnceDelivery`, and a `TestGracefulShutdown` test.

Partially addresses elastic#118
Part of elastic#123

Signed-off-by: Marc Lopez Rubio <[email protected]>
@marclop marclop requested a review from a team May 3, 2023 10:11
@marclop
Copy link
Contributor Author

marclop commented May 5, 2023

Ran the tests a lot of times locally and 10 times in the CI to verify that there's not any flaky tests left: https://github.com/elastic/apm-queue/actions/runs/4890005702.

@marclop marclop merged commit 5a3ed16 into elastic:main May 5, 2023
@marclop marclop deleted the f/handle-at-least-once-better branch May 5, 2023 04:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants