Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][client] Prevent retry topic and dead letter topic producer leaks when sending of message fails #23824

Merged
merged 9 commits into from
Jan 9, 2025

Conversation

lhotari
Copy link
Member

@lhotari lhotari commented Jan 8, 2025

Fixes #20635

Motivation

The client currently leaks producers when sending of the message fails and a new producer is created after an exception occurs. The previous producer wasn't closed in these cases.
In case of such failures, there should be a retry backoff so that such cases don't cause excessive load on the broker side.

Modifications

  • properly close retry topic and dead letter topic producers when they are discarded
  • add backoff for the creation of new producers when there are subsequent failures. Reset the backoff when a message is sent successfully.

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

@lhotari lhotari requested a review from heesung-sn January 8, 2025 17:30
Copy link
Contributor

@Technoboy- Technoboy- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@shibd shibd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a small comment.

And I have question about retry create deadletterProducer or retryProducer.

  1. For schema exceptions mentioned in the ticket, retries won't solve the issue, right? So what's the point?
  2. If there are network issues, the Producer has automatic retries internally.

I'm wondering why we need to implement a retry mechanism here, or why the original logic sets the deadletterProducer=null. If creation truly fails, shouldn't the user handle it?

@lhotari
Copy link
Member Author

lhotari commented Jan 9, 2025

Left a small comment.

And I have question about retry create deadletterProducer or retryProducer.

  1. For schema exceptions mentioned in the ticket, retries won't solve the issue, right? So what's the point?

Thanks for the review feedback, @shibd.

Correct. This PR doesn't try to resolve that. The point is that it's a way to reproduce the issue reported in #20635 (comment) . The test case has that comment too, but I guess I should have been more verbose about it.

  1. If there are network issues, the Producer has automatic retries internally.

I'm wondering why we need to implement a retry mechanism here, or why the original logic sets the deadletterProducer=null. If creation truly fails, shouldn't the user handle it?

That's a valid point. The original intention of this PR was to address the producer leak without changing the existing logic. However, the original logic is not great in the first place when a new producer is created whenever an exception happens in message sending.

@shibd
Copy link
Member

shibd commented Jan 9, 2025

hi, @lhotari

Thanks for the discussion. So, can we directly remove the logic that sets deadLetterProducer = null when creation exception?

This would simplify things, as re-create doesn't seem very meaningful to me.

@lhotari
Copy link
Member Author

lhotari commented Jan 9, 2025

hi, @lhotari

Thanks for the discussion. So, can we directly remove the logic that sets deadLetterProducer = null when creation exception?

This would simplify things, as re-create doesn't seem very meaningful to me.

@shibd Sure, that's a question that comes into mind. There seems to be a reason why it has been reset to null in the first place. I'll check the commit history to find out.

@lhotari lhotari force-pushed the lh-dlq-and-retry-producer-leaks branch from 9500a90 to 678aadf Compare January 9, 2025 13:15
@lhotari lhotari requested a review from shibd January 9, 2025 13:15
Copy link
Member

@shibd shibd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/LGTM

@lhotari lhotari merged commit 04e89fe into apache:master Jan 9, 2025
51 of 52 checks passed
lhotari added a commit that referenced this pull request Jan 9, 2025
…s when sending of message fails (#23824)

(cherry picked from commit 04e89fe)
lhotari added a commit that referenced this pull request Jan 9, 2025
…s when sending of message fails (#23824)

(cherry picked from commit 04e89fe)
lhotari added a commit that referenced this pull request Jan 9, 2025
…s when sending of message fails (#23824)

(cherry picked from commit 04e89fe)
nikhil-ctds pushed a commit to datastax/pulsar that referenced this pull request Jan 10, 2025
…s when sending of message fails (apache#23824)

(cherry picked from commit 04e89fe)
(cherry picked from commit b6beb90)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] pulsar keep creating dead letter queue producer and exceed the maximum limit
3 participants