-
Notifications
You must be signed in to change notification settings - Fork 232
Allow more BalancedConsumers than partitions on a topic #527
Comments
@jianbin-wei If you have a topic with a single partition, I'd suggest using a pair of |
From what I see the behavior of Kafka console consumer is good. The N+1th consumer is idle as its assigned partition is none. When one consumer dies, the N+1th consumer starts to consume (after rebalancing). |
thanks a lot. |
Actually this one is needed for #354 . In that case, for one consumer to consume from multiple topics with different number of partitions, it would be better to idle around and consume if needed. |
Given that this already works as desired for the edge case where consumers are added in quick succession (fixed in #392), and with a quick look at @jianbin-wei could you test if that's good enough, ie just removing pykafka/pykafka/balancedconsumer.py Lines 540 to 541 in 6035a91
|
@yungchin Yes in my simple test removing those two lines is enough. You would need to have regression test done though. |
+1 to the idea of idling extra consumers |
fyi: I've tested this one with 10k messages sent and received by 10 consumers; removing these two lines allows for more consumers than partitions and amount of data produced equals to amount of data consumed with one consumer thread getting exactly one message: balancedconsumer.py line 540 - below two lines need to be removed: Let's verify and merge this; it is a blocker for me at the moment. |
@vitalyli Thanks for that investigation. Mind opening a pull request that removes those lines and adds a test case that verifies that it works as expected? |
Please see PR for it: #554 |
In case of one topic with one partition, if two balanced consumers are created within the same group for the topic, the later one would raise exception. When the first balanced consumer is down, messages are not consumed anymore
However, in Java kafka client, the later kafka consumer would continue to consume the topic's messages. The behavior is more resilient.
PyKafka version: 2.3.1
Kafka version: 0.8.2.1
The text was updated successfully, but these errors were encountered: