-
Notifications
You must be signed in to change notification settings - Fork 232
Each new consumer creation takes more time, than previous #817
Comments
@emmett9001 quick profiling showed, that pykafka stucks in |
Thanks @vikt0rs. I'll have to look into this more deeply, but it looks like something we should fix. |
Maybe this will be helpful - this issue doesn't exist, if the user creates a consumer with |
@vikt0rs Though pykafka should suppoort arbitrary numbers of consumers per thread, I can't think of a situation in which consuming the same topic with more than one consumer in a single thread would be desirable over simply consuming the topic once and distributing the results to multiple downstream consumers of the messages. Is there a reason you're making so many consumers in the same thread? |
Well, the reason is a quite simple - to simplify code and get rid of multiple downstream consumers. ] Sure, this example is a synthetical one, but it illustrates the issue. In my case, there is a tornado-based web application, which sends messages from Kafka to the user via web-socket, so the new consumer creates for each new user connection. If this approach is wrong and the current situation is not a bug, please suggest - what is the proper pattern for this case? Thanks! |
@vikt0rs Like I said above, each successive consumer instantiation becoming slower is definitely a bug. That said, for your use case I'd try to read from Kafka with a single thread not directly tied to any particular user and write those messages to shared memory. I'd then have the user-specific logic read from that shared memory instead of directly from Kafka. |
I attempted to replicate this within Parse.ly's internal network and was unable to do so. Consumer instantiation times remain constant up to 100+ consumers in the same process. @vikt0rs Do you have more information available from the profiling test you ran? |
Thanks for working on this. Please inform, if you are use the same snippet for your tests or have you modified it? |
My test snippet is identical to the one posted above with the exception of |
I'm trying to create several consumers with the same consumer_id to allow several clients to read from same Kafka topic. But for some reasons, each new connections establishment takes more and more time.
This happens with pykafka 2.7.0
You may see the code snippet to demonstrate the issue and console output.
Please, suggest.
Thanks!
The text was updated successfully, but these errors were encountered: