Kafka consumer stops without errors #1433

khouloudsa · 2021-11-09T18:21:42Z

Versions used

Scala: 2.12
Akka version: 2.5.31
Alpakka-kafka version: 2.0.7
Consumers are deployed with Kubernetes
ConnectionChecker is activated
Bug observed on Prod

Expected Behavior

Consumer continue consumming or dies.

Actual Behavior

Consumer remains in live with CURRENT-OFFSET = 386844 and stops consumming
Consumer continue sending heartbeats, and comsumer metrics shows no LAG.

Relevant logs

2021-11-08 09:24:09,324 TRACE o.a.k.c.consumer.internals.Fetcher - [Consumer clientId=consumer-user-event-ingestion-2, groupId=user-event-ingestion] Returning fetched records at offset FetchPosition{offset=386844, offsetEpoch=Optional[30]...

2021-11-08 09:24:09,324 DEBUG o.a.k.c.consumer.internals.Fetcher - [Consumer clientId=consumer-user-event-ingestion-2, groupId=user-event-ingestion] Added READ_UNCOMMITTED fetch request for partition user-interaction-0 at position FetchPosition{offset=387110, offsetEpoch=Optional[30],...

2021-11-08 09:24:09,324 DEBUG o.a.k.clients.FetchSessionHandler - [Consumer clientId=consumer-user-event-ingestion-2, groupId=user-event-ingestion] Built incremental fetch (sessionId=1079420081, epoch=1) for node 3. Added (), altered (user-interaction-0), removed () out of (user-interaction-0)

2021-11-08 09:24:09,324 DEBUG o.a.k.c.consumer.internals.Fetcher - [Consumer clientId=consumer-user-event-ingestion-2, groupId=user-event-ingestion] Sending READ_UNCOMMITTED IncrementalFetchRequest(toSend=(user-interaction-0), toForget=(), implied=()) to broker ...

2021-11-08 09:24:09,374 DEBUG o.a.k.clients.consumer.KafkaConsumer - [Consumer clientId=consumer-user-event-ingestion-2, groupId=user-event-ingestion] Pausing partitions [user-interaction-0]

--

GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG
user-event-ingestion user-interaction 0 386344 387574 1230

--
Same Consumer code is used by other services but the problem was not observed.
The event traitement for this service is longer then the others so I am supposing may be the consumer in unable to resume the partition after a long time.
Also we observed an event loss after this stop.
I added application logs for this consumer and supervision strategy to log exception but nothing pass through.
Any hints please ?

Reproducible Test Case

Still unable to found the scenario.

khouloudsa changed the title ~~Kakfa consumer stops without errors~~ Kafka consumer stops without errors Nov 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kafka consumer stops without errors #1433

Kafka consumer stops without errors #1433

khouloudsa commented Nov 9, 2021 •

edited

Loading

Kafka consumer stops without errors #1433

Kafka consumer stops without errors #1433

Comments

khouloudsa commented Nov 9, 2021 • edited Loading

Versions used

Expected Behavior

Actual Behavior

Relevant logs

Reproducible Test Case

khouloudsa commented Nov 9, 2021 •

edited

Loading