[Bug] The kafka consumer plugin has a bug that it may mix up some messages and make those messages to another trace #12482
Unanswered
darkness-2nd
asked this question in
Q&A
Replies: 1 comment 6 replies
-
I don't know which is wrong, but as same as many async scenarios, if the poll operation doesn't keep the thread to the process only, but switch to another poll, then the tracing context in the thread context could be mixed for sure. You need to check carefully, because, if the consuming of message in the bulk mode, N messages per process, then I would say the agent and UI are correct. It should not be attached to match the request, it would be merged as the multiple parent of the consuming span. |
Beta Was this translation helpful? Give feedback.
6 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Search before asking
Apache SkyWalking Component
Java Agent (apache/skywalking-java)
What happened
I make a demo consumer use spring annotation KafkaListener, and make a REST API to do send kafka message by KafkaTemplate ,but when I make a breakpoint at org.apache.kafka.clients.consumer.KafkaConsumer#poll(long, boolean) before call method "pollAndFetchs", and then I make 5 http request in browser, after 1 miniutes, I resume the breakpoint, the kafka agent may mix up some messages and make those mixed message to another trace.
the test code is as follows
@GetMapping("/spring/kafka")
public String testSpringKafka() {
String traceId = TraceContext.traceId();
kafkaTemplate.send("TOPIC_DEDUCT_MONEY", "key", "msg");
return "OK";
}
@KafkaListener(
topics = {"TOPIC_DEDUCT_MONEY"},
groupId = "study-account"
)
public void deductStorage(ConsumerRecord<String, String> consumerRecord) {
String value = consumerRecord.value();
System.out.printf("deduct money,msg:%s%n", value);
}
I expect that the 5 messages can be the child span of its own http request, but there are 2 messages were belonged to a request that didn't send it.
The traces "3480b2ffc8594b1ab8749fc2af82ef74.82.17219730707690001" and "3480b2ffc8594b1ab8749fc2af82ef74.84.17219730707740001" don't have consume span, but another 3 http requests have its child consume span.
And look at the dashboard, there are only 3 consume segments, What's even more amazing is that the Consume span "3480b2ffc8594b1ab8749fc2af82ef74.83.17219730707690001" has 2 traceIds in its detail info, and those traces are belongs to ""3480b2ffc8594b1ab8749fc2af82ef74.82.17219730707690001" and "3480b2ffc8594b1ab8749fc2af82ef74.84.17219730707740001", which are missing in the dashboard.
What you expected to happen
I hope these two missing consume spans can be displayed on the dashboard and in the parent http request span to which they belong
How to reproduce
@GetMapping("/spring/kafka")
public String testSpringKafka() {
String traceId = TraceContext.traceId();
kafkaTemplate.send("TOPIC_DEDUCT_MONEY", "key", "msg");
return "OK";
}
@KafkaListener(
topics = {"TOPIC_DEDUCT_MONEY"},
groupId = "study-account"
)
public void deductStorage(ConsumerRecord<String, String> consumerRecord) {
String value = consumerRecord.value();
System.out.printf("deduct money,msg:%s%n", value);
}
In kafka 2.0.x ~ 3.6.x, make a breakpoint at org.apache.kafka.clients.consumer.KafkaConsumer#poll(long, boolean).
In kafka 3.7.x, make a breakpoint at org.apache.kafka.clients.consumer.KafkaConsumer#poll(long) and org.apache.kafka.clients.consumer.KafkaConsumer#poll(java.time.Duration).
And then call more than 5 http requests, then wait about 1 minutes, resume the breakpoint, then access the dashboard.
Anything else
No response
Are you willing to submit a pull request to fix on your own?
Code of Conduct
Beta Was this translation helpful? Give feedback.
All reactions