Replies: 2 comments 7 replies
-
Thanks for subnimtting this @mbihoop , I like the idea of suppresion, which can be super useful in cutting down costs. Perhaps we can make it based off message size too, which might be easier to calculate at high throughput One note though is a really common scenario I see is when in K8s outputting to stdout and then re-ingesting the logs and then reoutputting to stdout, the comparison wouldn't have much affect in that scenario |
Beta Was this translation helpful? Give feedback.
-
Hi, Is there anything new on this topic? Is it possible to use a Lua script to suppress duplicate log messages on the fluent-bit side? And if so, did anyone try it? Thanks! |
Beta Was this translation helpful? Give feedback.
-
Last week, our team demonstrated log forwarding from Kubernetes workloads to Fluent Bit to Splunk. Unfortunately, we set one of the containers to a very verbose mode during the demo and created a deluge of log messages.
Fluent Bit handled the messages like a champ. Unfortunately, Splunk was running within a memory-restricted VM on the developer laptop and didn't do so well as it is a bit of a memory hog. Memory growth became elevated, and within a short time, the system locked up, and we needed to end the demo.
In the situation I've outlined above, the log messages were all identical, the only difference being the timestamp.
It got me thinking, "is it possible for Fluent Bit to filter out duplicate messages?".
I did a quick google search and found the following Ruby language plugins, which have been written quite some time ago for fluentd, as opposed to Fluent Bit.
I wonder, and I can't immediately find an answer in the documentation; despite finding other load-regulation filters such as leaky-bucket, an equivalent mechanism or extension point for Fluent Bit which, in the following scenario, will suppress duplicate log events:
"When presented with 100,000 identical, sequential log messages sent over 5 minutes, Fluent Bit will only forward the first message in the sequence to Splunk, then suppress any following messages determined to be identical by some field comparison for a period of, say 5 minutes following the first message?"
Beta Was this translation helpful? Give feedback.
All reactions