Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement Request - automatic time-bucket conflict resolution #57

Open
PauloAugusto-Asos opened this issue Apr 18, 2017 · 2 comments
Open

Comments

@PauloAugusto-Asos
Copy link

PauloAugusto-Asos commented Apr 18, 2017

Enhancement Request

Requesting that the plugin automatically resolves time-bucket conflicts.

If we send 2 or more data points to the same "series" with the same timestamp, ex:

  • same Host "tag",
  • same HTTP Method "tag",
  • same HTTP Status response "tag",
  • same URL Path "tag",
  • for the exact same time,
  • duration / time-taken as value/data point,

InfluxDB will just overwrite all the data points with the last one received. This is quite likely to happen in high traffic websites, where you'll have the same server respond to more than 1 equal request in a second, while storing the request/response time with only the granularity of Second.

Proposal:

output {
_ _ influxdb {
_ _ _ _ time_conflict_resolver => "AddMillisecond"
changes the timespamps 12:34:56, 12:34:56, 12:34:56
To:
12:34:56.001, 12:34:56.002, 12:34:56.003

_ _ _ _ time_conflict_resolver => "AddMicrosecond"
Same but at the level of Microsecond.
Potentially also the same but at the level of Nanosecond.

_ _ _ _ time_conflict_resolver => "AddNewTag"
_ _ _ _ time_conflict_resolving_tag => "qwerty"
Adds the following InfluxDB "tags" to only and each conflicting datapoint:
qwerty=1,
qwerty=2,
qwerty=3
This one creates new series but it's my favorite, as it doesn't changes the timestamp.

@PauloAugusto-Asos
Copy link
Author

And just to confirm, indeed I can see strong signs that web access logs are being missed. I'm getting for each server consistently 4 entries every second:

  • Same URL;
  • Once for each of HTTP Methods GET, POST, DELETE, PUT

While at least a server peaked at ~150 similar requests in the same second in its access logs. A huge amount of logs is getting lost due to the time-bucket conflict.

@PauloAugusto-Asos
Copy link
Author

PauloAugusto-Asos commented Apr 19, 2017

I was losing plenty of requests due to Time-Bucket conflicts and even though it is still a suitable situation for many needs, it didn't suit my needs so I had to go back to Outputting to ElasticSearch instead :(.

I really hope you guys are able to improve this so I can get back to outputting to InfluxDB - after changing the output from InfluxDB to ElasticSearch it's now chewing through the server's disk space like crazy... :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant