Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Downsampling Task that works on all data types without hardcoded filters #296

Open
R-Studio opened this issue Feb 1, 2022 · 1 comment
Labels
Template Request Request a new Template to help you solve a specific need or use-case

Comments

@R-Studio
Copy link

R-Studio commented Feb 1, 2022

I know the downsampling tasks examples, but I don't won't to change the downsampling tasks and contained hardcoded filters everytime we adding new data to our InfluxDB. More Information below:

What I want
I want to downsample all of my data (with mean) from my raw bucket "telegraf" (frequency = 30s, retention 30 days) like this:

  • telegraf -> telegraf_90d (frequency = 1h, retention 90 days)
  • telegraf_90d -> telegraf_365d (frequency = 12h, retention 365 days)
    ..

What I get / Error
Unfortunately I get following error:
could not execute task run: unsupported input type for mean aggregate: string

What is the cause?
I have some collectors like NetApp Harvest that unfortunately writes sometimes strings or booleans in "_value" and that's why I get the error above.

What I want to prevent / What is my goal

  • I want a working downsampling task that works with all supported data types without specifying them hardcoded.
  • I don't want to exclude all the _measurements or _fields that have strings in "_value".

Acceptable Workaround (when we don't find any solution)
Exclude all other data types that not contains a numeric value in "_value" from downsampling.
I thought as a QuickFix exclude all non numeric data with regex, but this won't work and I don't get any help (influxdata/flux#3804)

-> But I would be very happy If we find a solution for my issue and not a workaround.

My downsampling task (one of them)

option task = {name: "task_telegraf_90d", every: 1h}

data = from(bucket: "telegraf")
	|> range(start: -duration(v: int(v: task.every) * 2))
	|> filter(fn: (r) =>
		(r._measurement =~ /.*/))

data
	|> aggregateWindow(fn: mean, every: 1h)
	|> filter(fn: (r) =>
		(exists r._value))
	|> to(bucket: "telegraf_90d", org: "MYORG")

Additional Informations
InfluxDB: Version 2.0.5
VM: 8 vCores & 128GB Memory
I also write the same to the InfluxData Community.

This is extremely important for us! I am happy about any help.

@R-Studio R-Studio added the Template Request Request a new Template to help you solve a specific need or use-case label Feb 1, 2022
@johntdyer
Copy link

yea, Influx's story around rolling up data has always left a lot to be desired IMHO

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Template Request Request a new Template to help you solve a specific need or use-case
Projects
None yet
Development

No branches or pull requests

2 participants