-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add rule for validating dashboards #6
Conversation
var basesch, distsch schema.CueSchema | ||
|
||
func getBaseSchema() (schema.CueSchema, error) { | ||
var err error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bleh, this means only the first call will actually error correctly, defeating the point of a Once.
If we actually want to merge this, i'll fix the problem.
|
||
"cuelang.org/go/cue/errors" | ||
"github.com/grafana/grafana/pkg/schema" | ||
"github.com/grafana/grafana/pkg/schema/load" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these packages apache licensed? This tool can't import AGPL code (its imported by mixtool, which is apache licensed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 this concern. Seems like the schema bits of grafana could/should/will-eventually be under a more forgiving license?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, looks like we need to update the list. All the CUE-related schema stuff is supposed to be Apache 2 (which is why https://github.com/grafana/grafana/blob/main/LICENSING.md excludes cue
), but looks like i didn't add a line for this subtree in particular when this package was added. Will make a PR, for this and a couple others.
(pkg/schema will become grafana/scuemata, which is A2)
lint/rule_validate.go
Outdated
basesch, err = load.BaseDashboardFamily(load.GetDefaultLoadPaths()) | ||
if err != nil { | ||
panic(err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why we're panicing and returning an error here; can we do one or the other?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sloppy lack of removal of my original approach to this, lol
Thanks Sam! Have you run this against the integrations for cloud? I wonder how many of the dashboards fail... |
This is great! Have been noodling on how to include validation into this. A couple of things stand out as possible improvements to the tool which might make rules like this more palatable.
|
Haven't, just banged this out and put up a PR after trying on some basic JSON
Frankly, right now, the output can be just vomit - see the link in the OP. It seems likely we're going to need our whole own framework for sane error management; not sure how much or when we can reasonably expect upstream CUE to do. (Apart from cue-lang/cue#602, which i linked to on the other issue). This is one of many areas i would love to be able to invest in :) So, multiple error results? Yes, that would be helpful, as once it's not vomit, there will still plausibly be multiple. But...
That's one way to do it. Honestly, validation is a different class of check than linting, fully prerequisite to it. That is, it's usually not worth linting invalid stuff. That won't be true here until the schema are fully developed, of course, but it does suggest that it might be worth making validation into a special, non-extensible operation that - if turned on - is run prior to all other rules. Also, just fyi - this exact same logic is available in |
The direction i imagine this could head, btw, is something like what i described here: https://www.youtube.com/watch?v=PpoS_ThntEM&t=1709s So, you start with |
In mixtool we used an error channel for the lint errors, allowing rules to "return" many: https://github.com/monitoring-mixins/mixtool/blob/master/pkg/mixer/lint.go#L67 Not 100% sure I want to recommend this approach, this is more just an FYI.
Personally I think its fine for this tool to be super opinionated. I considered even arguing that lint rules shouldn't be optional ala golint https://github.com/golang/lint#purpose. Using this linter is completely optional, and part of the reason for is existence IMO is consistency: any dashboard that passes this linter should be constructed and act the same. WDYT? |
Licensing fix: grafana/grafana#42234 |
I agree that this tool should be super opinionated. My concern is in defending the rules and their outcome. For instance, when offering a PR to an upstream mixin or dashboard, and citing that it doesn't pass a given linting rule, I don't want to leave margin for a maintainer to look at required, opinionated rules in the linter and find one that we can't unapologetically defend. At the same time, I think it's useful for there to be rules which we expect may not pass in all cases, or which we're experimenting with that may not be mature. For this functionality in particular, I agree with @sdboyer that we ought not merge validation as a rule. Instead it should be a pre-req that would be run beforehand in a CI toolchain. I expect at some point we'll encounter another type of rule that might fit in this "gray area" tho. 🤔 |
One major thought, here: opinionated isn't a problem, but exclusionary is. From my quick tests, this seems to just assume everything is a Prom datasource. That makes it impossible to use this for anything other than Prom, and by extension, to recommend it to users who are using anything other than Prom. IMO, that goes beyond opinionated. Now, that still wouldn't be a problem, except for the namespace this lives in - |
|
i'm gonna close this one...but FWIW, trying to do the integration here led to many, many improvements. I'll be coming with another PR based on grok sooner rather than later :) |
FYI, this is finally getting addressed in the next couple weeks, and is captured in an issue here #61 I'll likely add you as a reviewer when I start making these changes. |
This adds rules for validating the dashboards using the official upstream schema.
I've marked this as a draft because, after having seen how the flow of this tool actually works, it seems like a sloppy way to integrate this kind of logic - the possibility of false positives, and the messiness of errors, make it maaaaaaybe a good addition (it's good enough that we no longer allow Grafana dashboards to enter
grafana/grafana/devenv
without passing validation), but still not completely canonical and reliable. OTOH, that isn't worse than e.g. false positives arising from expecting only Prom queries.Either way, i figured it was worth putting the PR up at minimum to show how this kind of addition could be made. My guess is that a larger refactor of the application to treat validation as a prerequisite to linting would be a better architecture, and would also probably set us up for when we have Go structs that represent dashboard structures.