-
-
Notifications
You must be signed in to change notification settings - Fork 7.2k
fix: Explicitly pushing status=down now bypasses retry logic and goes directly to DOWN
#6595
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
@CommanderStorm would you review this PR as well? for two issues fix |
CommanderStorm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please split this into separate PRs? We try to keep it to one PR = one logical change.
The second change looks promising (though likely breaking), but the first isn’t quite ready yet. Keeping them separate will make review and decision-making much easier.
4f78377 to
4467102
Compare
|
@CommanderStorm this is splitted into one PR |
DOWN
CommanderStorm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but won't be merged into v2 due to likely being a breaking change
Co-authored-by: Frank Elsinga <[email protected]>
Co-authored-by: Frank Elsinga <[email protected]>
|
Hi @CommanderStorm when would you launch the v3.0? 😉 |
|
This will take a while. If you want, you can revise it onto the v3.0 branch and we can merge it there |
|
Is there a way to opt back in to explicitly push a pending status? Something like a We currently use the DOWN status with retries to differ between "command did not run" and "command ran but was not successful" while keeping the time metric of those failed runs. One such use case is a command where I monitor disk space capacity of a server to warn on usage above e.g. 80% where I misuse the time metric as the disk usage percentage. Graph shows 0-100ms but means 0-100% disk used. |
Yes, if you push PS: I know this is a breaking change for you, I still want to give the ability to fail the check instead of having basically 2x pending (one named "pending", one named "down") for the monitor. |
Awesome, that is all I need.
No worries. As long as it is clearly communicated, I see no issues with that. A fix on my end is easy and can be applied even before upgrading to the v3.0. I do prefer the difference between
I also saw this issue while searching for mine: #4785.
Alternatively the |
Breaking changes, no matter how nice, need to be done in a semver major. |
Introducing a fith status (no matter if it is partial, retrying or whatever) would involve a fairly large amount of work, as most of our internal code and notification providers aren’t currently designed with this in mind. Supporting it properly would require changes across multiple areas. Given my current review capacity, I’m unfortunately not able to take on or oversee additional work of this scope right now. For some additional background and context, please see: There are also some larger refactorings (for example starting to increase our test coverage) needed in how our internal components interact before we could realistically consider something like this. |
ℹ️ To keep reviews fast and effective, please make sure you’ve read our pull request guidelines
📝 Summary of changes done and why they are done
status=downnow bypasses retry logic and goes directly to DOWNProblem
When a user explicitly pushes
status=downto a push monitor with retries configured, the monitor incorrectly goes to PENDING instead of DOWN. Users expect that explicitly pushing down should immediately mark the monitor as DOWN.Reproduction:
Retries = 3status=downexplicitly:curl "http://localhost:3001/api/push/TOKEN?status=down"Root Cause
The
determineStatus()function applies retry logic to all DOWN statuses, regardless of whether the status was explicitly provided by the user or resulted from a timeout/missed push.Solution
Added
isExplicitDownflag that bypasses retry logic whenstatus=downis explicitly pushed:Behavior Changes
status=downwith retries📋 Related issues
📄 Checklist
Please follow this checklist to avoid unnecessary back and forth (click to expand)
I understand that I am responsible for and able to explain every line of code I submit.
📷 Screenshots or Visual Changes
Contribution by Gittensor, see my contribution statistics at https://gittensor.io/miners/details?githubId=42954461