-
Notifications
You must be signed in to change notification settings - Fork 440
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Error getting metrics profile: period between metrics collection is too small #2863
Comments
github-actions
bot
added
the
needs-triage
New issues that have not yet been triaged
label
Sep 13, 2024
nsrip-dd
added
profiler
and removed
needs-triage
New issues that have not yet been triaged
labels
Sep 13, 2024
Thanks for the report. I'll see if we can handle this better. |
nsrip-dd
added a commit
that referenced
this issue
Sep 13, 2024
The metrics profiler wants to have at least one second between collections to get meaningful numbers. It will return an error if it ran less than a second from the last time. However, it's normal to stop it early when shutting down the profiler. The error in this case isn't useful. Don't even bother enabling the metrics profiler if the period is shorter than one second. And if the profiler is stopped, don't log that error. Fixes #2863
nsrip-dd
added a commit
that referenced
this issue
Sep 19, 2024
The metrics profiler insisted on at least one second between collections for two reasons: - To avoid a division by zero because it was doing integer division to convert a time.Duration to seconds, which will truncate to 0, as a ratio in a subsequent computation - In case "a system clock issue causes time to run backwards" The profiler would report an error if less than one second elapsed between collections. In practice, this resulted in misleading error logs because it's entirely likely for profiling to be stopped less than a second after the last profile collection. The restriction was not really even needed. For one, we can just do floating-point division rather than integer division to avoid the truncation problem. Also, Go has had monotonic time support by default since 2017, added in Go 1.9, and time comparison operations including time.Time.Sub, work with respect to monotonic time. We shouldn't have any issues with negative periods. We can ensure the period is positive just as a defensive measure, and fail if it's negative since this may indicate a bug in the Go runtime if it's violating the monotonicity guarantees. Fixes #2863
MNThomson
pushed a commit
to agilebits/dd-trace-go
that referenced
this issue
Oct 21, 2024
…taDog#2865) The metrics profiler insisted on at least one second between collections for two reasons: - To avoid a division by zero because it was doing integer division to convert a time.Duration to seconds, which will truncate to 0, as a ratio in a subsequent computation - In case "a system clock issue causes time to run backwards" The profiler would report an error if less than one second elapsed between collections. In practice, this resulted in misleading error logs because it's entirely likely for profiling to be stopped less than a second after the last profile collection. The restriction was not really even needed. For one, we can just do floating-point division rather than integer division to avoid the truncation problem. Also, Go has had monotonic time support by default since 2017, added in Go 1.9, and time comparison operations including time.Time.Sub, work with respect to monotonic time. We shouldn't have any issues with negative periods. We can ensure the period is positive just as a defensive measure, and fail if it's negative since this may indicate a bug in the Go runtime if it's violating the monotonicity guarantees. Fixes DataDog#2863
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
When our service is terminating, the corresponding dd profiler is terminated as well via
profiler.Stop()
This results in an interupt of the sleep period of the profiler in (which may on some occasions be smaller then 1 sec)
Version of dd-trace-go
v1.67.1
Describe what happened:
Datadog Tracer v1.67.1 ERROR: Error getting metrics profile: period between metrics collection is too small min=1s observed=573.458102ms; skipping. (occurred: 13 Sep 24 09:49 UTC)
Describe what you expected:
No error when the service is shutting down and the profiler is terminating (even if it just reported its metrics)
Steps to reproduce the issue:
Additional environment details (Version of Go, Operating System, etc.):
go 1.23.1
The text was updated successfully, but these errors were encountered: