-
Notifications
You must be signed in to change notification settings - Fork 528
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move source mapping to model processing #5631
Conversation
💚 Build Succeeded
Expand to view the summary
Build stats
Test stats 🧪
Trends 🧪 |
4c88168
to
a73185e
Compare
Source mapping transformation is rewritten as a model.BatchProcessor, moved out of the model package and into the sourcemap package. One side-effect of the move to a process, worth debating, is that source mapping now occurs in the HTTP handler goroutine rather than in the publisher, which could increase request latency slightly. On the other hand this means that source mapping for more clients than there are CPUs can now happen concurrently (the publisher is limited to the number of CPUs in this regard); and the handlers will block for up to one second anyway if the publisher is busy/queue is full. If a stack frame cannot be mapped, we no longer set `sourcemap.error` or log anything. Just because RUM and source mapping is enabled, does not mean that all stacks _must_ be source mapped; therefore these "errors" are mostly just noise. Likewise we now only set `sourcemap.updated` when it is true.
/test |
The failures highlight an obvious issue, which is that if Elasticsearch is unavailable it should not prevent ingestion. When we have ditched the old direct-to-Elasticsearch source map implementation this won't be an isssue, but for now we'll need to continue logging and recording these errors in |
If there is an error fetching a source map, we once again set `sourcemap.error` on the frame and log the error at debug level. The logging is now using zap's rate limited logging, rather than storing in a map.
I reinstated the logging/recording of source map fetching errors. It is possible for the processor to take a significant amount of time due to Elasticsearch search retries, e.g. when the server is unavailable. In a realistic setup I can't see moving this from the publisher to the handler making a practical difference. If the delay occurs in the publisher, then its queue would back up and cause the handler to timeout. I have added a configurable timeout for the sourcemap processor, defaulting to 5s. This is currently ineffective due to elastic/go-elasticsearch#300. In the future when we're fully on Fleet, the catalogue of source maps will be pushed to APM Server. I think it would then be reasonable for APM Server to eagerly fetch and cache these, rather than fetching them on demand. Then the fetching will not be in the critical path of the handler, and it can just use them if they're in memory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small change and a question; but LGTM!
@@ -46,8 +47,14 @@ var ( | |||
validateError = monitoring.NewInt(registry, "validation.errors") | |||
) | |||
|
|||
// AddedNotifier is an interface for notifying of sourcemap additions. | |||
// This is implemented by sourcemap.Store. | |||
type AddedNotifier interface { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the motivation for exporting this interface? It doesn't cost anything to export it, just curious since the interface isn't being referenced as an argument or struct member anywhere else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used in the Handler function below. Am I misunderstanding your question?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going to merge as-is, happy to update if you think there's an improvement to be had.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's an improvement, just wondering your personal opinion about exporting or not exporting an interface. In this case, AddedNotifier
is only referenced once in the project, as an argument in the below function in the same package. A non-exported addedNotifier
would also have worked, so I was curious if there was a specific motivation behind the choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. My preference is to always have the interface exported if it is referenced by some exported type/function. IMO you should be able to understand how to use the interface just by clicking through the godoc UI, without delving into source.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/test |
* Move source mapping to model processing Source mapping transformation is rewritten as a model.BatchProcessor, moved out of the model package and into the sourcemap package. One side-effect of the move to a process, worth debating, is that source mapping now occurs in the HTTP handler goroutine rather than in the publisher, which could increase request latency slightly. On the other hand this means that source mapping for more clients than there are CPUs can now happen concurrently (the publisher is limited to the number of CPUs in this regard); and the handlers will block for up to one second anyway if the publisher is busy/queue is full. If a stack frame cannot be mapped, we no longer set `sourcemap.error` or log anything. Just because RUM and source mapping is enabled, does not mean that all stacks _must_ be source mapped; therefore these "errors" are mostly just noise. Likewise we now only set `sourcemap.updated` when it is true. * Reintroduce `sourcemap.error` If there is an error fetching a source map, we once again set `sourcemap.error` on the frame and log the error at debug level. The logging is now using zap's rate limited logging, rather than storing in a map. * Introduce sourcemap timeout config * Remove unnecessary fleetCfg param (cherry picked from commit ac3dc27) # Conflicts: # changelogs/head.asciidoc
* Move source mapping to model processing (#5631) * Move source mapping to model processing Source mapping transformation is rewritten as a model.BatchProcessor, moved out of the model package and into the sourcemap package. One side-effect of the move to a process, worth debating, is that source mapping now occurs in the HTTP handler goroutine rather than in the publisher, which could increase request latency slightly. On the other hand this means that source mapping for more clients than there are CPUs can now happen concurrently (the publisher is limited to the number of CPUs in this regard); and the handlers will block for up to one second anyway if the publisher is busy/queue is full. If a stack frame cannot be mapped, we no longer set `sourcemap.error` or log anything. Just because RUM and source mapping is enabled, does not mean that all stacks _must_ be source mapped; therefore these "errors" are mostly just noise. Likewise we now only set `sourcemap.updated` when it is true. * Reintroduce `sourcemap.error` If there is an error fetching a source map, we once again set `sourcemap.error` on the frame and log the error at debug level. The logging is now using zap's rate limited logging, rather than storing in a map. * Introduce sourcemap timeout config * Remove unnecessary fleetCfg param (cherry picked from commit ac3dc27) # Conflicts: # changelogs/head.asciidoc * Delete head.asciidoc Co-authored-by: Andrew Wilkins <[email protected]>
Confirmed with BC2 |
Motivation/summary
Source mapping transformation is rewritten as a
model.BatchProcessor
, moved out of the model package and into the sourcemap package. Moving such logic out of the model package takes us another step towards generating the model types andbeat.Event
transformation.One side-effect of the move to a process, worth debating, is that source mapping now occurs in the HTTP handler goroutine rather than in the publisher, which could increase request latency slightly. On the other hand this means that source mapping for more clients than there are CPUs can now happen concurrently (the publisher is limited to the number of CPUs in this regard); and the handlers will block for up to one second anyway if the publisher is busy/queue is full.
If a stack frame cannot be mapped, we no longer set
sourcemap.error
or log anything. Just because RUM and source mapping is enabled, does not mean that all stacks must be source mapped; therefore these "errors" are mostly just noise. Likewise we now only setsourcemap.updated
when it is true.I removed the
TestSourcemapCacheExpiration
system test as I believe it is redundant. In the test immediately above that one we prove that caching is effective, and cache expiration is unit tested.Checklist
- [ ] Documentation has been updatedHow to test these changes
sourcemap.error
orsourcemap.updated
setRelated issues
Prerequisite for #4120 and #3565
Closes #4958