-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[elastic-agent] initial instrumentation #29031
[elastic-agent] initial instrumentation #29031
Conversation
This pull request does not have a backport label. Could you fix it @stuartnelson3? 🙏
NOTE: |
This pull request does not have a backport label. Could you fix it @stuartnelson3? 🙏
NOTE: |
💚 Build Succeeded
Expand to view the summary
Build stats
Test stats 🧪
💚 Flaky test reportTests succeeded. 🤖 GitHub commentsTo re-run your PR in the CI, just comment with:
|
Haven't reviewed anything yet, but this PR is exciting! It also covers some diagnostics work that's on my backlog. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good so far!
Not too sure about the span names. Typically I'd name after higher level operations, rather than method names. Maybe that's just my bias though.
x-pack/elastic-agent/pkg/agent/application/upgrade/step_download.go
Outdated
Show resolved
Hide resolved
x-pack/elastic-agent/pkg/agent/application/upgrade/step_download.go
Outdated
Show resolved
Hide resolved
That would be much appreciated! I'm sort of flying blind at the moment as I'm not super familiar with the codebase, so any help would be great. |
This is really exciting! Thanks for taking this on. One thing I'm curious in the context of diagnostics: Assuming someone is not shipping the apm data to ES, would it still be possible to fetch some of it as part of the diagnostics call? Does that even make sense? @axw @stuartnelson3 |
@ruflin, I'm not entirely sure how the apm client works, I think that integration with the diagnostics command should be done as a separate PR. |
@ruflin not out of the box, but it would be possible to do something like this by implementing a custom transport. e.g. we have a transport which decodes events and stores them in memory for testing purposes: https://github.com/elastic/apm-agent-go/blob/master/transport/transporttest/recorder.go |
Very interesting @axw . Is this something that could be enabled "on-demand"? What I have in mind is something like:
@michel-laterman Agree in any case to separate it out in future PR's. |
@ruflin would you mind creating a dedicated issue for this kind of conversation. I believe we can isolate it and don't think we should block making progress with initial instrumentation by this. |
@ruflin please ping me on said issue and I'll reply there so the answers don't get lost. |
@ph i've addressed some comments and written some follow-up questions, can you take a look when you have a chance? also, I'm not sure what the e2e test failure is, it's apparently something with kubernetes ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me, @stuartnelson3 @blakerouse can you have an eye on it?
any idea what's up with the packaging failures?
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall the change looks very mechanical, which is good to see. Made it easier to review such a large branch.
I reviewed the initTracer
as that is probably the most important part. Looks good and provides unit test coverage for it.
I think this might have been broken by an unrelated change on master, see #30007. I think there was a follow up fix if it's the same issue I was seeing on other PRs. |
/test |
3 similar comments
/test |
/test |
/test |
/test |
e2e are failing but not caused by this PR. We @elastic/observablt-robots are working on the permanent fix (very probably here elastic/e2e-testing#2064), so I'd say this PR is fine 👍 |
This revert the code of the APM Instrumentation of the Elastic Agent. To unblock the build of and the CI for other team. This would require more investigation to really understand the problem. Fixes elastic/fleet-server#1129
* Revert #29031 This revert the code of the APM Instrumentation of the Elastic Agent. To unblock the build of and the CI for other team. This would require more investigation to really understand the problem. Fixes elastic/fleet-server#1129 * fix make update * fix linter
* Revert #29031 This revert the code of the APM Instrumentation of the Elastic Agent. To unblock the build of and the CI for other team. This would require more investigation to really understand the problem. Fixes elastic/fleet-server#1129 * fix make update * fix linter (cherry picked from commit 718c923)
Co-authored-by: Pier-Hugues Pellerin <[email protected]>
This reverts commit 718c923.
This reverts commit 718c923.
This reverts commit 718c923.
This revert the code of the APM Instrumentation of the Elastic Agent. To unblock the build of and the CI for other team. This would require more investigation to really understand the problem. Fixes elastic/fleet-server#1129
What does this PR do?
Add some initial tracing to elastic-agent. I basically just looked through the code and was looking at lowhanging fruit for wrapping http handlers and clients. I'd love to have some of the EA dev's point out places I've missed that I could add more tracing, and ways to trigger these traces to view them in the UI.
todo:
Why is it important?
This will help in troubleshooting issues as more users start running EA.
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.How to test this PR locally
build elastic-agent off this branch, start it under fleet, add the APM integration. the agent should appear as an apm service.
Related issues
closes #28895
Use cases
troubleshooting mis-behaving elastic-agents
Screenshots
(traces are coming from a metricbeat grabbing proxied data from apm-server)