Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: ensure the span run-context for outgoing http requests does not spill into user code #2470

Merged
merged 8 commits into from
Dec 1, 2021

Conversation

trentm
Copy link
Member

@trentm trentm commented Nov 25, 2021

This fixes the 'http' and 'https' instrumentation for outgoing requests
to not have the 'http' span context be active in user code. This ensures
that user code cannot create a child span of the http span, which would
(a) be misleading and (b) cause problems for coming exit span and
compressed span work.

Refs: #2430

Checklist

  • Implement code
  • Add tests
  • Add CHANGELOG.asciidoc entry

…ill into user code

This fixes the 'http' and 'https' instrumentation for outgoing requests
to not have the 'http' span context be active in user code. This ensures
that user code cannot create a child span of the http span, which would
(a) be misleading and (b) cause problems for coming exit span and
compressed span work.

Refs: #2430
@trentm trentm self-assigned this Nov 25, 2021
@github-actions github-actions bot added the agent-nodejs Make available for APM Agents project planning. label Nov 25, 2021
@apmmachine
Copy link
Contributor

apmmachine commented Nov 25, 2021

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2021-11-30T23:01:16.106+0000

  • Duration: 20 min 40 sec

  • Commit: 9ca8dbb

Test stats 🧪

Test Results
Failed 0
Passed 22
Skipped 0
Total 22

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • run module tests for <modules> : Run TAV tests for one or more modules, where <modules> can be either a comma separated list of modules (e.g. memcached,redis) or the string literal ALL to test all modules

  • run benchmark tests : Run the benchmark test only.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@trentm
Copy link
Member Author

trentm commented Nov 25, 2021

The tests were failing on this (with some added console.log('XXX ...') to try to show what is going on):

% node  test/central-config-enabled.test.js
TAP version 13
# remote config enabled
XXX making proto.request in http-request
XXX making proto.request in http-request
XXX making proto.request in http-request
ok 1 mock apm-server got central config request
ok 2 agent conf for key transactionSampleRate was updated to expected value
ok 3 agent conf for key transactionMaxSpans was updated to expected value
ok 4 agent conf for key captureBody was updated to expected value
ok 5 agent conf for key transactionIgnoreUrls was updated to expected value
ok 6 agent conf for key logLevel was updated to expected value
# remote config enabled: receives comma delimited
XXX making proto.request in http-request
XXX instrumentation not yet started
/Users/trentm/el/apm-agent-nodejs6/lib/instrumentation/http-shared.js:172
      const spanRunContext = parentRunContext.enterSpan(span)
                                              ^

TypeError: Cannot read property 'enterSpan' of null
    at Object.request (/Users/trentm/el/apm-agent-nodejs6/lib/instrumentation/http-shared.js:172:47)
    at httpRequest (/Users/trentm/el/apm-agent-nodejs6/lib/http-request.js:117:21)
    at getMetadataGcp (/Users/trentm/el/apm-agent-nodejs6/lib/cloud-metadata/gcp.js:28:15)
    at /Users/trentm/el/apm-agent-nodejs6/lib/cloud-metadata/index.js:73:9
    at processTicksAndRejections (internal/process/task_queues.js:79:11)

The issue is that we are accidentally instrumenting the outgoing http request attempts being done for cloud metadata fetching... and doing so with the agent in a state here:

  • the require-in-the-middle hook is active, but
  • the Instrumentation has not started.
    This state is only possible when the Agent class is being re-used in the same process, which some test files like "test/central-config-enabled.test.js" are doing.

a fix

A possible fix is to tweak "lib/http-request.js" (as we already did for the elastic-apm-http-client in elastic/apm-nodejs-http-client#161) to get early references to http.request and https.request so that the agent doesn't instrument itself. This kills two birds with one stone: we will also no longer get the following occasional abortive attempts to instrument the cloud-metadata fetching:

[2021-11-25T00:29:20.480Z] DEBUG (elastic-apm-node): intercepted call to http.request {"id":null}
[2021-11-25T00:29:20.481Z] DEBUG (elastic-apm-node): no active transaction found - cannot build new span

There was a bug in the https instrumentation in older versions of node
(version <9.0.0) where the instrumentation of `https.request` relied
on intercepting `http.request` (that Node's `https.request()` would
call). However, the agent didn't guarantee that the 'http' module was
instrumented. A user program that used `https.request` without
indirectly `require('http')`ing would not get an HTTP span.

Also change some test fixtures to use {http,https}.request() usage
to a form supported in versions of node before v10.9.0 to be able
to test with node v8.
@trentm trentm marked this pull request as ready for review November 25, 2021 21:01
@trentm trentm requested a review from astorm November 25, 2021 21:01
@trentm trentm changed the title fix: ensure the span run-context for output http requests does not spill into user code fix: ensure the span run-context for outgoing http requests does not spill into user code Nov 25, 2021
Copy link
Contributor

@astorm astorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Approving.

I see here that we've taken steps to ensure the request methods used by the agent are uninstrumented functions.

I also see that we're binding a request's callback to the run context that was active when the request was made.

I see we're invoking the original request using a new run context created via the parent run context's enterSpan method.

Finally, in the response event handler, I see we're ensuring the response is bound to the parent's run context.

@trentm trentm merged commit 751b9b2 into master Dec 1, 2021
@trentm trentm deleted the trentm/run-context-http branch December 1, 2021 21:28
trentm added a commit that referenced this pull request Dec 1, 2021
…spill into user code (#2470)

This fixes the 'http' and 'https' instrumentation for outgoing requests
to not have the 'http' span context be active in user code. This ensures
that user code cannot create a child span of the http span, which would
(a) be misleading and (b) cause problems for coming exit span and
compressed span work.

Also, fix a bug in the https instrumentation in older versions of node
(version <9.0.0) where the instrumentation of `https.request` relied on
intercepting `http.request` (that Node's `https.request()` would call). 
The agent didn't guarantee that the 'http' module was instrumented. 
A user program that used `https.request` without indirectly
`require('http')`ing would not get an HTTP span.

Refs: #2430
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agent-nodejs Make available for APM Agents project planning.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants