Skip to content

Commit

Permalink
feat: mark Elasticsearch spans as exit spans; HTTP child spans are no…
Browse files Browse the repository at this point in the history
…w elided (#2557)

feat: mark Elasticsearch spans as exit spans; HTTP child spans are now elided

Elasticsearch (ES) client spans are no marked as exit spans, so their HTTP
child spans are now no longer generated per spec:
https://github.com/elastic/apm/blob/main/specs/agents/tracing-spans.md#exit-spans

Some HTTP context has been added to ES spans:
- span.context.http.status_code
- span.context.http.response.encoded_body_size

Closes: #2000
Closes: #2484
  • Loading branch information
trentm authored Feb 3, 2022
1 parent cff1465 commit 10f533e
Show file tree
Hide file tree
Showing 7 changed files with 228 additions and 98 deletions.
14 changes: 14 additions & 0 deletions CHANGELOG.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,20 @@ Notes:
run-context of callbacks. See {issues}2498[#2498] and the
<<release-notes-3.26.0,v3.26.0 release notes>> which has a similar change.
* Elasticsearch spans (from `elasticsearch`, `@elastic/elasticsearch`, and
`@elastic/elasticsearch-canary` instrumentation) will no longer have an HTTP
child span(s) for the underlying HTTP request. This is listed in this section
to provide awareness in case some users have custom analysis of APM trace
data that expects those HTTP spans.
Per https://github.com/elastic/apm/blob/main/specs/agents/tracing-spans.md#exit-spans[the APM Agent spec for exit spans],
Elasticsearch spans are now marked as exit spans and as a result, HTTP child
spans are suppressed. ({issues}2000[#2000])
As part of this change, some HTTP context has been added to Elasticsearch
spans, when available: the HTTP response `status_code`, and the size of the
response body (`encoded_body_size`). ({issues}2484[#2484])
[float]
===== Features
Expand Down
111 changes: 88 additions & 23 deletions lib/instrumentation/modules/@elastic/elasticsearch.js
Original file line number Diff line number Diff line change
Expand Up @@ -7,23 +7,25 @@
// to hook into all ES server interactions.
//
// Limitations:
// - In @elastic/elasticsearch >=7.14 <8, the HTTP span for ES spans made before
// the product-check is finished will have an incorrect parent.
// - In @elastic/elasticsearch >=7.14 <8, the diagnostic events sent for ES
// spans started before the product-check is finished will have an incorrect
// `currentSpan`.
//
// An Elasticsearch (ES) request typically results in a single HTTP request to
// the server. The APM agent creates an HTTP span that is a child of the ES
// span. For some of the later 7.x versions of @elastic/elasticsearch there is
// a product-check "GET /" that blocks the *first* request to the server. This
// results in:
// ES span
// |- ES span (the product check)
// | `- HTTP span (GET /)
// `- HTTP span (GET /...)
// This is fine so far. However, if any ES requests are made before that
// product-check completes, then their HTTP requests are effectively queued.
// When they *do* run, the async context of each HTTP request is that of the
// initial "HTTP span (GET /...)". Currently the APM agent is not patching
// for this.
// the server. For some of the later 7.x versions of @elastic/elasticsearch
// there is a product-check "GET /" that blocks the *first* request to the
// server. The handling of ES requests are effectively queued until that
// product-check is complete. When they *do* run, the async context is that
// of the initial ES span. This means that `apm.currentSpan` inside an ES
// client diagnostic event for these queued ES requests will be wrong.
// Currently the APM agent is not patching for this.
//
// - When using the non-default `asyncHooks=false` APM Agent option with
// @elastic/elasticsearch >=8 instrumentation, the diagnostic events do not
// have the async run context of the current span. There are two impacts:
// 1. Elasticsearch tracing spans will not have additional "http" context
// about the underlying HTTP request.
// 2. Users cannot access `apm.currentSpan` inside a diagnostic event handler.

const semver = require('semver')

Expand All @@ -44,21 +46,39 @@ module.exports = function (elasticsearch, agent, { version, enabled }) {
// would re-call `this.request(...)` inside a Promise.
const doubleCallsRequestIfNoCb = semver.lt(version, '7.7.0')
const ins = agent._instrumentation
const isGteV8 = semver.satisfies(version, '>=8', { includePrerelease: true })

agent.logger.debug('shimming elasticsearch.Transport.prototype.{request,getConnection}')
shimmer.wrap(elasticsearch.Transport && elasticsearch.Transport.prototype, 'request', wrapRequest)
shimmer.wrap(elasticsearch.Transport && elasticsearch.Transport.prototype, 'getConnection', wrapGetConnection)
shimmer.wrap(elasticsearch, 'Client', wrapClient)

// Mapping an ES client Connection object to its active span.
// - Use WeakMap to avoid a leak from possible spans that don't end.
// - WeakMap allows us to key off the ES client `request` object itself,
// which means we don't need to rely on `request.id`, which might be
// unreliable because it is user-settable (see `generateRequestId` at
// https://www.elastic.co/guide/en/elasticsearch/client/javascript-api/current/observability.html)
// Tracking the ES client Connection object and DiagnosticResult for each
// active span. Use WeakMap to avoid a leak from possible spans that don't
// end.
const connFromSpan = new WeakMap()
const diagResultFromSpan = new WeakMap()

return elasticsearch

function wrapClient (OrigClient) {
class ClientTraced extends OrigClient {
constructor (...args) {
super(...args)
const diagnostic = isGteV8 ? this.diagnostic : this
diagnostic.on('response', (_err, result) => {
if (result) {
const currSpan = ins.currSpan()
if (currSpan) {
diagResultFromSpan[currSpan] = result
}
}
})
}
}
return ClientTraced
}

// Transport#request() calls Transport#getConnection() when it is ready to
// make the HTTP request. This returns the actual connection to be used for
// the request. This is limited, however:
Expand Down Expand Up @@ -92,14 +112,14 @@ module.exports = function (elasticsearch, agent, { version, enabled }) {
const method = (params && params.method) || '<UnknownMethod>'
const path = (params && params.path) || '<UnknownPath>'
agent.logger.debug({ method, path }, 'intercepted call to @elastic/elasticsearch.Transport.prototype.request')
const span = ins.createSpan(`Elasticsearch: ${method} ${path}`, 'db', 'elasticsearch', 'request')
const span = ins.createSpan(`Elasticsearch: ${method} ${path}`, 'db', 'elasticsearch', 'request', { exitSpan: true })
if (!span) {
return origRequest.apply(this, arguments)
}

const parentRunContext = ins.currRunContext()
const spanRunContext = parentRunContext.enterSpan(span)
const finish = ins.bindFunctionToRunContext(spanRunContext, (err, _result) => {
const finish = ins.bindFunctionToRunContext(spanRunContext, (err, result) => {
// Set DB context.
// In @elastic/elasticsearch@7, `Transport#request` encodes
// `params.{querystring,body}` in-place; use it. In >=8 this encoding is
Expand All @@ -126,6 +146,51 @@ module.exports = function (elasticsearch, agent, { version, enabled }) {
span.setDestinationContext(getDBDestination(span,
connUrl && connUrl.hostname, connUrl && connUrl.port))

// Gather some HTTP context from the "DiagnosticResult" object.
// We are *not* including the response headers b/c they are boring:
//
// X-elastic-product: Elasticsearch
// content-type: application/json
// content-length: ...
//
// Getting the ES client request "DiagnosticResult" object has some edge cases:
// - In v7 using a callback, we always get `result`.
// - In v7 using a Promise, if the promise is rejected, then `result` is
// not passed.
// - In v8, `result` only includes HTTP response info if `options.meta`
// is true. We use the diagnostic 'response' event instead.
// - In v7, see the limitation note above for the rare start case where
// the diagnostic 'response' event may have the wrong currentSpan.
// The result is that with Promise usage of v7, ES client requests that
// are queued behind the "product-check" and that reject, won't have a
// `diagResult`.
let diagResult = isGteV8 ? null : result
if (!diagResult) {
diagResult = diagResultFromSpan[span]
if (diagResult) {
diagResultFromSpan.delete(span)
}
}
if (diagResult) {
const httpContext = {}
let haveHttpContext = false
if (diagResult.statusCode) {
haveHttpContext = true
httpContext.status_code = diagResult.statusCode
}
// *Not* currently adding headers because
if (diagResult.headers && 'content-length' in diagResult.headers) {
const contentLength = Number(diagResult.headers['content-length'])
if (!isNaN(contentLength)) {
haveHttpContext = true
httpContext.response = { encoded_body_size: contentLength }
}
}
if (haveHttpContext) {
span.setHttpContext(httpContext)
}
}

if (err) {
// Error properties are specified here:
// https://github.com/elastic/elasticsearch-js/blob/master/lib/errors.d.ts
Expand Down
2 changes: 1 addition & 1 deletion lib/instrumentation/modules/elasticsearch.js
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ module.exports = function (elasticsearch, agent, { enabled }) {

function wrapRequest (original) {
return function wrappedRequest (params, cb) {
var span = ins.createSpan(null, 'db', 'elasticsearch', 'request')
var span = ins.createSpan(null, 'db', 'elasticsearch', 'request', { exitSpan: true })
var id = span && span.transaction.id
var method = params && params.method
var path = params && params.path
Expand Down
8 changes: 4 additions & 4 deletions lib/instrumentation/transaction.js
Original file line number Diff line number Diff line change
Expand Up @@ -142,16 +142,16 @@ Transaction.prototype.createSpan = function (...args) {
return null
}

// Exit spans must not have child spans. The spec allows child spans "that
// have the same type and subtype", but this agent currently has no use for
// this case.
// Exit spans must not have child spans (unless of the same type and subtype).
// https://github.com/elastic/apm/blob/master/specs/agents/tracing-spans.md#child-spans-of-exit-spans
const opts = typeof args[args.length - 1] === 'object'
? (args.pop() || {})
: {}
const [_name, type, subtype] = args // eslint-disable-line no-unused-vars
opts.childOf = opts.childOf || this._agent._instrumentation.currSpan() || this
const childOf = opts.childOf
if (childOf instanceof Span && childOf._exitSpan) {
if (childOf instanceof Span && childOf._exitSpan &&
!(childOf.type === type && childOf.subtype === subtype)) {
this._agent.logger.trace({ exitSpanId: childOf.id, newSpanArgs: args },
'createSpan: drop child span of exit span')
return null
Expand Down
7 changes: 7 additions & 0 deletions test/instrumentation/modules/@elastic/_mock_es.js
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
// })
// server.start(function (serverUrl) {
// // - Test code using `serverUrl`.
// // - Use `server.requests` to see the requests the ES server received
// // - Call `server.close()` when done.
// })

Expand All @@ -26,13 +27,19 @@ class MockES {
this._responses = opts.responses
this._reqCount = 0
this.serverUrl = null // set in .start()
this.requests = []
this._http = http.createServer(this._onRequest.bind(this))
}

_onRequest (req, res) {
const response = this._responses[this._reqCount % this._responses.length]
this._reqCount++
req.on('end', () => {
this.requests.push({
method: req.method,
url: req.url,
headers: req.headers
})
res.writeHead(response.statusCode, response.headers || {})
res.end(response.body)
})
Expand Down
Loading

0 comments on commit 10f533e

Please sign in to comment.