From 3f02b1dcac9a9c39130567d8159ff164113f5c36 Mon Sep 17 00:00:00 2001 From: Felix Barnsteiner Date: Tue, 27 Apr 2021 15:24:42 +0200 Subject: [PATCH] Add destination spec (#435) --- specs/agents/README.md | 1 + specs/agents/tracing-instrumentation-http.md | 10 +- specs/agents/tracing-spans-destination.md | 105 +++++++++++++++++++ 3 files changed, 115 insertions(+), 1 deletion(-) create mode 100644 specs/agents/tracing-spans-destination.md diff --git a/specs/agents/README.md b/specs/agents/README.md index 71f0d58a..23262440 100644 --- a/specs/agents/README.md +++ b/specs/agents/README.md @@ -39,6 +39,7 @@ You can find details about each of these in the [APM Data Model](https://www.ela - Tracing - [Transactions](tracing-transactions.md) - [Spans](tracing-spans.md) + - [Span destination](tracing-spans-destination.md) - [Sampling](tracing-sampling.md) - [Distributed tracing](tracing-distributed-tracing.md) - [Tracer API](tracing-api.md) diff --git a/specs/agents/tracing-instrumentation-http.md b/specs/agents/tracing-instrumentation-http.md index 8f598fe9..3c9df7af 100644 --- a/specs/agents/tracing-instrumentation-http.md +++ b/specs/agents/tracing-instrumentation-http.md @@ -94,4 +94,12 @@ lower than 400 and to `"failure"` otherwise. For both transactions and spans, if there is no HTTP status we set `outcome` from the reported error: - `failure` if an error is reported -- `success` otherwise \ No newline at end of file +- `success` otherwise + +## Destination + +- `context.destination.address`: `url.host` +- `context.destination.port`: `url.port` +- `context.destination.service.name`: `url.port.isDefault() ? "${url.scheme}://${url.host}" : "${url.scheme}://${url.host}:${url.port}"` +- `context.destination.service.type`: `"external"` +- `context.destination.service.resource`: `"${url.host}:${url.port}"` \ No newline at end of file diff --git a/specs/agents/tracing-spans-destination.md b/specs/agents/tracing-spans-destination.md new file mode 100644 index 00000000..06c34b14 --- /dev/null +++ b/specs/agents/tracing-spans-destination.md @@ -0,0 +1,105 @@ +## Span destination + +The span destination information is relevant for exit spans and helps to identify the downstream service. +This information is used for the [service map](https://www.elastic.co/guide/en/kibana/current/service-maps.html), +the [dependencies table](https://www.elastic.co/guide/en/kibana/current/service-overview.html#service-span-duration) in the service overview, +and the [APM SIEM integration](https://www.elastic.co/blog/elastic-apm-7-6-0-released). + +### Destination service fields + +Spans representing an external call MUST have `context.destination.service` information. +If the span represents a call to an in-memory database, the information SHOULD still be set. + +Agents SHOULD have a generic component used in all tests that validates that the destination information is present for exit spans. +Rather than opting into the validation, the testing should provide an opt-out if, +for whatever reason, the destination information can't or shouldn't be collected for a particular exit span. + +#### `context.destination.service.name` + +ES field: `span.destination.service.name` + +The identifier for the destination service. + +**Usage** + +Currently, this field is not used anywhere within the product. +The original intent was to use it as a display name of a service in the service map. + +**Value** + +For HTTP, use scheme, host, and non-default port (e.g. `http://elastic.co`, `http://apm.example.com:8200`). +For anything else, use `span.subtype` (e.g. `postgresql`, `elasticsearch`). +However, individual sub-resources of a service, such as the name of a message queue, should not be added. + +#### `context.destination.service.resource` + +ES field: `span.destination.service.resource` + +Identifies unique destinations for each service. + +**Usage** + +Each unique resource will result in a node on the [service map](https://www.elastic.co/guide/en/kibana/current/service-maps.html). +Also, APM Server will roll up metrics based on the resource. +These metrics are currently used for the [dependencies table](https://www.elastic.co/guide/en/kibana/current/service-overview.html#service-span-duration) +on the service overview page. +There are plans to use the service destination metrics in the service map, too. + +The metrics are calculated based on the (head-based) sampled span documents that are sent to APM Server. +That's why agents have to send the [`sample_rate`](tracing-sampling.md#effect-on-metrics) +attribute for transactions and spans: +It is used by APM Server to extrapolate the service destination metrics based on the (head-based) sampled spans. + +**Cardinality** + +To avoid a huge impact on storage requirements for metrics, +and to not "spam" the service map with lots of fine-grained nodes, +the cardinality has to be kept low. +However, the cardinality should not be too low, either, +so that different clusters, instances, and queues can be displayed separately in the service map. + +The cardinality should be the same or higher as `span.destination.service.name`. +Higher, if there are individual sub-resources for a service, such as individual queues for a message broker. +Same cardinality otherwise. + +**Value** + +Usually, the value is just the `span.subtype`. +For HTTP, this is the host and port (see the [HTTP spec](tracing-instrumentation-http.md#destination) for more details). +The specs for the specific technologies will have more information on how to construct the value for `context.destination.service.resource`. + +#### `context.destination.service.type` + +ES field: `span.destination.service.type` + +Type of the destination service, e.g. `db`, `elasticsearch`. +Should typically be the same as `span.type`. + +**Usage** + +Currently, this field is not used anywhere within the product. +It was originally intended to be used to display different icons on the service map. + +### Destination fields + +These fields are used within the APM/SIEM integration. +They don't play a role for service maps. + +Spans representing an external call SHOULD have `context.destination` information if it is easy to gather. + +Examples when the effort of capturing the address and port is not justified: +* When the underlying protocol-layer code is not readily available in the instrumented code. +* When the instrumentation captures the exit event, + but the actual client is not bound to a specific connection (e.g. a client that does load balancing). + +#### `context.destination.address` + +ES field: [`destination.address`](https://www.elastic.co/guide/en/ecs/current/ecs-destination.html#_destination_field_details) + +Address is the destination network address: hostname (e.g. `localhost`), FQDN (e.g. `elastic.co`), IPv4 (e.g. `127.0.0.1`) IPv6 (e.g. `::1`) + +#### `context.destination.port` + +ES field: [`destination.port`](https://www.elastic.co/guide/en/ecs/current/ecs-destination.html#_destination_field_details) + +Port is the destination network port (e.g. 443)