diff --git a/.github/workflows/pr-code-checks.yml b/.github/workflows/pr-code-checks.yml
index 6a8437bc5a..b00c02f681 100644
--- a/.github/workflows/pr-code-checks.yml
+++ b/.github/workflows/pr-code-checks.yml
@@ -17,9 +17,6 @@ on:
- 'external-images.yaml'
workflow_dispatch:
-env:
- GITLEAKS_VERSION: 8.28.0
-
jobs:
unit-tests:
runs-on: ubuntu-latest
diff --git a/README.md b/README.md
index 5b4822822e..413d45d2c6 100644
--- a/README.md
+++ b/README.md
@@ -8,25 +8,25 @@
## Overview
-[Telemetry Manager](docs/user/01-manager.md) is a Kubernetes [operator](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/) that fulfils the [Kyma module interface](https://kyma-project.io/#/06-modules/README.md). It provides APIs for a managed agent/gateway setup for log, trace, and metric ingestion and dispatching into 3rd-party backend systems, in order to reduce the pain of orchestrating such setup on your own. Read more on the [manager](./docs/user/01-manager.md) itself or the general [usage](docs/user/README.md) of the module.
+[Telemetry Manager](docs/user/01-manager.md) is a Kubernetes [operator](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/) that fulfils the [Kyma module interface](https://kyma-project.io/#/06-modules/README.md). It provides APIs for a managed agent/gateway setup for log, trace, and metric ingestion and dispatching into 3rd-party backend systems, in order to reduce the pain of orchestrating such setup on your own. Read more on the [Architecture](./docs/user/architecture/README.md) itself or the general [usage](docs/user/README.md) of the module.
### Logs
-The logging controllers generate a Fluent Bit DaemonSet and configuration from one or more LogPipeline and LogParser custom resources. The controllers ensure that all Fluent Bit Pods run the current configuration by restarting Pods after the configuration has changed. See all [CRD attributes](apis/telemetry/v1alpha1/logpipeline_types.go) and some [examples](samples).
+The logging controllers generate an [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/) DaemonSet and Deployment, and configuration from one or more `LogPipeline` custom resources (CRs). The controllers ensure that all OTel Collector Pods run the current configuration by restarting Pods after the configuration has changed. See all [CRD attributes](apis/telemetry/v1alpha1/logpipeline_types.go) and some [examples](./samples).
-For more information, see [Logs](./docs/user/02-logs.md).
+For more information, see [Logs](./docs/user/collecting-logs/README.md).
### Traces
-The trace controller creates an [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/) deployment and related Kubernetes objects from a `TracePipeline` custom resource. The collector is configured to receive traces using the [OpenTelemetry Protocol (OTLP)](https://opentelemetry.io/docs/specs/otel/protocol/), and forwards the received traces to a configurable OTLP backend.
+The trace controller creates an [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/) Deployment and related Kubernetes objects from a `TracePipeline` CR. The collector is configured to receive traces using the [OpenTelemetry Protocol (OTLP)](https://opentelemetry.io/docs/specs/otel/protocol/), and forwards the received traces to a configurable OTLP backend.
-For more information, see [Traces](./docs/user/03-traces.md).
+For more information, see [Traces](./docs/user/collecting-traces/README.md).
### Metrics
-The metric controller creates an [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/) and related Kubernetes objects from a `MetricPipeline` custom resource. The collector is deployed as a [Gateway](https://opentelemetry.io/docs/collector/deployment/#gateway). The controller is configured to receive metrics in the OTLP protocol and forward them to a configurable OTLP backend.
+The metric controller creates an [OpenTelemetry Collector](https://opentelemetry.io/docs/collector/) DaemonSet and Deployment and related Kubernetes objects from a `MetricPipeline` CR. The collector is deployed as a [Gateway](https://opentelemetry.io/docs/collector/deployment/#gateway). The controller is configured to receive metrics in the OTLP protocol and forward them to a configurable OTLP backend.
-For more information, see [Metrics](./docs/user/04-metrics.md).
+For more information, see [Metrics](./docs/user/collecting-metrics/README.md).
## Installation
diff --git a/docs/user/01-manager.md b/docs/user/01-manager.md
index b9bb5ad81f..b6b0b4f756 100644
--- a/docs/user/01-manager.md
+++ b/docs/user/01-manager.md
@@ -1,25 +1,3 @@
# Telemetry Manager
-As the core element of the Telemetry module, Telemetry Manager manages the lifecycle of other Telemetry module components by watching user-created resources.
-
-## Module Lifecycle
-
-The Telemetry module includes Telemetry Manager, a Kubernetes [operator](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/) that's described by a custom resource of type Telemetry. Telemetry Manager has the following tasks:
-
-1. Watch the module configuration for changes and sync the module status to it.
-2. Watch for the user-created Kubernetes resources LogPipeline, TracePipeline, and MetricPipeline. In these resources, you specify what data of a signal type to collect and where to ship it.
-3. Manage the lifecycle of the self monitor and the user-configured agents and gateways.
- For example, only if you defined a LogPipeline resource, the Fluent Bit DaemonSet is deployed as log agent.
-
-
-
-### Self Monitor
-
-The Telemetry module contains a self monitor, based on [Prometheus](https://prometheus.io/), to collect and evaluate metrics from the managed gateways and agents. Telemetry Manager retrieves the current pipeline health from the self monitor and adjusts the status of the pipeline resources and the module status.
-Additionally, you can monitor the health of your pipelines in an integrated backend like [SAP Cloud Logging](./integration/sap-cloud-logging/README.md#use-sap-cloud-logging-alerts): To set up alerts and reports in the backend, use the [pipeline health metrics](./04-metrics.md#5-monitor-pipeline-health) emitted by your MetricPipeline.
-
-
-
-## Module Configuration and Status
-
-For configuration options and the overall status of the module, see the specification of the related [Telemetry resource](./resources/01-telemetry.md).
+This content moved to [Architecture](./architecture/README.md).
diff --git a/docs/user/02-logs.md b/docs/user/02-logs.md
index 919b3ea653..a5f9d1107e 100644
--- a/docs/user/02-logs.md
+++ b/docs/user/02-logs.md
@@ -1,12 +1,15 @@
# Application Logs (Fluent Bit)
+> [!NOTE]
+> The following API uses the LogPipeline output `http` and `custom` based on a FluentBit agent. As a new user, start with the approach based on OpenTelemetry: [Collecting Logs](./collecting-logs/README.md).
+
With application logs, you can debug an application and derive the internal state of an application. When logs are emitted with the correct severity level and context, they're essential for observing an application.
## Overview
-The Telemetry module provides the [Fluent Bit](https://fluentbit.io/) log agent for the collection and shipment of application logs of any container running in the Kyma runtime.
+The Telemetry module provides the [Fluent Bit](https://fluentbit.io/) log agent for the collection and shipment of application logs of any container running in Kyma runtime.
-You can configure the log agent with external systems using runtime configuration with a dedicated Kubernetes API ([CRD](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/#customresourcedefinitions)) named `LogPipeline`. With the LogPipeline's HTTP output, you can natively integrate with vendors that support this output, or with any vendor using a [Fluentd integration](https://medium.com/hepsiburadatech/fluent-logging-architecture-fluent-bit-fluentd-elasticsearch-ca4a898e28aa).
+You can configure the log agent with external systems using runtime configuration with a dedicated Kubernetes API ([CRD](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/#customresourcedefinitions)) named `LogPipeline`. With the LogPipeline's HTTP output, you can natively integrate with vendors that support this output, or with any vendor using a Fluentd integration.
The feature is optional, if you don't want to use the Logs feature, simply don't set up a LogPipeline.
@@ -478,7 +481,7 @@ The enriched timestamp attributes have the following meaning:
The Telemetry module ensures that the log agent instances are operational and healthy at any time, for example, with buffering and retries. However, there may be situations when the instances drop logs, or cannot handle the log load.
-To detect and fix such situations, check the [pipeline status](./resources/02-logpipeline.md#logpipeline-status) and check out [Troubleshooting](#troubleshooting). If you have set up [pipeline health monitoring](./04-metrics.md#5-monitor-pipeline-health), check the alerts and reports in an integrated backend like [SAP Cloud Logging](./integration/sap-cloud-logging/README.md#use-sap-cloud-logging-alerts).
+To detect and fix such situations, check the [pipeline status](./resources/02-logpipeline.md#logpipeline-status) and check out [Troubleshooting](#troubleshooting). If you have set up [pipeline health monitoring](./monitor-pipeline-health.md), check the alerts and reports in an integrated backend like [SAP Cloud Logging](./integration/sap-cloud-logging/README.md#use-sap-cloud-logging-alerts).
> [!WARNING]
> It's not recommended to access the metrics endpoint of the used FluentBit instances directly, because the exposed metrics are no official API of the Kyma Telemetry module. Breaking changes can happen if the underlying FluentBit version introduces such.
diff --git a/docs/user/03-traces.md b/docs/user/03-traces.md
index 8c6bf70636..53e9e53d12 100644
--- a/docs/user/03-traces.md
+++ b/docs/user/03-traces.md
@@ -1,524 +1,3 @@
# Traces
-The Telemetry module supports you in collecting all relevant trace data in a Kyma cluster, enriches them and ships them to a backend for further analysis. Kyma modules like Istio or Serverless contribute traces transparently. You can choose among multiple vendors for [OTLP-based backends](https://opentelemetry.io/ecosystem/vendors/).
-
-## Overview
-
-Observability tools aim to show the big picture, no matter if you're monitoring just a few or many components. In a cloud-native microservice architecture, a user request often flows through dozens of different microservices. Logging and monitoring tools help to track the request's path. However, they treat each component or microservice in isolation. This individual treatment results in operational issues.
-
-[Distributed tracing](https://opentelemetry.io/docs/concepts/observability-primer/#understanding-distributed-tracing) charts out the transactions in cloud-native systems, helping you to understand the application behavior and relations between the frontend actions and backend implementation.
-
-The following diagram shows how distributed tracing helps to track the request path:
-
-
-
-The Telemetry module provides a trace gateway for the shipment of traces of any container running in the Kyma runtime.
-
-You can configure the trace gateway with external systems using runtime configuration with a dedicated Kubernetes API ([CRD](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/#customresourcedefinitions)) named TracePipeline.
-The Trace feature is optional. If you don't want to use it, simply don't set up a TracePipeline.
-
-## Prerequisites
-
-For the recording of a distributed trace, every involved component must propagate at least the trace context. For details, see [Trace Context](https://www.w3.org/TR/trace-context/#problem-statement).
-
-- In Kyma, all modules involved in users’ requests support the [W3C Trace Context](https://www.w3.org/TR/trace-context) protocol. The involved Kyma modules are, for example, Istio, Serverless, and Eventing.
-- Your application also must propagate the W3C Trace Context for any user-related activity. This can be achieved easily using the [Open Telemetry SDKs](https://opentelemetry.io/docs/instrumentation/) available for all common programming languages. If your application follows that guidance and is part of the Istio Service Mesh, it’s already outlined with dedicated span data in the trace data collected by the Kyma telemetry setup.
-- Furthermore, your application must enrich a trace with additional span data and send these data to the cluster-central telemetry services. You can achieve this with [Open Telemetry SDKs](https://opentelemetry.io/docs/instrumentation/).
-
-## Architecture
-
-In the Kyma cluster, the Telemetry module provides a central deployment of an [OTel Collector](https://opentelemetry.io/docs/collector/) acting as a gateway. The gateway exposes endpoints to which all Kyma modules and users’ applications should send the trace data.
-
-
-
-1. An end-to-end request is triggered and populated across the distributed application. Every involved component propagates the trace context using the [W3C Trace Context](https://www.w3.org/TR/trace-context/) protocol.
-2. After contributing a new span to the trace, the involved components send the related span data to the trace gateway using the `telemetry-otlp-traces` service. The communication happens based on the [OpenTelemetry Protocol (OTLP)](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/protocol/otlp.md) either using GRPC or HTTP.
-3. Istio sends the related span data to the trace gateway as well.
-4. The trace gateway discovers metadata that's typical for sources running on Kubernetes, like Pod identifiers, and then enriches the span data with that metadata.
-5. Telemetry Manager configures the gateway according to the `TracePipeline` resource, including the target backend for the trace gateway. Also, it observes the trace flow to the backend and reports problems in the `TracePipeline` status.
-6. The trace gateway sends the data to the observability system that's specified in your `TracePipeline` resource - either within the Kyma cluster, or, if authentication is set up, to an external observability backend.
-7. You can analyze the trace data with your preferred backend system.
-
-### Telemetry Manager
-
-The TracePipeline resource is watched by Telemetry Manager, which is responsible for generating the custom parts of the OTel Collector configuration.
-
-
-
-1. Telemetry Manager watches all TracePipeline resources and related Secrets.
-2. Furthermore, Telemetry Manager takes care of the full lifecycle of the OTel Collector Deployment itself. Only if you defined a TracePipeline, the collector is deployed.
-3. Whenever the configuration changes, it validates the configuration and generates a new configuration for OTel Collector, where a ConfigMap for the configuration is generated.
-4. Referenced Secrets are copied into one Secret that is mounted to the OTel Collector as well.
-
-### Trace Gateway
-
-In a Kyma cluster, the trace gateway is the central component to which all components can send their individual spans. The gateway collects, enriches, and dispatches the data to the configured backend. For more information, see [Telemetry Gateways](./gateways.md).
-
-## Setting up a TracePipeline
-
-In the following steps, you can see how to construct and deploy a typical TracePipeline. Learn more about the available [parameters and attributes](resources/04-tracepipeline.md).
-
-### 1. Create a TracePipeline
-
-To ship traces to a new OTLP output, create a resource of the kind `TracePipeline` and save the file (named, for example, `tracepipeline.yaml`).
-
-This configures the underlying OTel Collector with a pipeline for traces and opens a push endpoint that is accessible with the `telemetry-otlp-traces` service. For details, see [Gateway Usage](./gateways.md#usage). The following push URLs are set up:
-
-- GRPC: 'http://telemetry-otlp-traces.kyma-system:4317'
-- HTTP: 'http://telemetry-otlp-traces.kyma-system:4318'
-
-The default protocol for shipping the data to a backend is GRPC, but you can choose HTTP instead. Depending on the configured protocol, an `otlp` or an `otlphttp` exporter is used. Ensure that the correct port is configured as part of the endpoint.
-
-- For GRPC, use:
-
- ```yaml
- apiVersion: telemetry.kyma-project.io/v1alpha1
- kind: TracePipeline
- metadata:
- name: backend
- spec:
- output:
- otlp:
- endpoint:
- value: https://backend.example.com:4317
- ```
-
-- For HTTP, use the `protocol` attribute:
-
- ```yaml
- apiVersion: telemetry.kyma-project.io/v1alpha1
- kind: TracePipeline
- metadata:
- name: backend
- spec:
- output:
- otlp:
- protocol: http
- endpoint:
- value: https://backend.example.com:4318
- ```
-
-### 2. Enable Istio Tracing
-
-By default, the tracing feature of the Istio module is disabled to avoid increased network utilization if there is no TracePipeline.
-
-To activate the Istio tracing feature with a sampling rate of 5% (for recommendations, see [Istio](#istio)), use a resource similar to the following example:
-
-```yaml
-apiVersion: telemetry.istio.io/v1
-kind: Telemetry
-metadata:
- name: tracing-default
- namespace: istio-system
-spec:
- tracing:
- - providers:
- - name: "kyma-traces"
- randomSamplingPercentage: 5.00
-```
-
-### 3a. Add Authentication Details From Plain Text
-
-To integrate with external systems, you must configure authentication details. You can use mutual TLS (mTLS), Basic Authentication, or custom headers:
-
-
-
-#### **mTLS**
-
-```yaml
-apiVersion: telemetry.kyma-project.io/v1alpha1
-kind: TracePipeline
-metadata:
- name: backend
-spec:
- output:
- otlp:
- endpoint:
- value: https://backend.example.com/otlp:4317
- tls:
- cert:
- value: |
- -----BEGIN CERTIFICATE-----
- ...
- key:
- value: |
- -----BEGIN RSA PRIVATE KEY-----
- ...
-```
-
-#### **Basic Authentication**
-
-```yaml
-apiVersion: telemetry.kyma-project.io/v1alpha1
-kind: TracePipeline
-metadata:
- name: backend
-spec:
- output:
- otlp:
- endpoint:
- value: https://backend.example.com/otlp:4317
- authentication:
- basic:
- user:
- value: myUser
- password:
- value: myPwd
-```
-
-#### **Token-based authentication with custom headers**
-
-```yaml
-apiVersion: telemetry.kyma-project.io/v1alpha1
-kind: TracePipeline
-metadata:
- name: backend
-spec:
- output:
- otlp:
- endpoint:
- value: https://backend.example.com/otlp:4317
- headers:
- - name: Authorization
- prefix: Bearer
- value: "myToken"
-```
-
-
-
-### 3b. Add Authentication Details From Secrets
-
-Integrations into external systems usually need authentication details dealing with sensitive data. To handle that data properly in Secrets, TracePipeline supports the reference of Secrets.
-
-Using the **valueFrom** attribute, you can map Secret keys for mutual TLS (mTLS), Basic Authentication, or with custom headers.
-
-You can store the value of the token in the referenced Secret without any prefix or scheme, and you can configure it in the `headers` section of the TracePipeline. In the following example, the token has the prefix "Bearer".
-
-
-
-#### **mTLS**
-
-```yaml
-apiVersion: telemetry.kyma-project.io/v1alpha1
-kind: TracePipeline
-metadata:
- name: backend
-spec:
- output:
- otlp:
- endpoint:
- value: https://backend.example.com/otlp:4317
- tls:
- cert:
- valueFrom:
- secretKeyRef:
- name: backend
- namespace: default
- key: cert
- key:
- valueFrom:
- secretKeyRef:
- name: backend
- namespace: default
- key: key
-```
-
-#### **Basic Authentication**
-
-```yaml
-apiVersion: telemetry.kyma-project.io/v1alpha1
-kind: TracePipeline
-metadata:
- name: backend
-spec:
- output:
- otlp:
- endpoint:
- valueFrom:
- secretKeyRef:
- name: backend
- namespace: default
- key: endpoint
- authentication:
- basic:
- user:
- valueFrom:
- secretKeyRef:
- name: backend
- namespace: default
- key: user
- password:
- valueFrom:
- secretKeyRef:
- name: backend
- namespace: default
- key: password
-```
-
-#### **Token-based authentication with custom headers**
-
-```yaml
-apiVersion: telemetry.kyma-project.io/v1alpha1
-kind: TracePipeline
-metadata:
- name: backend
-spec:
- output:
- otlp:
- endpoint:
- value: https://backend.example.com:4317
- headers:
- - name: Authorization
- prefix: Bearer
- valueFrom:
- secretKeyRef:
- name: backend
- namespace: default
- key: token
-```
-
-
-
-The related Secret must have the referenced name, be located in the referenced namespace, and contain the mapped key. See the following example:
-
-```yaml
-kind: Secret
-apiVersion: v1
-metadata:
- name: backend
- namespace: default
-stringData:
- endpoint: https://backend.example.com:4317
- user: myUser
- password: XXX
- token: YYY
-```
-
-### 4. Rotate the Secret
-
-Telemetry Manager continuously watches the Secret referenced with the **secretKeyRef** construct. You can update the Secret’s values, and Telemetry Manager detects the changes and applies the new Secret to the setup.
-
-> [!TIP]
-> If you use a Secret owned by the [SAP BTP Service Operator](https://github.com/SAP/sap-btp-service-operator), you can configure an automated rotation using a `credentialsRotationPolicy` with a specific `rotationFrequency` and don’t have to intervene manually.
-
-### 5. Deploy the Pipeline
-
-To activate the TracePipeline, apply the `tracepipeline.yaml` resource file in your cluster:
-
-```bash
-kubectl apply -f tracepipeline.yaml
-```
-
-### Result
-
-You activated a TracePipeline and traces start streaming to your backend.
-
-To check that the pipeline is running, wait until the status conditions of the TracePipeline in your cluster have status `True`:
-
-```bash
-kubectl get tracepipeline
-NAME CONFIGURATION GENERATED GATEWAY HEALTHY FLOW HEALTHY
-backend True True True
-```
-
-## Kyma Modules With Tracing Capabilities
-
-Kyma bundles several modules that can be involved in user flows. Applications involved in a distributed trace must propagate the trace context to keep the trace complete. Optionally, they can enrich the trace with custom spans, which requires reporting them to the backend.
-
-### Istio
-
-The Istio module is crucial in distributed tracing because it provides the [Ingress Gateway](https://istio.io/latest/docs/tasks/traffic-management/ingress/ingress-control/). Typically, this is where external requests enter the cluster scope and are enriched with trace context if it hasn’t happened earlier. Furthermore, every component that’s part of the Istio Service Mesh runs an Istio proxy, which propagates the context properly but also creates span data. If Istio tracing is activated and taking care of trace propagation in your application, you get a complete picture of a trace, because every component automatically contributes span data. Also, Istio tracing is pre-configured to be based on the vendor-neutral [W3C Trace Context](https://www.w3.org/TR/trace-context/) protocol.
-
-The Istio module is configured with an [extension provider](https://istio.io/latest/docs/tasks/observability/telemetry/) called `kyma-traces`. To activate the provider on the global mesh level using the Istio [Telemetry API](https://istio.io/latest/docs/reference/config/telemetry/#Tracing), place a resource to the `istio-system` namespace. The following code samples help setting up the Istio tracing feature:
-
-
-
-#### **Extension Provider**
-
-The following example configures all Istio proxies with the `kyma-traces` extension provider, which, by default, reports span data to the trace gateway of the Telemetry module.
-
-```yaml
-apiVersion: telemetry.istio.io/v1
-kind: Telemetry
-metadata:
- name: mesh-default
- namespace: istio-system
-spec:
- tracing:
- - providers:
- - name: "kyma-traces"
-```
-
-#### **Sampling Rate**
-
-By default, the sampling rate is configured to 1%. That means that only 1 trace out of 100 traces is reported to the trace gateway, and all others are dropped. The sampling decision itself is propagated as part of the [trace context](https://www.w3.org/TR/trace-context/#sampled-flag) so that either all involved components are reporting the span data of a trace, or none.
-
-> [!TIP]
-> If you increase the sampling rate, you send more data your tracing backend and cause much higher network utilization in the cluster.
-> To reduce costs and performance impacts in a production setup, a very low percentage of around 5% is recommended.
-
-To configure an "always-on" sampling, set the sampling rate to 100%:
-
-```yaml
-apiVersion: telemetry.istio.io/v1
-kind: Telemetry
-metadata:
- name: mesh-default
- namespace: istio-system
-spec:
- tracing:
- - providers:
- - name: "kyma-traces"
- randomSamplingPercentage: 100.00
-```
-
-#### **Namespaces or Workloads**
-
-If you need specific settings for individual namespaces or workloads, place additional Telemetry resources. If you don't want to report spans at all for a specific workload, activate the `disableSpanReporting` flag with the selector expression.
-
-```yaml
-apiVersion: telemetry.istio.io/v1
-kind: Telemetry
-metadata:
- name: tracing
- namespace: my-namespace
-spec:
- selector:
- matchLabels:
- app.kubernetes.io/name: "my-app"
- tracing:
- - providers:
- - name: "kyma-traces"
- randomSamplingPercentage: 100.00
-```
-
-#### **Trace Context Without Spans**
-
-To enable the propagation of the [W3C Trace Context](https://www.w3.org/TR/trace-context/) only, without reporting any spans (so the actual tracing feature is disabled), you must enable the `kyma-traces` provider with a sampling rate of 0. With this configuration, you get the relevant trace context into the [access logs](https://kyma-project.io/#/istio/user/tutorials/01-45-enable-istio-access-logs) without any active trace reporting.
-
- ```yaml
- apiVersion: telemetry.istio.io/v1
- kind: Telemetry
- metadata:
- name: mesh-default
- namespace: istio-system
- spec:
- tracing:
- - providers:
- - name: "kyma-traces"
- randomSamplingPercentage: 0
- ```
-
-
-
-### Eventing
-
-The [Eventing](https://kyma-project.io/#/eventing-manager/user/README) module uses the [CloudEvents](https://cloudevents.io/) protocol (which natively supports the [W3C Trace Context](https://www.w3.org/TR/trace-context) propagation). Because of that, it propagates trace context properly. However, it doesn't enrich a trace with more advanced span data.
-
-### Serverless
-
-By default, all engines for the [Serverless](https://kyma-project.io/#/serverless-manager/user/README) module integrate the [Open Telemetry SDK](https://opentelemetry.io/docs/reference/specification/metrics/sdk/). Thus, the used middlewares are configured to automatically propagate the trace context for chained calls.
-
-Because the Telemetry endpoints are configured by default, Serverless also reports custom spans for incoming and outgoing requests. You can [customize Function traces](https://kyma-project.io/#/serverless-manager/user/tutorials/01-100-customize-function-traces) to add more spans as part of your Serverless source code.
-
-## Operations
-
-A TracePipeline runs several OTel Collector instances in your cluster. This Deployment serves OTLP endpoints and ships received data to the configured backend.
-
-The Telemetry module ensures that the OTel Collector instances are operational and healthy at any time, for example, with buffering and retries. However, there may be situations when the instances drop traces, or cannot handle the trace load.
-
-To detect and fix such situations, check the [pipeline status](./resources/04-tracepipeline.md#tracepipeline-status) and check out [Troubleshooting](#troubleshooting). If you have set up [pipeline health monitoring](./04-metrics.md#5-monitor-pipeline-health), check the alerts and reports in an integrated backend like [SAP Cloud Logging](./integration/sap-cloud-logging/README.md#use-sap-cloud-logging-alerts).
-
-> [! WARNING]
-> It's not recommended to access the metrics endpoint of the used OTel Collector instances directly, because the exposed metrics are no official API of the Kyma Telemetry module. Breaking changes can happen if the underlying OTel Collector version introduces such.
-> Instead, use the [pipeline status](./resources/04-tracepipeline.md#tracepipeline-status).
-
-## Limitations
-
-- **Throughput**: Assuming an average span with 40 attributes with 64 characters, the maximum throughput is 4200 span/sec ~= 15.000.000 spans/hour. If this limit is exceeded, spans are refused. To increase the maximum throughput, manually scale out the gateway by increasing the number of replicas for the trace gateway. See [Module Configuration and Status](https://kyma-project.io/#/telemetry-manager/user/01-manager?id=module-configuration).
-- **Unavailability of Output**: For up to 5 minutes, a retry for data is attempted when the destination is unavailable. After that, data is dropped.
-- **No Guaranteed Delivery**: The used buffers are volatile. If the OTel Collector instance crashes, trace data can be lost.
-- **Multiple TracePipeline Support**: The maximum amount of TracePipeline resources is 5.
-- **System Span Filtering**: System-related spans reported by Istio are filtered out without the opt-out option, for example:
- - Any communication of applications to the Telemetry gateways
- - Any communication from the gateways to backends
-
-## Troubleshooting
-
-### No Spans Arrive at the Backend
-
-**Symptom**: In the TracePipeline status, the `TelemetryFlowHealthy` condition has status **GatewayAllTelemetryDataDropped**.
-
-**Cause**: Incorrect backend endpoint configuration (such as using the wrong authentication credentials), or the backend is unreachable.
-
-**Solution**:
-
-1. Check the `telemetry-trace-gateway` Pods for error logs by calling `kubectl logs -n kyma-system {POD_NAME}`.
-2. Check if the backend is up and reachable.
-3. Fix the errors.
-
-### Not All Spans Arrive at the Backend
-
-**Symptom**:
-
-- The backend is reachable and the connection is properly configured, but some spans are refused.
-- In the TracePipeline status, the `TelemetryFlowHealthy` condition has status **GatewaySomeTelemetryDataDropped**.
-
-**Cause**: It can happen due to a variety of reasons - for example, the backend is limiting the ingestion rate.
-
-**Solution**:
-
-1. Check the `telemetry-trace-gateway` Pods for error logs by calling `kubectl logs -n kyma-system {POD_NAME}`. Also, check your observability backend to investigate potential causes.
-2. If the backend is limiting the rate by refusing spans, try the following options:
- - Option 1: Increase maximum backend ingestion rate. For example, by scaling out the SAP Cloud Logging instances.
- - Option 2: Reduce the emitted spans in your applications.
-3. Otherwise, take the actions appropriate to the cause indicated in the logs.
-
-### Custom Spans Don’t Arrive at the Backend, but Istio Spans Do
-
-**Cause**: Your SDK version is incompatible with the OTel Collector version.
-
-**Solution**:
-
-1. Check which SDK version you are using for instrumentation.
-2. Investigate whether it is compatible with the OTel Collector version.
-3. If required, upgrade to a supported SDK version.
-
-### Trace Backend Shows Fewer Traces than Expected
-
-**Cause**: By [default](#istio), only 1% of the requests are sent to the trace backend for trace recording.
-
-**Solution**:
-
-To see more traces in the trace backend, increase the percentage of requests by changing the default settings.
-If you just want to see traces for one particular request, you can manually force sampling:
-
-1. Create a `values.yaml` file.
- The following example sets the value to `60`, which means 60% of the requests are sent to the tracing backend.
-
-```yaml
- apiVersion: telemetry.istio.io/v1
- kind: Telemetry
- metadata:
- name: kyma-traces
- namespace: istio-system
- spec:
- tracing:
- - providers:
- - name: "kyma-traces"
- randomSamplingPercentage: 60
-```
-
-2. To override the default percentage, change the value for the **randomSamplingPercentage** attribute.
-3. Deploy the `values.yaml` to your existing Kyma installation.
-
-### Gateway Throttling
-
-**Symptom**:
-
-- In the TracePipeline status, the `TelemetryFlowHealthy` condition has status **GatewayThrottling**.
-- Also, your application might have error logs indicating a refusal for sending traces to the gateway.
-
-**Cause**: Gateway cannot receive spans at the given rate.
-
-**Solution**: Manually scale out the gateway by increasing the number of replicas for the trace gateway. See [Module Configuration and Status](https://kyma-project.io/#/telemetry-manager/user/01-manager?id=module-configuration).
+This content moved to [Collecting Traces](./collecting-traces/README.md).
diff --git a/docs/user/04-metrics.md b/docs/user/04-metrics.md
index 6b4746c62a..9b71100096 100644
--- a/docs/user/04-metrics.md
+++ b/docs/user/04-metrics.md
@@ -1,879 +1,3 @@
# Metrics
-The goal of the Telemetry module is to support you in collecting all relevant metrics of a workload in a Kyma cluster and ship them to a backend for further analysis. Kyma modules like [Istio](https://kyma-project.io/#/istio/user/README) or [Serverless](https://kyma-project.io/#/serverless-manager/user/README) contribute metrics instantly, and the Telemetry module enriches the data. You can choose among multiple [vendors for OTLP-based backends](https://opentelemetry.io/ecosystem/vendors/).
-
-## Overview
-
-Observability is all about exposing the internals of the components belonging to a distributed application and making that data analysable at a central place.
-While application logs and traces usually provide request-oriented data, metrics are aggregated statistics exposed by a component to reflect the internal state. Typical statistics like the amount of processed requests, or the amount of registered users, can be very useful to monitor the current state and also the health of a component. Also, you can define proactive and reactive alerts if metrics are about to reach thresholds, or if they already passed thresholds.
-
-The Telemetry module provides a metric gateway and, optionally, an agent for the collection and shipment of metrics of any container running in the Kyma runtime.
-
-You can configure the metric gateway with external systems using runtime configuration with a dedicated Kubernetes API ([CRD](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/#customresourcedefinitions)) named MetricPipeline.
-The Metric feature is optional. If you don't want to use it, simply don't set up a MetricPipeline.
-
-## Prerequisites
-
-- Before you can collect metrics data from a component, it must expose (or instrument) the metrics. Typically, it instruments specific metrics for the used language runtime (like Node.js) and custom metrics specific to the business logic. Also, the exposure can be in different formats, like the pull-based Prometheus format or the [push-based OTLP format](https://opentelemetry.io/docs/specs/otlp/).
-
-- If you want to use Prometheus-based metrics, you must have instrumented your application using a library like the [Prometheus client library](https://prometheus.io/docs/instrumenting/clientlibs/), with a port in your workload exposed serving as a Prometheus metrics endpoint.
-
-- For the instrumentation, you typically use an SDK, namely the [Prometheus client libraries](https://prometheus.io/docs/instrumenting/clientlibs/) or the [Open Telemetry SDKs](https://opentelemetry.io/docs/instrumentation/). Both libraries provide extensions to activate language-specific auto-instrumentation like for Node.js, and an API to implement custom instrumentation.
-
-## Architecture
-
-In the Telemetry module, a central in-cluster Deployment of an [OTel Collector](https://opentelemetry.io/docs/collector/) acts as a gateway. The gateway exposes endpoints for the [OpenTelemetry Protocol (OTLP)](https://opentelemetry.io/docs/specs/otlp/) for GRPC and HTTP-based communication using the dedicated `telemetry-otlp-metrics` service, to which all Kyma modules and users’ applications send the metrics data.
-
-Optionally, the Telemetry module provides a DaemonSet of an OTel Collector acting as an agent. This agent can pull metrics of a workload and the Istio sidecar in the [Prometheus pull-based format](https://prometheus.io/docs/instrumenting/exposition_formats) and can provide runtime-specific metrics for the workload.
-
-
-
-1. An application (exposing metrics in OTLP) sends metrics to the central metric gateway service.
-2. An application (exposing metrics in Prometheus protocol) activates the agent to scrape the metrics with an annotation-based configuration.
-3. Additionally, you can activate the agent to pull metrics of each Istio sidecar.
-4. The agent supports collecting metrics from the Kubelet and Kubernetes APIServer.
-5. The agent converts and sends all collected metric data to the gateway in OTLP.
-6. The gateway discovers the metadata and enriches all received data with typical metadata of the source by communicating with the Kubernetes APIServer. Furthermore, it filters data according to the pipeline configuration.
-7. Telemetry Manager configures the agent and gateway according to the `MetricPipeline` resource specification, including the target backend for the metric gateway. Also, it observes the metrics flow to the backend and reports problems in the MetricPipeline status.
-8. The metric gateway sends the data to the observability system that's specified in your `MetricPipeline` resource - either within the Kyma cluster, or, if authentication is set up, to an external observability backend.
-9. You can analyze the metric data with your preferred backend system.
-
-### Telemetry Manager
-
-The MetricPipeline resource is watched by Telemetry Manager, which is responsible for generating the custom parts of the OTel Collector configuration.
-
-
-
-1. Telemetry Manager watches all MetricPipeline resources and related Secrets.
-2. Furthermore, Telemetry Manager takes care of the full lifecycle of the gateway Deployment and the agent DaemonSet. Only if you defined a MetricPipeline, the gateway and agent are deployed.
-3. Whenever the user configuration changes, Telemetry Manager validates it and generates a single configuration for the gateway and agent.
-4. Referenced Secrets are copied into one Secret that is mounted to the gateway as well.
-
-### Metric Gateway
-
-In a Kyma cluster, the metric gateway is the central component to which all components can send their individual metrics. The gateway collects, enriches, and dispatches the data to the configured backend. For more information, see [Telemetry Gateways](./gateways.md).
-
-### Metric Agent
-
-If a MetricPipeline configures a feature in the `input` section, an additional DaemonSet is deployed acting as an agent. The agent is also based on an [OTel Collector](https://opentelemetry.io/docs/collector/) and encompasses the collection and conversion of Prometheus-based metrics. Hereby, the workload puts a `prometheus.io/scrape` annotation on the specification of the Pod or service, and the agent collects it. The agent sends all data in OTLP to the central gateway.
-
-## Setting up a MetricPipeline
-
-In the following steps, you can see how to construct and deploy a typical MetricPipeline. Learn more about the available [parameters and attributes](resources/05-metricpipeline.md).
-
-### 1. Create a MetricPipeline
-
-To ship metrics to a new OTLP output, create a resource of the kind `MetricPipeline` and save the file (named, for example, `metricpipeline.yaml`).
-
-This configures the underlying OTel Collector with a pipeline for metrics and opens a push endpoint that is accessible with the `telemetry-otlp-metrics` service. For details, see [Gateway Usage](./gateways.md#usage).
-
-The following push URLs are set up:
-
-- GRPC: `http://telemetry-otlp-metrics.kyma-system:4317`
-- HTTP: `http://telemetry-otlp-metrics.kyma-system:4318`
-
-The default protocol for shipping the data to a backend is GRPC, but you can choose HTTP instead. Depending on the configured protocol, an `otlp` or an `otlphttp` exporter is used. Ensure that the correct port is configured as part of the endpoint.
-
-
-
-#### **GRPC**
-
-For GRPC, use:
-
-```yaml
-apiVersion: telemetry.kyma-project.io/v1alpha1
-kind: MetricPipeline
-metadata:
- name: backend
-spec:
- output:
- otlp:
- endpoint:
- value: https://backend.example.com:4317
-```
-
-#### **HTTP**
-
-For HTTP, use the `protocol` attribute:
-
-```yaml
-apiVersion: telemetry.kyma-project.io/v1alpha1
-kind: MetricPipeline
-metadata:
- name: backend
-spec:
- output:
- otlp:
- protocol: http
- endpoint:
- value: https://backend.example.com:4318
-```
-
-
-
-### 2a. Add Authentication Details From Plain Text
-
-To integrate with external systems, you must configure authentication details. You can use mutual TLS (mTLS), Basic Authentication, or custom headers:
-
-
-
-#### **mTLS**
-
-```yaml
-apiVersion: telemetry.kyma-project.io/v1alpha1
-kind: MetricPipeline
-metadata:
- name: backend
-spec:
- output:
- otlp:
- endpoint:
- value: https://backend.example.com/otlp:4317
- tls:
- cert:
- value: |
- -----BEGIN CERTIFICATE-----
- ...
- key:
- value: |
- -----BEGIN RSA PRIVATE KEY-----
- ...
-```
-
-#### **Basic Authentication**
-
-```yaml
-apiVersion: telemetry.kyma-project.io/v1alpha1
-kind: MetricPipeline
-metadata:
- name: backend
-spec:
- output:
- otlp:
- endpoint:
- value: https://backend.example.com/otlp:4317
- authentication:
- basic:
- user:
- value: myUser
- password:
- value: myPwd
-```
-
-#### **Token-based authentication with custom headers**
-
-```yaml
-apiVersion: telemetry.kyma-project.io/v1alpha1
-kind: MetricPipeline
-metadata:
- name: backend
-spec:
- output:
- otlp:
- endpoint:
- value: https://backend.example.com/otlp:4317
- headers:
- - name: Authorization
- prefix: Bearer
- value: "myToken"
-```
-
-
-### 2b. Add Authentication Details From Secrets
-
-Integrations into external systems usually need authentication details dealing with sensitive data. To handle that data properly in Secrets, MetricsPipeline supports the reference of Secrets.
-
-Using the **valueFrom** attribute, you can map Secret keys for mutual TLS (mTLS), Basic Authentication, or with custom headers.
-
-You can store the value of the token in the referenced Secret without any prefix or scheme, and you can configure it in the headers section of the MetricPipeline. In this example, the token has the prefix “Bearer”.
-
-
-
-#### **mTLS**
-
-```yaml
-apiVersion: telemetry.kyma-project.io/v1alpha1
-kind: MetricPipeline
-metadata:
- name: backend
-spec:
- output:
- otlp:
- endpoint:
- value: https://backend.example.com/otlp:4317
- tls:
- cert:
- valueFrom:
- secretKeyRef:
- name: backend
- namespace: default
- key: cert
- key:
- valueFrom:
- secretKeyRef:
- name: backend
- namespace: default
- key: key
-```
-
-#### **Basic Authentication**
-
-```yaml
-apiVersion: telemetry.kyma-project.io/v1alpha1
-kind: MetricPipeline
-metadata:
- name: backend
-spec:
- output:
- otlp:
- endpoint:
- valueFrom:
- secretKeyRef:
- name: backend
- namespace: default
- key: endpoint
- authentication:
- basic:
- user:
- valueFrom:
- secretKeyRef:
- name: backend
- namespace: default
- key: user
- password:
- valueFrom:
- secretKeyRef:
- name: backend
- namespace: default
- key: password
-```
-
-#### **Token-based authentication with custom headers**
-
-```yaml
-apiVersion: telemetry.kyma-project.io/v1alpha1
-kind: MetricPipeline
-metadata:
- name: backend
-spec:
- output:
- otlp:
- endpoint:
- value: https://backend.example.com:4317
- headers:
- - name: Authorization
- prefix: Bearer
- valueFrom:
- secretKeyRef:
- name: backend
- namespace: default
- key: token
-```
-
-
-
-The related Secret must have the referenced name, be located in the referenced namespace, and contain the mapped key. See the following example:
-
-```yaml
-kind: Secret
-apiVersion: v1
-metadata:
- name: backend
- namespace: default
-stringData:
- endpoint: https://backend.example.com:4317
- user: myUser
- password: XXX
- token: YYY
-```
-
-### 3. Rotate the Secret
-
-Telemetry Manager continuously watches the Secret referenced with the **secretKeyRef** construct. You can update the Secret’s values, and Telemetry Manager detects the changes and applies the new Secret to the setup.
-
-> [!TIP]
-> If you use a Secret owned by the [SAP BTP Service Operator](https://github.com/SAP/sap-btp-service-operator), you can configure an automated rotation using a `credentialsRotationPolicy` with a specific `rotationFrequency` and don’t have to intervene manually.
-
-### 4. Activate Prometheus-Based Metrics
-
-> [!NOTE]
-> For the following approach, you must have instrumented your application using a library like the [Prometheus client library](https://prometheus.io/docs/instrumenting/clientlibs/), with a port in your workload exposed serving as a Prometheus metrics endpoint.
-
-To enable collection of Prometheus-based metrics, define a MetricPipeline that has the `prometheus` section enabled as input:
-
-```yaml
-apiVersion: telemetry.kyma-project.io/v1alpha1
-kind: MetricPipeline
-metadata:
- name: backend
-spec:
- input:
- prometheus:
- enabled: true
- output:
- otlp:
- endpoint:
- value: https://backend.example.com:4317
-```
-
-The Metric agent is configured with a generic scrape configuration, which uses annotations to specify the endpoints to scrape in the cluster.
-
-For metrics ingestion to start automatically, use the annotations of the following table.
-If an Istio sidecar is present, apply them to a Service that resolves your metrics port.
-By annotating the Service, all endpoints targeted by the Service are resolved and scraped by the Metric agent bypassing the Service itself.
-Only if Istio sidecar is not present, you can alternatively apply the annotations directly to the Pod.
-
-| Annotation Key | Example Values | Default Value | Description |
-|------------------------------------------------------------------|-------------------|-------------- |---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| `prometheus.io/scrape` (mandatory) | `true`, `false` | none | Controls whether Prometheus Receiver automatically scrapes metrics from this target. |
-| `prometheus.io/port` (mandatory) | `8080`, `9100` | none | Specifies the port where the metrics are exposed. |
-| `prometheus.io/path` | `/metrics`, `/custom_metrics` | `/metrics` | Defines the HTTP path where Prometheus Receiver can find metrics data. |
-| `prometheus.io/scheme` (only relevant when annotating a Service) | `http`, `https` | If Istio is active, `https` is supported; otherwise, only `http` is available. The default scheme is `http` unless an Istio sidecar is present, denoted by the label `security.istio.io/tlsMode=istio`, in which case `https` becomes the default. | Determines the protocol used for scraping metrics — either HTTPS with mTLS or plain HTTP. |
-| `prometheus.io/param_: ` | `prometheus.io/param_format: prometheus` | none | Instructs Prometheus Receiver to pass name-value pairs as URL parameters when calling the metrics endpoint. |
-
-If you're running the Pod targeted by a Service with Istio, Istio must be able to derive the [appProtocol](https://kubernetes.io/docs/concepts/services-networking/service/#application-protocol) from the Service port definition; otherwise the communication for scraping the metric endpoint cannot be established. You must either prefix the port name with the protocol like in `http-metrics`, or explicitly define the `appProtocol` attribute.
-
-For example, see the following `Service` configuration:
-
-```yaml
-apiVersion: v1
-kind: Service
-metadata:
- annotations:
- prometheus.io/port: "8080"
- prometheus.io/scrape: "true"
- name: sample
-spec:
- ports:
- - name: http-metrics
- appProtocol: http
- port: 8080
- protocol: TCP
- targetPort: 8080
- selector:
- app: sample
- type: ClusterIP
-```
-
-> [!NOTE]
-> The Metric agent can scrape endpoints even if the workload is a part of the Istio service mesh and accepts mTLS communication. However, there's a constraint: For scraping through HTTPS, Istio must configure the workload using 'STRICT' mTLS mode. Without 'STRICT' mTLS mode, you can set up scraping through HTTP by applying the annotation `prometheus.io/scheme=http`. For related troubleshooting, see [Log Entry: Failed to Scrape Prometheus Endpoint](#log-entry-failed-to-scrape-prometheus-endpoint).
-
-### 5. Monitor Pipeline Health
-
-By default, a MetricPipeline emits metrics about the health of all pipelines managed by the Telemetry module. Based on these metrics, you can track the status of every individual pipeline and set up alerting for it.
-
-Metrics for Pipelines and the Telemetry Module:
-
-| Metric | Description | Availability |
-|---------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------|
-| kyma.resource.status.conditions | Value represents status of different conditions reported by the resource. Possible values are 1 ("True"), 0 ("False"), and -1 (other status values) | Available for both, the pipelines and the Telemetry resource |
-| kyma.resource.status.state | Value represents the state of the resource (if present) | Available for the Telemetry resource |
-
-Metric Attributes for Monitoring:
-
-| Name | Description |
-|--------------------------|----------------------------------------------------------------------------------------------|
-| metric.attributes.Type | Type of the condition |
-| metric.attributes.status | Status of the condition |
-| metric.attributes.reason | Contains a programmatic identifier indicating the reason for the condition's last transition |
-
-To set up alerting, use an alert rule. In the following example, the alert is triggered if metrics are not delivered to the backend:
-
-```text
- min by (k8s_resource_name) ((kyma_resource_status_conditions{type="TelemetryFlowHealthy",k8s_resource_kind="metricpipelines"})) == 0
-```
-
-### 6. Activate Runtime Metrics
-
-To enable collection of runtime metrics, define a MetricPipeline that has the `runtime` section enabled as input:
-
-```yaml
-apiVersion: telemetry.kyma-project.io/v1alpha1
-kind: MetricPipeline
-metadata:
- name: backend
-spec:
- input:
- runtime:
- enabled: true
- output:
- otlp:
- endpoint:
- value: https://backend.example.com:4317
-```
-
-By default, metrics for all resources (Pod, container, Node, Volume, DaemonSet, Deployment, StatefulSet, and Job) are collected.
-To enable or disable the collection of metrics for a specific resource, use the `resources` section in the `runtime` input.
-
-The following example collects only DaemonSet, Deployment, StatefulSet, and Job metrics:
-
- ```yaml
- apiVersion: telemetry.kyma-project.io/v1alpha1
- kind: MetricPipeline
- metadata:
- name: backend
- spec:
- input:
- runtime:
- enabled: true
- resources:
- pod:
- enabled: false
- container:
- enabled: false
- node:
- enabled: false
- volume:
- enabled: false
- daemonset:
- enabled: true
- deployment:
- enabled: true
- statefulset:
- enabled: true
- job:
- enabled: true
- output:
- otlp:
- endpoint:
- value: https://backend.example.com:4317
- ```
-
-If Pod metrics are enabled, the following metrics are collected:
-
-- From the [kubletstatsreceiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/kubeletstatsreceiver):
- - `k8s.pod.cpu.capacity`
- - `k8s.pod.cpu.usage`
- - `k8s.pod.filesystem.available`
- - `k8s.pod.filesystem.capacity`
- - `k8s.pod.filesystem.usage`
- - `k8s.pod.memory.available`
- - `k8s.pod.memory.major_page_faults`
- - `k8s.pod.memory.page_faults`
- - `k8s.pod.memory.rss`
- - `k8s.pod.memory.usage`
- - `k8s.pod.memory.working_set`
- - `k8s.pod.network.errors`
- - `k8s.pod.network.io`
-- From the [k8sclusterreceiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/k8sclusterreceiver):
- - `k8s.pod.phase`
-
-If container metrics are enabled, the following metrics are collected:
-
-- From the [kubletstatsreceiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/kubeletstatsreceiver):
- - `container.cpu.time`
- - `container.cpu.usage`
- - `container.filesystem.available`
- - `container.filesystem.capacity`
- - `container.filesystem.usage`
- - `container.memory.available`
- - `container.memory.major_page_faults`
- - `container.memory.page_faults`
- - `container.memory.rss`
- - `container.memory.usage`
- - `container.memory.working_set`
-- From the [k8sclusterreceiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/k8sclusterreceiver):
- - `k8s.container.cpu_request`
- - `k8s.container.cpu_limit`
- - `k8s.container.memory_request`
- - `k8s.container.memory_limit`
- - `k8s.container.restarts`
-
-If Node metrics are enabled, the following metrics are collected:
-
-- From the [kubletstatsreceiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/kubeletstatsreceiver):
- - `k8s.node.cpu.usage`
- - `k8s.node.filesystem.available`
- - `k8s.node.filesystem.capacity`
- - `k8s.node.filesystem.usage`
- - `k8s.node.memory.available`
- - `k8s.node.memory.usage`
- - `k8s.node.network.errors`,
- - `k8s.node.network.io`,
- - `k8s.node.memory.rss`
- - `k8s.node.memory.working_set`
-
-If Volume metrics are enabled, the following metrics are collected:
-
-- From the [kubletstatsreceiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/kubeletstatsreceiver):
- - `k8s.volume.available`
- - `k8s.volume.capacity`
- - `k8s.volume.inodes`
- - `k8s.volume.inodes.free`
- - `k8s.volume.inodes.used`
-
-If Deployment metrics are enabled, the following metrics are collected:
-
-- From the [k8sclusterreceiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/k8sclusterreceiver):
- - `k8s.deployment.available`
- - `k8s.deployment.desired`
-
-If DaemonSet metrics are enabled, the following metrics are collected:
-
-- From the [k8sclusterreceiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/k8sclusterreceiver):
- - `k8s.daemonset.current_scheduled_nodes`
- - `k8s.daemonset.desired_scheduled_nodes`
- - `k8s.daemonset.misscheduled_nodes`
- - `k8s.daemonset.ready_nodes`
-
-If StatefulSet metrics are enabled, the following metrics are collected:
-
-- From the [k8sclusterreceiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/k8sclusterreceiver):
- - `k8s.statefulset.current_pods`
- - `k8s.statefulset.desired_pods`
- - `k8s.statefulset.ready_pods`
- - `k8s.statefulset.updated_pods`
-
-If Job metrics are enabled, the following metrics are collected:
-
-- From the [k8sclusterreceiver](https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/receiver/k8sclusterreceiver):
- - `k8s.job.active_pods`
- - `k8s.job.desired_successful_pods`
- - `k8s.job.failed_pods`
- - `k8s.job.max_parallel_pods`
- - `k8s.job.successful_pods`
-
-### 7. Activate Istio Metrics
-
-To enable collection of Istio metrics, define a MetricPipeline that has the `istio` section enabled as input:
-
-```yaml
-apiVersion: telemetry.kyma-project.io/v1alpha1
-kind: MetricPipeline
-metadata:
- name: backend
-spec:
- input:
- istio:
- enabled: true
- output:
- otlp:
- endpoint:
- value: https://backend.example.com:4317
-```
-
-With this, the agent starts collecting all Istio metrics from Istio sidecars.
-
-If you are using the `istio` input, you can also collect Envoy metrics. Envoy metrics provide insights into the performance and behavior of the Envoy proxy, such as request rates, latencies, and error counts. These metrics are useful for observability and troubleshooting service mesh traffic.
-
-For details, see the list of available [Envoy metrics](https://www.envoyproxy.io/docs/envoy/latest/configuration/upstream/cluster_manager/cluster_stats) and [server metrics](https://www.envoyproxy.io/docs/envoy/latest/configuration/observability/statistics).
-
-> [!NOTE]
-> Envoy metrics are only available for the `istio` input. Ensure that Istio sidecars are correctly injected into your workloads for Envoy metrics to be available.
-
-By default, Envoy metrics collection is disabled.
-
-To activate Envoy metrics, enable the `envoyMetrics` section in the MetricPipeline specification under the `istio` input:
-
- ```yaml
- apiVersion: telemetry.kyma-project.io/v1alpha1
- kind: MetricPipeline
- metadata:
- name: envoy-metrics
- spec:
- input:
- istio:
- enabled: true
- envoyMetrics:
- enabled: true
- output:
- otlp:
- endpoint:
- value: https://backend.example.com:4317
- ```
-
-### 8. Deactivate OTLP Metrics
-
-By default, `otlp` input is enabled.
-
-To drop the push-based OTLP metrics that are received by the metric gateway, define a MetricPipeline that has the `otlp` section disabled as an input:
-
-```yaml
-apiVersion: telemetry.kyma-project.io/v1alpha1
-kind: MetricPipeline
-metadata:
- name: backend
-spec:
- input:
- istio:
- enabled: true
- otlp:
- disabled: true
- output:
- otlp:
- endpoint:
- value: https://backend.example.com:4317
-```
-
-With this, the agent starts collecting all Istio metrics from Istio sidecars, and the push-based OTLP metrics are dropped.
-
-### 9. Add Filters
-
-To filter metrics by namespaces, define a MetricPipeline that has the `namespaces` section defined in one of the inputs. For example, you can specify the namespaces from which metrics are collected or the namespaces from which metrics are dropped. Learn more about the available [parameters and attributes](resources/05-metricpipeline.md).
-
-The following example collects runtime metrics **only** from the `foo` and `bar` namespaces:
-
-```yaml
-apiVersion: telemetry.kyma-project.io/v1alpha1
-kind: MetricPipeline
-metadata:
- name: backend
-spec:
- input:
- runtime:
- enabled: true
- namespaces:
- include:
- - foo
- - bar
- output:
- otlp:
- endpoint:
- value: https://backend.example.com:4317
-```
-
-The following example collects runtime metrics from all namespaces **except** the `foo` and `bar` namespaces:
-
-```yaml
-apiVersion: telemetry.kyma-project.io/v1alpha1
-kind: MetricPipeline
-metadata:
- name: backend
-spec:
- input:
- runtime:
- enabled: true
- namespaces:
- exclude:
- - foo
- - bar
- output:
- otlp:
- endpoint:
- value: https://backend.example.com:4317
-```
-The following example collects metrics from all namespaces including system namespaces:
-
-```yaml
-apiVersion: telemetry.kyma-project.io/v1alpha1
-kind: MetricPipeline
-metadata:
- name: backend
-spec:
- input:
- istio:
- enabled: true
- namespaces: {}
- prometheus:
- enabled: true
- namespaces: {}
- runtime:
- enabled: true
- namespaces: {}
- output:
- otlp:
- endpoint:
- value: https://backend.example.com:4317
-```
-To collect metrics with specific system namespaces, add them to the `include` section of namespaces.
-
-> [!NOTE]
-> The default settings depend on the input:
->
-> If no namespace selector is defined for the `prometheus` or `runtime` input, then metrics from system namespaces are excluded by default.
->
-> However, if the namespace selector is not defined for the `istio` and `otlp` input, then metrics from system namespaces are included by default.
-
-### 10. Activate Diagnostic Metrics
-
-If you use the `prometheus` or `istio` input, for every metric source typical scrape metrics are produced, such as `up`, `scrape_duration_seconds`, `scrape_samples_scraped`, `scrape_samples_post_metric_relabeling`, and `scrape_series_added`.
-
-By default, they are disabled.
-
-If you want to use them for debugging and diagnostic purposes, you can activate them. To activate diagnostic metrics, define a MetricPipeline that has the `diagnosticMetrics` section defined.
-
-- The following example collects diagnostic metrics **only** for input `istio`:
-
- ```yaml
- apiVersion: telemetry.kyma-project.io/v1alpha1
- kind: MetricPipeline
- metadata:
- name: backend
- spec:
- input:
- istio:
- enabled: true
- diagnosticMetrics:
- enabled: true
- output:
- otlp:
- endpoint:
- value: https://backend.example.com:4317
- ```
-
-- The following example collects diagnostic metrics **only** for input `prometheus`:
-
- ```yaml
- apiVersion: telemetry.kyma-project.io/v1alpha1
- kind: MetricPipeline
- metadata:
- name: backend
- spec:
- input:
- prometheus:
- enabled: true
- diagnosticMetrics:
- enabled: true
- output:
- otlp:
- endpoint:
- value: https://backend.example.com:4317
- ```
-
-> [!NOTE]
-> Diagnostic metrics are only available for inputs `prometheus` and `istio`. Learn more about the available [parameters and attributes](resources/05-metricpipeline.md).
-
-### 11. Deploy the Pipeline
-
-To activate the MetricPipeline, apply the `metricpipeline.yaml` resource file in your cluster:
-
-```bash
-kubectl apply -f metricpipeline.yaml
-```
-
-### Result
-
-You activated a MetricPipeline and metrics start streaming to your backend.
-
-To check that the pipeline is running, wait until the status conditions of the MetricPipeline in your cluster have status `True`:
-
-```bash
-kubectl get metricpipeline
-NAME CONFIGURATION GENERATED GATEWAY HEALTHY AGENT HEALTHY FLOW HEALTHY
-backend True True True True
-```
-
-## Operations
-
-A MetricPipeline runs several OTel Collector instances in your cluster. This Deployment serves OTLP endpoints and ships received data to the configured backend.
-
-The Telemetry module ensures that the OTel Collector instances are operational and healthy at any time, for example, with buffering and retries. However, there may be situations when the instances drop metrics, or cannot handle the metric load.
-
-To detect and fix such situations, check the [pipeline status](./resources/05-metricpipeline.md#metricpipeline-status) and check out [Troubleshooting](#troubleshooting). If you have set up [pipeline health monitoring](./04-metrics.md#5-monitor-pipeline-health), check the alerts and reports in an integrated backend like [SAP Cloud Logging](./integration/sap-cloud-logging/README.md#use-sap-cloud-logging-alerts).
-
-> [! WARNING]
-> It's not recommended to access the metrics endpoint of the used OTel Collector instances directly, because the exposed metrics are no official API of the Kyma Telemetry module. Breaking changes can happen if the underlying OTel Collector version introduces such.
-> Instead, use the [pipeline status](./resources/05-metricpipeline.md#metricpipeline-status).
-
-## Limitations
-
-- **Throughput**: Assuming an average metric with 20 metric data points and 10 labels, the default metric **gateway** setup has a maximum throughput of 34K metric data points/sec. If more data is sent to the gateway, it is refused. To increase the maximum throughput, manually scale out the gateway by increasing the number of replicas for the metric gateway. See [Module Configuration and Status](https://kyma-project.io/#/telemetry-manager/user/01-manager?id=module-configuration).
- The metric **agent** setup has a maximum throughput of 14K metric data points/sec per instance. If more data must be ingested, it is refused. If a metric data endpoint emits more than 50.000 metric data points per scrape loop, the metric agent refuses all the data.
-- **Load Balancing With Istio**: To ensure availability, the metric gateway runs with multiple instances. If you want to increase the maximum throughput, use manual scaling and enter a higher number of instances.
- By design, the connections to the gateway are long-living connections (because OTLP is based on gRPC and HTTP/2). For optimal scaling of the gateway, the clients or applications must balance the connections across the available instances, which is automatically achieved if you use an Istio sidecar. If your application has no Istio sidecar, the data is always sent to one instance of the gateway.
-- **Unavailability of Output**: For up to 5 minutes, a retry for data is attempted when the destination is unavailable. After that, data is dropped.
-- **No Guaranteed Delivery**: The used buffers are volatile. If the gateway or agent instances crash, metric data can be lost.
-- **Multiple MetricPipeline Support**: The maximum amount of MetricPipeline resources is 5.
-
-## Troubleshooting
-
-### No Metrics Arrive at the Backend
-
-**Symptom**:
-
-- No metrics arrive at the backend.
-- In the MetricPipeline status, the `TelemetryFlowHealthy` condition has status **GatewayAllTelemetryDataDropped** or **AgentAllTelemetryDataDropped**.
-
-**Cause**: Incorrect backend endpoint configuration (such as using the wrong authentication credentials) or the backend is unreachable.
-
-**Solution**:
-
-1. Check the error logs for the affected Pod by calling `kubectl logs -n kyma-system {POD_NAME}`
- - For **GatewayAllTelemetryDataDropped**, check pods `telemetry-metric-gateway`.
- - For **AgentAllTelemetryDataDropped**, check pods `telemetry-metric-agent`.
-2. Check if the backend is up and reachable.
-3. Fix the errors.
-
-### Not All Metrics Arrive at the Backend
-
-**Symptom**:
-
-- The backend is reachable and the connection is properly configured, but some metrics are refused.
-- In the MetricPipeline status, the `TelemetryFlowHealthy` condition has status **GatewaySomeTelemetryDataDropped** or **AgentSomeTelemetryDataDropped**.
-
-**Cause**: It can happen due to a variety of reasons - for example, the backend is limiting the ingestion rate.
-
-**Solution**:
-
-1. Check the `telemetry-metric-gateway` Pods for error logs by calling `kubectl logs -n kyma-system {POD_NAME}`. Also, check your observability backend to investigate potential causes.
-2. If the backend is limiting the rate by refusing metrics, try the following options:
- - Option 1: Increase maximum backend ingestion rate. For example, by scaling out the SAP Cloud Logging instances.
- - Option 2: Reduce emitted metrics by re-configuring the MetricPipeline. For example, by disabling certain inputs or applying namespace filters.
- - Option 3: Reduce emitted metrics in your applications.
-3. Otherwise, take the actions appropriate to the cause indicated in the logs.
-
-### Only Istio Metrics Arrive at the Backend
-
-**Symptom**: Custom metrics don't arrive at the backend, but Istio metrics do.
-
-**Cause**: Your SDK version is incompatible with the OTel Collector version.
-
-**Solution**:
-
-1. Check which SDK version you are using for instrumentation.
-2. Investigate whether it is compatible with the OTel Collector version.
-3. If required, upgrade to a supported SDK version.
-
-### Log Entry: Failed to Scrape Prometheus Endpoint
-
-**Symptom**: Custom metrics don't arrive at the destination. The OTel Collector produces log entries saying "Failed to scrape Prometheus endpoint", such as the following example:
-
-```bash
-2023-08-29T09:53:07.123Z warn internal/transaction.go:111 Failed to scrape Prometheus endpoint {"kind": "receiver", "name": "prometheus/app-pods", "data_type": "metrics", "scrape_timestamp": 1693302787120, "target_labels": "{__name__=\"up\", instance=\"10.42.0.18:8080\", job=\"app-pods\"}"}
-```
-
-**Cause 1**: The workload is not configured to use 'STRICT' mTLS mode. For details, see [Activate Prometheus-Based Metrics](#_4-activate-prometheus-based-metrics).
-
-**Solution 1**: You can either set up 'STRICT' mTLS mode or HTTP scraping:
-
-- Configure the workload using “STRICT” mTLS mode (for example, by applying a corresponding PeerAuthentication).
-- Set up scraping through HTTP by applying the `prometheus.io/scheme=http` annotation.
-
-**Cause 2**: The Service definition enabling the scrape with Prometheus annotations does not reveal the application protocol to use in the port definition. For details, see [Activate Prometheus-Based Metrics](#_4-activate-prometheus-based-metrics).
-
-**Solution 2**: Define the application protocol in the Service port definition by either prefixing the port name with the protocol, like in `http-metrics` or define the `appProtocol` attribute.
-
-**Cause 3**: A deny-all `NetworkPolicy` was created in the workload namespace, which prevents that the agent can scrape metrics from annotated workloads.
-
-**Solution 3**: Create a separate `NetworkPolicy` to explicitly let the agent scrape your workload using the `telemetry.kyma-project.io/metric-scrape` label.
-
-For example, see the following `NetworkPolicy` configuration:
-
-```yaml
-apiVersion: networking.k8s.io/v1
-kind: NetworkPolicy
-metadata:
- name: allow-traffic-from-agent
-spec:
- podSelector:
- matchLabels:
- app.kubernetes.io/name: "annotated-workload" #
- ingress:
- - from:
- - namespaceSelector:
- matchLabels:
- kubernetes.io/metadata.name: kyma-system
- podSelector:
- matchLabels:
- telemetry.kyma-project.io/metric-scrape: "true"
- policyTypes:
- - Ingress
-```
-
-### Gateway Throttling
-
-**Symptom**: In the MetricPipeline status, the `TelemetryFlowHealthy` condition has status **GatewayThrottling**.
-
-**Cause**: Gateway cannot receive metrics at the given rate.
-
-**Solution**: Manually scale out the gateway by increasing the number of replicas for the metric gateway. See [Module Configuration and Status](https://kyma-project.io/#/telemetry-manager/user/01-manager?id=module-configuration).
+This content moved to [Collecting Metrics](./collecting-metrics/README.md).
diff --git a/docs/user/README.md b/docs/user/README.md
index 4178e51e8c..a21262ed6c 100644
--- a/docs/user/README.md
+++ b/docs/user/README.md
@@ -1,97 +1,82 @@
# Telemetry Module
-Learn more about the Telemetry Module. Use it to enable observability for your application.
+Use the Telemetry module to collect telemetry signals (logs, traces, and metrics) from your applications and send them to your preferred observability backend.
## What Is Telemetry?
-Fundamentally, "Observability" is a measure of how well the application's external outputs can reflect the internal states of single components. The insights that an application and the surrounding infrastructure expose are displayed in the form of metrics, traces, and logs - collectively, that's called "telemetry" or ["signals"](https://opentelemetry.io/docs/concepts/signals/). These can be exposed by employing modern instrumentation.
+With telemetry signals, you can understand the behavior and health of your applications and infrastructure. The Telemetry module provides a standardized way to collect these signals and send them to your observability backend, where you can analyze them and troubleshoot issues.
+
+The Telemetry module processes three types of signals:
+
+- Logs: Time-stamped records of events that happen over time.
+- Traces: The path of a request as it travels through your application's components.
+- Metrics: Aggregated numerical data about the performance or state of a component over time.

-1. In order to implement Day-2 operations for a distributed application running in a container runtime, the single components of an application must expose these signals by employing modern instrumentation.
-2. Furthermore, the signals must be collected and enriched with the infrastructural metadata in order to ship them to a target system.
-3. Instead of providing a one-size-for-all backend solution, the Telemetry module supports you with instrumenting and shipping your telemetry data in a vendor-neutral way.
-4. This way, you can conveniently enable observability for your application by integrating it into your existing or desired backends. Pick your favorite among many observability backends (available either as a service or as a self-manageable solution) that focus on different aspects and scenarios.
+Telemetry signals flow through the following stages:
+
+1. You instrument your application so that its components expose telemetry signals.
+2. The signals are collected and enriched with infrastructural metadata.
+3. You send the enriched signals to your preferred observability backend.
+4. The backend stores your data, where you can analyze and visualize it.
-The Telemetry module focuses exactly on the aspects of instrumentation, collection, and shipment that happen in the runtime and explicitly defocuses on backends.
+The Telemetry module focuses on the collection, processing, and shipment stages of the observability workflow. It offers a vendor-neutral approach based on [OpenTelemetry](https://opentelemetry.io/) and doesn't force you into a specific backend. This means you can integrate with your existing observability platforms or choose from a wide range of available backends that best suit your operational needs.
> [!TIP]
-> An enterprise-grade setup demands a central solution outside the cluster, so we recommend in-cluster solutions only for testing purposes. If you want to install lightweight in-cluster backends for demo or development purposes, see [Integration Guides](#integration-guides).
+> Build your first telemetry pipeline with the hands-on lesson [Collecting Application Logs and Shipping them to SAP Cloud Logging](https://learning.sap.com/learning-journeys/developing-applications-in-sap-btp-kyma-runtime/collecting-application-logs-and-shipping-to-sap-cloud-logging).
## Features
To support telemetry for your applications, the Telemetry module provides the following features:
-- Tooling for collection, filtering, and shipment: Based on the [Open Telemetry Collector](https://opentelemetry.io/docs/collector/), you can configure basic pipelines to filter and ship telemetry data.
-- Integration in a vendor-neutral way to a vendor-specific observability system: Based on the [OpenTelemetry protocol (OTLP)](https://opentelemetry.io/docs/reference/specification/protocol/), you can integrate backend systems.
-- Guidance for the instrumentation: Based on [Open Telemetry](https://opentelemetry.io/), you get community samples on how to instrument your code using the [Open Telemetry SDKs](https://opentelemetry.io/docs/instrumentation/) in nearly every programming language.
-- Enriching telemetry data by automatically adding common attributes. This is done in compliance with established semantic conventions, ensuring that the enriched data adheres to industry best practices and is more meaningful for analysis. For details, see [Data Enrichment](gateways.md#data-enrichment).
-- Opt-out of features for advanced scenarios: At any time, you can opt out for each data type, and use custom tooling to collect and ship the telemetry data.
-- SAP BTP as first-class integration: Integration into SAP BTP Observability services, such as SAP Cloud Logging, is prioritized. For more information, see [Integrate with SAP Cloud Logging](integration/sap-cloud-logging/README.md).
-
-## Scope
-
-The Telemetry module focuses only on the signals of application logs, distributed traces, and metrics. Other kinds of signals are not considered. Also, audit logs are not in scope.
+- **Consistent Telemetry Pipeline API**: Use a streamlined set of APIs based on the [OTel Collector](https://opentelemetry.io/docs/collector/) to collect, filter, and ship your logs, metrics, and traces (see [Telemetry Pipeline API](pipelines.md)). You define a pipeline for each signal type to control how the data is processed and where it's sent. For details, see [Collecting Logs](./collecting-logs/README.md), [Collecting Traces](./collecting-traces/README.md), and [Collecting Metrics](./collecting-metrics/README.md).
-Supported integration scenarios are neutral to the vendor of the target system.
-
-## Architecture
+- **Flexible Backend Integration**: The Telemetry module is optimized for integration with SAP BTP observability services, such as SAP Cloud Logging. You can also send data to any backend that supports the [OpenTelemetry protocol (OTLP)](https://opentelemetry.io/docs/specs/otel/protocol/), giving you the freedom to choose your preferred solution (see [Integrate With Your OTLP Backend](./integrate-otlp-backend/)).
-
+ > [!TIP]
+ > For production deployments, we recommend using a central telemetry solution located outside your cluster. For an example with SAP Cloud Logging, see [Integrate With SAP Cloud Logging](./integration/sap-cloud-logging/README.md).
+ >
+ > For testing or development, in-cluster solutions may be suitable. For examples such as Dynatrace (or to learn how to collect data from applications based on the OpenTelemetry Demo App), see [Integration Guides](https://kyma-project.io/#/telemetry-manager/user/integration/README).
-### Telemetry Manager
+- **Seamless Istio Integration**: The Telemetry module seamlessly integrates with the Istio module when both are present in your cluster. For details, see [Istio Integration](./architecture/istio-integration.md).
-The Telemetry module ships Telemetry Manager as its core component. Telemetry Manager is a Kubernetes [operator](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/) that implements the Kubernetes controller pattern and manages the whole lifecycle of all other components covered in the Telemetry module. Telemetry Manager watches for the user-created Kubernetes resources: LogPipeline, TracePipeline, and MetricPipeline. In these resources, you specify what data of a signal type to collect and where to ship it.
-If Telemetry Manager detects a configuration, it deploys the related gateway and agent components accordingly and keeps them in sync with the requested pipeline definition.
+- **Automatic Data Enrichment**: The Telemetry module adds resource attributes as metadata, following OTel semantic conventions. This makes your data more consistent, meaningful, and ready for analysis in your observability backend. For details, see [Automatic Data Enrichment](./filter-and-process/automatic-data-enrichment.md).
-For more information, see [Telemetry Manager](01-manager.md).
+- **Instrumentation Guidance**: To generate telemetry data, you must instrument your code. Based on [Open Telemetry](https://opentelemetry.io/) (OTel), you get community samples on how to instrument your code using the [Open Telemetry SDKs](https://opentelemetry.io/docs/languages/) in most programming languages.
-### Gateways
+- **Custom Tooling Support**: For advanced scenarios, you can opt out of the module's default collection and shipment mechanisms for individual data types. This enables you to use custom tooling to collect and ship the telemetry data.
-The log, trace, and metrics features provide gateways based on an [OTel Collector](https://opentelemetry.io/docs/collector/) [Deployment](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/). The gateways act as central endpoints in the cluster to which your applications push data in the [OTLP](https://opentelemetry.io/docs/reference/specification/protocol/) format. From here, the data is enriched and filtered, and then dispatched as configured in your pipeline resources.
-
-For more information, see [Telemetry Gateways](gateways.md).
-
-### Log Gateway and Agent
-
-In addition to the log gateway, you can also use the log agent based on a [DaemonSet](https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/), which collects logs of any container printing logs to `stdout/stderr`. For more information, see [Application Logs (OTLP)](logs.md).
-
-As an alternative to the OTLP-based log feature, you can choose using a log agent based on a [Fluent Bit](https://fluentbit.io/) installation running as a [DaemonSet](https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/). It reads all containers' logs in the runtime and ships them according to your LogPipeline configuration. For more information, see [Application Logs (Fluent Bit)](02-logs.md).
-
-### Trace Gateway
+## Scope
-The trace gateway provides an [OTLP-based](https://opentelemetry.io/docs/reference/specification/protocol/) endpoint to which applications can push the trace signals. Kyma modules like Istio or Serverless contribute traces transparently. For more information, see [Traces](03-traces.md).
+The Telemetry module focuses only on the signals of application logs, distributed traces, and metrics. Other kinds of signals are not considered. Also, audit logs are not in scope.
-### Metric Gateway and Agent
+Supported integration scenarios are neutral to the vendor of the target system.
-In addition to the metric gateway, you can also use the metric agent based on a [DaemonSet](https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/), which scrapes annotated Prometheus-based workloads. For more information, see [Metrics](04-metrics.md).
+## Architecture
-## Integration Guides
+The Telemetry module is built around a central controller, Telemetry Manager, which dynamically configures and deploys data collection components based on your pipeline resources.
-To learn about integration with SAP Cloud Logging, read [Integrate with SAP Cloud Logging](./integration/sap-cloud-logging/README.md).
+To understand how the core components interact, see [Architecture](architecture/README.md).
-For integration with other backends, such as Dynatrace, see:
+To learn how this model applies to each signal type, see:
-- [Dynatrace](./integration/dynatrace/README.md)
-- [Prometheus](./integration/prometheus/README.md)
-- [Loki](./integration/loki/README.md)
-- [Jaeger](./integration/jaeger/README.md)
-- [Amazon CloudWatch](./integration/aws-cloudwatch/README.md)
+- [Logs Architecture](./architecture/logs-architecture.md)
+- [Traces Architecture](./architecture/traces-architecture.md)
+- [Metrics Architecture](./architecture/metrics-architecture.md)
-To learn how to collect data from applications based on the OpenTelemetry SDK, see:
+## API/Custom Resource Definitions
-- [OpenTelemetry Demo App](./integration/opentelemetry-demo/README.md)
-- [Sample App](./integration/sample-app/)
+You configure the Telemetry module and its pipelines by creating and applying Kubernetes CustomResourceDefinitions (CRDs), which extend the Kubernetes API with custom additions.
-## API / Custom Resource Definitions
+To understand and configure the module's global settings, refer to the [Telemetry CRD](./resources/01-telemetry.md).
-The API of the Telemetry module is based on Kubernetes Custom Resource Definitions (CRD), which extend the Kubernetes API with custom additions. To inspect the specification of the Telemetry module API, see:
+To define how to collect, process, and ship a specific signal, use the pipeline CRDs:
-- [Telemetry CRD](./resources/01-telemetry.md)
- [LogPipeline CRD](./resources/02-logpipeline.md)
- [TracePipeline CRD](./resources/04-tracepipeline.md)
- [MetricPipeline CRD](./resources/05-metricpipeline.md)
-## Resource Usage
+## Resource Consumption
To learn more about the resources used by the Telemetry module, see [Kyma Modules' Sizing](https://help.sap.com/docs/btp/sap-business-technology-platform/kyma-modules-sizing#telemetry).
diff --git a/docs/user/_sidebar.md b/docs/user/_sidebar.md
index b6e625e299..5ea3423c7f 100644
--- a/docs/user/_sidebar.md
+++ b/docs/user/_sidebar.md
@@ -1,12 +1,32 @@
- [Back to Kyma Home](/)
- [Kyma Telemetry Module](/telemetry-manager/user/README.md)
-- [Telemetry Manager](/telemetry-manager/user/01-manager.md)
-- [Gateways](/telemetry-manager/user/gateways.md)
-- [Application Logs (Fluent Bit)](/telemetry-manager/user/02-logs.md)
-- [Application Logs (OTLP)](/telemetry-manager/user/logs.md)
-- [Traces](/telemetry-manager/user/03-traces.md)
-- [Metrics](/telemetry-manager/user/04-metrics.md)
+- [Telemetry Pipeline API](/telemetry-manager/user/pipelines.md)
+- [Set Up the OTLP Input](/telemetry-manager/user/otlp-input.md)
+- [Collecting Logs](/telemetry-manager/user/collecting-logs/README.md)
+ - [Configure Application Logs](/telemetry-manager/user/collecting-logs/application-input.md)
+ - [Configure Istio Access Logs](/telemetry-manager/user/collecting-logs/istio-support.md)
+- [Collecting Traces](/telemetry-manager/user/collecting-traces/README.md)
+ - [Configure Istio Tracing](/telemetry-manager/user/collecting-traces/istio-support.md)
+- [Collecting Metrics](/telemetry-manager/user/collecting-metrics/README.md)
+ - [Collect Prometheus Metrics](/telemetry-manager/user/collecting-metrics/prometheus-input.md)
+ - [Collect Istio Metrics](/telemetry-manager/user/collecting-metrics/istio-input.md)
+ - [Collect Runtime Metrics](/telemetry-manager/user/collecting-metrics/runtime-input.md)
+- [Filtering and Processing Data](/telemetry-manager/user/filter-and-process/README.md)
+ - [Filter Logs](/telemetry-manager/user/filter-and-process/filter-logs.md)
+ - [Filter Traces](/telemetry-manager/user/filter-and-process/filter-traces.md)
+ - [Filter Metrics](/telemetry-manager/user/filter-and-process/filter-metrics.md)
+ - [Transformation to OTLP Logs](/telemetry-manager/user/filter-and-process/transformation-to-otlp-logs.md)
+ - [Automatic Data Enrichment](/telemetry-manager/user/filter-and-process/automatic-data-enrichment.md)
+- [Integrate with your OTLP Backend](/telemetry-manager/user/integrate-otlp-backend/README.md)
+ - [Migrate Your LogPipeline from HTTP to OTLP Logs](/telemetry-manager/user/integrate-otlp-backend/migration-to-otlp-logs.md)
+- [Monitor Pipeline Health](/telemetry-manager/user/monitor-pipeline-health.md)
+- [Troubleshooting the Telemetry Module](/telemetry-manager/user/troubleshooting.md)
+- [Architecture](/telemetry-manager/user/architecture/README.md)
+ - [Logs Architecture](/telemetry-manager/user/architecture/logs-architecture.md)
+ - [Traces Architecture](/telemetry-manager/user/architecture/traces-architecture.md)
+ - [Metrics Architecture](/telemetry-manager/user/architecture/metrics-architecture.md)
+ - [Istio Integration](/telemetry-manager/user/architecture/istio-integration.md)
- [Integration Guides](/telemetry-manager/user/integration/README.md)
- [SAP Cloud Logging](/telemetry-manager/user/integration/sap-cloud-logging/README.md)
- [Dynatrace](/telemetry-manager/user/integration/dynatrace/README.md)
@@ -21,4 +41,5 @@
- [LogPipeline](/telemetry-manager/user/resources/02-logpipeline.md)
- [TracePipeline](/telemetry-manager/user/resources/04-tracepipeline.md)
- [MetricPipeline](/telemetry-manager/user/resources/05-metricpipeline.md)
+- [Logs (Fluent Bit)](/telemetry-manager/user/02-logs.md)
diff --git a/docs/user/_sidebar.ts b/docs/user/_sidebar.ts
index 7eddf96739..79146ae36d 100644
--- a/docs/user/_sidebar.ts
+++ b/docs/user/_sidebar.ts
@@ -1,24 +1,76 @@
export default [
- { text: 'Telemetry Manager', link: './01-manager' },
- { text: 'Gateways', link: './gateways' },
- { text: 'Application Logs (Fluent Bit)', link: './02-logs' },
- { text: 'Application Logs (OTLP)', link: './logs' },
- { text: 'Traces', link: './03-traces' },
- { text: 'Metrics', link: './04-metrics' },
- { text: 'Integration Guides', link: './integration/README', collapsed: true, items: [
- { text: 'SAP Cloud Logging', link: './integration/sap-cloud-logging/README' },
- { text: 'Dynatrace', link: './integration/dynatrace/README' },
- { text: 'Prometheus', link: './integration/prometheus/README' },
- { text: 'Loki', link: './integration/loki/README' },
- { text: 'Jaeger', link: './integration/jaeger/README' },
- { text: 'Amazon CloudWatch', link: './integration/aws-cloudwatch/README' },
- { text: 'OpenTelemetry Demo App', link: './integration/opentelemetry-demo/README' },
- { text: 'Sample App', link: './integration/sample-app/README' }
- ]},
- { text: 'Resources', link: './resources/README', collapsed: true, items: [
- { text: 'Telemetry', link: './resources/01-telemetry' },
- { text: 'LogPipeline', link: './resources/02-logpipeline' },
- { text: 'TracePipeline', link: './resources/04-tracepipeline' },
- { text: 'MetricPipeline', link: './resources/05-metricpipeline' }
- ]}
+ { text: 'Telemetry Pipeline API', link: '/telemetry-manager/user/pipelines.md' },
+ { text: 'Set Up the OTLP Input', link: '/telemetry-manager/user/collecting-logs/README.md' },
+ {
+ text: 'Collecting Logs', link: '/telemetry-manager/user/collecting-logs/README.md', collapsed: true, items: [
+ { text: 'Configure Application Logs', link: '/telemetry-manager/user/collecting-logs/application-input.md' },
+ { text: 'Configure Istio Access Logs', link: '/telemetry-manager/user/collecting-logs/istio-support.md' },
+ ]
+ },
+ {
+ text: 'Collecting Traces', link: '/telemetry-manager/user/collecting-traces/README.md', collapsed: true, items: [
+ { text: 'Configure Istio Tracing', link: '/telemetry-manager/user/collecting-traces/istio-support.md' },
+ ]
+ },
+ {
+ text: 'Collecting Metrics', link: '/telemetry-manager/user/collecting-metrics/README.md', collapsed: true, items: [
+ { text: 'Collect Prometheus Metrics', link: '/telemetry-manager/user/collecting-metrics/prometheus-input.md' },
+ { text: 'Collect Istio Metrics', link: '/telemetry-manager/user/collecting-metrics/istio-input.md' },
+ { text: 'Collect Runtime Metrics', link: '/telemetry-manager/user/collecting-metrics/runtime-input.md' },
+ ]
+ },
+ {
+ text: 'Filtering and Processing Data', link: '/telemetry-manager/user/filter-and-process/README.md', collapsed: true, items: [
+ { text: 'Filter Logs', link: '/telemetry-manager/user/filter-and-process/filter-logs.md' },
+ { text: 'Filter Traces', link: '/telemetry-manager/user/filter-and-process/filter-traces.md' },
+ { text: 'Filter Metrics', link: '/telemetry-manager/user/filter-and-process/filter-metrics.md' },
+ { text: 'Transformation to OTLP Logs', link: '/telemetry-manager/user/filter-and-process/transformation-to-otlp-logs.md' },
+ { text: 'Automatic Data Enrichment', link: '/telemetry-manager/user/filter-and-process/automatic-data-enrichment.md' },
+ ]
+ },
+ {
+ text: 'Integrate with your OTLP Backend', link: '/telemetry-manager/user/integrate-otlp-backend/README.md', collapsed: true, items: [
+ { text: 'Migrate Your LogPipeline from HTTP to OTLP Logs', link: '/telemetry-manager/user/integrate-otlp-backend/migration-to-otlp-logs.md' },
+ ]
+ },
+ { text: 'Monitor Pipeline Health', link: '/telemetry-manager/user/monitor-pipeline-health.md' },
+ { text: 'Troubleshooting the Telemetry Module', link: '/telemetry-manager/user/troubleshooting.md' },
+ {
+ text: 'Architecture',
+ link: '/telemetry-manager/user/architecture/README.md',
+ collapsed: true,
+ items: [
+ { text: 'Logs Architecture', link: '/telemetry-manager/user/architecture/logs-architecture.md' },
+ { text: 'Traces Architecture', link: '/telemetry-manager/user/architecture/traces-architecture.md' },
+ { text: 'Metrics Architecture', link: '/telemetry-manager/user/architecture/metrics-architecture.md' },
+ { text: 'Istio Integration', link: '/telemetry-manager/user/architecture/istio-integration.md' },
+ ]
+ },
+ {
+ text: 'Integration Guides',
+ link: '/telemetry-manager/user/integration/README.md',
+ collapsed: true,
+ items: [
+ { text: 'SAP Cloud Logging', link: '/telemetry-manager/user/integration/sap-cloud-logging/README.md' },
+ { text: 'Dynatrace', link: '/telemetry-manager/user/integration/dynatrace/README.md' },
+ { text: 'Prometheus', link: '/telemetry-manager/user/integration/prometheus/README.md' },
+ { text: 'Loki', link: '/telemetry-manager/user/integration/loki/README.md' },
+ { text: 'Jaeger', link: '/telemetry-manager/user/integration/jaeger/README.md' },
+ { text: 'Amazon CloudWatch', link: '/telemetry-manager/user/integration/aws-cloudwatch/README.md' },
+ { text: 'OpenTelemetry Demo App', link: '/telemetry-manager/user/integration/opentelemetry-demo/README.md' },
+ { text: 'Sample App', link: '/telemetry-manager/user/integration/sample-app/README.md' },
+ ]
+ },
+ {
+ text: 'Resources',
+ link: '/telemetry-manager/user/resources/README.md',
+ collapsed: true,
+ items: [
+ { text: 'Telemetry', link: '/telemetry-manager/user/resources/01-telemetry.md' },
+ { text: 'LogPipeline', link: '/telemetry-manager/user/resources/02-logpipeline.md' },
+ { text: 'TracePipeline', link: '/telemetry-manager/user/resources/04-tracepipeline.md' },
+ { text: 'MetricPipeline', link: '/telemetry-manager/user/resources/05-metricpipeline.md' },
+ ]
+ },
+ { text: 'Logs (Fluent Bit)', link: '/telemetry-manager/user/02-logs.md' }
];
diff --git a/docs/user/architecture/README.md b/docs/user/architecture/README.md
new file mode 100644
index 0000000000..cd63bed54d
--- /dev/null
+++ b/docs/user/architecture/README.md
@@ -0,0 +1,55 @@
+# Architecture
+
+The Telemetry module consists of a manager component, which continuosly watches the user-provided pipeline resources and deploys the respective OTel Collectors. Learn more about the architecture and how the components interact.
+
+## Overview
+
+The Telemetry API provides a robust, pre-configured OpenTelemetry (OTel) Collector setup that abstracts its underlying complexities. This approach delivers several key benefits:
+
+- Compatibility: Maintains stability and functionality even as underlying OTel Collector features evolve, reducing the need for constant updates on your end.
+- Migratability: Facilitates smooth transitions when you switch underlying technologies or architectures.
+- Native Kubernetes Support: Offers seamless integration with Secrets, for example, served by the SAP BTP Service Operator, and the Telemetry Manager automatically handles the full lifecycle of all components.
+- Focus: Reduces the need to understand intricate underlying OTel Collector concepts, allowing you to focus on your application development.
+
+
+
+## Telemetry Manager
+
+Telemetry Manager, the core component of the module, is a Kubernetes [operator](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/) that implements the Kubernetes controller pattern and manages the whole lifecycle of all other Telemetry components. It performs the following tasks:
+
+1. Watch the module configuration for changes and sync the module status to it.
+2. Watch the user-created Kubernetes resources LogPipeline, TracePipeline, and MetricPipeline. In these resources, you specify what data of a signal type to collect and where to ship it.
+3. Manage the lifecycle of the self monitor and the user-configured agents and gateways.
+ For example, only if you defined a LogPipeline resource, the log gateway is deployed.
+
+
+
+## Gateways and Agents
+
+Gateways and agents handle the incoming telemetry data. The Telemetry Manager deploys them based on your pipeline configuration.
+
+The gateways are based on an [OTel Collector](https://opentelemetry.io/docs/collector/) [Deployment](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/) and act as central endpoints in the cluster to which your applications push data in the OTLP format. From here, the data is enriched and filtered, and then dispatched configured in your pipeline resources.
+
+Agents run as [DaemonSet](https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/) and pull data from the respective Node.
+
+- Log Gateway and Agent
+
+ The log gateway provides a central OTLP endpoint for logs. You can also enable the log agent, which collects logs from the stdout/stderr output of all containers on a Node. For details, see [Logs Architecture](./logs-architecture.md).
+
+ As an alternative to the OTLP-based log feature, you can choose using a log agent based on a [Fluent Bit](https://fluentbit.io/) installation running as a DaemonSet. It reads all containers’ logs in the runtime and ships them according to your LogPipeline configuration. For details, see [Application Logs (Fluent Bit)](./../02-logs.md).
+
+- Trace Gateway
+
+The trace gateway provides a central [OTLP](https://opentelemetry.io/docs/specs/otel/protocol/) endpoint to which your applications can push the trace signals. Kyma modules like Istio or Serverless contribute traces transparently. For details, see [Traces Architecture](./traces-architecture.md).
+
+- Metric Gateway and Agent
+
+ The metric gateway provides a central OTLP endpoint for metrics. You can also enable the metric agent, which scrapes Prometheus-annotated workloads on each Node. For details, see [Metrics Architecture](./metrics-architecture.md).
+
+## Self Monitor
+
+The Telemetry module includes a [Prometheus](https://prometheus.io/)-based self-monitor that collects and evaluates health metrics from the gateways and agents. Telemetry Manager uses this data to report the current health status in your pipeline resources.
+
+You can also use these health metrics in your own observability backend to set up alerts and dashboards for your telemetry pipelines. For details, see [Monitor Pipeline Health](./../monitor-pipeline-health.md).
+
+
diff --git a/docs/user/architecture/istio-integration.md b/docs/user/architecture/istio-integration.md
new file mode 100644
index 0000000000..2e473eeff0
--- /dev/null
+++ b/docs/user/architecture/istio-integration.md
@@ -0,0 +1,29 @@
+# Istio Integration
+
+When you have the Istio module in your cluster, the Telemetry module automatically integrates with it. It detects the Istio installation and injects sidecars into the Telemetry components, adding them to the service mesh. This enables secure mTLS communication for your Telemetry pipelines by default.
+
+## Receiving Data from Your Applications
+
+The Telemetry gateways are automatically configured to accept OTLP data from applications both inside and outside the Istio service mesh. To achieve this, the ingestion endpoints of gateways are set to Istio's permissive mode, so they accept mTLS-based communication as well as plain text.
+
+- Applications within the mesh automatically send data to the gateways using mTLS for a secure, encrypted connection.
+- Applications outside the mesh can send data to the gateway using a standard plain text connection.
+
+> [!TIP]
+> Learn more about Istio-specific input configuration for logs, traces, and metrics:
+>
+> - Configure Istio Access Logs
+> - Configure Istio Tracing
+> - Collect Istio Metrics
+
+
+
+## Sending Data to In-Cluster Backends
+
+Telemetry gateways automatically secure the connection when sending data to your observability backends.
+
+If you're using an in-cluster backend that is part of the Istio mesh, the Telemetry gateways automatically use mTLS to send data to the backend securely. You don't need any special configuration for this.
+
+For sending data to backends outside the cluster, see [Integrate With Your OTLP Backend](./../integrate-otlp-backend/README.md).
+
+
diff --git a/docs/user/architecture/logs-architecture.md b/docs/user/architecture/logs-architecture.md
new file mode 100644
index 0000000000..1169a5f2ce
--- /dev/null
+++ b/docs/user/architecture/logs-architecture.md
@@ -0,0 +1,32 @@
+# Logs Architecture
+
+The Telemetry module provides a central Deployment of an [OTel Collector](https://opentelemetry.io/docs/collector/) acting as a gateway, and an optional DaemonSet acting as an agent. The gateway exposes endpoints that receive OTLP logs from your applications, while the agent collects container logs from each node. To control their behavior and data destination, you define a LogPipeline.
+
+
+
+1. Application containers print JSON logs to the `stdout/stderr` channel and are stored by the Kubernetes container runtime under the `var/log` directory and its subdirectories at the related Node. Istio is configured to write access logs to `stdout` as well.
+2. If you choose to use the agent, an OTel Collector runs as a [DaemonSet](https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/) (one instance per Node), detects any new log files in the folder, and tails and parses them.
+3. An application (exposing logs in OTLP) sends logs to the central log gateway using the `telemetry-otlp-logs` service. Istio is configured to push access logs with OTLP as well.
+4. The gateway and agent discover the metadata and enrich all received data with metadata of the source by communicating with the Kubernetes APIServer. Furthermore, they filter data according to the pipeline configuration.
+5. Telemetry Manager configures the agent and gateway according to the LogPipeline resource specification, including the target backend. Also, it observes the logs flow to the backend and reports problems in the LogPipeline status.
+8. The log agent and gateway send the data to the observability backend that's specified in your LogPipeline resource - either within your cluster, or, if authentication is set up, to an external observability backend.
+9. You can analyze the logs data with your preferred backend.
+
+## Telemetry Manager
+
+The LogPipeline resource is watched by Telemetry Manager, which is responsible for generating the custom parts of the OTel Collector configuration.
+
+
+
+1. Telemetry Manager watches all LogPipeline resources and related Secrets.
+2. Furthermore, Telemetry Manager takes care of the full lifecycle of the gateway Deployment and the agent DaemonSet. Only if you defined a LogPipeline, the gateway and agent are deployed.
+3. Whenever the user configuration changes, Telemetry Manager validates it and generates a single configuration for the gateway and agent.
+4. Referenced Secrets are copied into one Secret that is mounted to the gateway as well.
+
+## Log Gateway
+
+In your cluster, the log gateway is the central component to which all components can send their individual logs. The gateway collects, enriches, and dispatches the data to the configured backend. For more information, see [Set Up the OTLP Input](./../otlp-input.md).
+
+## Log Agent
+
+If you configure a feature in the `input` section of your LogPipeline, an additional DaemonSet is deployed acting as an agent. The agent is based on an [OTel Collector](https://opentelemetry.io/docs/collector/) and encompasses the collection and conversion of logs from the container runtime. Hereby, the workload container just prints the structured log to the `stdout/stderr` channel. The agent picks them up, parses and enriches them, and sends all data in OTLP to the configured backend.
diff --git a/docs/user/architecture/metrics-architecture.md b/docs/user/architecture/metrics-architecture.md
new file mode 100644
index 0000000000..6f716d7d5c
--- /dev/null
+++ b/docs/user/architecture/metrics-architecture.md
@@ -0,0 +1,33 @@
+# Metrics Architecture
+
+The Telemetry module provides a central Deployment of an [OTel Collector](https://opentelemetry.io/docs/collector/) acting as a gateway, and an optional DaemonSet acting as an agent. The gateway exposes endpoints that receive OTLP metrics from your applications, while the agent pulls metrics from Prometheus-annotated endpoints. To control their behavior and data destination, you define a MetricPipeline.
+
+
+
+1. An application (exposing metrics in [OTLP](https://opentelemetry.io/docs/specs/otlp/)) sends metrics to the central metric gateway using the `telemetry-otlp-metrics` service.
+2. An application (exposing metrics in [Prometheus](https://prometheus.io/docs/instrumenting/exposition_formats) protocol) activates the agent to scrape the metrics with an annotation-based configuration.
+3. Additionally, you can activate the agent to pull metrics of each Istio sidecar.
+4. The agent supports collecting metrics from the Kubelet and Kubernetes APIServer.
+5. The gateway and the agent discover the metadata and enrich all received data with typical metadata of the source by communicating with the Kubernetes APIServer. Furthermore, they filter data according to the pipeline configuration.
+6. Telemetry Manager configures the agent and gateway according to the MetricPipeline resource specification, including the target backend for the metric gateway. Also, it observes the metrics flow to the backend and reports problems in the MetricPipeline status.
+7. The gateway and the agent send the data to the observability backend that's specified in your MetricPipeline resource - either within your cluster, or, if authentication is set up, to an external observability backend.
+8. You can analyze the metric data with your preferred observability backend.
+
+## Telemetry Manager
+
+The MetricPipeline resource is watched by Telemetry Manager, which is responsible for generating the custom parts of the OTel Collector configuration.
+
+
+
+1. Telemetry Manager watches all MetricPipeline resources and related Secrets.
+2. Furthermore, Telemetry Manager takes care of the full lifecycle of the gateway Deployment and the agent DaemonSet. Only if you defined a MetricPipeline, the gateway and agent are deployed.
+3. Whenever the user configuration changes, Telemetry Manager validates it and generates a single configuration for the gateway and agent.
+4. Referenced Secrets are copied into one Secret that is mounted to the gateway as well.
+
+## Metric Gateway
+
+In your cluster, the metric gateway is the central component to which all applications can send their individual metrics. The gateway collects, enriches, and dispatches the data to the configured backend. For more information, see [Set Up the OTLP Input](./../otlp-input.md).
+
+## Metric Agent
+
+If a MetricPipeline configures a feature in the `input` section, an additional DaemonSet is deployed acting as an agent. The agent is also based on an [OTel Collector](https://opentelemetry.io/docs/collector/) and encompasses the collection and conversion of Prometheus-based metrics. Hereby, the workload puts a `prometheus.io/scrape` annotation on the specification of the Pod or service, and the agent collects it.
diff --git a/docs/user/architecture/traces-architecture.md b/docs/user/architecture/traces-architecture.md
new file mode 100644
index 0000000000..d0062c106f
--- /dev/null
+++ b/docs/user/architecture/traces-architecture.md
@@ -0,0 +1,28 @@
+# Traces Architecture
+
+The Telemetry module provides a central Deployment of an [OTel Collector](https://opentelemetry.io/docs/collector/) acting as a gateway in the cluster. The gateway exposes endpoints that receive trace data from your applications and the service mesh. To control the gateway's behavior and data destination, you define a TracePipeline.
+
+
+
+1. An end-to-end request is triggered and populated across the distributed application. Every involved component propagates the trace context using the [W3C Trace Context](https://www.w3.org/TR/trace-context/) protocol.
+2. After contributing a new span to the trace, the involved components send the related span data ([OTLP](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/protocol/otlp.md)) to the central trace gateway using the `telemetry-otlp-traces` service.
+3. Istio sends the related span data to the trace gateway as well.
+4. The trace gateway discovers metadata that's typical for sources running on Kubernetes, like Pod identifiers, and then enriches the span data with that metadata.
+5. Telemetry Manager configures the gateway according to the TracePipeline resource, including the target backend for the trace gateway. Also, it observes the trace flow to the backend and reports problems in the TracePipeline status.
+6. The trace gateway sends the data to the observability backend that's specified in your TracePipeline resource - either within your cluster, or, if authentication is set up, to an external observability backend.
+7. You can analyze the trace data with your preferred observability backend.
+
+## Telemetry Manager
+
+The TracePipeline resource is watched by Telemetry Manager, which is responsible for generating the custom parts of the OTel Collector configuration.
+
+
+
+1. Telemetry Manager watches all TracePipeline resources and related Secrets.
+2. Furthermore, Telemetry Manager takes care of the full lifecycle of the OTel Collector Deployment itself. Only if you defined a TracePipeline, the collector is deployed.
+3. Whenever the configuration changes, it validates the configuration and generates a new configuration for OTel Collector, where a ConfigMap for the configuration is generated.
+4. Referenced Secrets are copied into one Secret that is mounted to the OTel Collector as well.
+
+## Trace Gateway
+
+In your cluster, the trace gateway is the central component to which all components can send their individual spans. The gateway collects, enriches, and dispatches the data to the configured backend. For more information, see [Set Up the OTLP Input](./../otlp-input.md).
diff --git a/docs/user/assets/istio-input.drawio.svg b/docs/user/assets/istio-input.drawio.svg
new file mode 100644
index 0000000000..f138b94fda
--- /dev/null
+++ b/docs/user/assets/istio-input.drawio.svg
@@ -0,0 +1,325 @@
+
\ No newline at end of file
diff --git a/docs/user/assets/gateways-plain.drawio.svg b/docs/user/assets/istio-output.drawio.svg
similarity index 60%
rename from docs/user/assets/gateways-plain.drawio.svg
rename to docs/user/assets/istio-output.drawio.svg
index 9144727f3f..ca50e2e92a 100644
--- a/docs/user/assets/gateways-plain.drawio.svg
+++ b/docs/user/assets/istio-output.drawio.svg
@@ -1,17 +1,17 @@
-