Skip to content

Commit

Permalink
Revise Metrics (#460)
Browse files Browse the repository at this point in the history
Co-authored-by: dtrai2 <[email protected]>

* reimplement metrics
* delete metric_exposer.py and corresponding tests
* reimplement Amides Metrics
* refactor metric tests to component base
* add logprep dashboard
* add offset metrics
* add librdkafka metrics
* implement output metrics
* delete and refactor time_measurement
* refactor exceptions test module
* remove normalizer acceptance test
  • Loading branch information
dtrai2 authored Nov 17, 2023
1 parent 183c271 commit c828dfc
Show file tree
Hide file tree
Showing 154 changed files with 8,819 additions and 2,941 deletions.
1 change: 0 additions & 1 deletion .vscode/settings.json
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@
"python.testing.pytestArgs": [
"tests",
"-v",
"--log-cli-level=DEBUG"
],
"python.analysis.importFormat": "absolute",
"editor.formatOnSave": true,
Expand Down
16 changes: 16 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,28 @@
## Upcoming Changes

## next release
### Breaking

* reimplemented metrics so the former metrics configuration won't work anymore
* metric content changed and existent grafana dashboards will break
* new rule `id` could possibly break configurations if the same rule is used in both rule trees
- can be fixed by adding a unique `id` to each rule or delete the possibly redundant rule

### Features

* add possibility to convert hex to int in `calculator` processor with new added function `from_hex`
* add metrics on rule level
* add grafana example dashboards under `quickstart/exampledata/config/grafana/dashboards`
* add new configuration field `id` for all rules to identify rules in metrics and logs
- if no `id` is given, the `id` will be generated in a stable way
- add verification of rule `id` uniqueness on processor level over both rule trees to ensure metrics are counted correctly on rule level

### Improvements

* reimplemented prometheus metrics exporter to provide gauges, histograms and counter metrics
* removed shared counter, because it is redundant to the metrics
* get exception stack trace by setting environment variable `DEBUG`

### Bugfix

## v7.0.0
Expand Down
2 changes: 1 addition & 1 deletion doc/source/development/connector_how_to.rst
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@ An exception should be thrown if an error occurs on calling this function.
These exceptions must inherit from the exception classes in :py:class:`~logprep.output.output.Output`.
They should return a helpful message when calling `str(exception)`.
Analogous to the input, exceptions that require a restart of Logprep should inherit from `FatalOutputError`.
Exceptions that inherit from `WarningOutputError` will be logged, but they do not require any error handling.
Exceptions that inherit from `OutputWarning` will be logged, but they do not require any error handling.

:py:meth:`~logprep.output.output.Output.store_failed`
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Expand Down
34 changes: 12 additions & 22 deletions doc/source/development/processor_how_to.rst
Original file line number Diff line number Diff line change
Expand Up @@ -122,19 +122,9 @@ However, the log message will be separately stored as failed (see :ref:`connecto
Metrics
^^^^^^^

By default a processor exposes metrics like the number of processed events or the mean processing
time.
If it is required to expose new, processor specific, metrics it is possible to extend the default
metrics.
To achieve this you have to implement a sub class inside the processor class which inherits from
:code:`Processor.ProcessorMetrics`.
The attributes or properties included in that class will be automatically exposed if the general
metrics configuration is enabled.
Further more the newly defined metric object has to be defined inside the :code:`__init__` method.
It is also possible to define metrics that are private and which won't be exposed.
These metrics have to start with an underscore.
The purpose of this functionality is to allow the calculation of metrics which are based on
intermediate values which aren't directly interesting to log and expose.
To achieve implementing new processor specific metrics you have to implement a embedded class
:code:`Metrics` inside the processor class which inherits from :code:`Component.Metrics`.
For further information about metrics see the reference implementation in the :code:`Amides` Processor.

The following code example highlights an implementation of processor specific metrics, aligned with
the general implementation of a new processor seen in :ref:`implementing_a_new_processor`.
Expand All @@ -144,6 +134,7 @@ the general implementation of a new processor seen in :ref:`implementing_a_new_p
"""Processor Documentation"""
from logprep.abc.processor import Processor
from logprep.metrics.metrics import CounterMetric
from attrs import define
class NewProcessor(Processor):
Expand All @@ -155,18 +146,17 @@ the general implementation of a new processor seen in :ref:`implementing_a_new_p
...
@define(kw_only=True)
class NewProcessorMetrics(Processor.ProcessorMetrics):
class Metrics(Component.Metrics):
"""Tracks statistics about the NewProcessor"""
new_metric: int = 0
"""Short description of this metric"""
_private_new_metric: int = 0
new_metric: CounterMetric = field(
factory=lambda: CounterMetric(
description="Short description of this metric",
name="new_metric",
)
)
"""Short description of this metric"""
@property
def calculated_metric(self):
"""Calculates something"""
return self.new_metric + self._private_new_metric
__slots__ = ["processor_attribute"]
Expand Down
10 changes: 0 additions & 10 deletions doc/source/user_manual/configuration/logprep.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,16 +28,6 @@ However, Logprep reacts quickly for small values (< 1.0), but this requires more
This can be useful for testing and debugging.
Larger values (like 5.0) slow the reaction time down, but this requires less processing power, which makes in preferable for continuous operation.

print_processed_period
======================

Integer, value > 0

Logprep does periodically write the amount of processed log messages per time period into the journal.
This value defines this time period in seconds.
It is an optional value and is set to 5 minutes by default.


logger
======

Expand Down
149 changes: 1 addition & 148 deletions doc/source/user_manual/configuration/metrics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,151 +2,4 @@
Metrics
=======

Configuration
=============

metrics
-------

This configuration allows the export of certain metrics like counts of processed events, errors or
warnings, for all processors and multiprocessing pipelines.
The metrics can be exported as a rotating log file with JSON lines and/or via a Prometheus Exporter.
Both can be activated at the same time.
By default only the file target is activated.
To activate the prometheus exporter the required target has to be configured.
Furthermore the utilized `prometheus python client <https://github.com/prometheus/client_python>`_
requires the configuration of the environment variable :code:`PROMETHEUS_MULTIPROC_DIR`, a
directory to save temporary files needed for in-between process communication.
If this environment variable is not set it defaults to the local systems temporary directory path,
e.g. :code:`{TEMP_DIR}/logprep/prometheus_multiproc_dir`

.. WARNING::
The configured directory :code:`PROMETHEUS_MULTIPROC_DIR` will be cleared on every startup.
Make sure it does not contain any other files as they will be lost afterwards.

It is also possible to export the metrics as an aggregation of all child processes or independently,
without any aggregation.

.. hint::
To achieve the best results for the prometheus exporter it is suggested to deactivate
`cumulative` metrics as well as the process aggregation `aggregate_processes`.
This ensures that each process is exported as it's own metrics giving full transparency.
And deactivating `cumulative` will result in exporting only the statistics of the past period
instead of counting endlessly.

This status_logger is configured with the following sub parameters:

period
^^^^^^

Integer, value > 0

Defines after how many seconds the metric should be written or updated.

enabled
^^^^^^^

true/false

Defines if the status logger should be activated.
It is enabled by default.

cumulative
^^^^^^^^^^

true/false

Defines if the metrics should count continuously (true) or if they should be reset after every period (false).
It is enabled by default.

aggregate_processes
^^^^^^^^^^^^^^^^^^^

true/false

Defines if the metrics of each child process should be aggregated to single values or if the metrics
of the processes should be exported directly per process.

Time Measurement
^^^^^^^^^^^^^^^^

It is possible to export metrics that indicate the processing times of one or more events.
This can be configured in the main configuration file via :code:`metrics.measure_time`.
If this config field is available and the subfield :code:`enabled` is set to :code:`true` then the
processing times are measured and exported.
Through the metric tracking it is possible to export those metrics as file or through the prometheus
exporter.
There the processing times represent the average of all events during the configured
period.
It is also possible to export the processing time for each event independently by appending the
information to the event.
The processing times of all processors/modules can be then found in the field
:code:`processing_time` of each processed event.
Additionally, the hostname of the machine on which Logprep runs is listed,
which allows to read how much time has passed between the generation of the event and the
beginning of the processing.

Time Measurement is deactivated by default.

If only the general metrics are activated then the metric for the time measurement will be 0.

port
^^^^^^^

Port which should be used to start the default prometheus exporter webservers. (default: 8000)

Example
-------

.. code-block:: yaml
:linenos:
metrics:
enabled: true
period: 10
cumulative: false
aggregate_processes: false
measure_time:
enabled: true
append_to_event: false
port: 8000
Metrics Overview
================

.. autoclass:: logprep.framework.rule_tree.rule_tree.RuleTree.RuleTreeMetrics
:members:
:undoc-members:
:private-members:
:inherited-members:

.. autoclass:: logprep.abc.processor.Processor.ProcessorMetrics
:members:
:undoc-members:
:private-members:
:inherited-members:

.. autoclass:: logprep.processor.amides.processor.Amides.ProcessorMetrics
:members:
:undoc-members:
:private-members:
:inherited-members:

.. autoclass:: logprep.processor.domain_resolver.processor.DomainResolver.DomainResolverMetrics
:members:
:undoc-members:
:private-members:
:inherited-members:

.. autoclass:: logprep.processor.pseudonymizer.processor.Pseudonymizer.PseudonymizerMetrics
:members:
:undoc-members:
:private-members:
:inherited-members:

.. autoclass:: logprep.framework.pipeline.Pipeline.PipelineMetrics
:members:
:undoc-members:
:private-members:
:inherited-members:
.. automodule:: logprep.metrics.metrics
2 changes: 1 addition & 1 deletion doc/source/user_manual/configuration/pipeline.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,5 +15,5 @@ Example

.. literalinclude:: /../../quickstart/exampledata/config/pipeline.yml
:language: yaml
:start-after: level: DEBUG
:start-after: port: 8000
:end-before: input:
34 changes: 29 additions & 5 deletions logprep/abc/component.py
Original file line number Diff line number Diff line change
@@ -1,12 +1,15 @@
""" abstract module for components"""
from abc import ABC
from functools import cached_property
from logging import Logger
from typing import Callable

import msgspec
from attr import define, field, validators
from attrs import asdict
from schedule import Scheduler

from logprep.metrics.metrics import Metric
from logprep.util.helper import camel_to_snake


Expand All @@ -20,6 +23,19 @@ class Config:
type: str = field(validator=validators.instance_of(str))
"""Type of the component"""

@define(kw_only=True)
class Metrics:
"""Base Metric class to track and expose statistics about logprep"""

_labels: dict

def __attrs_post_init__(self):
for attribute in asdict(self):
attribute = getattr(self, attribute)
if isinstance(attribute, Metric):
attribute.labels = self._labels
attribute.init_tracker()

# __dict__ is added to support functools.cached_property
__slots__ = ["name", "_logger", "_config", "__dict__"]

Expand All @@ -31,11 +47,21 @@ class Config:
_decoder: msgspec.json.Decoder = msgspec.json.Decoder()
_encoder: msgspec.json.Encoder = msgspec.json.Encoder()

@property
def metric_labels(self) -> dict:
"""Labels for the metrics"""
return {"component": self._config.type, "name": self.name, "description": "", "type": ""}

def __init__(self, name: str, configuration: "Component.Config", logger: Logger):
self._logger = logger
self._config = configuration
self.name = name

@cached_property
def metrics(self):
"""create and return metrics object"""
return self.Metrics(labels=self.metric_labels)

def __repr__(self):
return camel_to_snake(self.__class__.__name__)

Expand All @@ -53,11 +79,9 @@ def describe(self) -> str:
return f"{self.__class__.__name__} ({self.name})"

def setup(self):
"""Set the component up.
This is optional.
"""
"""Set the component up."""
# initialize metrics
_ = self.metrics

def shut_down(self):
"""Stop processing of this component.
Expand Down
Loading

0 comments on commit c828dfc

Please sign in to comment.