Skip to content

Releases: fkie-cad/Logprep

v6.6.0

03 Jul 12:51
e5e25e8
Compare
Choose a tag to compare

Improvements

  • Replace rule_filter with lucene_filter in predetector output. The old internal logprep rule
    representation is not present anymore in the predetector output, the name rule_filter will stay
    in place of the lucene_filter name.
  • 'amides' processor now stores confidence values of processed events in the amides.confidence field.
    In case of positive detection results, rule attributions are now inserted in the amides.attributions field.

Bugfix

  • Fix lucene rule filter representation such that it is aligned with opensearch lucene query syntax
  • Fix grok pattern UNIXPATH by internally converting [[:alnum:]] to \w"
  • Fix overwriting of temporary tld-list with empty content

Details

  • replace rule filter with lucene filter in predetector output by @dtrai2 in #403
  • fix lucene filter representation by @dtrai2 in #409
  • fix dry_runner to support selective_extractor extra outputs by @dtrai2 in #407
  • Overhaul AMIDES output by @clumsy9 in #411
  • Extend predetector documentation by @dtrai2 in #412
  • fix UNIXPATH grok pattern by @dtrai2 in #415
  • prevent overwriting existing file by @dtrai2 in #414
  • Prepare Release v6.6.0 by @dtrai2 in #418

Full Changelog: v6.5.1...v6.6.0

v6.5.1

14 Jun 13:58
1462fc2
Compare
Choose a tag to compare

Bugfix

  • Fix creation of logprep temp dir

Details

Full Changelog: v6.5.0...v6.5.1

v6.5.0

13 Jun 14:03
fea6543
Compare
Choose a tag to compare

Improvements

  • Make the PROMETHEUS_MULTIPROC_DIR environment variable optional, will default to
    /tmp/PROMETHEUS_MULTIPROC_DIR if not given

Bugfix

  • All temp files will now be stored inside the systems default temp directory

Details

Full Changelog: v6.3.0...v6.5.0

v6.4.0

01 Jun 10:39
Compare
Choose a tag to compare

Improvements

  • Bump requests to >=2.31.0 to circumvent CVE-2023-32681
  • Include a lucene representation of the rule filter into the predetector results. The
    representation is not completely lucene compatible due to non-existing regex functionality.

Bugfix

  • Fix error handling of FieldManager if no mapped source field exists in the event.
  • Fix Grokker such that only the first grok pattern match is applied instead of all matching
    pattern
  • Fix Grokker such that nested parentheses in oniguruma pattern are working (3 levels are supported
    now)
  • Fix Grokker such that two or more oniguruma can point to the same target. This ensures
    grok-pattern compatibility with the normalizer and other grok tools

Details

  • bump requests to >=2.31.0 by @dtrai2 in #397
  • Fix confluent kafka 2 store offset by @dtrai2 in #386
  • fix error handling of fieldmanager if no mapped source field exists by @dtrai2 in #395
  • add lucene filter to predetector detections by @dtrai2 in #396
  • grokker fixes by @dtrai2 in #398

Full Changelog: v6.3.0...v6.4.0

v6.3.0

22 May 10:46
46ea995
Compare
Choose a tag to compare

Features

  • Extend dissector such that it can trim characters around dissected field with %{field-( )}
    notation.
  • Extend timestamper such that it can take multiple source_formats. First format that matches
    will be used, all following formats will be ignored

Improvements

  • Extend the FieldManager such that it can move/copy multiple source fields into multiple targets
    inside one rule.

Bugfix

  • Fix error handling of missing source fields in grokker
  • Fix using same output fields in list of grok pattern in grokker

Details

  • Grokker: fix handling of missing source fields by @dtrai2 in #387
  • Allow same target names in grokker mapping list by @dtrai2 in #389
  • Dissector: add feature to strip chars after dissecting by @dtrai2 in #390
  • Extend fieldmanager rule to support writing into multiple targets in one rule by @dtrai2 in #388
  • allow using list of source_formats in timestamper by @dtrai2 in #392
  • merge processors and rule docs to reduce duplicate documentation by @dtrai2 in #391
  • prepare release 6.3.0 by @dtrai2 in #394

Full Changelog: v6.2.0...v6.3.0

v6.2.0

10 May 09:20
7df6dbb
Compare
Choose a tag to compare

Features

  • add timestamper processor to extract timestamp functionality from normalizer

Improvements

  • removed arrow dependency and depending features for performance reasons
    • switched to datetime.strftime syntax in timestamp_differ, s3_output, elasticsearch_output and opensearch_output
  • encapsulate time related functionality in logprep.util.time.TimeParser

Bugfix

  • Fix missing default grok patterns in packaged logprep version

Details

Full Changelog: v6.1.0...v6.2.0

v6.1.0

24 Apr 10:08
a2151c6
Compare
Choose a tag to compare

Features

  • Add amides processor to extends conventional rule matching by applying machine learning components
  • Add grokker processor to extract grok functionality from normalizer
  • Normalizer writes failure tags if nomalization fails
  • Add flush_timeout to opensearch and elasticsearch outputs to ensure message delivery within a configurable period
  • add kafka_config option to confluent_kafka_input and confluent_kafka_output connectors to provide additional config options to librdkafka

Improvements

  • Harmonize error messages and handling for processors and connectors
  • Add ability to schedule periodic tasks to all components
  • Improve performance of pipeline processing by switching form builtin json to msgspec in pipeline and kafka connectors

Bugfix

  • Fix resetting processor caches in the auto_rule_corpus_tester by initializing all processors
    between test cases.
  • Fix processing of generic rules after there was an error inside the specific rules.
  • Remove coordinate fields from results of the geoip enricher if one of them has None values

Details

  • replace json.loads with msgspec to improve input connector performance by @ekneg54 in #351
  • Add processor and rules for the Adaptive Misuse Detection System (AMIDES) by @clumsy9 in #347
  • Make multiple applications of rules by the same processor optional by @ppcad in #355
  • add scheduler to all components and schedule search flush after period of times by @ekneg54 in #344
  • add kafka advanced config option by @ekneg54 in #357
  • avoid string splitting during processing by @ekneg54 in #362
  • Speedup startup by @ppcad in #314
  • Improve error message by @dtrai2 in #366
  • Fix rule corpus tester processor caches by @dtrai2 in #367
  • Fix processing of generic rules after an error in the specific rules by @dtrai2 in #368
  • S3 output by @ppcad in #364
  • improve pipeline performance by @ekneg54 in #369
  • Remove all geoip enricher fields if they are None by @ppcad in #371
  • Add date replacement to prefixes for s3 output by @ppcad in #375
  • Fix catched exception in tree parser that led to ignoring of rules by @ppcad in #343
  • add grokker processor by @ekneg54 in #363
  • improve connector performance by @ekneg54 in #370
  • prepare release 6.1.0 by @ekneg54 in #377

New Contributors

Full Changelog: v6.0.0...v6.1.0

v6.0.0

23 Mar 12:51
21e317f
Compare
Choose a tag to compare

v6.0.0

Breaking

  • Remove rules deprecations introduced in v4.0.0
  • Changes rule language of selective_extractor, pseudonymizer, pre_detector to support multiple outputs

Features

  • Add string_splitter processor to split strings of variable length into lists
  • Add ip_informer processor to enrich events with ip information
  • Allow running the Pipeline in python without input/output connectors
  • Add auto_rule_corpus_tester to test a whole rule corpus against defined expected outputs.
  • Add shorthand for converting datatypes to dissector dissect pattern language
  • Add support for multiple output connectors
  • Apply processors multiple times until no new rule matches anymore. This enables applying rules on
    results of previous rules.

Improvements

  • Bump attrs to >=22.2.0 and delete redundant min_len_validator
  • Specify the metric labels for connectors (add name, type and direction as labels)
  • Rename metric names to clarify their meanings (logprep_pipeline_number_of_warnings to
    logprep_pipeline_sum_of_processor_warnings and logprep_pipeline_number_of_errors to
    logprep_pipeline_sum_of_processor_errors)

Bugfix

  • Fixes a bug that breaks templating config and rule files with environment variables if one or more variables are not set in environment
  • Fixes a bug for opensearch_output and elasticsearch_output not handling authentication issues
  • Fix metric logprep_pipeline_number_of_processed_events to actually count the processed events per pipeline
  • Fix a bug for enrichment with environment variables. Variables must have one of the following prefixes now: LOGPREP_, CI_, GITHUB_ or PYTEST_

Improvements

  • reimplements the selective_extractor

Details

  • add simple string splitter processor by @ekneg54 in #304
  • fix type hint for target_field by @ekneg54 in #317
  • Fix auto rule tester to call all processor setup methods by @dtrai2 in #312
  • add processor to enrich IP information by @ekneg54 in #303
  • Add pseudonymization of list elements and allow regex of list elements in auto-tests by @ppcad in #284
  • Fix graceful shutdown of multiprocessing pipeline by @dtrai2 in #313
  • bump attrs to 22.2 by @ekneg54 in #319
  • Enable pipeline run without connectors by @dtrai2 in #308
  • Add auto rule corpus tester by @dtrai2 in #185
  • Fix automodule documentation paths by @dtrai2 in #324
  • Add shorthand to dissect pattern for converting datatypes by @ekneg54 in #328
  • update changelog bump attrs requirement by @ekneg54 in #331
  • add documentation for environment variables in config by @ekneg54 in #326
  • make properties configurable in ip_informer rule by @ekneg54 in #325
  • fix handle template errors by @ekneg54 in #333
  • fix opensearch-output-connector-raises-to-late-if-unauthenticated by @ekneg54 in #334
  • add support for multiple output connectors by @ekneg54 in #306
  • Fix metrics for multiple outputs by @dtrai2 in #336
  • Fixed broken link in README.md by @0xr2po in #339
  • remove verification of configuration in pipeline manager and run_logprep by @ekneg54 in #337
  • add connectionerror to connection test try block by @ekneg54 in #341
  • make broader exception for search connection test by @ekneg54 in #342
  • Apply processors multiple times by @dtrai2 in #318
  • remove all deprecated code and resolve all warnings by @ekneg54 in #345
  • only expand valid prefixed variables by @ekneg54 in #349
  • fix non-deterministic behavior and add test by @dtrai2 in #346
  • bump confluent-kafka client to >2.0.0 by @ekneg54 in #353
  • Fix dry_runner to deal with multiple outputs by @dtrai2 in #352
  • prepare release v6.0.0 by @ekneg54 in #354

New Contributors

Full Changelog: v5.0.1...v6.0.0

v5.0.1

06 Feb 16:18
e52804a
Compare
Choose a tag to compare

Breaking

  • drop support for python 3.6, 3.7, 3.8
  • change default prefix behavior on appending to strings of dissector

Features

  • Add an http input connector that spawns a uvicorn server which parses requests content to events.
  • Add an file input connector that reads generic logfiles.
  • Provide the possibility to consume lists, rules and configuration from files and http endpoints
  • Add requester processor that enriches by making http requests with field values
  • Add calculator processor to calculate with or without field values
  • Make output subfields of the geoip_enricher configurable by introducing the rule config
    customize_target_subfields
  • Add a timestamp_differ processor that can parse two timestamps and calculate their respective time delta.
  • Add config_refresh_interval configuration option to refresh the configuration on a given timedelta
  • Add option to dissector to use a prefix pattern in dissect language for appending to strings and add the default behavior to append to strings without any prefixed separator

Improvements

  • Add support for python 3.10 and 3.11
  • Add option to submit a template with list_search_base_path config parameter in list_comparison processor
  • Add functionality to geoip_enricher to download the geoip-database
  • Add ability to use environment variables in rules and config
  • Add list access including slicing to dotted field notation for getting values
  • Add processor boilerplate generator to help adding new processors

Bugfixes

  • Fix count of number_of_processed_events metric in input connector. Will now only count actual
    events.

Details

New Contributors

Full Changelog: v4.0.0...v5.0.0

v4.0.0

21 Nov 10:33
2dd4cc5
Compare
Choose a tag to compare

v4.0.0

Breaking

  • Splitting the general connector config into input and output to compose connector config independendly
  • Removal of Deprecated Feature: HMAC-Options in the connector consumer options have to be
    under the subkey preprocessing of the input processor
  • Removal of Deprecated Feature: delete processor was renamed to deleter
  • Rename writing_output connector to jsonl_output

Features

  • Add an opensearch output connector that can be used to write directly into opensearch.
  • Add an elasticsearch output connector that can be used to write directly into elasticsearch.
  • Split connector config into seperate config keys input and output
  • Add preprocessing capabillities to all input connectors
  • Add preprocessor for log_arrival_time
  • Add preprocessor for log_arrival_timedelta
  • Add metrics to connectors
  • Add concatenator processor that can combine multiple source fields
  • Add dissector processor that tokinizes messages into new or existing fields
  • Add key_checker processor that checks if all dotted fields from a list are present in the event
  • Add field_manager processor that copies or moves fields and merges lists
  • Add ability to delete source fields to concatenator, datetime_extractor, dissector, domain_label_extractor, domain_resolver, geoip_enricher and list_comparison
  • Add ability to overwrite target field to datetime_extractor, domain_label_extractor, domain_resolver, geoip_enricher and list_comparison

Improvements

  • Validate connector config on class level via attrs classes
  • Implement a common interface to all connectors
  • Refactor connector code
  • Revise the documentation
  • Add sphinxcontrib.datatemplates and testcase-renderer to docs
  • Reimplement get_dotted_field_value helper method which should lead to increased performance
  • Reimplement dropper processor code to improve performance

Deprecations

Rule Language

  • datetime_extractor.datetime_field is deprecated. Use datetime_extractor.source_fields as list instead.
  • datetime_extractor.destination_field is deprecated. Use datetime_extractor.target_field instead.
  • delete is deprecated. Use deleter.delete instead.
  • domain_label_extractor.target_field is deprecated. Use domain_label_extractor.source_fields as list instead.
  • domain_label_extractor.output_field is deprecated. Use domain_label_extractor.target_field instead.
  • domain_resolver.source_url_or_domain is deprecated. Use domain_resolver.source_fields as list instead.
  • domain_resolver.output_field is deprecated. Use domain_resolver.target_field instead.
  • drop is deprecated. Use dropper.drop instead.
  • drop_full is deprecated. Use dropper.drop_full instead.
  • geoip_enricher.source_ip is deprecated. Use geoip_enricher.source_fields as list instead.
  • geoip_enricher.output_field is deprecated. Use geoip_enricher.target_field instead.
  • label is deprecated. Use labeler.label instead.
  • list_comparison.check_field is deprecated. Use list_comparison.source_fields as list instead.
  • list_comparison.output_field is deprecated. Use list_comparison.target_field instead.
  • pseudonymize is deprecated. Use pseudonymizer.pseudonyms instead.
  • url_fields is deprecated. Use pseudonymizer.url_fields instead.

Details

  • Add es output connector and split kafka connector by @ppcad in #83
  • Increase Test Coverage for Connectors by @ekneg54 in #169
  • Remove unused code by @ekneg54 in #170
  • refactor connectors part 1 by @ekneg54 in #173
  • Fix duplication in rule tree and adapt rule equalities by @ppcad in #168
  • Remove accidental print statements from normalizer by @ppcad in #176
  • Add OpenSearch output connector by @ppcad in #174
  • Refactor delete processor -> deleter by @saegel in #167
  • Split Connectors into separate input and output by @ekneg54 in #175
  • Add concatenator processor by @dtrai2 in #182
  • Fix picklingerror in debug by @ppcad in #188
  • Add Dissecter processor by @ekneg54 in #181
  • Add Dockerfile, .dockerignore and update README by @ekneg54 in #187
  • Fix time measurement for get_next input connector by @dtrai2 in #191
  • fix keyerror for message with setted _op_type field by @ekneg54 in #194
  • 192 docs build is failing by @ekneg54 in #197
  • 192 docs build is failing by @ekneg54 in #199
  • Fix typo dissecter -> dissector by @dtrai2 in #200
  • Fix typo seperator -> separator by @dtrai2 in #198
  • change documentation and config key for opensearch and elasticsearch by @ekneg54 in #195
  • Remove deprecation warnings and tests for next major release by @dtrai2 in #201
  • Fix error message for missing input/output config keys by @dtrai2 in #202
  • Fix non-deterministic behavior by @dtrai2 in #203
  • refactor rules by @ekneg54 in #190
  • dev-add-keychecker-processor by @DogchampDiego in #204
  • Implement delete_source_field and overwrite_target to all possible processors by @ekneg54 in #205
  • reimplement get_dotted_field_value by @ekneg54 in #207
  • Fix add version info preprocessor and revice tests by @dtrai2 in #208
  • Adapt rule configuration validation by @ppcad in #209
  • add field_manager processor by @ekneg54 in #206
  • refactor add_field_to by @ekneg54 in #210
  • Fix typo in elasticsearch output connector by @ppcad in #214
  • Implement HMAC fallback for input connectors without raw message by @dtrai2 in #213
  • Fix reset of metric by @dtrai2 in #212
  • Fix error message for processing error in field_manager by @ekneg54 in #215
  • fix bug for non existing source_field in domain_label_extractor by @ekneg54 in #217
  • reimplement dropper by @ekneg54 in #216
  • Fix prometheus_exporter reset on failed pipelines by @dtrai2 in #218
  • add new representation for rules by @ekneg54 in #219
  • fix output of warnings by @ekneg54 in #220
  • Logprep 4.0.0 release by @ekneg54 in #224

New Contributors

Full Changelog: v3.3.0...v4.0.0