Releases · fkie-cad/Logprep

03 Jul 12:51

dtrai2

v6.6.0

e5e25e8

v6.6.0

Improvements

Replace rule_filter with lucene_filter in predetector output. The old internal logprep rule
representation is not present anymore in the predetector output, the name rule_filter will stay
in place of the lucene_filter name.
'amides' processor now stores confidence values of processed events in the amides.confidence field.
In case of positive detection results, rule attributions are now inserted in the amides.attributions field.

Bugfix

Fix lucene rule filter representation such that it is aligned with opensearch lucene query syntax
Fix grok pattern UNIXPATH by internally converting [[:alnum:]] to \w"
Fix overwriting of temporary tld-list with empty content

Details

replace rule filter with lucene filter in predetector output by @dtrai2 in #403
fix lucene filter representation by @dtrai2 in #409
fix dry_runner to support selective_extractor extra outputs by @dtrai2 in #407
Overhaul AMIDES output by @clumsy9 in #411
Extend predetector documentation by @dtrai2 in #412
fix UNIXPATH grok pattern by @dtrai2 in #415
prevent overwriting existing file by @dtrai2 in #414
Prepare Release v6.6.0 by @dtrai2 in #418

Full Changelog: v6.5.1...v6.6.0

Contributors

clumsy9 and dtrai2

Assets 2

14 Jun 13:58

dtrai2

v6.5.1

1462fc2

v6.5.1

Bugfix

Fix creation of logprep temp dir

Details

fix creation of logprep temp dir and tests by @dtrai2 in #406
prepare release v6.5.1 by @dtrai2 in #408

Full Changelog: v6.5.0...v6.5.1

Contributors

dtrai2

Assets 2

13 Jun 14:03

dtrai2

v6.5.0

fea6543

v6.5.0

Improvements

Make the PROMETHEUS_MULTIPROC_DIR environment variable optional, will default to
/tmp/PROMETHEUS_MULTIPROC_DIR if not given

Bugfix

All temp files will now be stored inside the systems default temp directory

Details

bump requests to >=2.31.0 by @dtrai2 in #397
Fix confluent kafka 2 store offset by @dtrai2 in #386
fix error handling of fieldmanager if no mapped source field exists by @dtrai2 in #395
add lucene filter to predetector detections by @dtrai2 in #396
grokker fixes by @dtrai2 in #398
prepare release v6.4.0 by @dtrai2 in #399
remove direct python-dateutil dependency by @dtrai2 in #393
rework logprep temp files by @dtrai2 in #402
update CHANGELOG.md by @dtrai2 in #405

Full Changelog: v6.3.0...v6.5.0

Contributors

dtrai2

Assets 2

01 Jun 10:39

dtrai2

v6.4.0

ed9d617

v6.4.0

Improvements

Bump requests to >=2.31.0 to circumvent CVE-2023-32681
Include a lucene representation of the rule filter into the predetector results. The
representation is not completely lucene compatible due to non-existing regex functionality.

Bugfix

Fix error handling of FieldManager if no mapped source field exists in the event.
Fix Grokker such that only the first grok pattern match is applied instead of all matching
pattern
Fix Grokker such that nested parentheses in oniguruma pattern are working (3 levels are supported
now)
Fix Grokker such that two or more oniguruma can point to the same target. This ensures
grok-pattern compatibility with the normalizer and other grok tools

Details

bump requests to >=2.31.0 by @dtrai2 in #397
Fix confluent kafka 2 store offset by @dtrai2 in #386
fix error handling of fieldmanager if no mapped source field exists by @dtrai2 in #395
add lucene filter to predetector detections by @dtrai2 in #396
grokker fixes by @dtrai2 in #398

Full Changelog: v6.3.0...v6.4.0

Contributors

dtrai2

Assets 2

22 May 10:46

dtrai2

v6.3.0

46ea995

v6.3.0

Features

Extend dissector such that it can trim characters around dissected field with %{field-( )}
notation.
Extend timestamper such that it can take multiple source_formats. First format that matches
will be used, all following formats will be ignored

Improvements

Extend the FieldManager such that it can move/copy multiple source fields into multiple targets
inside one rule.

Bugfix

Fix error handling of missing source fields in grokker
Fix using same output fields in list of grok pattern in grokker

Details

Grokker: fix handling of missing source fields by @dtrai2 in #387
Allow same target names in grokker mapping list by @dtrai2 in #389
Dissector: add feature to strip chars after dissecting by @dtrai2 in #390
Extend fieldmanager rule to support writing into multiple targets in one rule by @dtrai2 in #388
allow using list of source_formats in timestamper by @dtrai2 in #392
merge processors and rule docs to reduce duplicate documentation by @dtrai2 in #391
prepare release 6.3.0 by @dtrai2 in #394

Full Changelog: v6.2.0...v6.3.0

Contributors

dtrai2

Assets 2

10 May 09:20

dtrai2

v6.2.0

7df6dbb

v6.2.0

Features

add timestamper processor to extract timestamp functionality from normalizer

Improvements

removed arrow dependency and depending features for performance reasons
- switched to datetime.strftime syntax in timestamp_differ, s3_output, elasticsearch_output and opensearch_output
encapsulate time related functionality in logprep.util.time.TimeParser

Bugfix

Fix missing default grok patterns in packaged logprep version

Details

simplify quickstart by @ekneg54 in #373
mark not used fields with init=False, repr=False, eq=False by @ekneg54 in #379
add timestamper processor by @ekneg54 in #374
Fix packaging of logprep to include grok default pattern by @dtrai2 in #382
Fix util time.py for non-utc systems by @dtrai2 in #384
update CHANGELOG.md by @dtrai2 in #385

Full Changelog: v6.1.0...v6.2.0

Contributors

dtrai2 and ekneg54

Assets 2

24 Apr 10:08

ekneg54

v6.1.0

a2151c6

v6.1.0

Features

Add amides processor to extends conventional rule matching by applying machine learning components
Add grokker processor to extract grok functionality from normalizer
Normalizer writes failure tags if nomalization fails
Add flush_timeout to opensearch and elasticsearch outputs to ensure message delivery within a configurable period
add kafka_config option to confluent_kafka_input and confluent_kafka_output connectors to provide additional config options to librdkafka

Improvements

Harmonize error messages and handling for processors and connectors
Add ability to schedule periodic tasks to all components
Improve performance of pipeline processing by switching form builtin json to msgspec in pipeline and kafka connectors

Bugfix

Fix resetting processor caches in the auto_rule_corpus_tester by initializing all processors
between test cases.
Fix processing of generic rules after there was an error inside the specific rules.
Remove coordinate fields from results of the geoip enricher if one of them has None values

Details

replace json.loads with msgspec to improve input connector performance by @ekneg54 in #351
Add processor and rules for the Adaptive Misuse Detection System (AMIDES) by @clumsy9 in #347
Make multiple applications of rules by the same processor optional by @ppcad in #355
add scheduler to all components and schedule search flush after period of times by @ekneg54 in #344
add kafka advanced config option by @ekneg54 in #357
avoid string splitting during processing by @ekneg54 in #362
Speedup startup by @ppcad in #314
Improve error message by @dtrai2 in #366
Fix rule corpus tester processor caches by @dtrai2 in #367
Fix processing of generic rules after an error in the specific rules by @dtrai2 in #368
S3 output by @ppcad in #364
improve pipeline performance by @ekneg54 in #369
Remove all geoip enricher fields if they are None by @ppcad in #371
Add date replacement to prefixes for s3 output by @ppcad in #375
Fix catched exception in tree parser that led to ignoring of rules by @ppcad in #343
add grokker processor by @ekneg54 in #363
improve connector performance by @ekneg54 in #370
prepare release 6.1.0 by @ekneg54 in #377

New Contributors

@clumsy9 made their first contribution in #347

Full Changelog: v6.0.0...v6.1.0

Contributors

clumsy9, ppcad, and 2 other contributors

Assets 2

23 Mar 12:51

ekneg54

v6.0.0

21e317f

v6.0.0

Breaking

Remove rules deprecations introduced in v4.0.0
Changes rule language of selective_extractor, pseudonymizer, pre_detector to support multiple outputs

Features

Add string_splitter processor to split strings of variable length into lists
Add ip_informer processor to enrich events with ip information
Allow running the Pipeline in python without input/output connectors
Add auto_rule_corpus_tester to test a whole rule corpus against defined expected outputs.
Add shorthand for converting datatypes to dissector dissect pattern language
Add support for multiple output connectors
Apply processors multiple times until no new rule matches anymore. This enables applying rules on
results of previous rules.

Improvements

Bump attrs to >=22.2.0 and delete redundant min_len_validator
Specify the metric labels for connectors (add name, type and direction as labels)
Rename metric names to clarify their meanings (logprep_pipeline_number_of_warnings to
logprep_pipeline_sum_of_processor_warnings and logprep_pipeline_number_of_errors to
logprep_pipeline_sum_of_processor_errors)

Bugfix

Fixes a bug that breaks templating config and rule files with environment variables if one or more variables are not set in environment
Fixes a bug for opensearch_output and elasticsearch_output not handling authentication issues
Fix metric logprep_pipeline_number_of_processed_events to actually count the processed events per pipeline
Fix a bug for enrichment with environment variables. Variables must have one of the following prefixes now: LOGPREP_, CI_, GITHUB_ or PYTEST_

Improvements

reimplements the selective_extractor

Details

add simple string splitter processor by @ekneg54 in #304
fix type hint for target_field by @ekneg54 in #317
Fix auto rule tester to call all processor setup methods by @dtrai2 in #312
add processor to enrich IP information by @ekneg54 in #303
Add pseudonymization of list elements and allow regex of list elements in auto-tests by @ppcad in #284
Fix graceful shutdown of multiprocessing pipeline by @dtrai2 in #313
bump attrs to 22.2 by @ekneg54 in #319
Enable pipeline run without connectors by @dtrai2 in #308
Add auto rule corpus tester by @dtrai2 in #185
Fix automodule documentation paths by @dtrai2 in #324
Add shorthand to dissect pattern for converting datatypes by @ekneg54 in #328
update changelog bump attrs requirement by @ekneg54 in #331
add documentation for environment variables in config by @ekneg54 in #326
make properties configurable in ip_informer rule by @ekneg54 in #325
fix handle template errors by @ekneg54 in #333
fix opensearch-output-connector-raises-to-late-if-unauthenticated by @ekneg54 in #334
add support for multiple output connectors by @ekneg54 in #306
Fix metrics for multiple outputs by @dtrai2 in #336
Fixed broken link in README.md by @0xr2po in #339
remove verification of configuration in pipeline manager and run_logprep by @ekneg54 in #337
add connectionerror to connection test try block by @ekneg54 in #341
make broader exception for search connection test by @ekneg54 in #342
Apply processors multiple times by @dtrai2 in #318
remove all deprecated code and resolve all warnings by @ekneg54 in #345
only expand valid prefixed variables by @ekneg54 in #349
fix non-deterministic behavior and add test by @dtrai2 in #346
bump confluent-kafka client to >2.0.0 by @ekneg54 in #353
Fix dry_runner to deal with multiple outputs by @dtrai2 in #352
prepare release v6.0.0 by @ekneg54 in #354

New Contributors

@0xr2po made their first contribution in #339

Full Changelog: v5.0.1...v6.0.0

Contributors

ppcad, dtrai2, and 2 other contributors

Assets 2

06 Feb 16:18

ekneg54

v5.0.1

e52804a

v5.0.1

Breaking

drop support for python 3.6, 3.7, 3.8
change default prefix behavior on appending to strings of dissector

Features

Add an http input connector that spawns a uvicorn server which parses requests content to events.
Add an file input connector that reads generic logfiles.
Provide the possibility to consume lists, rules and configuration from files and http endpoints
Add requester processor that enriches by making http requests with field values
Add calculator processor to calculate with or without field values
Make output subfields of the geoip_enricher configurable by introducing the rule config
customize_target_subfields
Add a timestamp_differ processor that can parse two timestamps and calculate their respective time delta.
Add config_refresh_interval configuration option to refresh the configuration on a given timedelta
Add option to dissector to use a prefix pattern in dissect language for appending to strings and add the default behavior to append to strings without any prefixed separator

Improvements

Add support for python 3.10 and 3.11
Add option to submit a template with list_search_base_path config parameter in list_comparison processor
Add functionality to geoip_enricher to download the geoip-database
Add ability to use environment variables in rules and config
Add list access including slicing to dotted field notation for getting values
Add processor boilerplate generator to help adding new processors

Bugfixes

Fix count of number_of_processed_events metric in input connector. Will now only count actual
events.

Details

Reimplement flipping aggregation test by @ekneg54 in #222
consume lists, configuration and rules via api or file by @ekneg54 in #221
Add HTTP Connector by @ekneg54 in #75
fix dissector not able to start with seperator by @ekneg54 in #227
drop support for python 3.6, 3.7, 3.8 and bump dependencies to support 3.11 by @ekneg54 in #228
fix main pipeline by @ekneg54 in #229
fix documentation build by @ekneg54 in #230
Add calculator processor by @ekneg54 in #223
fix Dockerfile by @ekneg54 in #232
Geoip Enricher: Add config to customize target subfields by @dtrai2 in #231
fix regex normalization by @ekneg54 in #234
Fix normalizer documentation by @dtrai2 in #235
Fix count of number_of_processed_events in input connector by @dtrai2 in #236
add notebooks by @ekneg54 in #237
add jsonl parser to getter by @ekneg54 in #240
Dev improve documentation by @dtrai2 in #244
241 feature load files from http is broken because of file validator by @ekneg54 in #243
add http config option to documentation by @ekneg54 in #238
fix mermaid documentation output by @ekneg54 in #248
Fix deleter unbound local error by @ekneg54 in #247
Upgrade protobuff to non CVE version by @ekneg54 in #239
log python and logprep version on startup by @ekneg54 in #245
upgrade certifi by @ekneg54 in #249
Add timestamp_differ processor by @dtrai2 in #242
fix http config load - remove hard filesystem validations by @ekneg54 in #250
Add requester processor by @ekneg54 in #226
Relax the timestamp_differ timestamp format by @dtrai2 in #252
refactor pipeline by @ekneg54 in #251
Fix docu + notebook by @dtrai2 in #253
Hide units by default by @dtrai2 in #255
add requester example by @ekneg54 in #256
add build image by @ekneg54 in #254
add variable expansion to getter by @ekneg54 in #257
fix version output by @ekneg54 in #258
add option to submit a template through list_search_base_path by @ekneg54 in #259
add session handling to http getter by @ekneg54 in #260
add get_raw to getter and make it the abstract method by @ekneg54 in #261
geoip enricher downloads geoip database if not exists by @ekneg54 in #262
fix artifacts load during process time by @ekneg54 in #263
Resolve deprecation warnings in tests for deleter and datetime_extractor by @ekneg54 in #264
resolve deprecation warnings in test for domain_label_extractor and domain_resolver by @ekneg54 in #265
resolve deprecation warnings in test for dropper by @ekneg54 in #266
resolve deprecation warnings in test for geoip_enricher by @ekneg54 in #267
resolve deprecation warnings in test for labeler by @ekneg54 in #268
resolve deprecation warnings in test for list_comparison by @ekneg54 in #269
fix tldlist error on startup by @ekneg54 in #271
fix minor issues by @ekneg54 in #272
refactor filter expression by @ekneg54 in #274
Fix deprecation warnings pseudonymizer by @ekneg54 in #270
refactor pipeline.py by @ekneg54 in #273
fix readme badge and revise readme by @ekneg54 in #276
Add config refresh interval by @ekneg54 in #275
make http getter more robust by @ekneg54 in #277
Fix config refresh after failure by @ekneg54 in #281
Fix scale pipeline with config refresh by @ekneg54 in #280
Dev substitute from environment in config and rules by @ekneg54 in #283
Fix typo leading to crash of pipeline by @ppcad in #285
Update rules.rst by @Vrdlbrmft in #287
Fix error with regex normalization by @ppcad in #286
bump hyperscan to 0.4.0 by @ekneg54 in #291
add append without separator to dissector by @ekneg54 in #290
fix quickstart docker-compose setup by @ekneg54 in #292
Revert "fix quickstart docker-compose setup" by @ekneg54 in #293
refactor pipeline.py by @ekneg54 in #295
Dev ducplication warnings and generic adder overwrite by @ppcad in #282
Dev add generic file input connector by @herrfeder in #288
Fix quickstart docker config by @ekneg54 in #294
fix type hints for field manager by @ekneg54 in #300
add list access to get_dotted_field_value by @ekneg54 in #299
Expand configuration verification by @ppcad in #301
add processor boilerplate generator helper by @ekneg54 in #302
fix dissector regex by @ekneg54 in #307
Set deprecation filter to always by @dtrai2 in #309

New Contributors

@Vrdlbrmft made their first contribution in #287
@herrfeder made their first contribution in #288

Full Changelog: v4.0.0...v5.0.0

Contributors

herrfeder, ppcad, and 3 other contributors

Assets 2

21 Nov 10:33

ekneg54

v4.0.0

2dd4cc5

v4.0.0

Breaking

Splitting the general connector config into input and output to compose connector config independendly
Removal of Deprecated Feature: HMAC-Options in the connector consumer options have to be
under the subkey preprocessing of the input processor
Removal of Deprecated Feature: delete processor was renamed to deleter
Rename writing_output connector to jsonl_output

Features

Add an opensearch output connector that can be used to write directly into opensearch.
Add an elasticsearch output connector that can be used to write directly into elasticsearch.
Split connector config into seperate config keys input and output
Add preprocessing capabillities to all input connectors
Add preprocessor for log_arrival_time
Add preprocessor for log_arrival_timedelta
Add metrics to connectors
Add concatenator processor that can combine multiple source fields
Add dissector processor that tokinizes messages into new or existing fields
Add key_checker processor that checks if all dotted fields from a list are present in the event
Add field_manager processor that copies or moves fields and merges lists
Add ability to delete source fields to concatenator, datetime_extractor, dissector, domain_label_extractor, domain_resolver, geoip_enricher and list_comparison
Add ability to overwrite target field to datetime_extractor, domain_label_extractor, domain_resolver, geoip_enricher and list_comparison

Improvements

Validate connector config on class level via attrs classes
Implement a common interface to all connectors
Refactor connector code
Revise the documentation
Add sphinxcontrib.datatemplates and testcase-renderer to docs
Reimplement get_dotted_field_value helper method which should lead to increased performance
Reimplement dropper processor code to improve performance

Deprecations

Rule Language

datetime_extractor.datetime_field is deprecated. Use datetime_extractor.source_fields as list instead.
datetime_extractor.destination_field is deprecated. Use datetime_extractor.target_field instead.
delete is deprecated. Use deleter.delete instead.
domain_label_extractor.target_field is deprecated. Use domain_label_extractor.source_fields as list instead.
domain_label_extractor.output_field is deprecated. Use domain_label_extractor.target_field instead.
domain_resolver.source_url_or_domain is deprecated. Use domain_resolver.source_fields as list instead.
domain_resolver.output_field is deprecated. Use domain_resolver.target_field instead.
drop is deprecated. Use dropper.drop instead.
drop_full is deprecated. Use dropper.drop_full instead.
geoip_enricher.source_ip is deprecated. Use geoip_enricher.source_fields as list instead.
geoip_enricher.output_field is deprecated. Use geoip_enricher.target_field instead.
label is deprecated. Use labeler.label instead.
list_comparison.check_field is deprecated. Use list_comparison.source_fields as list instead.
list_comparison.output_field is deprecated. Use list_comparison.target_field instead.
pseudonymize is deprecated. Use pseudonymizer.pseudonyms instead.
url_fields is deprecated. Use pseudonymizer.url_fields instead.

Details

Add es output connector and split kafka connector by @ppcad in #83
Increase Test Coverage for Connectors by @ekneg54 in #169
Remove unused code by @ekneg54 in #170
refactor connectors part 1 by @ekneg54 in #173
Fix duplication in rule tree and adapt rule equalities by @ppcad in #168
Remove accidental print statements from normalizer by @ppcad in #176
Add OpenSearch output connector by @ppcad in #174
Refactor delete processor -> deleter by @saegel in #167
Split Connectors into separate input and output by @ekneg54 in #175
Add concatenator processor by @dtrai2 in #182
Fix picklingerror in debug by @ppcad in #188
Add Dissecter processor by @ekneg54 in #181
Add Dockerfile, .dockerignore and update README by @ekneg54 in #187
Fix time measurement for get_next input connector by @dtrai2 in #191
fix keyerror for message with setted _op_type field by @ekneg54 in #194
192 docs build is failing by @ekneg54 in #197
192 docs build is failing by @ekneg54 in #199
Fix typo dissecter -> dissector by @dtrai2 in #200
Fix typo seperator -> separator by @dtrai2 in #198
change documentation and config key for opensearch and elasticsearch by @ekneg54 in #195
Remove deprecation warnings and tests for next major release by @dtrai2 in #201
Fix error message for missing input/output config keys by @dtrai2 in #202
Fix non-deterministic behavior by @dtrai2 in #203
refactor rules by @ekneg54 in #190
dev-add-keychecker-processor by @DogchampDiego in #204
Implement delete_source_field and overwrite_target to all possible processors by @ekneg54 in #205
reimplement get_dotted_field_value by @ekneg54 in #207
Fix add version info preprocessor and revice tests by @dtrai2 in #208
Adapt rule configuration validation by @ppcad in #209
add field_manager processor by @ekneg54 in #206
refactor add_field_to by @ekneg54 in #210
Fix typo in elasticsearch output connector by @ppcad in #214
Implement HMAC fallback for input connectors without raw message by @dtrai2 in #213
Fix reset of metric by @dtrai2 in #212
Fix error message for processing error in field_manager by @ekneg54 in #215
fix bug for non existing source_field in domain_label_extractor by @ekneg54 in #217
reimplement dropper by @ekneg54 in #216
Fix prometheus_exporter reset on failed pipelines by @dtrai2 in #218
add new representation for rules by @ekneg54 in #219
fix output of warnings by @ekneg54 in #220
Logprep 4.0.0 release by @ekneg54 in #224

New Contributors

@DogchampDiego made their first contribution in #204

Full Changelog: v3.3.0...v4.0.0

Contributors

saegel, ppcad, and 3 other contributors

Assets 2

Releases: fkie-cad/Logprep

v6.6.0

Improvements

Bugfix

Details

Contributors

v6.5.1

Bugfix

Details

Contributors

v6.5.0

Improvements

Bugfix

Details

Contributors

v6.4.0

Improvements

Bugfix

Details

Contributors

v6.3.0

Features

Improvements

Bugfix

Details

Contributors

v6.2.0

Features

Improvements

Bugfix

Details

Contributors

v6.1.0

Features

Improvements

Bugfix

Details

New Contributors

Contributors

v6.0.0

v6.0.0

Breaking

Features

Improvements

Bugfix

Improvements

Details

New Contributors

Contributors

v5.0.1

Breaking

Features

Improvements

Bugfixes

Details

New Contributors

Contributors

v4.0.0

v4.0.0

Breaking

Features

Improvements

Deprecations

Rule Language

Details

New Contributors

Contributors