Releases: fkie-cad/Logprep
Releases · fkie-cad/Logprep
v6.6.0
Improvements
- Replace rule_filter with lucene_filter in predetector output. The old internal logprep rule
representation is not present anymore in the predetector output, the namerule_filter
will stay
in place of thelucene_filter
name. - 'amides' processor now stores confidence values of processed events in the
amides.confidence
field.
In case of positive detection results, rule attributions are now inserted in theamides.attributions
field.
Bugfix
- Fix lucene rule filter representation such that it is aligned with opensearch lucene query syntax
- Fix grok pattern
UNIXPATH
by internally converting[[:alnum:]]
to\w"
- Fix overwriting of temporary tld-list with empty content
Details
- replace rule filter with lucene filter in predetector output by @dtrai2 in #403
- fix lucene filter representation by @dtrai2 in #409
- fix dry_runner to support selective_extractor extra outputs by @dtrai2 in #407
- Overhaul AMIDES output by @clumsy9 in #411
- Extend predetector documentation by @dtrai2 in #412
- fix UNIXPATH grok pattern by @dtrai2 in #415
- prevent overwriting existing file by @dtrai2 in #414
- Prepare Release v6.6.0 by @dtrai2 in #418
Full Changelog: v6.5.1...v6.6.0
v6.5.1
v6.5.0
Improvements
- Make the
PROMETHEUS_MULTIPROC_DIR
environment variable optional, will default to
/tmp/PROMETHEUS_MULTIPROC_DIR
if not given
Bugfix
- All temp files will now be stored inside the systems default temp directory
Details
- bump requests to >=2.31.0 by @dtrai2 in #397
- Fix confluent kafka 2 store offset by @dtrai2 in #386
- fix error handling of fieldmanager if no mapped source field exists by @dtrai2 in #395
- add lucene filter to predetector detections by @dtrai2 in #396
- grokker fixes by @dtrai2 in #398
- prepare release v6.4.0 by @dtrai2 in #399
- remove direct python-dateutil dependency by @dtrai2 in #393
- rework logprep temp files by @dtrai2 in #402
- update CHANGELOG.md by @dtrai2 in #405
Full Changelog: v6.3.0...v6.5.0
v6.4.0
Improvements
- Bump
requests
to>=2.31.0
to circumventCVE-2023-32681
- Include a lucene representation of the rule filter into the predetector results. The
representation is not completely lucene compatible due to non-existing regex functionality.
Bugfix
- Fix error handling of FieldManager if no mapped source field exists in the event.
- Fix Grokker such that only the first grok pattern match is applied instead of all matching
pattern - Fix Grokker such that nested parentheses in oniguruma pattern are working (3 levels are supported
now) - Fix Grokker such that two or more oniguruma can point to the same target. This ensures
grok-pattern compatibility with the normalizer and other grok tools
Details
- bump requests to >=2.31.0 by @dtrai2 in #397
- Fix confluent kafka 2 store offset by @dtrai2 in #386
- fix error handling of fieldmanager if no mapped source field exists by @dtrai2 in #395
- add lucene filter to predetector detections by @dtrai2 in #396
- grokker fixes by @dtrai2 in #398
Full Changelog: v6.3.0...v6.4.0
v6.3.0
Features
- Extend dissector such that it can trim characters around dissected field with
%{field-( )}
notation. - Extend timestamper such that it can take multiple source_formats. First format that matches
will be used, all following formats will be ignored
Improvements
- Extend the
FieldManager
such that it can move/copy multiple source fields into multiple targets
inside one rule.
Bugfix
- Fix error handling of missing source fields in grokker
- Fix using same output fields in list of grok pattern in grokker
Details
- Grokker: fix handling of missing source fields by @dtrai2 in #387
- Allow same target names in grokker mapping list by @dtrai2 in #389
- Dissector: add feature to strip chars after dissecting by @dtrai2 in #390
- Extend fieldmanager rule to support writing into multiple targets in one rule by @dtrai2 in #388
- allow using list of source_formats in timestamper by @dtrai2 in #392
- merge processors and rule docs to reduce duplicate documentation by @dtrai2 in #391
- prepare release 6.3.0 by @dtrai2 in #394
Full Changelog: v6.2.0...v6.3.0
v6.2.0
Features
- add
timestamper
processor to extract timestamp functionality from normalizer
Improvements
- removed
arrow
dependency and depending features for performance reasons- switched to
datetime.strftime
syntax intimestamp_differ
,s3_output
,elasticsearch_output
andopensearch_output
- switched to
- encapsulate time related functionality in
logprep.util.time.TimeParser
Bugfix
- Fix missing default grok patterns in packaged logprep version
Details
- simplify quickstart by @ekneg54 in #373
- mark not used fields with init=False, repr=False, eq=False by @ekneg54 in #379
- add timestamper processor by @ekneg54 in #374
- Fix packaging of logprep to include grok default pattern by @dtrai2 in #382
- Fix util time.py for non-utc systems by @dtrai2 in #384
- update CHANGELOG.md by @dtrai2 in #385
Full Changelog: v6.1.0...v6.2.0
v6.1.0
Features
- Add
amides
processor to extends conventional rule matching by applying machine learning components - Add
grokker
processor to extract grok functionality from normalizer Normalizer
writes failure tags if nomalization fails- Add
flush_timeout
toopensearch
andelasticsearch
outputs to ensure message delivery within a configurable period - add
kafka_config
option toconfluent_kafka_input
andconfluent_kafka_output
connectors to provide additional config options tolibrdkafka
Improvements
- Harmonize error messages and handling for processors and connectors
- Add ability to schedule periodic tasks to all components
- Improve performance of pipeline processing by switching form builtin
json
tomsgspec
in pipeline and kafka connectors
Bugfix
- Fix resetting processor caches in the
auto_rule_corpus_tester
by initializing all processors
between test cases. - Fix processing of generic rules after there was an error inside the specific rules.
- Remove coordinate fields from results of the geoip enricher if one of them has
None
values
Details
- replace json.loads with msgspec to improve input connector performance by @ekneg54 in #351
- Add processor and rules for the Adaptive Misuse Detection System (AMIDES) by @clumsy9 in #347
- Make multiple applications of rules by the same processor optional by @ppcad in #355
- add scheduler to all components and schedule search flush after period of times by @ekneg54 in #344
- add kafka advanced config option by @ekneg54 in #357
- avoid string splitting during processing by @ekneg54 in #362
- Speedup startup by @ppcad in #314
- Improve error message by @dtrai2 in #366
- Fix rule corpus tester processor caches by @dtrai2 in #367
- Fix processing of generic rules after an error in the specific rules by @dtrai2 in #368
- S3 output by @ppcad in #364
- improve pipeline performance by @ekneg54 in #369
- Remove all geoip enricher fields if they are None by @ppcad in #371
- Add date replacement to prefixes for s3 output by @ppcad in #375
- Fix catched exception in tree parser that led to ignoring of rules by @ppcad in #343
- add grokker processor by @ekneg54 in #363
- improve connector performance by @ekneg54 in #370
- prepare release 6.1.0 by @ekneg54 in #377
New Contributors
Full Changelog: v6.0.0...v6.1.0
v6.0.0
v6.0.0
Breaking
- Remove rules deprecations introduced in
v4.0.0
- Changes rule language of
selective_extractor
,pseudonymizer
,pre_detector
to support multiple outputs
Features
- Add
string_splitter
processor to split strings of variable length into lists - Add
ip_informer
processor to enrich events with ip information - Allow running the
Pipeline
in python without input/output connectors - Add
auto_rule_corpus_tester
to test a whole rule corpus against defined expected outputs. - Add shorthand for converting datatypes to
dissector
dissect pattern language - Add support for multiple output connectors
- Apply processors multiple times until no new rule matches anymore. This enables applying rules on
results of previous rules.
Improvements
- Bump
attrs
to>=22.2.0
and delete redundantmin_len_validator
- Specify the metric labels for connectors (add name, type and direction as labels)
- Rename metric names to clarify their meanings (
logprep_pipeline_number_of_warnings
to
logprep_pipeline_sum_of_processor_warnings
andlogprep_pipeline_number_of_errors
to
logprep_pipeline_sum_of_processor_errors
)
Bugfix
- Fixes a bug that breaks templating config and rule files with environment variables if one or more variables are not set in environment
- Fixes a bug for
opensearch_output
andelasticsearch_output
not handling authentication issues - Fix metric
logprep_pipeline_number_of_processed_events
to actually count the processed events per pipeline - Fix a bug for enrichment with environment variables. Variables must have one of the following prefixes now:
LOGPREP_
,CI_
,GITHUB_
orPYTEST_
Improvements
- reimplements the
selective_extractor
Details
- add simple string splitter processor by @ekneg54 in #304
- fix type hint for target_field by @ekneg54 in #317
- Fix auto rule tester to call all processor setup methods by @dtrai2 in #312
- add processor to enrich IP information by @ekneg54 in #303
- Add pseudonymization of list elements and allow regex of list elements in auto-tests by @ppcad in #284
- Fix graceful shutdown of multiprocessing pipeline by @dtrai2 in #313
- bump attrs to 22.2 by @ekneg54 in #319
- Enable pipeline run without connectors by @dtrai2 in #308
- Add auto rule corpus tester by @dtrai2 in #185
- Fix automodule documentation paths by @dtrai2 in #324
- Add shorthand to dissect pattern for converting datatypes by @ekneg54 in #328
- update changelog bump
attrs
requirement by @ekneg54 in #331 - add documentation for environment variables in config by @ekneg54 in #326
- make properties configurable in ip_informer rule by @ekneg54 in #325
- fix handle template errors by @ekneg54 in #333
- fix opensearch-output-connector-raises-to-late-if-unauthenticated by @ekneg54 in #334
- add support for multiple output connectors by @ekneg54 in #306
- Fix metrics for multiple outputs by @dtrai2 in #336
- Fixed broken link in README.md by @0xr2po in #339
- remove verification of configuration in pipeline manager and run_logprep by @ekneg54 in #337
- add connectionerror to connection test try block by @ekneg54 in #341
- make broader exception for search connection test by @ekneg54 in #342
- Apply processors multiple times by @dtrai2 in #318
- remove all deprecated code and resolve all warnings by @ekneg54 in #345
- only expand valid prefixed variables by @ekneg54 in #349
- fix non-deterministic behavior and add test by @dtrai2 in #346
- bump confluent-kafka client to >2.0.0 by @ekneg54 in #353
- Fix dry_runner to deal with multiple outputs by @dtrai2 in #352
- prepare release v6.0.0 by @ekneg54 in #354
New Contributors
Full Changelog: v5.0.1...v6.0.0
v5.0.1
Breaking
- drop support for python
3.6
,3.7
,3.8
- change default prefix behavior on appending to strings of
dissector
Features
- Add an
http input connector
that spawns a uvicorn server which parses requests content to events. - Add an
file input connector
that reads generic logfiles. - Provide the possibility to consume lists, rules and configuration from files and http endpoints
- Add
requester
processor that enriches by making http requests with field values - Add
calculator
processor to calculate with or without field values - Make output subfields of the
geoip_enricher
configurable by introducing the rule config
customize_target_subfields
- Add a
timestamp_differ
processor that can parse two timestamps and calculate their respective time delta. - Add
config_refresh_interval
configuration option to refresh the configuration on a given timedelta - Add option to
dissector
to use a prefix pattern in dissect language for appending to strings and add the default behavior to append to strings without any prefixed separator
Improvements
- Add support for python
3.10
and3.11
- Add option to submit a template with
list_search_base_path
config parameter inlist_comparison
processor - Add functionality to
geoip_enricher
to download the geoip-database - Add ability to use environment variables in rules and config
- Add list access including slicing to dotted field notation for getting values
- Add processor boilerplate generator to help adding new processors
Bugfixes
- Fix count of
number_of_processed_events
metric ininput
connector. Will now only count actual
events.
Details
- Reimplement flipping aggregation test by @ekneg54 in #222
- consume lists, configuration and rules via api or file by @ekneg54 in #221
- Add HTTP Connector by @ekneg54 in #75
- fix dissector not able to start with seperator by @ekneg54 in #227
- drop support for python 3.6, 3.7, 3.8 and bump dependencies to support 3.11 by @ekneg54 in #228
- fix main pipeline by @ekneg54 in #229
- fix documentation build by @ekneg54 in #230
- Add calculator processor by @ekneg54 in #223
- fix Dockerfile by @ekneg54 in #232
- Geoip Enricher: Add config to customize target subfields by @dtrai2 in #231
- fix regex normalization by @ekneg54 in #234
- Fix normalizer documentation by @dtrai2 in #235
- Fix count of number_of_processed_events in input connector by @dtrai2 in #236
- add notebooks by @ekneg54 in #237
- add jsonl parser to getter by @ekneg54 in #240
- Dev improve documentation by @dtrai2 in #244
- 241 feature load files from http is broken because of file validator by @ekneg54 in #243
- add http config option to documentation by @ekneg54 in #238
- fix mermaid documentation output by @ekneg54 in #248
- Fix deleter unbound local error by @ekneg54 in #247
- Upgrade protobuff to non CVE version by @ekneg54 in #239
- log python and logprep version on startup by @ekneg54 in #245
- upgrade certifi by @ekneg54 in #249
- Add timestamp_differ processor by @dtrai2 in #242
- fix http config load - remove hard filesystem validations by @ekneg54 in #250
- Add requester processor by @ekneg54 in #226
- Relax the timestamp_differ timestamp format by @dtrai2 in #252
- refactor pipeline by @ekneg54 in #251
- Fix docu + notebook by @dtrai2 in #253
- Hide units by default by @dtrai2 in #255
- add requester example by @ekneg54 in #256
- add build image by @ekneg54 in #254
- add variable expansion to getter by @ekneg54 in #257
- fix version output by @ekneg54 in #258
- add option to submit a template through
list_search_base_path
by @ekneg54 in #259 - add session handling to http getter by @ekneg54 in #260
- add get_raw to getter and make it the abstract method by @ekneg54 in #261
- geoip enricher downloads geoip database if not exists by @ekneg54 in #262
- fix artifacts load during process time by @ekneg54 in #263
- Resolve deprecation warnings in tests for deleter and datetime_extractor by @ekneg54 in #264
- resolve deprecation warnings in test for domain_label_extractor and domain_resolver by @ekneg54 in #265
- resolve deprecation warnings in test for dropper by @ekneg54 in #266
- resolve deprecation warnings in test for geoip_enricher by @ekneg54 in #267
- resolve deprecation warnings in test for labeler by @ekneg54 in #268
- resolve deprecation warnings in test for list_comparison by @ekneg54 in #269
- fix tldlist error on startup by @ekneg54 in #271
- fix minor issues by @ekneg54 in #272
- refactor filter expression by @ekneg54 in #274
- Fix deprecation warnings pseudonymizer by @ekneg54 in #270
- refactor pipeline.py by @ekneg54 in #273
- fix readme badge and revise readme by @ekneg54 in #276
- Add config refresh interval by @ekneg54 in #275
- make http getter more robust by @ekneg54 in #277
- Fix config refresh after failure by @ekneg54 in #281
- Fix scale pipeline with config refresh by @ekneg54 in #280
- Dev substitute from environment in config and rules by @ekneg54 in #283
- Fix typo leading to crash of pipeline by @ppcad in #285
- Update rules.rst by @Vrdlbrmft in #287
- Fix error with regex normalization by @ppcad in #286
- bump hyperscan to 0.4.0 by @ekneg54 in #291
- add append without separator to dissector by @ekneg54 in #290
- fix quickstart docker-compose setup by @ekneg54 in #292
- Revert "fix quickstart docker-compose setup" by @ekneg54 in #293
- refactor pipeline.py by @ekneg54 in #295
- Dev ducplication warnings and generic adder overwrite by @ppcad in #282
- Dev add generic file input connector by @herrfeder in #288
- Fix quickstart docker config by @ekneg54 in #294
- fix type hints for field manager by @ekneg54 in #300
- add list access to get_dotted_field_value by @ekneg54 in #299
- Expand configuration verification by @ppcad in #301
- add processor boilerplate generator helper by @ekneg54 in #302
- fix dissector regex by @ekneg54 in #307
- Set deprecation filter to always by @dtrai2 in #309
New Contributors
- @Vrdlbrmft made their first contribution in #287
- @herrfeder made their first contribution in #288
Full Changelog: v4.0.0...v5.0.0
v4.0.0
v4.0.0
Breaking
- Splitting the general
connector
config intoinput
andoutput
to compose connector config independendly - Removal of Deprecated Feature: HMAC-Options in the connector consumer options have to be
under the subkeypreprocessing
of theinput
processor - Removal of Deprecated Feature:
delete
processor was renamed todeleter
- Rename
writing_output
connector tojsonl_output
Features
- Add an
opensearch
output connector that can be used to write directly into opensearch. - Add an
elasticsearch
output connector that can be used to write directly into elasticsearch. - Split connector config into seperate config keys
input
andoutput
- Add preprocessing capabillities to all input connectors
- Add preprocessor for log_arrival_time
- Add preprocessor for log_arrival_timedelta
- Add metrics to connectors
- Add
concatenator
processor that can combine multiple source fields - Add
dissector
processor that tokinizes messages into new or existing fields - Add
key_checker
processor that checks if all dotted fields from a list are present in the event - Add
field_manager
processor that copies or moves fields and merges lists - Add ability to delete source fields to
concatenator
,datetime_extractor
,dissector
,domain_label_extractor
,domain_resolver
,geoip_enricher
andlist_comparison
- Add ability to overwrite target field to
datetime_extractor
,domain_label_extractor
,domain_resolver
,geoip_enricher
andlist_comparison
Improvements
- Validate connector config on class level via attrs classes
- Implement a common interface to all connectors
- Refactor connector code
- Revise the documentation
- Add
sphinxcontrib.datatemplates
andtestcase-renderer
to docs - Reimplement
get_dotted_field_value
helper method which should lead to increased performance - Reimplement
dropper
processor code to improve performance
Deprecations
Rule Language
datetime_extractor.datetime_field
is deprecated. Usedatetime_extractor.source_fields
as list instead.datetime_extractor.destination_field
is deprecated. Usedatetime_extractor.target_field
instead.delete
is deprecated. Usedeleter.delete
instead.domain_label_extractor.target_field
is deprecated. Usedomain_label_extractor.source_fields
as list instead.domain_label_extractor.output_field
is deprecated. Usedomain_label_extractor.target_field
instead.domain_resolver.source_url_or_domain
is deprecated. Usedomain_resolver.source_fields
as list instead.domain_resolver.output_field
is deprecated. Usedomain_resolver.target_field
instead.drop
is deprecated. Usedropper.drop
instead.drop_full
is deprecated. Usedropper.drop_full
instead.geoip_enricher.source_ip
is deprecated. Usegeoip_enricher.source_fields
as list instead.geoip_enricher.output_field
is deprecated. Usegeoip_enricher.target_field
instead.label
is deprecated. Uselabeler.label
instead.list_comparison.check_field
is deprecated. Uselist_comparison.source_fields
as list instead.list_comparison.output_field
is deprecated. Uselist_comparison.target_field
instead.pseudonymize
is deprecated. Usepseudonymizer.pseudonyms
instead.url_fields is
deprecated. Usepseudonymizer.url_fields
instead.
Details
- Add es output connector and split kafka connector by @ppcad in #83
- Increase Test Coverage for Connectors by @ekneg54 in #169
- Remove unused code by @ekneg54 in #170
- refactor connectors part 1 by @ekneg54 in #173
- Fix duplication in rule tree and adapt rule equalities by @ppcad in #168
- Remove accidental print statements from normalizer by @ppcad in #176
- Add OpenSearch output connector by @ppcad in #174
- Refactor delete processor -> deleter by @saegel in #167
- Split Connectors into separate input and output by @ekneg54 in #175
- Add concatenator processor by @dtrai2 in #182
- Fix picklingerror in debug by @ppcad in #188
- Add Dissecter processor by @ekneg54 in #181
- Add Dockerfile, .dockerignore and update README by @ekneg54 in #187
- Fix time measurement for get_next input connector by @dtrai2 in #191
- fix keyerror for message with setted _op_type field by @ekneg54 in #194
- 192 docs build is failing by @ekneg54 in #197
- 192 docs build is failing by @ekneg54 in #199
- Fix typo dissecter -> dissector by @dtrai2 in #200
- Fix typo seperator -> separator by @dtrai2 in #198
- change documentation and config key for opensearch and elasticsearch by @ekneg54 in #195
- Remove deprecation warnings and tests for next major release by @dtrai2 in #201
- Fix error message for missing input/output config keys by @dtrai2 in #202
- Fix non-deterministic behavior by @dtrai2 in #203
- refactor rules by @ekneg54 in #190
- dev-add-keychecker-processor by @DogchampDiego in #204
- Implement delete_source_field and overwrite_target to all possible processors by @ekneg54 in #205
- reimplement get_dotted_field_value by @ekneg54 in #207
- Fix add version info preprocessor and revice tests by @dtrai2 in #208
- Adapt rule configuration validation by @ppcad in #209
- add field_manager processor by @ekneg54 in #206
- refactor add_field_to by @ekneg54 in #210
- Fix typo in elasticsearch output connector by @ppcad in #214
- Implement HMAC fallback for input connectors without raw message by @dtrai2 in #213
- Fix reset of metric by @dtrai2 in #212
- Fix error message for processing error in field_manager by @ekneg54 in #215
- fix bug for non existing source_field in domain_label_extractor by @ekneg54 in #217
- reimplement dropper by @ekneg54 in #216
- Fix prometheus_exporter reset on failed pipelines by @dtrai2 in #218
- add new representation for rules by @ekneg54 in #219
- fix output of warnings by @ekneg54 in #220
- Logprep 4.0.0 release by @ekneg54 in #224
New Contributors
- @DogchampDiego made their first contribution in #204
Full Changelog: v3.3.0...v4.0.0