-
Notifications
You must be signed in to change notification settings - Fork 1.8k
conf: Convert existing parser*.conf files to yaml, add converter, add tests #11168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Previously, trying to use this file resulted in: [2025/11/15 21:23:49.103609095] [error] [config] error in parser_custom.conf:1: undefined value - check config is in valid classic format Signed-off-by: Hank Leininger <[email protected]>
WalkthroughThis PR migrates many Fluent Bit parser definitions from INI Changes
Sequence Diagram(s)sequenceDiagram
participant Admin as Administrator
participant Conf as parsers.conf
participant Yaml as parsers.yaml
participant Parser as Fluent Bit Parser Registry
rect rgb(230,240,255)
Note over Admin,Parser: Legacy flow
Admin->>Conf: Provide .conf with [PARSER]/[MULTILINE_PARSER]
Conf->>Parser: Register INI-style parsers
end
rect rgb(230,255,230)
Note over Admin,Parser: New YAML flow
Admin->>Yaml: Provide parsers.yaml (list / structured rules)
Yaml->>Parser: Register YAML-style parsers
end
rect rgb(255,245,220)
Note over Admin,Parser: Migration step
Admin->>Admin: Run fb-parser-conf2yaml.pl
Admin->>Conf: Parse .conf sections & comments
Admin->>Yaml: Emit equivalent .yaml files
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Updated test outputs: Verbose mode: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Nitpick comments (4)
tests/internal/data/config_format/convert/parser_custom.external-dns.test (1)
1-1: Confirm whether outer single quotes are expected in external-dns sampleThis sample wraps the log line in single quotes. The parser regex will likely still match (the leading
'satisfies the initial non-space), but if real external-dns logs are not quoted this may diverge from the documented example. Consider dropping the outer quotes unless they are actually present in the referenced log snippet.conf/parsers_openstack.yaml (1)
8-49: Refactor: Extract common OpenStack parser pattern to reduce duplication.All six parsers (cinder, glance, heat, keystone, neutron, nova) define an identical regex pattern. This violates the DRY principle and makes maintenance difficult—any bug fixes to the pattern must be applied in six places.
Consider refactoring the common pattern into a single reusable parser definition or using YAML anchors (
&and*) to eliminate duplication:parsers: - &openstack_base format: regex regex: '^(?<log_time>[^ ][-.\d\+:T]+[ ]*[.:\d]*)\s+(?<pid>[^ ]\d+)\s+(?<severity>[^ ][.-_\w]+)\s+(?<component>[^ ][.-_\w]+)(\s+\[(-|(?<req_id>[^ ][-\w]*) (?<req_user>[^ ][-\w]*) (?<req_project>[^ ][-\w]*) (?<req_domain>[^ ][-\w]*) (?<req_user_domain>[^ ][-\w]*) (?<req_project_domain>[^ ][-\w]*))\]){1}\s+(?<message>.*)$' time_format: '%Y-%m-%d %H:%M:%S.%L' time_keep: Off time_key: log_time - name: cinder <<: *openstack_base - name: glance <<: *openstack_base # ... and so on for heat, keystone, neutron, novaAlternatively, document why each parser is needed separately if they are intentionally distinct.
tests/internal/data/config_format/convert/run_tests.sh (2)
21-26: Verifysed -Eflag compatibility across target platforms.Line 25 uses
sed -Efor extended regex syntax. This flag differs between GNU sed (-Eor-r) and BSD sed (-E). For portability, consider usingsed -nwith explicit pattern ranges or useawkinstead:squash_date() { awk '{gsub(/"date":[0-9]+\.[0-9]+,/, "\"date\":1234567890.123456,"); print}' }Alternatively, document the target platform assumption (e.g., "GNU sed required").
58-62: Add error handling for missing fluent-bit binary and invalid parser names.The script invokes
fluent-bitwithout checking for availability or error status. Iffluent-bitis not in$PATHor if the parser name doesn't exist, the command will fail silently or produce misleading output. Add explicit checks:command -v fluent-bit >/dev/null 2>&1 || die "fluent-bit not found in PATH"Also, capture stderr and exit codes to detect parser errors early:
if ! CONF_OUT=$(cat "$A" | fluent-bit -q -R "${F}".conf -i stdin -p parser="${P}" -o stdout -p format=json_lines 2>&1 | filter); then die "fluent-bit failed for ${F}.conf with parser ${P}: exit code $?" fiAdditionally, verify that the
-qflag suppresses all debug output; if not, stderr may contaminate the hash comparison.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (32)
conf/parser_custom.conf(1 hunks)conf/parser_custom.yaml(1 hunks)conf/parsers.conf(1 hunks)conf/parsers.yaml(1 hunks)conf/parsers_ambassador.yaml(1 hunks)conf/parsers_cinder.conf(1 hunks)conf/parsers_cinder.yaml(1 hunks)conf/parsers_extra.conf(2 hunks)conf/parsers_extra.yaml(1 hunks)conf/parsers_java.yaml(1 hunks)conf/parsers_kafka.yaml(1 hunks)conf/parsers_mult.yaml(1 hunks)conf/parsers_multiline.yaml(1 hunks)conf/parsers_openstack.yaml(1 hunks)tests/internal/data/config_format/convert/.gitignore(1 hunks)tests/internal/data/config_format/convert/README(1 hunks)tests/internal/data/config_format/convert/fb-parser-conf2yaml.pl(1 hunks)tests/internal/data/config_format/convert/parser_custom.external-dns.test(1 hunks)tests/internal/data/config_format/convert/parser_custom.neo4j.test(1 hunks)tests/internal/data/config_format/convert/parser_custom.rabbitmq.test(1 hunks)tests/internal/data/config_format/convert/parsers.cri.test(1 hunks)tests/internal/data/config_format/convert/parsers.envoy.test(1 hunks)tests/internal/data/config_format/convert/parsers.istio-envoy-proxy.test(1 hunks)tests/internal/data/config_format/convert/parsers.k8s-nginx-ingress.test(1 hunks)tests/internal/data/config_format/convert/parsers.kmsg-netfilter-log.test(1 hunks)tests/internal/data/config_format/convert/parsers_cinder.ceph.test(1 hunks)tests/internal/data/config_format/convert/parsers_extra.chefclient.test(1 hunks)tests/internal/data/config_format/convert/parsers_extra.couchbase_java_multiline.test(1 hunks)tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_mixed.test(1 hunks)tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_utc.test(1 hunks)tests/internal/data/config_format/convert/parsers_extra.crowbar.test(1 hunks)tests/internal/data/config_format/convert/run_tests.sh(1 hunks)
🔇 Additional comments (18)
conf/parsers_cinder.conf (1)
3-3: HTTPS Rubular reference update is fineSwitching the Rubular URL to
httpsis a safe, non-functional improvement and keeps the reference up to date. No further changes needed.tests/internal/data/config_format/convert/.gitignore (1)
1-2: Ignoring generated parser configs in this helper directory makes senseThese patterns align with the workflow in the README (copy/convert parser *.conf locally) and prevent accidental check-in of generated test artifacts.
conf/parsers.conf (1)
118-125: CRI Rubular link scheme update is safeSwitching the Rubular reference comment to https is purely cosmetic and has no effect on the cri parser behavior.
conf/parser_custom.conf (1)
1-15: Whitespace-only normalization of[PARSER]headersRemoving the leading space before
[PARSER]is a clean formatting fix and keeps these sections consistent with other parser definitions.tests/internal/data/config_format/convert/parsers.kmsg-netfilter-log.test (1)
1-2: Good coverage for kmsg-netfilter-log parserThese TCP/UDP FIREWALL samples look realistic and should exercise the kmsg-netfilter-log parser well for the conversion tests.
tests/internal/data/config_format/convert/parsers_extra.crowbar.test (1)
1-20: Crowbar test data set looks representativeThe variety of DEBUG/WARN/INFO lines (backups, SQL, HTTP, switch warnings) should give good coverage for the crowbar parser across different message shapes.
tests/internal/data/config_format/convert/README (1)
1-19: README clearly documents the conversion/test workflowThe steps to copy parser *.conf files, run fb-parser-conf2yaml.pl, and execute
run_tests.share concise and match the surrounding tooling and .gitignore setup.conf/parsers_extra.conf (1)
4-16: Rubular links for crowbar/chefclient updated safelyUpdating these reference comments to https has no effect on the parsers and keeps the documentation links modern.
tests/internal/data/config_format/convert/parsers.envoy.test (1)
1-1: LGTM!The Envoy access log test entry is well-formed and contains all the expected fields for testing the parser.
tests/internal/data/config_format/convert/parser_custom.rabbitmq.test (1)
1-1: LGTM!The RabbitMQ log test entry is well-formed with proper timestamp, log level, process ID, and message structure.
tests/internal/data/config_format/convert/parser_custom.neo4j.test (1)
1-1: LGTM!The Neo4j log test entry is well-formed with proper timestamp, log level, and message structure. Note: there's a trailing space at the end of the line, which may be intentional for testing whitespace handling.
tests/internal/data/config_format/convert/parsers.k8s-nginx-ingress.test (1)
1-1: LGTM!The Kubernetes NGINX Ingress log test entry is comprehensive and well-formed, containing all expected fields including client IP, timestamp, request details, response metrics, and upstream information.
tests/internal/data/config_format/convert/parsers_extra.couchbase_java_multiline.test (1)
1-11: LGTM!The Couchbase Analytics Java multiline test data is well-structured for testing multiline parsing behavior. It includes:
- Proper log entry boundaries (timestamps + log levels)
- Complete exception stack trace with correct indentation
- Multiple log levels (INFO, DEBU)
- Version information and source references
This comprehensively tests the parser's ability to recognize log entry boundaries and aggregate multiline stack traces.
tests/internal/data/config_format/convert/parsers.istio-envoy-proxy.test (1)
1-1: LGTM!The Istio Envoy proxy log test entry is comprehensive and well-formed, including all expected fields such as timestamp, HTTP request details, timing metrics, and Istio-specific routing context (inbound/outbound cluster information).
tests/internal/data/config_format/convert/parsers_extra.chefclient.test (1)
6-17: Verify and clarify intent of IP address and path variations in test data.The inconsistencies flagged in the review are confirmed to exist in the test file. The file contains example Chef client log output used as a test fixture, but the codebase contains no documentation explaining why five different IP addresses and paths appear across lines describing what should be a single mount operation:
- Line 6:
192.168.131.243:/nova_nfs__vol_1/nova_1- Line 8:
192.1.1.243:/nova_nfs_rot_2vol_1/nova_1- Line 13:
192.18.11.243:/nova_nfs_vol_1/nova_1- Line 16:
192.168.1.243:/nova_nfs_vol_1/nova_1- Line 17:
192.16.1.243:/nova_nfs_rot_2_1_vsos211_vol_1/nova_1Clarify whether these variations are intentional (testing parser robustness, realistic obfuscation patterns) or the result of incomplete redaction/copy-paste errors during test fixture creation.
conf/parsers_multiline.yaml (1)
1-12: Verify Fluent Bit regex dialect supports the state machine configuration.The multiline parser uses forward-slash-delimited regex patterns (
/(Dec \d+ \d+\:\d+\:\d+)(.*)/) and a state machine with transitions. Confirm that Fluent Bit's regex parser supports this syntax and the state machine logic works as documented.conf/parsers_java.yaml (1)
1-6: Verify PCRE-style named capture group support in Fluent Bit regex parser.The Java parser uses PCRE-style named capture groups (
(?<name>...)). Ensure Fluent Bit's regex implementation (Oniguruma) supports this syntax. Additionally, confirm that the time_format'%Y-%m-%d %H:%M:%S'correctly aligns with the capturedtimegroup pattern (\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2}).conf/parser_custom.yaml (1)
1-16: Verify regex patterns work correctly with corresponding test data.The three custom parsers include references to regex testing at rubular.com. Ensure each pattern is tested end-to-end against the corresponding parser test files (e.g.,
parser_custom.rabbitmq.test,parser_custom.neo4j.test,parser_custom.external-dns.test) using the new test harness to confirm expected behavior.Additionally, clarify the external-dns pattern structure at line 15: the leading
([^ ])\"appears to capture a non-space character followed by a quote. Verify this is the intended behavior for the target log format.
Signed-off-by: Hank Leininger <[email protected]>
Signed-off-by: Hank Leininger <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
♻️ Duplicate comments (5)
conf/parsers_mult.yaml (2)
8-10: Critical: Literal slash delimiters break the regex pattern.The regex pattern is wrapped in literal forward slashes (
/...$/), which will cause Fluent Bit to treat the slashes as part of the pattern. This was already identified in a previous review.Apply this diff:
- regex: /Processing by (?<controller>[^\u0023]+)\u0023(?<controller_method>[^ ]+) as (?<format>[^ ]+?)$/ + regex: 'Processing by (?<controller>[^\u0023]+)\u0023(?<controller_method>[^ ]+) as (?<format>[^ ]+?)$'
16-18: Critical: Literal slash delimiters break the regex pattern.Same issue as mult_1 - the regex is wrapped in literal forward slashes.
Apply this diff:
- regex: / Rendered (?<template>[^ ]+) within (?<layout>.+) \([\d\.]+ms\)/ + regex: ' Rendered (?<template>[^ ]+) within (?<layout>.+) \([\d\.]+ms\)'conf/parsers.yaml (1)
75-80: Critical: Literal slash delimiters break the syslog-rfc3164 parser.The regex is wrapped in literal forward slashes (
'/^...$/'), making Fluent Bit look for lines that start and end with a slash character. This was already identified in a previous review.Apply this diff:
- regex: '/^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$/' + regex: '^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$'tests/internal/data/config_format/convert/fb-parser-conf2yaml.pl (1)
182-232: Address the regex delimiter normalization issue flagged in previous review.A previous review identified that the converter copies
Regexvalues verbatim, including optional/pattern/delimiters. When the original .conf uses this style, Fluent Bit will treat the slashes as literal characters, breaking the parser.As suggested in the previous review, normalize regex values before storing them:
my ($key, $value) = ($1, $2); $blank_before_comments = 0; # Convert key to lowercase with underscores my $yaml_key = lc($key); + # Normalize regex delimiters + if ($yaml_key eq 'regex' && $value =~ m{\A/(.*)/\z} && $value !~ m{\A//}) { + $value = $1; # drop enclosing delimiters + } + # Special handling for multiline parser 'rule' fieldsAfter applying this fix, regenerate all YAML files to restore correct parser behavior.
tests/internal/data/config_format/convert/run_tests.sh (1)
37-46: Address the multi-line string formatting issue flagged in previous review.The previous review correctly identified that the
diefunction receives a multi-line string literal that may not preserve newlines reliably across different POSIX shells. Use a here-document as suggested:- die "Usage: $0 [-v] [file1.test [file2.test ...]] - - Process parser*.*.test files, either each one named on the command line, - or every file in pwd matching that pattern. - - Makes sure the output produced by the .conf file and .yaml configs match. - - Needs a corresponding .conf and .yaml for each. -" + die <<EOF +Usage: $0 [-v] [file1.test [file2.test ...]] + +Process parser*.*.test files, either each one named on the command line, +or every file in pwd matching that pattern. + +Makes sure the output produced by the .conf file and .yaml configs match. + +Needs a corresponding .conf and .yaml for each. +EOF
🧹 Nitpick comments (2)
conf/parsers_openstack.yaml (1)
9-49: Consider whether these six parsers could share a common definition.All six OpenStack parsers (cinder, glance, heat, keystone, neutron, nova) use identical regex patterns and time handling configuration. While explicit definitions provide clarity, this duplication could make maintenance harder if the pattern needs to change.
However, this may be intentional to allow independent evolution of each service's parser.
tests/internal/data/config_format/convert/run_tests.sh (1)
69-73: Consider adding error message context to fluent-bit check.The self-test at line 71 validates basic fluent-bit functionality, but if it fails, the generic error message doesn't indicate whether fluent-bit is found but malfunctioning, or if there's a PATH issue. Consider adding diagnostic output:
command -v fluent-bit >/dev/null 2>&1 || die "fluent-bit not found in PATH" -if ! echo '{"a":"b"}' | fluent-bit -q -i stdin -o stdout -p format=json_lines | grep -E -q '{"date":[0-9]+\.[0-9]+,"a":"b"}' ; then - die "fluent-bit self-test failed, cannot continue." +if ! echo '{"a":"b"}' | fluent-bit -q -i stdin -o stdout -p format=json_lines 2>&1 | grep -E -q '{"date":[0-9]+\.[0-9]+,"a":"b"}' ; then + die "fluent-bit self-test failed (command succeeded but output unexpected), cannot continue." fi
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (32)
conf/parser_custom.yaml(1 hunks)conf/parsers.conf(2 hunks)conf/parsers.yaml(1 hunks)conf/parsers_ambassador.yaml(1 hunks)conf/parsers_cinder.conf(1 hunks)conf/parsers_cinder.yaml(1 hunks)conf/parsers_extra.conf(2 hunks)conf/parsers_extra.yaml(1 hunks)conf/parsers_java.yaml(1 hunks)conf/parsers_kafka.yaml(1 hunks)conf/parsers_mult.yaml(1 hunks)conf/parsers_multiline.yaml(1 hunks)conf/parsers_openstack.yaml(1 hunks)tests/internal/data/config_format/convert/.gitignore(1 hunks)tests/internal/data/config_format/convert/README(1 hunks)tests/internal/data/config_format/convert/fb-parser-conf2yaml.pl(1 hunks)tests/internal/data/config_format/convert/parser_custom.external-dns.test(1 hunks)tests/internal/data/config_format/convert/parser_custom.neo4j.test(1 hunks)tests/internal/data/config_format/convert/parser_custom.rabbitmq.test(1 hunks)tests/internal/data/config_format/convert/parsers.cri.test(1 hunks)tests/internal/data/config_format/convert/parsers.envoy.test(1 hunks)tests/internal/data/config_format/convert/parsers.istio-envoy-proxy.test(1 hunks)tests/internal/data/config_format/convert/parsers.k8s-nginx-ingress.test(1 hunks)tests/internal/data/config_format/convert/parsers.kmsg-netfilter-log.test(1 hunks)tests/internal/data/config_format/convert/parsers.syslog-rfc3164.test(1 hunks)tests/internal/data/config_format/convert/parsers_cinder.ceph.test(1 hunks)tests/internal/data/config_format/convert/parsers_extra.chefclient.test(1 hunks)tests/internal/data/config_format/convert/parsers_extra.couchbase_java_multiline.test(1 hunks)tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_mixed.test(1 hunks)tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_utc.test(1 hunks)tests/internal/data/config_format/convert/parsers_extra.crowbar.test(1 hunks)tests/internal/data/config_format/convert/run_tests.sh(1 hunks)
✅ Files skipped from review due to trivial changes (3)
- tests/internal/data/config_format/convert/parsers_extra.crowbar.test
- tests/internal/data/config_format/convert/README
- conf/parsers_extra.conf
🚧 Files skipped from review as they are similar to previous changes (17)
- conf/parsers.conf
- tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_utc.test
- tests/internal/data/config_format/convert/parser_custom.external-dns.test
- conf/parsers_kafka.yaml
- tests/internal/data/config_format/convert/parsers.envoy.test
- tests/internal/data/config_format/convert/.gitignore
- tests/internal/data/config_format/convert/parsers.kmsg-netfilter-log.test
- tests/internal/data/config_format/convert/parsers.k8s-nginx-ingress.test
- tests/internal/data/config_format/convert/parsers.istio-envoy-proxy.test
- conf/parsers_cinder.conf
- conf/parsers_cinder.yaml
- conf/parsers_multiline.yaml
- tests/internal/data/config_format/convert/parsers.cri.test
- tests/internal/data/config_format/convert/parsers_cinder.ceph.test
- tests/internal/data/config_format/convert/parsers_extra.couchbase_java_multiline.test
- tests/internal/data/config_format/convert/parsers_extra.chefclient.test
- tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_mixed.test
🔇 Additional comments (10)
tests/internal/data/config_format/convert/parsers.syslog-rfc3164.test (1)
1-1: LGTM! Well-formed RFC3164 syslog test data.The test data correctly represents a complete RFC3164 syslog message with PRI (<34>), timestamp, hostname (mymachine), program (su), and message content.
conf/parser_custom.yaml (3)
2-5: LGTM! Clean regex pattern for RabbitMQ logs.The regex correctly extracts date, time, log_level, PID, and message fields using named capture groups.
7-10: LGTM! Clean regex pattern for Neo4j logs.The regex correctly extracts date, time, log_level, and message fields.
12-15: The YAML regex matches the original.conffile exactly, so no conversion error exists.The leading unnamed capture group
([^ ])is present in the original.conffile, not introduced by this YAML version. Since the conversion is faithful to the source, verify with domain knowledge whether the original regex pattern's unnamed capture group is intentional for external-dns log parsing.conf/parsers_ambassador.yaml (1)
5-7: LGTM! Comprehensive regex for Ambassador/Envoy logs.The regex correctly extracts all relevant fields from Ambassador access logs including request details, response metrics, and proxy information.
conf/parsers_extra.yaml (1)
1-159: LGTM! Comprehensive parser definitions.This file provides a wide variety of parsers (crowbar, chefclient, mysql, pacemaker, rabbitmq, couchbase variants, etc.) with properly formatted regex patterns and appropriate time handling. No literal slash delimiters detected.
tests/internal/data/config_format/convert/fb-parser-conf2yaml.pl (1)
60-67: Prevent out-of-bounds shift in argument parsing.Line 66 calls
shiftunconditionally within the loop, but if$argis the last argument and matches a valid pattern,shiftwill fail silently or cause unexpected behavior.Move the
shiftcall outside thecaselogic or validate that arguments remain:for arg in "$@"; do case "$arg" in -v) filter() { squash_date; }; ;; -*) usage ;; *) TEST_FILES+=("$arg") ;; esac - shift doneWait, this is shell script code, not Perl. Let me reconsider - this comment should be for run_tests.sh, not this Perl file. Let me skip this for now and address it in the correct file.
tests/internal/data/config_format/convert/parser_custom.rabbitmq.test (1)
1-1: LGTM!Test data file contains a representative RabbitMQ log entry for parser validation.
tests/internal/data/config_format/convert/parser_custom.neo4j.test (1)
1-1: LGTM!Test data file contains a representative Neo4j log entry for parser validation.
tests/internal/data/config_format/convert/run_tests.sh (1)
88-94: LGTM!The core test logic correctly:
- Executes fluent-bit with both .conf and .yaml parser configurations
- Applies consistent filtering/normalization to both outputs
- Validates outputs are non-empty before comparison
The approach of using checksums (or verbose output with
-v) provides a reliable equivalence check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
tests/internal/data/config_format/convert/run_tests.sh (2)
48-56: Simplify redundant echo statement.Line 55's
echo -e "\n"is redundant sinceechoalready appends a newline by default, anddiewill output its message. This just adds visual noise.Consider simplifying:
[[ "$check" = "" ]] || \ [[ "$check" = "01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b" ]] || \ [[ "$check" = "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855" ]] && \ - echo -e "\n" && die "### FAIL '$TEST_FILE' output was empty" + die "### FAIL '$TEST_FILE' output was empty"
77-100: Add pipeline failure detection for reliable error handling.The script runs complex pipelines (lines 87, 91) but doesn't use
set -o pipefail, which means intermediate command failures might go undetected. Whilecheck_outputdoes validate that output isn't empty, explicit pipeline failure detection would make error handling more robust.Add at the top of the script (after the shebang):
#!/bin/bash +set -euo pipefail # Test a batch of parsers that have been converted from .conf to .yaml,This ensures:
set -e: Exit on any command failureset -u: Exit on undefined variable referenceset -o pipefail: Pipelines fail if any command fails, not just the last one
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (29)
conf/parser_custom.yaml(1 hunks)conf/parsers.yaml(1 hunks)conf/parsers_ambassador.yaml(1 hunks)conf/parsers_cinder.yaml(1 hunks)conf/parsers_extra.yaml(1 hunks)conf/parsers_java.yaml(1 hunks)conf/parsers_kafka.yaml(1 hunks)conf/parsers_mult.yaml(1 hunks)conf/parsers_multiline.yaml(1 hunks)conf/parsers_openstack.yaml(1 hunks)tests/internal/data/config_format/convert/.gitignore(1 hunks)tests/internal/data/config_format/convert/README(1 hunks)tests/internal/data/config_format/convert/fb-parser-conf2yaml.pl(1 hunks)tests/internal/data/config_format/convert/parser_custom.external-dns.test(1 hunks)tests/internal/data/config_format/convert/parser_custom.neo4j.test(1 hunks)tests/internal/data/config_format/convert/parser_custom.rabbitmq.test(1 hunks)tests/internal/data/config_format/convert/parsers.cri.test(1 hunks)tests/internal/data/config_format/convert/parsers.envoy.test(1 hunks)tests/internal/data/config_format/convert/parsers.istio-envoy-proxy.test(1 hunks)tests/internal/data/config_format/convert/parsers.k8s-nginx-ingress.test(1 hunks)tests/internal/data/config_format/convert/parsers.kmsg-netfilter-log.test(1 hunks)tests/internal/data/config_format/convert/parsers.syslog-rfc3164.test(1 hunks)tests/internal/data/config_format/convert/parsers_cinder.ceph.test(1 hunks)tests/internal/data/config_format/convert/parsers_extra.chefclient.test(1 hunks)tests/internal/data/config_format/convert/parsers_extra.couchbase_java_multiline.test(1 hunks)tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_mixed.test(1 hunks)tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_utc.test(1 hunks)tests/internal/data/config_format/convert/parsers_extra.crowbar.test(1 hunks)tests/internal/data/config_format/convert/run_tests.sh(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- conf/parsers_multiline.yaml
🚧 Files skipped from review as they are similar to previous changes (16)
- tests/internal/data/config_format/convert/parsers_extra.chefclient.test
- conf/parsers_openstack.yaml
- conf/parsers_kafka.yaml
- conf/parser_custom.yaml
- conf/parsers_extra.yaml
- tests/internal/data/config_format/convert/parsers_extra.couchbase_java_multiline.test
- tests/internal/data/config_format/convert/parsers.syslog-rfc3164.test
- tests/internal/data/config_format/convert/parsers.cri.test
- tests/internal/data/config_format/convert/README
- conf/parsers_mult.yaml
- tests/internal/data/config_format/convert/parsers.istio-envoy-proxy.test
- conf/parsers_java.yaml
- tests/internal/data/config_format/convert/parsers_extra.crowbar.test
- tests/internal/data/config_format/convert/parsers.envoy.test
- tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_utc.test
- tests/internal/data/config_format/convert/parser_custom.neo4j.test
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-11-16T22:16:26.015Z
Learnt from: hlein
Repo: fluent/fluent-bit PR: 11168
File: conf/parsers_mult.yaml:8-14
Timestamp: 2025-11-16T22:16:26.015Z
Learning: In Fluent Bit parser configurations (both .conf and .yaml formats), the regex engine automatically strips leading and trailing `/` characters from regex patterns, so patterns like `/Processing by .../` and `Processing by ...` are functionally equivalent and both work correctly.
Applied to files:
conf/parsers.yamltests/internal/data/config_format/convert/fb-parser-conf2yaml.plconf/parsers_cinder.yaml
📚 Learning: 2025-09-22T15:59:55.794Z
Learnt from: nicknezis
Repo: fluent/fluent-bit PR: 10882
File: plugins/out_http/http.c:112-116
Timestamp: 2025-09-22T15:59:55.794Z
Learning: When users consider bug fixes out of scope for their focused PRs, it's appropriate to create separate GitHub issues to track those concerns rather than expanding the current PR scope.
Applied to files:
conf/parsers_cinder.yaml
🔇 Additional comments (17)
tests/internal/data/config_format/convert/.gitignore (1)
1-2: Verify that test fixtures are tracked in git if needed by the test harness.This
.gitignorepattern ignores all.confand.yamlfiles in the test directory. Given that the PR adds a test harness for the converter, clarify whether these patterns are intended to exclude:
- Generated test output files (temporary/derived) — in which case ignoring is appropriate
- Test fixture files (reference data needed by tests) — in which case ignoring may prevent test execution
Please confirm:
- Are
.confand.yamlfiles in this directory test fixtures that need to be committed to git?- Or are they generated output files that should be excluded?
- Are test fixtures tracked elsewhere, or will they be generated at test time?
If test fixtures should be committed, consider scoping this
.gitignoreto exclude only generated files, e.g.:*.conf.generated *.yaml.generatedOr use a separate directory structure for generated vs. fixture files.
tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_mixed.test (1)
1-3: LGTM!The test fixture appropriately contains mixed Couchbase log formats (slash-separated and ISO 8601 date formats) with realistic log content. This test data effectively validates that the parser can handle different timestamp formats from Couchbase logs.
conf/parsers_cinder.yaml (1)
1-8: LGTM! Ceph parser converted correctly.The YAML format correctly preserves the existing ceph parser configuration from the .conf version, and test results confirm functional parity.
tests/internal/data/config_format/convert/parsers_cinder.ceph.test (1)
1-10: Test data looks good.The Ceph log samples provide good coverage with both Hangup signals and connection state messages, sufficient for validating the parser conversion.
conf/parsers.yaml (1)
1-122: Excellent parser registry conversion.The YAML format correctly preserves all existing parser configurations, and test results confirm functional parity between .conf and .yaml formats across all parsers. The comprehensive registry supports diverse log formats with consistent time extraction patterns.
conf/parsers_ambassador.yaml (1)
1-7: Ambassador parser looks good.The configuration correctly defines an Envoy-based parser for Ambassador logs with comprehensive field extraction. The regex pattern aligns with the Envoy log format used by Ambassador.
tests/internal/data/config_format/convert/fb-parser-conf2yaml.pl (1)
1-489: Well-structured converter script.The Perl converter is cleanly organized with clear separation between parsing, formatting, and output generation. It handles the conversion requirements effectively:
- Preserves comments (leading, inline, trailing) with proper positioning
- Transforms multiline parser rules into structured YAML
- Handles UTF-8 encoding properly
- Provides appropriate error handling with warnings
The test results confirm the converter produces functionally equivalent YAML configurations.
tests/internal/data/config_format/convert/parser_custom.external-dns.test (1)
1-1: Test data validates the external-dns parser.The sample log line provides appropriate coverage for testing the external-dns parser configuration.
tests/internal/data/config_format/convert/parsers.kmsg-netfilter-log.test (1)
1-2: Netfilter test data provides good protocol coverage.The test samples include both TCP and UDP variants, sufficient for validating the kmsg-netfilter-log parser's comprehensive field extraction.
tests/internal/data/config_format/convert/parser_custom.rabbitmq.test (1)
1-1: RabbitMQ test data validates the parser.The log sample appropriately tests the rabbitmq parser's field extraction capabilities.
tests/internal/data/config_format/convert/parsers.k8s-nginx-ingress.test (1)
1-1: LGTM! Test data is appropriate.The log line contains all expected fields for a k8s nginx ingress access log, including client IP, timestamp, HTTP request details, response metrics, and upstream information. This provides comprehensive test coverage for the parser.
tests/internal/data/config_format/convert/run_tests.sh (6)
1-16: LGTM! Utility functions are well-structured.The shebang,
warn, anddiefunctions follow bash best practices for error reporting.
18-27: LGTM! Date normalization is correctly implemented.The regex pattern properly matches JSON date fields, and the comment clearly explains why this normalization is necessary. The
sed -Eusage is appropriate for this pattern.
29-33: LGTM! Filter function provides appropriate output transformation.The pipeline correctly normalizes dates and generates a checksum for comparison. The function override mechanism for verbose mode is a clean approach.
60-66: LGTM! Argument parsing is correct.The loop properly handles verbose mode, validates flags, and collects test file arguments. The previous issue with unnecessary
shifthas been addressed.
68-72: LGTM! Fluent-bit validation provides good early failure detection.The PATH check and self-test ensure the script fails fast if fluent-bit is unavailable or malfunctioning, which improves debuggability.
74-75: LGTM! Default file selection provides good usability.Defaulting to all matching test files when none are specified is intuitive and follows common Unix tool patterns.
Signed-off-by: Hank Leininger <[email protected]>
These have been run through the test suite to make sure fluent-bit produces identical output using them. Signed-off-by: Hank Leininger <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
conf/parsers.yaml (1)
75-80: Consider removing regex delimiters for consistency.While this regex pattern works correctly (Fluent Bit strips the leading and trailing
/characters automatically), it's inconsistent with other parsers in this file. For example, the similarsyslog-rfc3164-localparser at line 70 and all other regex parsers in this file don't include the/delimiters.Consider applying this change for consistency:
- name: syslog-rfc3164 format: regex - regex: '/^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$/' + regex: '^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$' time_key: time time_format: '%b %d %H:%M:%S' time_keep: OnBased on learnings.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (29)
conf/parser_custom.yaml(1 hunks)conf/parsers.yaml(1 hunks)conf/parsers_ambassador.yaml(1 hunks)conf/parsers_cinder.yaml(1 hunks)conf/parsers_extra.yaml(1 hunks)conf/parsers_java.yaml(1 hunks)conf/parsers_kafka.yaml(1 hunks)conf/parsers_mult.yaml(1 hunks)conf/parsers_multiline.yaml(1 hunks)conf/parsers_openstack.yaml(1 hunks)tests/internal/data/config_format/convert/.gitignore(1 hunks)tests/internal/data/config_format/convert/README(1 hunks)tests/internal/data/config_format/convert/fb-parser-conf2yaml.pl(1 hunks)tests/internal/data/config_format/convert/parser_custom.external-dns.test(1 hunks)tests/internal/data/config_format/convert/parser_custom.neo4j.test(1 hunks)tests/internal/data/config_format/convert/parser_custom.rabbitmq.test(1 hunks)tests/internal/data/config_format/convert/parsers.cri.test(1 hunks)tests/internal/data/config_format/convert/parsers.envoy.test(1 hunks)tests/internal/data/config_format/convert/parsers.istio-envoy-proxy.test(1 hunks)tests/internal/data/config_format/convert/parsers.k8s-nginx-ingress.test(1 hunks)tests/internal/data/config_format/convert/parsers.kmsg-netfilter-log.test(1 hunks)tests/internal/data/config_format/convert/parsers.syslog-rfc3164.test(1 hunks)tests/internal/data/config_format/convert/parsers_cinder.ceph.test(1 hunks)tests/internal/data/config_format/convert/parsers_extra.chefclient.test(1 hunks)tests/internal/data/config_format/convert/parsers_extra.couchbase_java_multiline.test(1 hunks)tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_mixed.test(1 hunks)tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_utc.test(1 hunks)tests/internal/data/config_format/convert/parsers_extra.crowbar.test(1 hunks)tests/internal/data/config_format/convert/run_tests.sh(1 hunks)
✅ Files skipped from review due to trivial changes (2)
- tests/internal/data/config_format/convert/parser_custom.neo4j.test
- tests/internal/data/config_format/convert/parsers.k8s-nginx-ingress.test
🚧 Files skipped from review as they are similar to previous changes (17)
- tests/internal/data/config_format/convert/parser_custom.external-dns.test
- tests/internal/data/config_format/convert/parsers_extra.couchbase_java_multiline.test
- tests/internal/data/config_format/convert/parsers.cri.test
- tests/internal/data/config_format/convert/parsers_extra.chefclient.test
- tests/internal/data/config_format/convert/parsers.syslog-rfc3164.test
- conf/parsers_cinder.yaml
- conf/parsers_openstack.yaml
- conf/parsers_java.yaml
- conf/parsers_ambassador.yaml
- conf/parser_custom.yaml
- tests/internal/data/config_format/convert/parsers.kmsg-netfilter-log.test
- tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_mixed.test
- conf/parsers_kafka.yaml
- conf/parsers_mult.yaml
- tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_utc.test
- tests/internal/data/config_format/convert/run_tests.sh
- tests/internal/data/config_format/convert/parsers.envoy.test
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-16T22:16:26.015Z
Learnt from: hlein
Repo: fluent/fluent-bit PR: 11168
File: conf/parsers_mult.yaml:8-14
Timestamp: 2025-11-16T22:16:26.015Z
Learning: In Fluent Bit parser configurations (both .conf and .yaml formats), the regex engine automatically strips leading and trailing `/` characters from regex patterns, so patterns like `/Processing by .../` and `Processing by ...` are functionally equivalent and both work correctly.
Applied to files:
conf/parsers.yamlconf/parsers_extra.yamltests/internal/data/config_format/convert/fb-parser-conf2yaml.pl
🔇 Additional comments (15)
tests/internal/data/config_format/convert/.gitignore (1)
1-2: LGTM!The ignore patterns are appropriate for a conversion test directory where
.confand.yamlfiles are generated or copied during testing.tests/internal/data/config_format/convert/README (1)
1-20: LGTM!The documentation clearly explains the conversion workflow with practical usage examples.
tests/internal/data/config_format/convert/fb-parser-conf2yaml.pl (7)
1-28: LGTM!The header properly enables strict mode, warnings, and UTF-8 support. The documented limitations are reasonable given the scope of conversion needed, and the keyword pattern construction with
quotemetais a good security practice.
48-60: LGTM!The keyword transformation logic correctly identifies matching patterns and delegates value formatting to the appropriate function.
62-252: LGTM!The parsing logic correctly handles sections, comments, blank lines, and key-value pairs. The state management for comment positioning is well thought out, and appropriate warnings are issued for unsupported sections or parsing failures.
254-271: LGTM!The YAML value formatting correctly identifies values requiring quoting and properly escapes single quotes using YAML's
''escape sequence.
273-416: LGTM!The output functions properly format YAML with correct indentation, list markers, and blank line spacing. The special handling for rules arrays and comment positioning is well-implemented.
418-438: LGTM!The conversion function correctly orchestrates the output of all components in the proper order and returns well-formed YAML.
440-492: LGTM!The main execution has robust file handling with proper existence checks, UTF-8 encoding, and helpful error messages. The safety check for existing non-empty output files (line 463) prevents accidental overwrites.
tests/internal/data/config_format/convert/parser_custom.rabbitmq.test (1)
1-1: LGTM! Test data looks appropriate.The RabbitMQ log entry is well-formed and provides appropriate test coverage for the RabbitMQ parser. The PR objectives confirm that tests pass with matching checksums between .conf and .yaml formats.
tests/internal/data/config_format/convert/parsers_cinder.ceph.test (1)
1-10: LGTM! Comprehensive Ceph test data.The test data covers multiple Ceph log message types (Hangup signals and connection state messages), providing good parser validation coverage. The PR objectives confirm that tests pass with matching checksums.
conf/parsers_multiline.yaml (1)
1-12: LGTM! Multiline parser configuration is correct.The multiline parser for exception handling is properly configured. The regex patterns include
/delimiters, which work correctly because Fluent Bit's regex engine automatically strips leading and trailing/characters.Based on learnings.
tests/internal/data/config_format/convert/parsers_extra.crowbar.test (1)
1-20: LGTM! Comprehensive Crowbar test coverage.The test data provides excellent coverage of various Crowbar log formats including DEBUG, WARN, and INFO levels, command errors, and HTTP request/response logs. The PR objectives confirm that tests pass with matching checksums.
tests/internal/data/config_format/convert/parsers.istio-envoy-proxy.test (1)
1-1: LGTM! Appropriate Istio Envoy test data.The test data provides comprehensive coverage of Istio Envoy proxy log fields including request details, timing, upstream information, and routing context. The PR objectives confirm that tests pass with matching checksums.
conf/parsers_extra.yaml (1)
1-159: LGTM! Comprehensive and consistent parser definitions.This file provides an extensive collection of well-structured parser definitions for various log sources. All regex patterns are consistently formatted without leading/trailing
/delimiters, and time handling configurations are appropriate for each parser type. The Couchbase parser collection is particularly thorough, covering multiple log formats and edge cases.
My intention is to submit regex-cleanup commits separately from these changes, once these commits merge. |
|
Note, a branch full of fixes is waiting behind this PR; once it lands I'll retarget hlein#2 |
| @@ -0,0 +1,492 @@ | |||
| #!/usr/bin/env perl | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just my opinion but it would be nice to be written down in Python instead of Perl...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just my opinion but it would be nice to be written down in Python instead of Perl...
I'm sure it's not just your opinion, probably shared by others :)
But I'm about 10x more proficient in perl than python. This tool probably won't be needed longterm, once the current parser files are translated (but I added it to show my work).
I'm sure it could be translated/rewritten into python mostly automatically, but if so I wouldn't be qualified to debug it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (23)
CMakeLists.txt(1 hunks)conf/fluent-bit-macos.conf(1 hunks)conf/fluent-bit-win32.conf(1 hunks)conf/fluent-bit.conf(1 hunks)conf/kube.conf(1 hunks)conf/kube_elasticsearch.conf(1 hunks)conf/parser_custom.conf(0 hunks)conf/parsers_ambassador.conf(0 hunks)conf/parsers_cinder.conf(0 hunks)conf/parsers_extra.conf(0 hunks)conf/parsers_java.conf(0 hunks)conf/parsers_kafka.conf(0 hunks)conf/parsers_mult.conf(0 hunks)conf/parsers_openstack.conf(0 hunks)conf/rate_limit.conf(2 hunks)cpack/debian/conffiles(1 hunks)dockerfiles/Dockerfile(1 hunks)dockerfiles/Dockerfile.windows(1 hunks)documentation/examples/multiline/filter_multiline/fluent-bit.conf(1 hunks)documentation/examples/multiline/regex-001/fluent-bit.conf(1 hunks)examples/wasi_serde_json/README.md(2 hunks)src/CMakeLists.txt(1 hunks)tests/internal/data/stream_processor/gen_msgpack.sh(2 hunks)
💤 Files with no reviewable changes (8)
- conf/parsers_openstack.conf
- conf/parsers_mult.conf
- conf/parser_custom.conf
- conf/parsers_cinder.conf
- conf/parsers_java.conf
- conf/parsers_kafka.conf
- conf/parsers_ambassador.conf
- conf/parsers_extra.conf
✅ Files skipped from review due to trivial changes (2)
- CMakeLists.txt
- examples/wasi_serde_json/README.md
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit CMakeLists.txt, the system library preference flags are defined as FLB_PREFER_SYSTEM_LIB_ZSTD and FLB_PREFER_SYSTEM_LIB_KAFKA with the FLB_ prefix.
Applied to files:
src/CMakeLists.txt
📚 Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit, the correct CMake flag for using system librdkafka is `FLB_PREFER_SYSTEM_LIB_KAFKA=ON`.
Applied to files:
src/CMakeLists.txt
📚 Learning: 2025-09-14T09:46:09.531Z
Learnt from: aminvakil
Repo: fluent/fluent-bit PR: 10844
File: conf/fluent-bit:13-15
Timestamp: 2025-09-14T09:46:09.531Z
Learning: For fluent-bit Debian packaging, /opt/fluent-bit/bin/ is the appropriate installation path since the package may be installed from non-official Debian sources, making /opt compliant with FHS for optional software packages.
Applied to files:
dockerfiles/Dockerfiledockerfiles/Dockerfile.windowscpack/debian/conffiles
📚 Learning: 2025-11-16T22:16:26.032Z
Learnt from: hlein
Repo: fluent/fluent-bit PR: 11168
File: conf/parsers_mult.yaml:8-14
Timestamp: 2025-11-16T22:16:26.032Z
Learning: In Fluent Bit parser configurations (both .conf and .yaml formats), the regex engine automatically strips leading and trailing `/` characters from regex patterns, so patterns like `/Processing by .../` and `Processing by ...` are functionally equivalent and both work correctly.
Applied to files:
conf/fluent-bit-win32.confconf/fluent-bit.confconf/fluent-bit-macos.confdocumentation/examples/multiline/regex-001/fluent-bit.confdocumentation/examples/multiline/filter_multiline/fluent-bit.confcpack/debian/conffiles
🔇 Additional comments (12)
conf/fluent-bit-macos.conf (1)
28-28: LGTM! Parser file reference updated correctly.The update from
parsers.conftoparsers.yamlaligns with the PR's migration strategy. Testing confirms YAML parsers behave identically to the original .conf versions.conf/rate_limit.conf (1)
5-5: LGTM! Configuration updated correctly.The parser file reference has been updated to
parsers.yaml, consistent with the migration across all configuration files in this PR.Also applies to: 25-25
conf/fluent-bit-win32.conf (1)
28-28: LGTM! Windows configuration updated consistently.The parser file reference update matches the changes across all platform-specific configurations (macOS, Linux, Windows).
conf/kube_elasticsearch.conf (1)
5-5: LGTM! Kubernetes configuration updated correctly.The parser file reference has been updated to
parsers.yaml, maintaining consistency with the project-wide migration.documentation/examples/multiline/regex-001/fluent-bit.conf (1)
4-4: LGTM! Documentation example updated correctly.The multiline parser reference has been updated from
.confto.yamlformat, ensuring documentation examples remain accurate and aligned with the migration.conf/fluent-bit.conf (1)
28-28: LGTM! Primary configuration updated correctly.The main Fluent Bit configuration file now references
parsers.yaml, completing the migration for the default configuration.conf/kube.conf (1)
5-5: LGTM! Kubernetes configuration updated correctly.The parser file reference update completes the consistent migration across all configuration files in this review set.
cpack/debian/conffiles (1)
1-1: Verification confirmed: parsers.yaml installation properly configured.The build system correctly installs
parsers.yamlto/etc/fluent-bit/parsers.yamlas a config file (markednoreplacefor package preservation). The conffiles change is correct and aligns with the CMake configuration.tests/internal/data/stream_processor/gen_msgpack.sh (1)
10-10: LGTM! Consistent update to YAML parser configuration.Both fluent-bit invocations have been updated to use
parsers.yamlinstead ofparsers.conf, maintaining consistency across the test generation script.Also applies to: 34-34
src/CMakeLists.txt (1)
660-670: LGTM! Well-designed backward compatibility.The conditional installation logic properly handles both YAML-enabled and non-YAML builds by installing
parsers.yamlwhenFLB_HAVE_LIBYAMLis enabled and falling back toparsers.confotherwise. This maintains backward compatibility while enabling the migration to YAML-based configurations.dockerfiles/Dockerfile.windows (1)
190-195: LGTM! Consistent migration to YAML parser configurations.All parser configuration files have been updated to use YAML variants. This is well-aligned with the Windows build configuration, which includes libyaml installation (line 153) and proper CMake configuration (line 174).
dockerfiles/Dockerfile (1)
103-108: LGTM! Consistent migration to YAML parser configurations.All parser configuration files have been updated to use YAML variants, consistent with the Windows Dockerfile changes. The Linux build properly includes YAML support via the
libyaml-devpackage (line 57).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (23)
CMakeLists.txt(1 hunks)conf/fluent-bit-macos.conf(1 hunks)conf/fluent-bit-win32.conf(1 hunks)conf/fluent-bit.conf(1 hunks)conf/kube.conf(1 hunks)conf/kube_elasticsearch.conf(1 hunks)conf/parser_custom.conf(0 hunks)conf/parsers_ambassador.conf(0 hunks)conf/parsers_cinder.conf(0 hunks)conf/parsers_extra.conf(0 hunks)conf/parsers_java.conf(0 hunks)conf/parsers_kafka.conf(0 hunks)conf/parsers_mult.conf(0 hunks)conf/parsers_openstack.conf(0 hunks)conf/rate_limit.conf(2 hunks)cpack/debian/conffiles(1 hunks)dockerfiles/Dockerfile(1 hunks)dockerfiles/Dockerfile.windows(1 hunks)documentation/examples/multiline/filter_multiline/fluent-bit.conf(1 hunks)documentation/examples/multiline/regex-001/fluent-bit.conf(1 hunks)examples/wasi_serde_json/README.md(2 hunks)src/CMakeLists.txt(1 hunks)tests/internal/data/stream_processor/gen_msgpack.sh(2 hunks)
💤 Files with no reviewable changes (8)
- conf/parsers_openstack.conf
- conf/parsers_mult.conf
- conf/parsers_ambassador.conf
- conf/parsers_kafka.conf
- conf/parser_custom.conf
- conf/parsers_cinder.conf
- conf/parsers_java.conf
- conf/parsers_extra.conf
✅ Files skipped from review due to trivial changes (1)
- conf/fluent-bit-win32.conf
🚧 Files skipped from review as they are similar to previous changes (10)
- tests/internal/data/stream_processor/gen_msgpack.sh
- conf/kube.conf
- documentation/examples/multiline/regex-001/fluent-bit.conf
- examples/wasi_serde_json/README.md
- documentation/examples/multiline/filter_multiline/fluent-bit.conf
- conf/fluent-bit.conf
- conf/kube_elasticsearch.conf
- dockerfiles/Dockerfile.windows
- conf/rate_limit.conf
- src/CMakeLists.txt
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-14T09:46:09.531Z
Learnt from: aminvakil
Repo: fluent/fluent-bit PR: 10844
File: conf/fluent-bit:13-15
Timestamp: 2025-09-14T09:46:09.531Z
Learning: For fluent-bit Debian packaging, /opt/fluent-bit/bin/ is the appropriate installation path since the package may be installed from non-official Debian sources, making /opt compliant with FHS for optional software packages.
Applied to files:
cpack/debian/conffilesdockerfiles/Dockerfile
📚 Learning: 2025-11-16T22:16:26.032Z
Learnt from: hlein
Repo: fluent/fluent-bit PR: 11168
File: conf/parsers_mult.yaml:8-14
Timestamp: 2025-11-16T22:16:26.032Z
Learning: In Fluent Bit parser configurations (both .conf and .yaml formats), the regex engine automatically strips leading and trailing `/` characters from regex patterns, so patterns like `/Processing by .../` and `Processing by ...` are functionally equivalent and both work correctly.
Applied to files:
cpack/debian/conffilesconf/fluent-bit-macos.conf
🔇 Additional comments (2)
dockerfiles/Dockerfile (1)
102-110: Now I need to check what version of fluent-bit is being used and look at the complete fluent-bit.conf file to understand how parsers are loaded:Let me check the Dockerfile more thoroughly and the fluent-bit.conf file:Based on all my web searches, I now have enough context. When converting the existing conf to a YAML format, you'll change the file extension to be .YAML which means you must also modify the referencing parsers_file reference in the server section of the main configuration file. The critical issue is that fluent-bit.conf showed NO @include statements. Without checking the full fluent-bit.conf content to see if it has proper Parsers_File directives referencing the YAML files, I cannot definitively verify the configuration is correct. However, based on the evidence:
- ✓ YAML parser files exist in the build context
- ✓ Fluent-bit version is 4.1.2+ which supports YAML parsers
- ✗ fluent-bit.conf has NO @include statements (but could have parsers_file directive)
- Critical gap: No evidence that fluent-bit.conf has been updated to reference the new YAML parser files
This is a potential breaking change that needs verification.
Verify that fluent-bit.conf properly references the YAML parser files and that all parser configurations are correctly loaded in the container.
The Dockerfile change copied YAML parser files instead of .conf files, but without verifying that:
fluent-bit.confcontains updatedparsers_filedirectives pointing to the YAML files (not the old .conf files)- All parser references in the main configuration are correctly updated to use YAML format
The PR claims "kept in place" and "FLB_CONFIG_YAML remains optional," but the Dockerfile makes an exclusive switch from .conf to .yaml files without evidence of corresponding config updates. If
fluent-bit.confstill references.confparser files that are no longer copied, the container will fail to load parsers at runtime.conf/fluent-bit-macos.conf (1)
28-28: No issues found. Backward compatibility maintained.Both
conf/parsers.confandconf/parsers.yamlcoexist in the repository, ensuring users with existing installations can revert to the old format if needed by editing their configuration. The migration from.confto.yamlis consistent across all platform configurations (macOS, Windows, Linux, Kubernetes), and existing test data validates both formats work correctly. The change is a default configuration update, not a breaking removal.
|
what happens when a user upgrades from an old version a future one that contains this change ? We can merge this for v5.0 (released on March), so we have time to test |
I think if they have their own There was some feedback (don't remember if it was from a human or a bot) about making sure packages considered the old Or do you mean, users may want to have a conversion tool they can use to convert their own local/custom parser |
|
I think Eduardo means that folks will be using the current provided conf files so removing them will break their configuration, this is particularly true of parser definitions. I would say it's too risky to merge new definitions and remove the old ones all in one go. We should merge the new ones to allow them to be tested and adopted. We can deprecate the old ones and eventually remove them but dropping them straight away will break things which is not acceptable or expected by users. Splitting the removal into a separate PR also makes it easier to revert or backport each set of changes. |
c69c95c to
b5b520e
Compare
|
Fair enough, OK I've dropped the last commit which did the removals, and modified the second-to-last to just add commented-out references to The exception being examples/documentation - those are now changed to demonstrate/include |
Signed-off-by: Hank Leininger <[email protected]>
Install parser*.yaml files alongside .conf and add commented-out references to them in other configs. Signed-off-by: Hank Leininger <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
conf/kube_elasticsearch.conf (1)
6-6: LGTM! Clean documentation of the YAML alternative.The commented reference to
parsers.yamleffectively documents the new YAML parser format option while maintaining backward compatibility with the existingparsers.confconfiguration.Optional enhancement: Consider adding a brief inline comment explaining the YAML option, such as:
Parsers_File parsers.conf - #Parsers_File parsers.yaml + # Alternative: Use parsers.yaml for YAML-format parser definitions + #Parsers_File parsers.yamlThis could help users understand the purpose of the alternative, though the current approach is clear enough for most users.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (16)
CMakeLists.txt(1 hunks)conf/fluent-bit-macos.conf(1 hunks)conf/fluent-bit-win32.conf(1 hunks)conf/fluent-bit.conf(1 hunks)conf/kube.conf(1 hunks)conf/kube_elasticsearch.conf(1 hunks)conf/rate_limit.conf(2 hunks)cpack/debian/conffiles(1 hunks)dockerfiles/Dockerfile(1 hunks)dockerfiles/Dockerfile.windows(1 hunks)documentation/examples/multiline/filter_multiline/fluent-bit.conf(1 hunks)documentation/examples/multiline/filter_multiline/parsers_multiline.yaml(2 hunks)documentation/examples/multiline/regex-001/fluent-bit.conf(1 hunks)documentation/examples/multiline/regex-001/parsers_multiline.yaml(2 hunks)examples/wasi_serde_json/README.md(2 hunks)src/CMakeLists.txt(1 hunks)
✅ Files skipped from review due to trivial changes (1)
- conf/kube.conf
🚧 Files skipped from review as they are similar to previous changes (9)
- src/CMakeLists.txt
- conf/fluent-bit-win32.conf
- documentation/examples/multiline/regex-001/parsers_multiline.yaml
- documentation/examples/multiline/filter_multiline/parsers_multiline.yaml
- examples/wasi_serde_json/README.md
- documentation/examples/multiline/filter_multiline/fluent-bit.conf
- conf/fluent-bit-macos.conf
- conf/rate_limit.conf
- dockerfiles/Dockerfile.windows
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: hlein
Repo: fluent/fluent-bit PR: 11168
File: conf/parsers_mult.yaml:8-14
Timestamp: 2025-11-16T22:16:26.032Z
Learning: In Fluent Bit parser configurations (both .conf and .yaml formats), the regex engine automatically strips leading and trailing `/` characters from regex patterns, so patterns like `/Processing by .../` and `Processing by ...` are functionally equivalent and both work correctly.
📚 Learning: 2025-11-16T22:16:26.032Z
Learnt from: hlein
Repo: fluent/fluent-bit PR: 11168
File: conf/parsers_mult.yaml:8-14
Timestamp: 2025-11-16T22:16:26.032Z
Learning: In Fluent Bit parser configurations (both .conf and .yaml formats), the regex engine automatically strips leading and trailing `/` characters from regex patterns, so patterns like `/Processing by .../` and `Processing by ...` are functionally equivalent and both work correctly.
Applied to files:
documentation/examples/multiline/regex-001/fluent-bit.confconf/fluent-bit.confcpack/debian/conffiles
📚 Learning: 2025-11-21T06:23:29.750Z
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11171
File: include/fluent-bit/flb_lib.h:52-53
Timestamp: 2025-11-21T06:23:29.750Z
Learning: In Fluent Bit core (fluent/fluent-bit repository), function descriptions/documentation are not required for newly added functions in header files.
Applied to files:
conf/fluent-bit.conf
📚 Learning: 2025-09-14T09:46:09.531Z
Learnt from: aminvakil
Repo: fluent/fluent-bit PR: 10844
File: conf/fluent-bit:13-15
Timestamp: 2025-09-14T09:46:09.531Z
Learning: For fluent-bit Debian packaging, /opt/fluent-bit/bin/ is the appropriate installation path since the package may be installed from non-official Debian sources, making /opt compliant with FHS for optional software packages.
Applied to files:
dockerfiles/Dockerfilecpack/debian/conffiles
📚 Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit CMakeLists.txt, the system library preference flags are defined as FLB_PREFER_SYSTEM_LIB_ZSTD and FLB_PREFER_SYSTEM_LIB_KAFKA with the FLB_ prefix.
Applied to files:
CMakeLists.txtcpack/debian/conffiles
🔇 Additional comments (5)
conf/fluent-bit.conf (1)
29-29: LGTM! Helpful documentation for the YAML alternative.The commented line provides clear guidance for users who want to try the new YAML parser format while keeping the default
.confformat active. This approach aligns well with the PR's goal of introducing YAML support without breaking existing configurations.CMakeLists.txt (1)
1526-1526: LGTM! RPM configuration file handling is correct.The addition of
parsers.yamlto the RPM filelist with%config(noreplace)is appropriate. Bothparsers.conf(line 1525) andparsers.yamlare now properly marked as non-replaceable configuration files, ensuring user customizations are preserved during package upgrades.cpack/debian/conffiles (1)
2-2: LGTM! Debian configuration file handling is correct.The addition of
/etc/fluent-bit/parsers.yamlto the conffiles list is appropriate. Bothparsers.conf(line 1) andparsers.yamlare now listed, ensuring dpkg preserves user modifications during package upgrades. This aligns with the backward-compatible migration strategy.dockerfiles/Dockerfile (1)
109-114: LGTM! YAML parser files properly added to container image.The additions mirror the existing .conf parser files (lines 103-108), providing YAML versions alongside the original configuration format. This maintains backward compatibility during the migration. The COPY command syntax is correct and the destination is appropriate.
documentation/examples/multiline/regex-001/fluent-bit.conf (1)
4-5: LGTM! Example updated to demonstrate YAML parser format.The change appropriately updates this example to reference the new YAML parser file while keeping the old
.confreference as a commented alternative. This aligns with the PR's goal of demonstrating YAML parsers in examples while retaining backwards compatibility.Since the PR objectives mention that
FLB_CONFIG_YAML remains optional, please verify that YAML parser files (.yaml) can be loaded viaparsers_fileregardless of theFLB_CONFIG_YAMLbuild flag, or clarify any requirements for users who want to use this example. Run the following to check if there are any conditional compilation guards around YAML parser file loading:
Rewrote all
conf/parser*.conffiles to.yaml.Added a utility to do that conversion, in case it is handy for others. Currently only handles parser configs, but could be extended to handle others if desired.
Made a small test harness, test cases for every parser that had reference log examples, and confirmed the converted version behaves the same.
Initially I wanted to go ahead and clean up the old
.conffiles to avoid divergence, but it's too soon, there's lots of documentation and other configs that still reference them. And maybe never, becauseFLB_CONFIG_YAMLis optional.Addresses #11161 - no change in actual behavior yet (on purpose) but this opens the door for modernization and easier maintenance.
Testing
Before we can approve your change; please submit the following in a comment:
Documentation
Backporting
Should work for any release with
YAMLparsers that behave the same as current, but probably does not need a backport.Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.
Summary by CodeRabbit
New Features
Configuration Updates
Testing
✏️ Tip: You can customize this high-level summary in your review settings.