Skip to content

Conversation

@hlein
Copy link

@hlein hlein commented Nov 16, 2025

Rewrote all conf/parser*.conf files to .yaml.

Added a utility to do that conversion, in case it is handy for others. Currently only handles parser configs, but could be extended to handle others if desired.

Made a small test harness, test cases for every parser that had reference log examples, and confirmed the converted version behaves the same.

Initially I wanted to go ahead and clean up the old .conf files to avoid divergence, but it's too soon, there's lots of documentation and other configs that still reference them. And maybe never, because FLB_CONFIG_YAML is optional.

Addresses #11161 - no change in actual behavior yet (on purpose) but this opens the door for modernization and easier maintenance.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change <- included in the changes
  • Debug log output from testing the change

Documentation

  • Documentation required for this feature <- included in the changes

Backporting

Should work for any release with YAML parsers that behave the same as current, but probably does not need a backport.


Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • New Features

    • Added comprehensive YAML-based parser registry with new parsers (Kafka, Java, OpenStack, Ambassador, Ceph, RabbitMQ, Neo4j, External-DNS, multiline and many extras).
  • Configuration Updates

    • Examples, installers and defaults updated to reference/parity with YAML parser files while keeping legacy format supported.
  • Testing

    • Added conversion tool, test data and scripts to validate .conf → .yaml parser migrations.

✏️ Tip: You can customize this high-level summary in your review settings.

Previously, trying to use this file resulted in:

[2025/11/15 21:23:49.103609095] [error] [config] error in parser_custom.conf:1: undefined value - check config is in valid classic format

Signed-off-by: Hank Leininger <[email protected]>
@coderabbitai
Copy link

coderabbitai bot commented Nov 16, 2025

Walkthrough

This PR migrates many Fluent Bit parser definitions from INI .conf blocks to new YAML .yaml files, adds a Perl tool to convert .conf→.yaml with tests and fixtures, updates examples and build/install scripts to include YAML parsers, and removes corresponding parser blocks from several .conf files.

Changes

Cohort / File(s) Summary
New YAML parser files
conf/parsers.yaml, conf/parsers_extra.yaml, conf/parsers_ambassador.yaml, conf/parsers_cinder.yaml, conf/parsers_java.yaml, conf/parsers_kafka.yaml, conf/parsers_mult.yaml, conf/parsers_multiline.yaml, conf/parsers_openstack.yaml, conf/parser_custom.yaml
Add comprehensive YAML-format parser definitions (regex/json/logfmt) for many sources (apache/nginx/k8s/docker/syslog/mongodb/envoy/istio/cri/openstack/cinder/rabbitmq/neo4j/external-dns/etc.), with time_key/time_format/time_keep and structured/multiline rules.
Removed/modified .conf parser blocks
conf/parsers_extra.conf, conf/parser_custom.conf, conf/parsers_cinder.conf, conf/parsers.conf
Remove numerous INI [PARSER] and [MULTILINE_PARSER] blocks (crowbar, chefclient, mysql variants, pacemaker, rabbitmq, couchbase variants, iptables, universal, uuid, etc.) and a minor doc URL update (http→https).
Conversion tool & scripts
tests/internal/data/config_format/convert/fb-parser-conf2yaml.pl, tests/internal/data/config_format/convert/run_tests.sh, tests/internal/data/config_format/convert/README, tests/internal/data/config_format/convert/.gitignore
Add Perl converter that parses .conf parser sections and emits YAML (preserving comments and multiline rules), plus a test harness, README, and .gitignore.
Test fixtures
tests/internal/data/config_format/convert/parser_*.test, tests/internal/data/config_format/convert/parsers_*.test, tests/internal/data/config_format/convert/parsers_cinder.ceph.test, ...
Add many test log samples and fixtures for parser validation (rabbitmq, neo4j, external-dns, envoy, istio, k8s-nginx-ingress, kmsg-netfilter, cri, cinder/ceph, various couchbase and extra parser tests).
Examples & docs updated
documentation/examples/multiline/*/fluent-bit.conf, documentation/examples/multiline/*/parsers_multiline.yaml, examples/wasi_serde_json/README.md
Update example configs to reference YAML parser files and convert inline parser examples to YAML-structured format.
Build/install & packaging
CMakeLists.txt, src/CMakeLists.txt, cpack/debian/conffiles, dockerfiles/Dockerfile, dockerfiles/Dockerfile.windows
Add parsers.yaml to install lists, RPM filelist, Debian conffiles, and Docker build COPY/Copy-Item steps so YAML parser files are packaged/copied.
Config templates updated
conf/fluent-bit.conf, conf/fluent-bit-macos.conf, conf/fluent-bit-win32.conf, conf/kube.conf, conf/kube_elasticsearch.conf, conf/rate_limit.conf
Add commented example lines referencing parsers.yaml alongside existing parsers.conf entries (no active behavior change).

Sequence Diagram(s)

sequenceDiagram
    participant Admin as Administrator
    participant Conf as parsers.conf
    participant Yaml as parsers.yaml
    participant Parser as Fluent Bit Parser Registry

    rect rgb(230,240,255)
    Note over Admin,Parser: Legacy flow
    Admin->>Conf: Provide .conf with [PARSER]/[MULTILINE_PARSER]
    Conf->>Parser: Register INI-style parsers
    end

    rect rgb(230,255,230)
    Note over Admin,Parser: New YAML flow
    Admin->>Yaml: Provide parsers.yaml (list / structured rules)
    Yaml->>Parser: Register YAML-style parsers
    end

    rect rgb(255,245,220)
    Note over Admin,Parser: Migration step
    Admin->>Admin: Run fb-parser-conf2yaml.pl
    Admin->>Conf: Parse .conf sections & comments
    Admin->>Yaml: Emit equivalent .yaml files
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • Areas needing extra attention:
    • tests/internal/data/config_format/convert/fb-parser-conf2yaml.pl — complex parsing, comment preservation, multiline parser rule transformation and regex correctness.
    • Ensure parity between removed .conf parser blocks and new .yaml definitions (field names, time formats, time_keep semantics).
    • Packaging/install changes (CMakeLists, cpack) — verify file lists and install paths.
    • Test harness run_tests.sh — correctness of output comparison, date normalization, and assumptions about fluent-bit availability.
    • Large number of added test fixtures — verify coverage vs. parser definitions.

Suggested reviewers

  • edsiper
  • patrick-stephens
  • fujimotos
  • niedbalski

Poem

🐰 I hopped through configs, tidy and keen,

From bracketed .conf to YAML serene.
Parsers reborn in a cleaner display,
Tests and a converter to guide the way.
A carrot for build scripts — hooray! 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 57.14% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'conf: Convert existing parser*.conf files to yaml, add converter, add tests' directly summarizes the main changeset, which involves converting parser configuration files from .conf to YAML format, adding a conversion utility, and including tests.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@hlein
Copy link
Author

hlein commented Nov 16, 2025

Updated test outputs:

fluent-bit/tests/internal/data/config_format/convert $ ./run_tests.sh 
### 'parser_custom.external-dns.test'
CONF: 9c1c14bf43ebe2250c702967ba9268660c07ba1700256f26adc156c221a821e2
YAML: 9c1c14bf43ebe2250c702967ba9268660c07ba1700256f26adc156c221a821e2
### OK 'parser_custom.external-dns.test'

### 'parser_custom.neo4j.test'
CONF: 991c1e984b888ba6690e6bc2234216e32ae996119f50972160080645c2e5633a
YAML: 991c1e984b888ba6690e6bc2234216e32ae996119f50972160080645c2e5633a
### OK 'parser_custom.neo4j.test'

### 'parser_custom.rabbitmq.test'
CONF: 3c59afd684202eb8c30a32a45cd32996c2d9a7cff9191614c0e8e9c509336ce6
YAML: 3c59afd684202eb8c30a32a45cd32996c2d9a7cff9191614c0e8e9c509336ce6
### OK 'parser_custom.rabbitmq.test'

### 'parsers.cri.test'
CONF: 108cbff810bde6f6f85afaafb67ba4ac73bc24f2877ba2a3b71d29d833eac5f7
YAML: 108cbff810bde6f6f85afaafb67ba4ac73bc24f2877ba2a3b71d29d833eac5f7
### OK 'parsers.cri.test'

### 'parsers.envoy.test'
CONF: 3d61586f0a14fa2fbc90e1d30fa6b4fb85135fe6a53588b6cacae2f44b5fbc91
YAML: 3d61586f0a14fa2fbc90e1d30fa6b4fb85135fe6a53588b6cacae2f44b5fbc91
### OK 'parsers.envoy.test'

### 'parsers.istio-envoy-proxy.test'
CONF: 6714474977505410166a38c60ba32563feb02d7b8f333150fe724c35a944e90e
YAML: 6714474977505410166a38c60ba32563feb02d7b8f333150fe724c35a944e90e
### OK 'parsers.istio-envoy-proxy.test'

### 'parsers.k8s-nginx-ingress.test'
CONF: 242430740c5c796cd51daa780ab49a6a8e5ffbbc2044a327bd52e7b280c9726f
YAML: 242430740c5c796cd51daa780ab49a6a8e5ffbbc2044a327bd52e7b280c9726f
### OK 'parsers.k8s-nginx-ingress.test'

### 'parsers.kmsg-netfilter-log.test'
CONF: d140d21dd8329350ad958dc627ed5ec1be3c79ae32cc70501d9113fc29dc0c57
YAML: d140d21dd8329350ad958dc627ed5ec1be3c79ae32cc70501d9113fc29dc0c57
### OK 'parsers.kmsg-netfilter-log.test'

### 'parsers.syslog-rfc3164.test'
CONF: de8ce106ed6ef161bdc2e045ccc8d51c360ffef97699b6a81a00596d93535ca2
YAML: de8ce106ed6ef161bdc2e045ccc8d51c360ffef97699b6a81a00596d93535ca2
### OK 'parsers.syslog-rfc3164.test'

### 'parsers_cinder.ceph.test'
CONF: 52cfaca9f6284ebfb14ffd3caa3e37aa76c739d1175614fd51de2151dfced80f
YAML: 52cfaca9f6284ebfb14ffd3caa3e37aa76c739d1175614fd51de2151dfced80f
### OK 'parsers_cinder.ceph.test'

### 'parsers_extra.chefclient.test'
CONF: 206f0c9fad0a13b452411105fbacd3197a84b84b3d1aa2c70c2addb7e8133e15
YAML: 206f0c9fad0a13b452411105fbacd3197a84b84b3d1aa2c70c2addb7e8133e15
### OK 'parsers_extra.chefclient.test'

### 'parsers_extra.couchbase_java_multiline.test'
CONF: 1cde4c097f3100d11d3a8355850ea2f67bd00790bd5045b969cf54e722945237
YAML: 1cde4c097f3100d11d3a8355850ea2f67bd00790bd5045b969cf54e722945237
### OK 'parsers_extra.couchbase_java_multiline.test'

### 'parsers_extra.couchbase_simple_log_mixed.test'
CONF: 9be9c6b365ac4974c3203f74d0f757b4a87a56cadfb888b3ace889d9476ba0ea
YAML: 9be9c6b365ac4974c3203f74d0f757b4a87a56cadfb888b3ace889d9476ba0ea
### OK 'parsers_extra.couchbase_simple_log_mixed.test'

### 'parsers_extra.couchbase_simple_log_utc.test'
CONF: b6ef09bc7ad597b6ec8fcff723d076e0cb6bd3d2846968800484bf59e918dd7a
YAML: b6ef09bc7ad597b6ec8fcff723d076e0cb6bd3d2846968800484bf59e918dd7a
### OK 'parsers_extra.couchbase_simple_log_utc.test'

### 'parsers_extra.crowbar.test'
CONF: eaf1b453f93eac36ef21607380dab2141703ce692af55ca54a2937e36b5251f0
YAML: eaf1b453f93eac36ef21607380dab2141703ce692af55ca54a2937e36b5251f0
### OK 'parsers_extra.crowbar.test'

Verbose mode:

fluent-bit/tests/internal/data/config_format/convert $ ./run_tests.sh -v
### 'parser_custom.external-dns.test'
CONF: {"date":1234567890.123456,"time":"2021-10-28T05:29:08Z","log_level":"info","msg":"All records are already up to date"}
YAML: {"date":1234567890.123456,"time":"2021-10-28T05:29:08Z","log_level":"info","msg":"All records are already up to date"}
### OK 'parser_custom.external-dns.test'

### 'parser_custom.neo4j.test'
CONF: {"date":"2021-10-01","time":"23:14:38.390+0000","log_level":"INFO","msg":" Bound database 'core' to raft with id '3fa5a63d-cbf3-4306-afa5-2aad821887be', found on disk. \n"}
YAML: {"date":"2021-10-01","time":"23:14:38.390+0000","log_level":"INFO","msg":" Bound database 'core' to raft with id '3fa5a63d-cbf3-4306-afa5-2aad821887be', found on disk. \n"}
### OK 'parser_custom.neo4j.test'

### 'parser_custom.rabbitmq.test'
CONF: {"date":"2021-11-15","time":"07:10:27.454","log_level":"info","PID":"<0.27489.3>","msg":"Closing all channels from connection '10.171.85.133:59510 -> 10.171.91.132:5672' because it has been closed\n"}
YAML: {"date":"2021-11-15","time":"07:10:27.454","log_level":"info","PID":"<0.27489.3>","msg":"Closing all channels from connection '10.171.85.133:59510 -> 10.171.91.132:5672' because it has been closed\n"}
### OK 'parser_custom.rabbitmq.test'

### 'parsers.cri.test'
CONF: {"date":1234567890.123456,"time":"2018-02-13T07:59:37.505201169-05:00","stream":"stdout","logtag":"F","message":"line 1 user=A id=foo"}
{"date":1234567890.123456,"time":"2018-02-13T08:00:17.666563371-05:00","stream":"stdout","logtag":"F","message":"line 2 user= stdout F id=bar"}
YAML: {"date":1234567890.123456,"time":"2018-02-13T07:59:37.505201169-05:00","stream":"stdout","logtag":"F","message":"line 1 user=A id=foo"}
{"date":1234567890.123456,"time":"2018-02-13T08:00:17.666563371-05:00","stream":"stdout","logtag":"F","message":"line 2 user= stdout F id=bar"}
### OK 'parsers.cri.test'

### 'parsers.envoy.test'
CONF: {"date":1234567890.123456,"start_time":"2016-04-15T20:17:00.310Z","method":"POST","path":"/api/v1/locations","protocol":"HTTP/2","code":"204","response_flags":"-","bytes_received":"154","bytes_sent":"0","duration":"226","x_envoy_upstream_service_time":"100","x_forwarded_for":"10.0.35.28","user_agent":"nsq2http","request_id":"cc21d9b0-cf5c-432b-8c7e-98aeb7988cd2","authority":"locations","upstream_host":"tcp://10.0.2.1:80"}
YAML: {"date":1234567890.123456,"start_time":"2016-04-15T20:17:00.310Z","method":"POST","path":"/api/v1/locations","protocol":"HTTP/2","code":"204","response_flags":"-","bytes_received":"154","bytes_sent":"0","duration":"226","x_envoy_upstream_service_time":"100","x_forwarded_for":"10.0.35.28","user_agent":"nsq2http","request_id":"cc21d9b0-cf5c-432b-8c7e-98aeb7988cd2","authority":"locations","upstream_host":"tcp://10.0.2.1:80"}
### OK 'parsers.envoy.test'

### 'parsers.istio-envoy-proxy.test'
CONF: {"date":1234567890.123456,"start_time":"2020-11-25T21:26:18.409Z","method":"GET","path":"/status/418","protocol":"HTTP/1.1","response_code":"418","response_flags":"-","response_code_details":"via_upstream","connection_termination_details":"-","upstream_transport_failure_reason":"-","bytes_received":"0","bytes_sent":"135","duration":"3","x_envoy_upstream_service_time":"1","x_forwarded_for":"-","user_agent":"curl/7.73.0-DEV","x_request_id":"84961386-6d84-929d-98bd-c5aee93b5c88","authority":"httpbin:8000","upstream_host":"127.0.0.1:80","upstream_cluster":"inbound|8000||","upstream_local_address":"127.0.0.1:41854","downstream_local_address":"10.44.1.27:80","downstream_remote_address":"10.44.1.23:37652","requested_server_name":"outbound_.8000_._.httpbin.foo.svc.cluster.local","route_name":"default\n"}
YAML: {"date":1234567890.123456,"start_time":"2020-11-25T21:26:18.409Z","method":"GET","path":"/status/418","protocol":"HTTP/1.1","response_code":"418","response_flags":"-","response_code_details":"via_upstream","connection_termination_details":"-","upstream_transport_failure_reason":"-","bytes_received":"0","bytes_sent":"135","duration":"3","x_envoy_upstream_service_time":"1","x_forwarded_for":"-","user_agent":"curl/7.73.0-DEV","x_request_id":"84961386-6d84-929d-98bd-c5aee93b5c88","authority":"httpbin:8000","upstream_host":"127.0.0.1:80","upstream_cluster":"inbound|8000||","upstream_local_address":"127.0.0.1:41854","downstream_local_address":"10.44.1.27:80","downstream_remote_address":"10.44.1.23:37652","requested_server_name":"outbound_.8000_._.httpbin.foo.svc.cluster.local","route_name":"default\n"}
### OK 'parsers.istio-envoy-proxy.test'

### 'parsers.k8s-nginx-ingress.test'
CONF: {"date":1234567890.123456,"host":"100.100.93.64","user":"redacted-user","method":"POST","path":"/api/open/v1/redact/redacted","code":"200","size":"34152","referer":"-","agent":"Go-http-client/1.1","request_length":"308","request_time":"7.234","proxy_upstream_name":"redacted-http","upstream_addr":"101.86.32.129:8080","upstream_response_length":"34152","upstream_response_time":"7.232","upstream_status":"200","reg_id":"342434ed44ad21c7622b3c4590ec280e\n"}
YAML: {"date":1234567890.123456,"host":"100.100.93.64","user":"redacted-user","method":"POST","path":"/api/open/v1/redact/redacted","code":"200","size":"34152","referer":"-","agent":"Go-http-client/1.1","request_length":"308","request_time":"7.234","proxy_upstream_name":"redacted-http","upstream_addr":"101.86.32.129:8080","upstream_response_length":"34152","upstream_response_time":"7.232","upstream_status":"200","reg_id":"342434ed44ad21c7622b3c4590ec280e\n"}
### OK 'parsers.k8s-nginx-ingress.test'

### 'parsers.kmsg-netfilter-log.test'
CONF: {"date":1234567890.123456,"pri":"4","host":"gw-stl-a1","logprefix":"FIREWALL","in":"enp1s0","macsrc":"94:2b:f3:a0:10:af","macdst":"00:00:5e:00:01:2b","ethtype":"08:00","saddr":"192.168.1.123","daddr":"192.168.1.1","len":"40","tos":"0x00","prec":"0x00","ttl":"239","id":"34391","proto":"TCP","sport":"42694","dport":"10005","window":"1024","res":"0x00","flag":"SYN","urgp":"0"}
{"date":1234567890.123456,"pri":"4","host":"gw-stl-a1","logprefix":"FIREWALL","in":"enp3s0.12","out":"enp3s0.11","macsrc":"94:2b:f3:a0:10:af","macdst":"9c:b6:d0:d6:a1:af","ethtype":"08:00","saddr":"192.168.91.200","daddr":"192.168.96.3","len":"152","tos":"0x00","prec":"0x00","ttl":"63","id":"7769","proto":"UDP","sport":"41641","dport":"41641","protolen":"132"}
YAML: {"date":1234567890.123456,"pri":"4","host":"gw-stl-a1","logprefix":"FIREWALL","in":"enp1s0","macsrc":"94:2b:f3:a0:10:af","macdst":"00:00:5e:00:01:2b","ethtype":"08:00","saddr":"192.168.1.123","daddr":"192.168.1.1","len":"40","tos":"0x00","prec":"0x00","ttl":"239","id":"34391","proto":"TCP","sport":"42694","dport":"10005","window":"1024","res":"0x00","flag":"SYN","urgp":"0"}
{"date":1234567890.123456,"pri":"4","host":"gw-stl-a1","logprefix":"FIREWALL","in":"enp3s0.12","out":"enp3s0.11","macsrc":"94:2b:f3:a0:10:af","macdst":"9c:b6:d0:d6:a1:af","ethtype":"08:00","saddr":"192.168.91.200","daddr":"192.168.96.3","len":"152","tos":"0x00","prec":"0x00","ttl":"63","id":"7769","proto":"UDP","sport":"41641","dport":"41641","protolen":"132"}
### OK 'parsers.kmsg-netfilter-log.test'

### 'parsers.syslog-rfc3164.test'
CONF: {"date":1234567890.123456,"pri":"34","time":"Oct 11 22:14:15","host":"mymachine","ident":"su","message":"'su root' failed for lonvick on /dev/pts/8"}
YAML: {"date":1234567890.123456,"pri":"34","time":"Oct 11 22:14:15","host":"mymachine","ident":"su","message":"'su root' failed for lonvick on /dev/pts/8"}
### OK 'parsers.syslog-rfc3164.test'

### 'parsers_cinder.ceph.test'
CONF: {"date":1234567890.123456,"message":"7fa664005700 -1 received  signal: Hangup from  PID: 3101709 task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  UID: 0"}
{"date":1234567890.123456,"message":"7f213804f700 -1 received  signal: Hangup from  PID: 3101709 task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  UID: 0"}
{"date":1234567890.123456,"message":"7fa4cec7b700 -1 received  signal: Hangup from  PID: 3101709 task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  UID: 0"}
{"date":1234567890.123456,"message":"7f1b01352700 -1 received  signal: Hangup from  PID: 3101709 task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  UID: 0"}
{"date":1234567890.123456,"message":"7f1967203700 -1 received  signal: Hangup from  PID: 3101709 task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  UID: 0"}
{"date":1234567890.123456,"message":"7ff2b075a700  0 -- 192.168.132.30:6818/6907 >> 192.168.132.52:6824/7318 conn(0x55aa03987000 :6818 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept connect_seq 16 vs existing csq=15 existing_state=STATE_STANDBY"}
{"date":1234567890.123456,"message":"7ff2b075a700  0 -- 192.168.132.30:6818/6907 >> 192.168.132.31:6802/6015 conn(0x55a9fe229800 :6818 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept connect_seq 19 vs existing csq=19 existing_state=STATE_STANDBY"}
{"date":1234567890.123456,"message":"7ff2b075a700  0 -- 192.168.132.30:6818/6907 >> 192.168.132.31:6802/6015 conn(0x55a9fe229800 :6818 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept connect_seq 20 vs existing csq=19 existing_state=STATE_STANDBY"}
{"date":1234567890.123456,"message":"7ff2b0f5b700  0 -- 192.168.132.30:6818/6907 >> 192.168.132.48:6832/8761 conn(0x55aa02e39000 :6818 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept connect_seq 33 vs existing csq=33 existing_state=STATE_STANDBY"}
{"date":1234567890.123456,"message":"7ff2b0f5b700  0 -- 192.168.132.30:6818/6907 >> 192.168.132.48:6832/8761 conn(0x55aa02e39000 :6818 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept connect_seq 34 vs existing csq=33 existing_state=STATE_STANDBY"}
YAML: {"date":1234567890.123456,"message":"7fa664005700 -1 received  signal: Hangup from  PID: 3101709 task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  UID: 0"}
{"date":1234567890.123456,"message":"7f213804f700 -1 received  signal: Hangup from  PID: 3101709 task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  UID: 0"}
{"date":1234567890.123456,"message":"7fa4cec7b700 -1 received  signal: Hangup from  PID: 3101709 task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  UID: 0"}
{"date":1234567890.123456,"message":"7f1b01352700 -1 received  signal: Hangup from  PID: 3101709 task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  UID: 0"}
{"date":1234567890.123456,"message":"7f1967203700 -1 received  signal: Hangup from  PID: 3101709 task name: killall -q -1 ceph-mon ceph-mgr ceph-mds ceph-osd ceph-fuse radosgw  UID: 0"}
{"date":1234567890.123456,"message":"7ff2b075a700  0 -- 192.168.132.30:6818/6907 >> 192.168.132.52:6824/7318 conn(0x55aa03987000 :6818 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept connect_seq 16 vs existing csq=15 existing_state=STATE_STANDBY"}
{"date":1234567890.123456,"message":"7ff2b075a700  0 -- 192.168.132.30:6818/6907 >> 192.168.132.31:6802/6015 conn(0x55a9fe229800 :6818 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept connect_seq 19 vs existing csq=19 existing_state=STATE_STANDBY"}
{"date":1234567890.123456,"message":"7ff2b075a700  0 -- 192.168.132.30:6818/6907 >> 192.168.132.31:6802/6015 conn(0x55a9fe229800 :6818 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept connect_seq 20 vs existing csq=19 existing_state=STATE_STANDBY"}
{"date":1234567890.123456,"message":"7ff2b0f5b700  0 -- 192.168.132.30:6818/6907 >> 192.168.132.48:6832/8761 conn(0x55aa02e39000 :6818 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept connect_seq 33 vs existing csq=33 existing_state=STATE_STANDBY"}
{"date":1234567890.123456,"message":"7ff2b0f5b700  0 -- 192.168.132.30:6818/6907 >> 192.168.132.48:6832/8761 conn(0x55aa02e39000 :6818 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept connect_seq 34 vs existing csq=33 existing_state=STATE_STANDBY"}
### OK 'parsers_cinder.ceph.test'

### 'parsers_extra.chefclient.test'
CONF: {"date":1234567890.123456,"severity":"INFO","message":"Processing directory[/var/lib/nova/instances] action create (nfs-client::default line 147)"}
{"date":1234567890.123456,"severity":"INFO","message":"directory[/var/lib/nova/instances] created directory /var/lib/nova/instances"}
{"date":1234567890.123456,"severity":"INFO","message":"directory[/var/lib/nova/instances] owner changed to 0"}
{"date":1234567890.123456,"severity":"INFO","message":"directory[/var/lib/nova/instances] group changed to 0"}
{"date":1234567890.123456,"severity":"INFO","message":"directory[/var/lib/nova/instances] mode changed to 755"}
{"date":1234567890.123456,"severity":"INFO","message":"Processing mount[Configuring mount: 192.168.131.243:/nova_nfs__vol_1/nova_1 -> /var/lib/nova/instances] action mount (nfs-client::default line 168)"}
YAML: {"date":1234567890.123456,"severity":"INFO","message":"Processing directory[/var/lib/nova/instances] action create (nfs-client::default line 147)"}
{"date":1234567890.123456,"severity":"INFO","message":"directory[/var/lib/nova/instances] created directory /var/lib/nova/instances"}
{"date":1234567890.123456,"severity":"INFO","message":"directory[/var/lib/nova/instances] owner changed to 0"}
{"date":1234567890.123456,"severity":"INFO","message":"directory[/var/lib/nova/instances] group changed to 0"}
{"date":1234567890.123456,"severity":"INFO","message":"directory[/var/lib/nova/instances] mode changed to 755"}
{"date":1234567890.123456,"severity":"INFO","message":"Processing mount[Configuring mount: 192.168.131.243:/nova_nfs__vol_1/nova_1 -> /var/lib/nova/instances] action mount (nfs-client::default line 168)"}
### OK 'parsers_extra.chefclient.test'

### 'parsers_extra.couchbase_java_multiline.test'
CONF: {"date":1234567890.123456,"timestamp":"2021-03-09T17:32:25.339+00:00","level":"INFO","class":"CBAS.bootstrap.AnalyticsNCApplication","thread":"main","message":"Starting Couchbase Analytics driver, version: {\"build.date\":\"Thu Aug 06 20:37:33 UTC 2020\",\"build.number\":\"7909\",\"build.version\":\"6.6.0-7909\"}"}
{"date":1234567890.123456,"timestamp":"2021-03-09T17:32:25.520+00:00","level":"DEBU","class":"CBAS.util.MXHelper","thread":"main","message":"ignoring exception calling RuntimeMXBean.getBootClassPath; returning null"}
{"date":1234567890.123456,"timestamp":"2021-03-09T17:32:25.595+00:00","level":"DEBU","class":"CBAS.work.WorkQueue","thread":"Worker:ClusterController","message":"Executing: RegisterNode"}
YAML: {"date":1234567890.123456,"timestamp":"2021-03-09T17:32:25.339+00:00","level":"INFO","class":"CBAS.bootstrap.AnalyticsNCApplication","thread":"main","message":"Starting Couchbase Analytics driver, version: {\"build.date\":\"Thu Aug 06 20:37:33 UTC 2020\",\"build.number\":\"7909\",\"build.version\":\"6.6.0-7909\"}"}
{"date":1234567890.123456,"timestamp":"2021-03-09T17:32:25.520+00:00","level":"DEBU","class":"CBAS.util.MXHelper","thread":"main","message":"ignoring exception calling RuntimeMXBean.getBootClassPath; returning null"}
{"date":1234567890.123456,"timestamp":"2021-03-09T17:32:25.595+00:00","level":"DEBU","class":"CBAS.work.WorkQueue","thread":"Worker:ClusterController","message":"Executing: RegisterNode"}
### OK 'parsers_extra.couchbase_java_multiline.test'

### 'parsers_extra.couchbase_simple_log_mixed.test'
CONF: {"date":1234567890.123456,"timestamp":"2021/03/09 17:32:15","level":"cbauth","message":" key: fts/remoteClients registered for tls config updates"}
{"date":1234567890.123456,"timestamp":"2021-03-09T17:32:15.303+00:00","level":"INFO","message":" main: /opt/couchbase/bin/cbft started (v0.6.0/5.5.0)"}
{"date":1234567890.123456,"timestamp":"2021-03-09T17:32:15.304+00:00","level":"INFO","message":" main: file descriptor limit current: 1048576 max: 1048576"}
YAML: {"date":1234567890.123456,"timestamp":"2021/03/09 17:32:15","level":"cbauth","message":" key: fts/remoteClients registered for tls config updates"}
{"date":1234567890.123456,"timestamp":"2021-03-09T17:32:15.303+00:00","level":"INFO","message":" main: /opt/couchbase/bin/cbft started (v0.6.0/5.5.0)"}
{"date":1234567890.123456,"timestamp":"2021-03-09T17:32:15.304+00:00","level":"INFO","message":" main: file descriptor limit current: 1048576 max: 1048576"}
### OK 'parsers_extra.couchbase_simple_log_mixed.test'

### 'parsers_extra.couchbase_simple_log_utc.test'
CONF: {"date":1234567890.123456,"timestamp":"2021-03-09T17:32:02.137Z","level":"INFO","message":" GOXDCR.TpThrottler:  update routine starting"}
{"date":1234567890.123456,"timestamp":"2021-03-09T17:32:02.137Z","level":"INFO","message":" GOXDCR.GlobalSettingChangeListener: Started MetakvChangeListener GlobalSettingChangeListener"}
{"date":1234567890.123456,"timestamp":"2021-03-09T17:32:02.137Z","level":"INFO","message":" GOXDCR.ResourceMgr: collectCpuUsage starting ...."}
{"date":1234567890.123456,"timestamp":"2021-03-09T17:32:02.137Z","level":"INFO","message":" GOXDCR.ResourceMgr: manageResources starting ...."}
YAML: {"date":1234567890.123456,"timestamp":"2021-03-09T17:32:02.137Z","level":"INFO","message":" GOXDCR.TpThrottler:  update routine starting"}
{"date":1234567890.123456,"timestamp":"2021-03-09T17:32:02.137Z","level":"INFO","message":" GOXDCR.GlobalSettingChangeListener: Started MetakvChangeListener GlobalSettingChangeListener"}
{"date":1234567890.123456,"timestamp":"2021-03-09T17:32:02.137Z","level":"INFO","message":" GOXDCR.ResourceMgr: collectCpuUsage starting ...."}
{"date":1234567890.123456,"timestamp":"2021-03-09T17:32:02.137Z","level":"INFO","message":" GOXDCR.ResourceMgr: manageResources starting ...."}
### OK 'parsers_extra.couchbase_simple_log_utc.test'

### 'parsers_extra.crowbar.test'
CONF: {"date":1234567890.123456,"severity":"DEBUG","message":"Backing up databag repos-suse-12_2-x86_64/sle-module-adv-systems-mgmt-pool"}
{"date":1234567890.123456,"severity":"DEBUG","message":"Backing up databag repos-suse-12_2-x86_64/sles12-sp2-pool"}
{"date":1234567890.123456,"severity":"DEBUG","message":"Backing up databag repos-suse-12_2-x86_64/sle12-sp2-ha-pool"}
{"date":1234567890.123456,"severity":"DEBUG","message":"Backing up database"}
{"date":1234567890.123456,"severity":"DEBUG","message":"Backing up Crowbar files"}
{"date":1234567890.123456,"severity":"DEBUG","message":"Backing up batch export"}
{"date":1234567890.123456,"severity":"DEBUG","message":"Proposal Load (0.5ms)  SELECT \"proposals\".* FROM \"proposals\" WHERE \"proposals\".\"barclamp\" = $1  [[\"barclamp\", \"crowbar\"]]"}
{"date":1234567890.123456,"severity":"WARN","message":"Switch name Error: d9c-dc-71-4c-5d-f8.200.xxx.corp: Switch config not detected during discovery"}
{"date":1234567890.123456,"severity":"WARN","message":"Switch name Error: d9c-dc-71-4c-ba-f8.200.xxx.corp: Switch config not detected during discovery"}
{"date":1234567890.123456,"severity":"WARN","message":"Switch name Error: d9c-dc-71-4c-f2-d0.200.xxx.corp: Switch config not detected during discovery"}
{"date":1234567890.123456,"severity":"WARN","message":"Switch name Error: de0-07-1b-67-14-08.200.xxx.corp: Switch config not detected during discovery"}
{"date":1234567890.123456,"severity":"INFO","message":"Completed 200 OK in 6024ms (Views: 5.8ms | ActiveRecord: 0.0ms)"}
{"date":1234567890.123456,"severity":"INFO","message":"Started GET \"/crowbar/machines/1.0/show?name=d9c-dc-71-4c-f2-d0.200.xxx.corp\" for 127.0.0.1 at 2017-11-23 08:50:12 +0000"}
{"date":1234567890.123456,"severity":"INFO","message":"Processing by MachinesController#show as JSON"}
{"date":1234567890.123456,"severity":"INFO","message":"Parameters: {\"name\"=>\"d9c-dc-71-4c-f2-d0.200.xxx.corp\", \"id\"=>\"show\"}"}
{"date":1234567890.123456,"severity":"INFO","message":"Completed 200 OK in 64ms (Views: 12.2ms | ActiveRecord: 0.0ms)"}
{"date":1234567890.123456,"severity":"INFO","message":"Started GET \"/crowbar/machines/1.0/show?name=d9c-dc-71-4c-ba-f8.200.xxx.corp\" for 127.0.0.1 at 2017-11-23 08:50:13 +0000"}
YAML: {"date":1234567890.123456,"severity":"DEBUG","message":"Backing up databag repos-suse-12_2-x86_64/sle-module-adv-systems-mgmt-pool"}
{"date":1234567890.123456,"severity":"DEBUG","message":"Backing up databag repos-suse-12_2-x86_64/sles12-sp2-pool"}
{"date":1234567890.123456,"severity":"DEBUG","message":"Backing up databag repos-suse-12_2-x86_64/sle12-sp2-ha-pool"}
{"date":1234567890.123456,"severity":"DEBUG","message":"Backing up database"}
{"date":1234567890.123456,"severity":"DEBUG","message":"Backing up Crowbar files"}
{"date":1234567890.123456,"severity":"DEBUG","message":"Backing up batch export"}
{"date":1234567890.123456,"severity":"DEBUG","message":"Proposal Load (0.5ms)  SELECT \"proposals\".* FROM \"proposals\" WHERE \"proposals\".\"barclamp\" = $1  [[\"barclamp\", \"crowbar\"]]"}
{"date":1234567890.123456,"severity":"WARN","message":"Switch name Error: d9c-dc-71-4c-5d-f8.200.xxx.corp: Switch config not detected during discovery"}
{"date":1234567890.123456,"severity":"WARN","message":"Switch name Error: d9c-dc-71-4c-ba-f8.200.xxx.corp: Switch config not detected during discovery"}
{"date":1234567890.123456,"severity":"WARN","message":"Switch name Error: d9c-dc-71-4c-f2-d0.200.xxx.corp: Switch config not detected during discovery"}
{"date":1234567890.123456,"severity":"WARN","message":"Switch name Error: de0-07-1b-67-14-08.200.xxx.corp: Switch config not detected during discovery"}
{"date":1234567890.123456,"severity":"INFO","message":"Completed 200 OK in 6024ms (Views: 5.8ms | ActiveRecord: 0.0ms)"}
{"date":1234567890.123456,"severity":"INFO","message":"Started GET \"/crowbar/machines/1.0/show?name=d9c-dc-71-4c-f2-d0.200.xxx.corp\" for 127.0.0.1 at 2017-11-23 08:50:12 +0000"}
{"date":1234567890.123456,"severity":"INFO","message":"Processing by MachinesController#show as JSON"}
{"date":1234567890.123456,"severity":"INFO","message":"Parameters: {\"name\"=>\"d9c-dc-71-4c-f2-d0.200.xxx.corp\", \"id\"=>\"show\"}"}
{"date":1234567890.123456,"severity":"INFO","message":"Completed 200 OK in 64ms (Views: 12.2ms | ActiveRecord: 0.0ms)"}
{"date":1234567890.123456,"severity":"INFO","message":"Started GET \"/crowbar/machines/1.0/show?name=d9c-dc-71-4c-ba-f8.200.xxx.corp\" for 127.0.0.1 at 2017-11-23 08:50:13 +0000"}
### OK 'parsers_extra.crowbar.test'

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (4)
tests/internal/data/config_format/convert/parser_custom.external-dns.test (1)

1-1: Confirm whether outer single quotes are expected in external-dns sample

This sample wraps the log line in single quotes. The parser regex will likely still match (the leading ' satisfies the initial non-space), but if real external-dns logs are not quoted this may diverge from the documented example. Consider dropping the outer quotes unless they are actually present in the referenced log snippet.

conf/parsers_openstack.yaml (1)

8-49: Refactor: Extract common OpenStack parser pattern to reduce duplication.

All six parsers (cinder, glance, heat, keystone, neutron, nova) define an identical regex pattern. This violates the DRY principle and makes maintenance difficult—any bug fixes to the pattern must be applied in six places.

Consider refactoring the common pattern into a single reusable parser definition or using YAML anchors (& and *) to eliminate duplication:

parsers:
  - &openstack_base
    format: regex
    regex: '^(?<log_time>[^ ][-.\d\+:T]+[ ]*[.:\d]*)\s+(?<pid>[^ ]\d+)\s+(?<severity>[^ ][.-_\w]+)\s+(?<component>[^ ][.-_\w]+)(\s+\[(-|(?<req_id>[^ ][-\w]*) (?<req_user>[^ ][-\w]*) (?<req_project>[^ ][-\w]*) (?<req_domain>[^ ][-\w]*) (?<req_user_domain>[^ ][-\w]*) (?<req_project_domain>[^ ][-\w]*))\]){1}\s+(?<message>.*)$'
    time_format: '%Y-%m-%d %H:%M:%S.%L'
    time_keep: Off
    time_key: log_time

  - name: cinder
    <<: *openstack_base

  - name: glance
    <<: *openstack_base

  # ... and so on for heat, keystone, neutron, nova

Alternatively, document why each parser is needed separately if they are intentionally distinct.

tests/internal/data/config_format/convert/run_tests.sh (2)

21-26: Verify sed -E flag compatibility across target platforms.

Line 25 uses sed -E for extended regex syntax. This flag differs between GNU sed (-E or -r) and BSD sed (-E). For portability, consider using sed -n with explicit pattern ranges or use awk instead:

squash_date()
{
  awk '{gsub(/"date":[0-9]+\.[0-9]+,/, "\"date\":1234567890.123456,"); print}'
}

Alternatively, document the target platform assumption (e.g., "GNU sed required").


58-62: Add error handling for missing fluent-bit binary and invalid parser names.

The script invokes fluent-bit without checking for availability or error status. If fluent-bit is not in $PATH or if the parser name doesn't exist, the command will fail silently or produce misleading output. Add explicit checks:

command -v fluent-bit >/dev/null 2>&1 || die "fluent-bit not found in PATH"

Also, capture stderr and exit codes to detect parser errors early:

if ! CONF_OUT=$(cat "$A" | fluent-bit -q -R "${F}".conf -i stdin -p parser="${P}" -o stdout -p format=json_lines 2>&1 | filter); then
  die "fluent-bit failed for ${F}.conf with parser ${P}: exit code $?"
fi

Additionally, verify that the -q flag suppresses all debug output; if not, stderr may contaminate the hash comparison.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f1ba23a and e3430d0.

📒 Files selected for processing (32)
  • conf/parser_custom.conf (1 hunks)
  • conf/parser_custom.yaml (1 hunks)
  • conf/parsers.conf (1 hunks)
  • conf/parsers.yaml (1 hunks)
  • conf/parsers_ambassador.yaml (1 hunks)
  • conf/parsers_cinder.conf (1 hunks)
  • conf/parsers_cinder.yaml (1 hunks)
  • conf/parsers_extra.conf (2 hunks)
  • conf/parsers_extra.yaml (1 hunks)
  • conf/parsers_java.yaml (1 hunks)
  • conf/parsers_kafka.yaml (1 hunks)
  • conf/parsers_mult.yaml (1 hunks)
  • conf/parsers_multiline.yaml (1 hunks)
  • conf/parsers_openstack.yaml (1 hunks)
  • tests/internal/data/config_format/convert/.gitignore (1 hunks)
  • tests/internal/data/config_format/convert/README (1 hunks)
  • tests/internal/data/config_format/convert/fb-parser-conf2yaml.pl (1 hunks)
  • tests/internal/data/config_format/convert/parser_custom.external-dns.test (1 hunks)
  • tests/internal/data/config_format/convert/parser_custom.neo4j.test (1 hunks)
  • tests/internal/data/config_format/convert/parser_custom.rabbitmq.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers.cri.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers.envoy.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers.istio-envoy-proxy.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers.k8s-nginx-ingress.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers.kmsg-netfilter-log.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers_cinder.ceph.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers_extra.chefclient.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers_extra.couchbase_java_multiline.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_mixed.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_utc.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers_extra.crowbar.test (1 hunks)
  • tests/internal/data/config_format/convert/run_tests.sh (1 hunks)
🔇 Additional comments (18)
conf/parsers_cinder.conf (1)

3-3: HTTPS Rubular reference update is fine

Switching the Rubular URL to https is a safe, non-functional improvement and keeps the reference up to date. No further changes needed.

tests/internal/data/config_format/convert/.gitignore (1)

1-2: Ignoring generated parser configs in this helper directory makes sense

These patterns align with the workflow in the README (copy/convert parser *.conf locally) and prevent accidental check-in of generated test artifacts.

conf/parsers.conf (1)

118-125: CRI Rubular link scheme update is safe

Switching the Rubular reference comment to https is purely cosmetic and has no effect on the cri parser behavior.

conf/parser_custom.conf (1)

1-15: Whitespace-only normalization of [PARSER] headers

Removing the leading space before [PARSER] is a clean formatting fix and keeps these sections consistent with other parser definitions.

tests/internal/data/config_format/convert/parsers.kmsg-netfilter-log.test (1)

1-2: Good coverage for kmsg-netfilter-log parser

These TCP/UDP FIREWALL samples look realistic and should exercise the kmsg-netfilter-log parser well for the conversion tests.

tests/internal/data/config_format/convert/parsers_extra.crowbar.test (1)

1-20: Crowbar test data set looks representative

The variety of DEBUG/WARN/INFO lines (backups, SQL, HTTP, switch warnings) should give good coverage for the crowbar parser across different message shapes.

tests/internal/data/config_format/convert/README (1)

1-19: README clearly documents the conversion/test workflow

The steps to copy parser *.conf files, run fb-parser-conf2yaml.pl, and execute run_tests.sh are concise and match the surrounding tooling and .gitignore setup.

conf/parsers_extra.conf (1)

4-16: Rubular links for crowbar/chefclient updated safely

Updating these reference comments to https has no effect on the parsers and keeps the documentation links modern.

tests/internal/data/config_format/convert/parsers.envoy.test (1)

1-1: LGTM!

The Envoy access log test entry is well-formed and contains all the expected fields for testing the parser.

tests/internal/data/config_format/convert/parser_custom.rabbitmq.test (1)

1-1: LGTM!

The RabbitMQ log test entry is well-formed with proper timestamp, log level, process ID, and message structure.

tests/internal/data/config_format/convert/parser_custom.neo4j.test (1)

1-1: LGTM!

The Neo4j log test entry is well-formed with proper timestamp, log level, and message structure. Note: there's a trailing space at the end of the line, which may be intentional for testing whitespace handling.

tests/internal/data/config_format/convert/parsers.k8s-nginx-ingress.test (1)

1-1: LGTM!

The Kubernetes NGINX Ingress log test entry is comprehensive and well-formed, containing all expected fields including client IP, timestamp, request details, response metrics, and upstream information.

tests/internal/data/config_format/convert/parsers_extra.couchbase_java_multiline.test (1)

1-11: LGTM!

The Couchbase Analytics Java multiline test data is well-structured for testing multiline parsing behavior. It includes:

  • Proper log entry boundaries (timestamps + log levels)
  • Complete exception stack trace with correct indentation
  • Multiple log levels (INFO, DEBU)
  • Version information and source references

This comprehensively tests the parser's ability to recognize log entry boundaries and aggregate multiline stack traces.

tests/internal/data/config_format/convert/parsers.istio-envoy-proxy.test (1)

1-1: LGTM!

The Istio Envoy proxy log test entry is comprehensive and well-formed, including all expected fields such as timestamp, HTTP request details, timing metrics, and Istio-specific routing context (inbound/outbound cluster information).

tests/internal/data/config_format/convert/parsers_extra.chefclient.test (1)

6-17: Verify and clarify intent of IP address and path variations in test data.

The inconsistencies flagged in the review are confirmed to exist in the test file. The file contains example Chef client log output used as a test fixture, but the codebase contains no documentation explaining why five different IP addresses and paths appear across lines describing what should be a single mount operation:

  • Line 6: 192.168.131.243:/nova_nfs__vol_1/nova_1
  • Line 8: 192.1.1.243:/nova_nfs_rot_2vol_1/nova_1
  • Line 13: 192.18.11.243:/nova_nfs_vol_1/nova_1
  • Line 16: 192.168.1.243:/nova_nfs_vol_1/nova_1
  • Line 17: 192.16.1.243:/nova_nfs_rot_2_1_vsos211_vol_1/nova_1

Clarify whether these variations are intentional (testing parser robustness, realistic obfuscation patterns) or the result of incomplete redaction/copy-paste errors during test fixture creation.

conf/parsers_multiline.yaml (1)

1-12: Verify Fluent Bit regex dialect supports the state machine configuration.

The multiline parser uses forward-slash-delimited regex patterns (/(Dec \d+ \d+\:\d+\:\d+)(.*)/) and a state machine with transitions. Confirm that Fluent Bit's regex parser supports this syntax and the state machine logic works as documented.

conf/parsers_java.yaml (1)

1-6: Verify PCRE-style named capture group support in Fluent Bit regex parser.

The Java parser uses PCRE-style named capture groups ((?<name>...)). Ensure Fluent Bit's regex implementation (Oniguruma) supports this syntax. Additionally, confirm that the time_format '%Y-%m-%d %H:%M:%S' correctly aligns with the captured time group pattern (\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2}).

conf/parser_custom.yaml (1)

1-16: Verify regex patterns work correctly with corresponding test data.

The three custom parsers include references to regex testing at rubular.com. Ensure each pattern is tested end-to-end against the corresponding parser test files (e.g., parser_custom.rabbitmq.test, parser_custom.neo4j.test, parser_custom.external-dns.test) using the new test harness to confirm expected behavior.

Additionally, clarify the external-dns pattern structure at line 15: the leading ([^ ])\" appears to capture a non-space character followed by a quote. Verify this is the intended behavior for the target log format.

@hlein hlein changed the title Parsers yaml Convert existing parser*.conf files to yaml format, add a converter, add tests Nov 16, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (5)
conf/parsers_mult.yaml (2)

8-10: Critical: Literal slash delimiters break the regex pattern.

The regex pattern is wrapped in literal forward slashes (/...$/), which will cause Fluent Bit to treat the slashes as part of the pattern. This was already identified in a previous review.

Apply this diff:

-    regex: /Processing by (?<controller>[^\u0023]+)\u0023(?<controller_method>[^ ]+) as (?<format>[^ ]+?)$/
+    regex: 'Processing by (?<controller>[^\u0023]+)\u0023(?<controller_method>[^ ]+) as (?<format>[^ ]+?)$'

16-18: Critical: Literal slash delimiters break the regex pattern.

Same issue as mult_1 - the regex is wrapped in literal forward slashes.

Apply this diff:

-    regex: /  Rendered (?<template>[^ ]+) within (?<layout>.+) \([\d\.]+ms\)/
+    regex: '  Rendered (?<template>[^ ]+) within (?<layout>.+) \([\d\.]+ms\)'
conf/parsers.yaml (1)

75-80: Critical: Literal slash delimiters break the syslog-rfc3164 parser.

The regex is wrapped in literal forward slashes ('/^...$/'), making Fluent Bit look for lines that start and end with a slash character. This was already identified in a previous review.

Apply this diff:

-    regex: '/^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$/'
+    regex: '^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$'
tests/internal/data/config_format/convert/fb-parser-conf2yaml.pl (1)

182-232: Address the regex delimiter normalization issue flagged in previous review.

A previous review identified that the converter copies Regex values verbatim, including optional /pattern/ delimiters. When the original .conf uses this style, Fluent Bit will treat the slashes as literal characters, breaking the parser.

As suggested in the previous review, normalize regex values before storing them:

      my ($key, $value) = ($1, $2);
      $blank_before_comments = 0;
      
      # Convert key to lowercase with underscores
      my $yaml_key = lc($key);
      
+     # Normalize regex delimiters
+     if ($yaml_key eq 'regex' && $value =~ m{\A/(.*)/\z} && $value !~ m{\A//}) {
+       $value = $1;  # drop enclosing delimiters
+     }
+     
      # Special handling for multiline parser 'rule' fields

After applying this fix, regenerate all YAML files to restore correct parser behavior.

tests/internal/data/config_format/convert/run_tests.sh (1)

37-46: Address the multi-line string formatting issue flagged in previous review.

The previous review correctly identified that the die function receives a multi-line string literal that may not preserve newlines reliably across different POSIX shells. Use a here-document as suggested:

-  die "Usage: $0 [-v] [file1.test [file2.test ...]]
-
-  Process parser*.*.test files, either each one named on the command line,
-  or every file in pwd matching that pattern.
-
-  Makes sure the output produced by the .conf file and .yaml configs match.
-
-  Needs a corresponding .conf and .yaml for each.
-"
+  die <<EOF
+Usage: $0 [-v] [file1.test [file2.test ...]]
+
+Process parser*.*.test files, either each one named on the command line,
+or every file in pwd matching that pattern.
+
+Makes sure the output produced by the .conf file and .yaml configs match.
+
+Needs a corresponding .conf and .yaml for each.
+EOF
🧹 Nitpick comments (2)
conf/parsers_openstack.yaml (1)

9-49: Consider whether these six parsers could share a common definition.

All six OpenStack parsers (cinder, glance, heat, keystone, neutron, nova) use identical regex patterns and time handling configuration. While explicit definitions provide clarity, this duplication could make maintenance harder if the pattern needs to change.

However, this may be intentional to allow independent evolution of each service's parser.

tests/internal/data/config_format/convert/run_tests.sh (1)

69-73: Consider adding error message context to fluent-bit check.

The self-test at line 71 validates basic fluent-bit functionality, but if it fails, the generic error message doesn't indicate whether fluent-bit is found but malfunctioning, or if there's a PATH issue. Consider adding diagnostic output:

 command -v fluent-bit >/dev/null 2>&1 || die "fluent-bit not found in PATH"
-if ! echo '{"a":"b"}' | fluent-bit -q -i stdin -o stdout -p format=json_lines | grep -E -q '{"date":[0-9]+\.[0-9]+,"a":"b"}' ; then
-  die "fluent-bit self-test failed, cannot continue."
+if ! echo '{"a":"b"}' | fluent-bit -q -i stdin -o stdout -p format=json_lines 2>&1 | grep -E -q '{"date":[0-9]+\.[0-9]+,"a":"b"}' ; then
+  die "fluent-bit self-test failed (command succeeded but output unexpected), cannot continue."
 fi
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e3430d0 and 3b7e218.

📒 Files selected for processing (32)
  • conf/parser_custom.yaml (1 hunks)
  • conf/parsers.conf (2 hunks)
  • conf/parsers.yaml (1 hunks)
  • conf/parsers_ambassador.yaml (1 hunks)
  • conf/parsers_cinder.conf (1 hunks)
  • conf/parsers_cinder.yaml (1 hunks)
  • conf/parsers_extra.conf (2 hunks)
  • conf/parsers_extra.yaml (1 hunks)
  • conf/parsers_java.yaml (1 hunks)
  • conf/parsers_kafka.yaml (1 hunks)
  • conf/parsers_mult.yaml (1 hunks)
  • conf/parsers_multiline.yaml (1 hunks)
  • conf/parsers_openstack.yaml (1 hunks)
  • tests/internal/data/config_format/convert/.gitignore (1 hunks)
  • tests/internal/data/config_format/convert/README (1 hunks)
  • tests/internal/data/config_format/convert/fb-parser-conf2yaml.pl (1 hunks)
  • tests/internal/data/config_format/convert/parser_custom.external-dns.test (1 hunks)
  • tests/internal/data/config_format/convert/parser_custom.neo4j.test (1 hunks)
  • tests/internal/data/config_format/convert/parser_custom.rabbitmq.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers.cri.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers.envoy.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers.istio-envoy-proxy.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers.k8s-nginx-ingress.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers.kmsg-netfilter-log.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers.syslog-rfc3164.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers_cinder.ceph.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers_extra.chefclient.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers_extra.couchbase_java_multiline.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_mixed.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_utc.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers_extra.crowbar.test (1 hunks)
  • tests/internal/data/config_format/convert/run_tests.sh (1 hunks)
✅ Files skipped from review due to trivial changes (3)
  • tests/internal/data/config_format/convert/parsers_extra.crowbar.test
  • tests/internal/data/config_format/convert/README
  • conf/parsers_extra.conf
🚧 Files skipped from review as they are similar to previous changes (17)
  • conf/parsers.conf
  • tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_utc.test
  • tests/internal/data/config_format/convert/parser_custom.external-dns.test
  • conf/parsers_kafka.yaml
  • tests/internal/data/config_format/convert/parsers.envoy.test
  • tests/internal/data/config_format/convert/.gitignore
  • tests/internal/data/config_format/convert/parsers.kmsg-netfilter-log.test
  • tests/internal/data/config_format/convert/parsers.k8s-nginx-ingress.test
  • tests/internal/data/config_format/convert/parsers.istio-envoy-proxy.test
  • conf/parsers_cinder.conf
  • conf/parsers_cinder.yaml
  • conf/parsers_multiline.yaml
  • tests/internal/data/config_format/convert/parsers.cri.test
  • tests/internal/data/config_format/convert/parsers_cinder.ceph.test
  • tests/internal/data/config_format/convert/parsers_extra.couchbase_java_multiline.test
  • tests/internal/data/config_format/convert/parsers_extra.chefclient.test
  • tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_mixed.test
🔇 Additional comments (10)
tests/internal/data/config_format/convert/parsers.syslog-rfc3164.test (1)

1-1: LGTM! Well-formed RFC3164 syslog test data.

The test data correctly represents a complete RFC3164 syslog message with PRI (<34>), timestamp, hostname (mymachine), program (su), and message content.

conf/parser_custom.yaml (3)

2-5: LGTM! Clean regex pattern for RabbitMQ logs.

The regex correctly extracts date, time, log_level, PID, and message fields using named capture groups.


7-10: LGTM! Clean regex pattern for Neo4j logs.

The regex correctly extracts date, time, log_level, and message fields.


12-15: The YAML regex matches the original .conf file exactly, so no conversion error exists.

The leading unnamed capture group ([^ ]) is present in the original .conf file, not introduced by this YAML version. Since the conversion is faithful to the source, verify with domain knowledge whether the original regex pattern's unnamed capture group is intentional for external-dns log parsing.

conf/parsers_ambassador.yaml (1)

5-7: LGTM! Comprehensive regex for Ambassador/Envoy logs.

The regex correctly extracts all relevant fields from Ambassador access logs including request details, response metrics, and proxy information.

conf/parsers_extra.yaml (1)

1-159: LGTM! Comprehensive parser definitions.

This file provides a wide variety of parsers (crowbar, chefclient, mysql, pacemaker, rabbitmq, couchbase variants, etc.) with properly formatted regex patterns and appropriate time handling. No literal slash delimiters detected.

tests/internal/data/config_format/convert/fb-parser-conf2yaml.pl (1)

60-67: Prevent out-of-bounds shift in argument parsing.

Line 66 calls shift unconditionally within the loop, but if $arg is the last argument and matches a valid pattern, shift will fail silently or cause unexpected behavior.

Move the shift call outside the case logic or validate that arguments remain:

 for arg in "$@"; do
   case "$arg" in
     -v) filter() { squash_date; }; ;;
     -*) usage ;;
     *)  TEST_FILES+=("$arg") ;;
   esac
-  shift
 done

Wait, this is shell script code, not Perl. Let me reconsider - this comment should be for run_tests.sh, not this Perl file. Let me skip this for now and address it in the correct file.

tests/internal/data/config_format/convert/parser_custom.rabbitmq.test (1)

1-1: LGTM!

Test data file contains a representative RabbitMQ log entry for parser validation.

tests/internal/data/config_format/convert/parser_custom.neo4j.test (1)

1-1: LGTM!

Test data file contains a representative Neo4j log entry for parser validation.

tests/internal/data/config_format/convert/run_tests.sh (1)

88-94: LGTM!

The core test logic correctly:

  • Executes fluent-bit with both .conf and .yaml parser configurations
  • Applies consistent filtering/normalization to both outputs
  • Validates outputs are non-empty before comparison

The approach of using checksums (or verbose output with -v) provides a reliable equivalence check.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
tests/internal/data/config_format/convert/run_tests.sh (2)

48-56: Simplify redundant echo statement.

Line 55's echo -e "\n" is redundant since echo already appends a newline by default, and die will output its message. This just adds visual noise.

Consider simplifying:

   [[ "$check" = "" ]] || \
   [[ "$check" = "01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b" ]] || \
   [[ "$check" = "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855" ]] && \
-	echo -e "\n" && die "### FAIL '$TEST_FILE' output was empty"
+	die "### FAIL '$TEST_FILE' output was empty"

77-100: Add pipeline failure detection for reliable error handling.

The script runs complex pipelines (lines 87, 91) but doesn't use set -o pipefail, which means intermediate command failures might go undetected. While check_output does validate that output isn't empty, explicit pipeline failure detection would make error handling more robust.

Add at the top of the script (after the shebang):

 #!/bin/bash
+set -euo pipefail
 
 # Test a batch of parsers that have been converted from .conf to .yaml,

This ensures:

  • set -e: Exit on any command failure
  • set -u: Exit on undefined variable reference
  • set -o pipefail: Pipelines fail if any command fails, not just the last one
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3b7e218 and 52c8a83.

📒 Files selected for processing (29)
  • conf/parser_custom.yaml (1 hunks)
  • conf/parsers.yaml (1 hunks)
  • conf/parsers_ambassador.yaml (1 hunks)
  • conf/parsers_cinder.yaml (1 hunks)
  • conf/parsers_extra.yaml (1 hunks)
  • conf/parsers_java.yaml (1 hunks)
  • conf/parsers_kafka.yaml (1 hunks)
  • conf/parsers_mult.yaml (1 hunks)
  • conf/parsers_multiline.yaml (1 hunks)
  • conf/parsers_openstack.yaml (1 hunks)
  • tests/internal/data/config_format/convert/.gitignore (1 hunks)
  • tests/internal/data/config_format/convert/README (1 hunks)
  • tests/internal/data/config_format/convert/fb-parser-conf2yaml.pl (1 hunks)
  • tests/internal/data/config_format/convert/parser_custom.external-dns.test (1 hunks)
  • tests/internal/data/config_format/convert/parser_custom.neo4j.test (1 hunks)
  • tests/internal/data/config_format/convert/parser_custom.rabbitmq.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers.cri.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers.envoy.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers.istio-envoy-proxy.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers.k8s-nginx-ingress.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers.kmsg-netfilter-log.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers.syslog-rfc3164.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers_cinder.ceph.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers_extra.chefclient.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers_extra.couchbase_java_multiline.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_mixed.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_utc.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers_extra.crowbar.test (1 hunks)
  • tests/internal/data/config_format/convert/run_tests.sh (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • conf/parsers_multiline.yaml
🚧 Files skipped from review as they are similar to previous changes (16)
  • tests/internal/data/config_format/convert/parsers_extra.chefclient.test
  • conf/parsers_openstack.yaml
  • conf/parsers_kafka.yaml
  • conf/parser_custom.yaml
  • conf/parsers_extra.yaml
  • tests/internal/data/config_format/convert/parsers_extra.couchbase_java_multiline.test
  • tests/internal/data/config_format/convert/parsers.syslog-rfc3164.test
  • tests/internal/data/config_format/convert/parsers.cri.test
  • tests/internal/data/config_format/convert/README
  • conf/parsers_mult.yaml
  • tests/internal/data/config_format/convert/parsers.istio-envoy-proxy.test
  • conf/parsers_java.yaml
  • tests/internal/data/config_format/convert/parsers_extra.crowbar.test
  • tests/internal/data/config_format/convert/parsers.envoy.test
  • tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_utc.test
  • tests/internal/data/config_format/convert/parser_custom.neo4j.test
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-11-16T22:16:26.015Z
Learnt from: hlein
Repo: fluent/fluent-bit PR: 11168
File: conf/parsers_mult.yaml:8-14
Timestamp: 2025-11-16T22:16:26.015Z
Learning: In Fluent Bit parser configurations (both .conf and .yaml formats), the regex engine automatically strips leading and trailing `/` characters from regex patterns, so patterns like `/Processing by .../` and `Processing by ...` are functionally equivalent and both work correctly.

Applied to files:

  • conf/parsers.yaml
  • tests/internal/data/config_format/convert/fb-parser-conf2yaml.pl
  • conf/parsers_cinder.yaml
📚 Learning: 2025-09-22T15:59:55.794Z
Learnt from: nicknezis
Repo: fluent/fluent-bit PR: 10882
File: plugins/out_http/http.c:112-116
Timestamp: 2025-09-22T15:59:55.794Z
Learning: When users consider bug fixes out of scope for their focused PRs, it's appropriate to create separate GitHub issues to track those concerns rather than expanding the current PR scope.

Applied to files:

  • conf/parsers_cinder.yaml
🔇 Additional comments (17)
tests/internal/data/config_format/convert/.gitignore (1)

1-2: Verify that test fixtures are tracked in git if needed by the test harness.

This .gitignore pattern ignores all .conf and .yaml files in the test directory. Given that the PR adds a test harness for the converter, clarify whether these patterns are intended to exclude:

  1. Generated test output files (temporary/derived) — in which case ignoring is appropriate
  2. Test fixture files (reference data needed by tests) — in which case ignoring may prevent test execution

Please confirm:

  • Are .conf and .yaml files in this directory test fixtures that need to be committed to git?
  • Or are they generated output files that should be excluded?
  • Are test fixtures tracked elsewhere, or will they be generated at test time?

If test fixtures should be committed, consider scoping this .gitignore to exclude only generated files, e.g.:

*.conf.generated
*.yaml.generated

Or use a separate directory structure for generated vs. fixture files.

tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_mixed.test (1)

1-3: LGTM!

The test fixture appropriately contains mixed Couchbase log formats (slash-separated and ISO 8601 date formats) with realistic log content. This test data effectively validates that the parser can handle different timestamp formats from Couchbase logs.

conf/parsers_cinder.yaml (1)

1-8: LGTM! Ceph parser converted correctly.

The YAML format correctly preserves the existing ceph parser configuration from the .conf version, and test results confirm functional parity.

tests/internal/data/config_format/convert/parsers_cinder.ceph.test (1)

1-10: Test data looks good.

The Ceph log samples provide good coverage with both Hangup signals and connection state messages, sufficient for validating the parser conversion.

conf/parsers.yaml (1)

1-122: Excellent parser registry conversion.

The YAML format correctly preserves all existing parser configurations, and test results confirm functional parity between .conf and .yaml formats across all parsers. The comprehensive registry supports diverse log formats with consistent time extraction patterns.

conf/parsers_ambassador.yaml (1)

1-7: Ambassador parser looks good.

The configuration correctly defines an Envoy-based parser for Ambassador logs with comprehensive field extraction. The regex pattern aligns with the Envoy log format used by Ambassador.

tests/internal/data/config_format/convert/fb-parser-conf2yaml.pl (1)

1-489: Well-structured converter script.

The Perl converter is cleanly organized with clear separation between parsing, formatting, and output generation. It handles the conversion requirements effectively:

  • Preserves comments (leading, inline, trailing) with proper positioning
  • Transforms multiline parser rules into structured YAML
  • Handles UTF-8 encoding properly
  • Provides appropriate error handling with warnings

The test results confirm the converter produces functionally equivalent YAML configurations.

tests/internal/data/config_format/convert/parser_custom.external-dns.test (1)

1-1: Test data validates the external-dns parser.

The sample log line provides appropriate coverage for testing the external-dns parser configuration.

tests/internal/data/config_format/convert/parsers.kmsg-netfilter-log.test (1)

1-2: Netfilter test data provides good protocol coverage.

The test samples include both TCP and UDP variants, sufficient for validating the kmsg-netfilter-log parser's comprehensive field extraction.

tests/internal/data/config_format/convert/parser_custom.rabbitmq.test (1)

1-1: RabbitMQ test data validates the parser.

The log sample appropriately tests the rabbitmq parser's field extraction capabilities.

tests/internal/data/config_format/convert/parsers.k8s-nginx-ingress.test (1)

1-1: LGTM! Test data is appropriate.

The log line contains all expected fields for a k8s nginx ingress access log, including client IP, timestamp, HTTP request details, response metrics, and upstream information. This provides comprehensive test coverage for the parser.

tests/internal/data/config_format/convert/run_tests.sh (6)

1-16: LGTM! Utility functions are well-structured.

The shebang, warn, and die functions follow bash best practices for error reporting.


18-27: LGTM! Date normalization is correctly implemented.

The regex pattern properly matches JSON date fields, and the comment clearly explains why this normalization is necessary. The sed -E usage is appropriate for this pattern.


29-33: LGTM! Filter function provides appropriate output transformation.

The pipeline correctly normalizes dates and generates a checksum for comparison. The function override mechanism for verbose mode is a clean approach.


60-66: LGTM! Argument parsing is correct.

The loop properly handles verbose mode, validates flags, and collects test file arguments. The previous issue with unnecessary shift has been addressed.


68-72: LGTM! Fluent-bit validation provides good early failure detection.

The PATH check and self-test ensure the script fails fast if fluent-bit is unavailable or malfunctioning, which improves debuggability.


74-75: LGTM! Default file selection provides good usability.

Defaulting to all matching test files when none are specified is intuitive and follows common Unix tool patterns.

These have been run through the test suite to make sure fluent-bit
produces identical output using them.

Signed-off-by: Hank Leininger <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
conf/parsers.yaml (1)

75-80: Consider removing regex delimiters for consistency.

While this regex pattern works correctly (Fluent Bit strips the leading and trailing / characters automatically), it's inconsistent with other parsers in this file. For example, the similar syslog-rfc3164-local parser at line 70 and all other regex parsers in this file don't include the / delimiters.

Consider applying this change for consistency:

   - name: syslog-rfc3164
     format: regex
-    regex: '/^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$/'
+    regex: '^\<(?<pri>[0-9]+)\>(?<time>[^ ]* {1,2}[^ ]* [^ ]*) (?<host>[^ ]*) (?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$'
     time_key: time
     time_format: '%b %d %H:%M:%S'
     time_keep: On

Based on learnings.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 52c8a83 and 85e1ccc.

📒 Files selected for processing (29)
  • conf/parser_custom.yaml (1 hunks)
  • conf/parsers.yaml (1 hunks)
  • conf/parsers_ambassador.yaml (1 hunks)
  • conf/parsers_cinder.yaml (1 hunks)
  • conf/parsers_extra.yaml (1 hunks)
  • conf/parsers_java.yaml (1 hunks)
  • conf/parsers_kafka.yaml (1 hunks)
  • conf/parsers_mult.yaml (1 hunks)
  • conf/parsers_multiline.yaml (1 hunks)
  • conf/parsers_openstack.yaml (1 hunks)
  • tests/internal/data/config_format/convert/.gitignore (1 hunks)
  • tests/internal/data/config_format/convert/README (1 hunks)
  • tests/internal/data/config_format/convert/fb-parser-conf2yaml.pl (1 hunks)
  • tests/internal/data/config_format/convert/parser_custom.external-dns.test (1 hunks)
  • tests/internal/data/config_format/convert/parser_custom.neo4j.test (1 hunks)
  • tests/internal/data/config_format/convert/parser_custom.rabbitmq.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers.cri.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers.envoy.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers.istio-envoy-proxy.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers.k8s-nginx-ingress.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers.kmsg-netfilter-log.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers.syslog-rfc3164.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers_cinder.ceph.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers_extra.chefclient.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers_extra.couchbase_java_multiline.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_mixed.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_utc.test (1 hunks)
  • tests/internal/data/config_format/convert/parsers_extra.crowbar.test (1 hunks)
  • tests/internal/data/config_format/convert/run_tests.sh (1 hunks)
✅ Files skipped from review due to trivial changes (2)
  • tests/internal/data/config_format/convert/parser_custom.neo4j.test
  • tests/internal/data/config_format/convert/parsers.k8s-nginx-ingress.test
🚧 Files skipped from review as they are similar to previous changes (17)
  • tests/internal/data/config_format/convert/parser_custom.external-dns.test
  • tests/internal/data/config_format/convert/parsers_extra.couchbase_java_multiline.test
  • tests/internal/data/config_format/convert/parsers.cri.test
  • tests/internal/data/config_format/convert/parsers_extra.chefclient.test
  • tests/internal/data/config_format/convert/parsers.syslog-rfc3164.test
  • conf/parsers_cinder.yaml
  • conf/parsers_openstack.yaml
  • conf/parsers_java.yaml
  • conf/parsers_ambassador.yaml
  • conf/parser_custom.yaml
  • tests/internal/data/config_format/convert/parsers.kmsg-netfilter-log.test
  • tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_mixed.test
  • conf/parsers_kafka.yaml
  • conf/parsers_mult.yaml
  • tests/internal/data/config_format/convert/parsers_extra.couchbase_simple_log_utc.test
  • tests/internal/data/config_format/convert/run_tests.sh
  • tests/internal/data/config_format/convert/parsers.envoy.test
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-11-16T22:16:26.015Z
Learnt from: hlein
Repo: fluent/fluent-bit PR: 11168
File: conf/parsers_mult.yaml:8-14
Timestamp: 2025-11-16T22:16:26.015Z
Learning: In Fluent Bit parser configurations (both .conf and .yaml formats), the regex engine automatically strips leading and trailing `/` characters from regex patterns, so patterns like `/Processing by .../` and `Processing by ...` are functionally equivalent and both work correctly.

Applied to files:

  • conf/parsers.yaml
  • conf/parsers_extra.yaml
  • tests/internal/data/config_format/convert/fb-parser-conf2yaml.pl
🔇 Additional comments (15)
tests/internal/data/config_format/convert/.gitignore (1)

1-2: LGTM!

The ignore patterns are appropriate for a conversion test directory where .conf and .yaml files are generated or copied during testing.

tests/internal/data/config_format/convert/README (1)

1-20: LGTM!

The documentation clearly explains the conversion workflow with practical usage examples.

tests/internal/data/config_format/convert/fb-parser-conf2yaml.pl (7)

1-28: LGTM!

The header properly enables strict mode, warnings, and UTF-8 support. The documented limitations are reasonable given the scope of conversion needed, and the keyword pattern construction with quotemeta is a good security practice.


48-60: LGTM!

The keyword transformation logic correctly identifies matching patterns and delegates value formatting to the appropriate function.


62-252: LGTM!

The parsing logic correctly handles sections, comments, blank lines, and key-value pairs. The state management for comment positioning is well thought out, and appropriate warnings are issued for unsupported sections or parsing failures.


254-271: LGTM!

The YAML value formatting correctly identifies values requiring quoting and properly escapes single quotes using YAML's '' escape sequence.


273-416: LGTM!

The output functions properly format YAML with correct indentation, list markers, and blank line spacing. The special handling for rules arrays and comment positioning is well-implemented.


418-438: LGTM!

The conversion function correctly orchestrates the output of all components in the proper order and returns well-formed YAML.


440-492: LGTM!

The main execution has robust file handling with proper existence checks, UTF-8 encoding, and helpful error messages. The safety check for existing non-empty output files (line 463) prevents accidental overwrites.

tests/internal/data/config_format/convert/parser_custom.rabbitmq.test (1)

1-1: LGTM! Test data looks appropriate.

The RabbitMQ log entry is well-formed and provides appropriate test coverage for the RabbitMQ parser. The PR objectives confirm that tests pass with matching checksums between .conf and .yaml formats.

tests/internal/data/config_format/convert/parsers_cinder.ceph.test (1)

1-10: LGTM! Comprehensive Ceph test data.

The test data covers multiple Ceph log message types (Hangup signals and connection state messages), providing good parser validation coverage. The PR objectives confirm that tests pass with matching checksums.

conf/parsers_multiline.yaml (1)

1-12: LGTM! Multiline parser configuration is correct.

The multiline parser for exception handling is properly configured. The regex patterns include / delimiters, which work correctly because Fluent Bit's regex engine automatically strips leading and trailing / characters.

Based on learnings.

tests/internal/data/config_format/convert/parsers_extra.crowbar.test (1)

1-20: LGTM! Comprehensive Crowbar test coverage.

The test data provides excellent coverage of various Crowbar log formats including DEBUG, WARN, and INFO levels, command errors, and HTTP request/response logs. The PR objectives confirm that tests pass with matching checksums.

tests/internal/data/config_format/convert/parsers.istio-envoy-proxy.test (1)

1-1: LGTM! Appropriate Istio Envoy test data.

The test data provides comprehensive coverage of Istio Envoy proxy log fields including request details, timing, upstream information, and routing context. The PR objectives confirm that tests pass with matching checksums.

conf/parsers_extra.yaml (1)

1-159: LGTM! Comprehensive and consistent parser definitions.

This file provides an extensive collection of well-structured parser definitions for various log sources. All regex patterns are consistently formatted without leading/trailing / delimiters, and time handling configurations are appropriate for each parser type. The Couchbase parser collection is particularly thorough, covering multiple log formats and edge cases.

@hlein
Copy link
Author

hlein commented Nov 17, 2025

conf/parsers.yaml (1)> 75-80: Consider removing regex delimiters for consistency.

My intention is to submit regex-cleanup commits separately from these changes, once these commits merge.

@hlein
Copy link
Author

hlein commented Nov 18, 2025

Note, a branch full of fixes is waiting behind this PR; once it lands I'll retarget hlein#2

@@ -0,0 +1,492 @@
#!/usr/bin/env perl
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just my opinion but it would be nice to be written down in Python instead of Perl...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just my opinion but it would be nice to be written down in Python instead of Perl...

I'm sure it's not just your opinion, probably shared by others :)

But I'm about 10x more proficient in perl than python. This tool probably won't be needed longterm, once the current parser files are translated (but I added it to show my work).

I'm sure it could be translated/rewritten into python mostly automatically, but if so I wouldn't be qualified to debug it.

@hlein hlein changed the title Convert existing parser*.conf files to yaml format, add a converter, add tests Convert existing parser*.conf files to yaml, add converter, add tests, drop old Nov 18, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 85e1ccc and 34e35f0.

📒 Files selected for processing (23)
  • CMakeLists.txt (1 hunks)
  • conf/fluent-bit-macos.conf (1 hunks)
  • conf/fluent-bit-win32.conf (1 hunks)
  • conf/fluent-bit.conf (1 hunks)
  • conf/kube.conf (1 hunks)
  • conf/kube_elasticsearch.conf (1 hunks)
  • conf/parser_custom.conf (0 hunks)
  • conf/parsers_ambassador.conf (0 hunks)
  • conf/parsers_cinder.conf (0 hunks)
  • conf/parsers_extra.conf (0 hunks)
  • conf/parsers_java.conf (0 hunks)
  • conf/parsers_kafka.conf (0 hunks)
  • conf/parsers_mult.conf (0 hunks)
  • conf/parsers_openstack.conf (0 hunks)
  • conf/rate_limit.conf (2 hunks)
  • cpack/debian/conffiles (1 hunks)
  • dockerfiles/Dockerfile (1 hunks)
  • dockerfiles/Dockerfile.windows (1 hunks)
  • documentation/examples/multiline/filter_multiline/fluent-bit.conf (1 hunks)
  • documentation/examples/multiline/regex-001/fluent-bit.conf (1 hunks)
  • examples/wasi_serde_json/README.md (2 hunks)
  • src/CMakeLists.txt (1 hunks)
  • tests/internal/data/stream_processor/gen_msgpack.sh (2 hunks)
💤 Files with no reviewable changes (8)
  • conf/parsers_openstack.conf
  • conf/parsers_mult.conf
  • conf/parser_custom.conf
  • conf/parsers_cinder.conf
  • conf/parsers_java.conf
  • conf/parsers_kafka.conf
  • conf/parsers_ambassador.conf
  • conf/parsers_extra.conf
✅ Files skipped from review due to trivial changes (2)
  • CMakeLists.txt
  • examples/wasi_serde_json/README.md
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit CMakeLists.txt, the system library preference flags are defined as FLB_PREFER_SYSTEM_LIB_ZSTD and FLB_PREFER_SYSTEM_LIB_KAFKA with the FLB_ prefix.

Applied to files:

  • src/CMakeLists.txt
📚 Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit, the correct CMake flag for using system librdkafka is `FLB_PREFER_SYSTEM_LIB_KAFKA=ON`.

Applied to files:

  • src/CMakeLists.txt
📚 Learning: 2025-09-14T09:46:09.531Z
Learnt from: aminvakil
Repo: fluent/fluent-bit PR: 10844
File: conf/fluent-bit:13-15
Timestamp: 2025-09-14T09:46:09.531Z
Learning: For fluent-bit Debian packaging, /opt/fluent-bit/bin/ is the appropriate installation path since the package may be installed from non-official Debian sources, making /opt compliant with FHS for optional software packages.

Applied to files:

  • dockerfiles/Dockerfile
  • dockerfiles/Dockerfile.windows
  • cpack/debian/conffiles
📚 Learning: 2025-11-16T22:16:26.032Z
Learnt from: hlein
Repo: fluent/fluent-bit PR: 11168
File: conf/parsers_mult.yaml:8-14
Timestamp: 2025-11-16T22:16:26.032Z
Learning: In Fluent Bit parser configurations (both .conf and .yaml formats), the regex engine automatically strips leading and trailing `/` characters from regex patterns, so patterns like `/Processing by .../` and `Processing by ...` are functionally equivalent and both work correctly.

Applied to files:

  • conf/fluent-bit-win32.conf
  • conf/fluent-bit.conf
  • conf/fluent-bit-macos.conf
  • documentation/examples/multiline/regex-001/fluent-bit.conf
  • documentation/examples/multiline/filter_multiline/fluent-bit.conf
  • cpack/debian/conffiles
🔇 Additional comments (12)
conf/fluent-bit-macos.conf (1)

28-28: LGTM! Parser file reference updated correctly.

The update from parsers.conf to parsers.yaml aligns with the PR's migration strategy. Testing confirms YAML parsers behave identically to the original .conf versions.

conf/rate_limit.conf (1)

5-5: LGTM! Configuration updated correctly.

The parser file reference has been updated to parsers.yaml, consistent with the migration across all configuration files in this PR.

Also applies to: 25-25

conf/fluent-bit-win32.conf (1)

28-28: LGTM! Windows configuration updated consistently.

The parser file reference update matches the changes across all platform-specific configurations (macOS, Linux, Windows).

conf/kube_elasticsearch.conf (1)

5-5: LGTM! Kubernetes configuration updated correctly.

The parser file reference has been updated to parsers.yaml, maintaining consistency with the project-wide migration.

documentation/examples/multiline/regex-001/fluent-bit.conf (1)

4-4: LGTM! Documentation example updated correctly.

The multiline parser reference has been updated from .conf to .yaml format, ensuring documentation examples remain accurate and aligned with the migration.

conf/fluent-bit.conf (1)

28-28: LGTM! Primary configuration updated correctly.

The main Fluent Bit configuration file now references parsers.yaml, completing the migration for the default configuration.

conf/kube.conf (1)

5-5: LGTM! Kubernetes configuration updated correctly.

The parser file reference update completes the consistent migration across all configuration files in this review set.

cpack/debian/conffiles (1)

1-1: Verification confirmed: parsers.yaml installation properly configured.

The build system correctly installs parsers.yaml to /etc/fluent-bit/parsers.yaml as a config file (marked noreplace for package preservation). The conffiles change is correct and aligns with the CMake configuration.

tests/internal/data/stream_processor/gen_msgpack.sh (1)

10-10: LGTM! Consistent update to YAML parser configuration.

Both fluent-bit invocations have been updated to use parsers.yaml instead of parsers.conf, maintaining consistency across the test generation script.

Also applies to: 34-34

src/CMakeLists.txt (1)

660-670: LGTM! Well-designed backward compatibility.

The conditional installation logic properly handles both YAML-enabled and non-YAML builds by installing parsers.yaml when FLB_HAVE_LIBYAML is enabled and falling back to parsers.conf otherwise. This maintains backward compatibility while enabling the migration to YAML-based configurations.

dockerfiles/Dockerfile.windows (1)

190-195: LGTM! Consistent migration to YAML parser configurations.

All parser configuration files have been updated to use YAML variants. This is well-aligned with the Windows build configuration, which includes libyaml installation (line 153) and proper CMake configuration (line 174).

dockerfiles/Dockerfile (1)

103-108: LGTM! Consistent migration to YAML parser configurations.

All parser configuration files have been updated to use YAML variants, consistent with the Windows Dockerfile changes. The Linux build properly includes YAML support via the libyaml-dev package (line 57).

@cosmo0920 cosmo0920 changed the title Convert existing parser*.conf files to yaml, add converter, add tests, drop old conf: Convert existing parser*.conf files to yaml, add converter, add tests, drop old Nov 20, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d61797d and c3bf424.

📒 Files selected for processing (23)
  • CMakeLists.txt (1 hunks)
  • conf/fluent-bit-macos.conf (1 hunks)
  • conf/fluent-bit-win32.conf (1 hunks)
  • conf/fluent-bit.conf (1 hunks)
  • conf/kube.conf (1 hunks)
  • conf/kube_elasticsearch.conf (1 hunks)
  • conf/parser_custom.conf (0 hunks)
  • conf/parsers_ambassador.conf (0 hunks)
  • conf/parsers_cinder.conf (0 hunks)
  • conf/parsers_extra.conf (0 hunks)
  • conf/parsers_java.conf (0 hunks)
  • conf/parsers_kafka.conf (0 hunks)
  • conf/parsers_mult.conf (0 hunks)
  • conf/parsers_openstack.conf (0 hunks)
  • conf/rate_limit.conf (2 hunks)
  • cpack/debian/conffiles (1 hunks)
  • dockerfiles/Dockerfile (1 hunks)
  • dockerfiles/Dockerfile.windows (1 hunks)
  • documentation/examples/multiline/filter_multiline/fluent-bit.conf (1 hunks)
  • documentation/examples/multiline/regex-001/fluent-bit.conf (1 hunks)
  • examples/wasi_serde_json/README.md (2 hunks)
  • src/CMakeLists.txt (1 hunks)
  • tests/internal/data/stream_processor/gen_msgpack.sh (2 hunks)
💤 Files with no reviewable changes (8)
  • conf/parsers_openstack.conf
  • conf/parsers_mult.conf
  • conf/parsers_ambassador.conf
  • conf/parsers_kafka.conf
  • conf/parser_custom.conf
  • conf/parsers_cinder.conf
  • conf/parsers_java.conf
  • conf/parsers_extra.conf
✅ Files skipped from review due to trivial changes (1)
  • conf/fluent-bit-win32.conf
🚧 Files skipped from review as they are similar to previous changes (10)
  • tests/internal/data/stream_processor/gen_msgpack.sh
  • conf/kube.conf
  • documentation/examples/multiline/regex-001/fluent-bit.conf
  • examples/wasi_serde_json/README.md
  • documentation/examples/multiline/filter_multiline/fluent-bit.conf
  • conf/fluent-bit.conf
  • conf/kube_elasticsearch.conf
  • dockerfiles/Dockerfile.windows
  • conf/rate_limit.conf
  • src/CMakeLists.txt
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-09-14T09:46:09.531Z
Learnt from: aminvakil
Repo: fluent/fluent-bit PR: 10844
File: conf/fluent-bit:13-15
Timestamp: 2025-09-14T09:46:09.531Z
Learning: For fluent-bit Debian packaging, /opt/fluent-bit/bin/ is the appropriate installation path since the package may be installed from non-official Debian sources, making /opt compliant with FHS for optional software packages.

Applied to files:

  • cpack/debian/conffiles
  • dockerfiles/Dockerfile
📚 Learning: 2025-11-16T22:16:26.032Z
Learnt from: hlein
Repo: fluent/fluent-bit PR: 11168
File: conf/parsers_mult.yaml:8-14
Timestamp: 2025-11-16T22:16:26.032Z
Learning: In Fluent Bit parser configurations (both .conf and .yaml formats), the regex engine automatically strips leading and trailing `/` characters from regex patterns, so patterns like `/Processing by .../` and `Processing by ...` are functionally equivalent and both work correctly.

Applied to files:

  • cpack/debian/conffiles
  • conf/fluent-bit-macos.conf
🔇 Additional comments (2)
dockerfiles/Dockerfile (1)

102-110: Now I need to check what version of fluent-bit is being used and look at the complete fluent-bit.conf file to understand how parsers are loaded:

Let me check the Dockerfile more thoroughly and the fluent-bit.conf file:Based on all my web searches, I now have enough context. When converting the existing conf to a YAML format, you'll change the file extension to be .YAML which means you must also modify the referencing parsers_file reference in the server section of the main configuration file. The critical issue is that fluent-bit.conf showed NO @include statements. Without checking the full fluent-bit.conf content to see if it has proper Parsers_File directives referencing the YAML files, I cannot definitively verify the configuration is correct. However, based on the evidence:

  1. ✓ YAML parser files exist in the build context
  2. ✓ Fluent-bit version is 4.1.2+ which supports YAML parsers
  3. ✗ fluent-bit.conf has NO @include statements (but could have parsers_file directive)
  4. Critical gap: No evidence that fluent-bit.conf has been updated to reference the new YAML parser files

This is a potential breaking change that needs verification.

Verify that fluent-bit.conf properly references the YAML parser files and that all parser configurations are correctly loaded in the container.

The Dockerfile change copied YAML parser files instead of .conf files, but without verifying that:

  • fluent-bit.conf contains updated parsers_file directives pointing to the YAML files (not the old .conf files)
  • All parser references in the main configuration are correctly updated to use YAML format

The PR claims "kept in place" and "FLB_CONFIG_YAML remains optional," but the Dockerfile makes an exclusive switch from .conf to .yaml files without evidence of corresponding config updates. If fluent-bit.conf still references .conf parser files that are no longer copied, the container will fail to load parsers at runtime.

conf/fluent-bit-macos.conf (1)

28-28: No issues found. Backward compatibility maintained.

Both conf/parsers.conf and conf/parsers.yaml coexist in the repository, ensuring users with existing installations can revert to the old format if needed by editing their configuration. The migration from .conf to .yaml is consistent across all platform configurations (macOS, Windows, Linux, Kubernetes), and existing test data validates both formats work correctly. The change is a default configuration update, not a breaking removal.

@cosmo0920 cosmo0920 added this to the Fluent Bit Next milestone Nov 21, 2025
@edsiper
Copy link
Member

edsiper commented Nov 21, 2025

what happens when a user upgrades from an old version a future one that contains this change ?

We can merge this for v5.0 (released on March), so we have time to test

@hlein
Copy link
Author

hlein commented Nov 21, 2025

what happens when a user upgrades from an old version a future one that contains this change ?

We can merge this for v5.0 (released on March), so we have time to test

I think if they have their own .conf files and configs that refer to them, nothing should change for them?

There was some feedback (don't remember if it was from a human or a bot) about making sure packages considered the old .conf file names protected from removal even if they're missing in our future bundles - that sounds like a good idea to me although I don't yet know how that works.

Or do you mean, users may want to have a conversion tool they can use to convert their own local/custom parser .conf files to .yaml? They could use the conversion script and also test script, although that does maybe change my "this is only needed for now" assertion.

@patrick-stephens
Copy link
Collaborator

patrick-stephens commented Nov 22, 2025

I think Eduardo means that folks will be using the current provided conf files so removing them will break their configuration, this is particularly true of parser definitions.

I would say it's too risky to merge new definitions and remove the old ones all in one go. We should merge the new ones to allow them to be tested and adopted.

We can deprecate the old ones and eventually remove them but dropping them straight away will break things which is not acceptable or expected by users.

Splitting the removal into a separate PR also makes it easier to revert or backport each set of changes.

@hlein hlein force-pushed the parsers-yaml branch 2 times, most recently from c69c95c to b5b520e Compare November 22, 2025 20:10
@hlein hlein changed the title conf: Convert existing parser*.conf files to yaml, add converter, add tests, drop old conf: Convert existing parser*.conf files to yaml, add converter, add tests Nov 22, 2025
@hlein
Copy link
Author

hlein commented Nov 22, 2025

Fair enough, OK I've dropped the last commit which did the removals, and modified the second-to-last to just add commented-out references to .yaml versions of the parsers, not switch to them.

The exception being examples/documentation - those are now changed to demonstrate/include yaml parsers.

Install parser*.yaml files alongside .conf and add commented-out
references to them in other configs.

Signed-off-by: Hank Leininger <[email protected]>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
conf/kube_elasticsearch.conf (1)

6-6: LGTM! Clean documentation of the YAML alternative.

The commented reference to parsers.yaml effectively documents the new YAML parser format option while maintaining backward compatibility with the existing parsers.conf configuration.

Optional enhancement: Consider adding a brief inline comment explaining the YAML option, such as:

     Parsers_File parsers.conf
-    #Parsers_File parsers.yaml
+    # Alternative: Use parsers.yaml for YAML-format parser definitions
+    #Parsers_File parsers.yaml

This could help users understand the purpose of the alternative, though the current approach is clear enough for most users.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b5b520e and ee5a9ae.

📒 Files selected for processing (16)
  • CMakeLists.txt (1 hunks)
  • conf/fluent-bit-macos.conf (1 hunks)
  • conf/fluent-bit-win32.conf (1 hunks)
  • conf/fluent-bit.conf (1 hunks)
  • conf/kube.conf (1 hunks)
  • conf/kube_elasticsearch.conf (1 hunks)
  • conf/rate_limit.conf (2 hunks)
  • cpack/debian/conffiles (1 hunks)
  • dockerfiles/Dockerfile (1 hunks)
  • dockerfiles/Dockerfile.windows (1 hunks)
  • documentation/examples/multiline/filter_multiline/fluent-bit.conf (1 hunks)
  • documentation/examples/multiline/filter_multiline/parsers_multiline.yaml (2 hunks)
  • documentation/examples/multiline/regex-001/fluent-bit.conf (1 hunks)
  • documentation/examples/multiline/regex-001/parsers_multiline.yaml (2 hunks)
  • examples/wasi_serde_json/README.md (2 hunks)
  • src/CMakeLists.txt (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • conf/kube.conf
🚧 Files skipped from review as they are similar to previous changes (9)
  • src/CMakeLists.txt
  • conf/fluent-bit-win32.conf
  • documentation/examples/multiline/regex-001/parsers_multiline.yaml
  • documentation/examples/multiline/filter_multiline/parsers_multiline.yaml
  • examples/wasi_serde_json/README.md
  • documentation/examples/multiline/filter_multiline/fluent-bit.conf
  • conf/fluent-bit-macos.conf
  • conf/rate_limit.conf
  • dockerfiles/Dockerfile.windows
🧰 Additional context used
🧠 Learnings (5)
📓 Common learnings
Learnt from: hlein
Repo: fluent/fluent-bit PR: 11168
File: conf/parsers_mult.yaml:8-14
Timestamp: 2025-11-16T22:16:26.032Z
Learning: In Fluent Bit parser configurations (both .conf and .yaml formats), the regex engine automatically strips leading and trailing `/` characters from regex patterns, so patterns like `/Processing by .../` and `Processing by ...` are functionally equivalent and both work correctly.
📚 Learning: 2025-11-16T22:16:26.032Z
Learnt from: hlein
Repo: fluent/fluent-bit PR: 11168
File: conf/parsers_mult.yaml:8-14
Timestamp: 2025-11-16T22:16:26.032Z
Learning: In Fluent Bit parser configurations (both .conf and .yaml formats), the regex engine automatically strips leading and trailing `/` characters from regex patterns, so patterns like `/Processing by .../` and `Processing by ...` are functionally equivalent and both work correctly.

Applied to files:

  • documentation/examples/multiline/regex-001/fluent-bit.conf
  • conf/fluent-bit.conf
  • cpack/debian/conffiles
📚 Learning: 2025-11-21T06:23:29.750Z
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11171
File: include/fluent-bit/flb_lib.h:52-53
Timestamp: 2025-11-21T06:23:29.750Z
Learning: In Fluent Bit core (fluent/fluent-bit repository), function descriptions/documentation are not required for newly added functions in header files.

Applied to files:

  • conf/fluent-bit.conf
📚 Learning: 2025-09-14T09:46:09.531Z
Learnt from: aminvakil
Repo: fluent/fluent-bit PR: 10844
File: conf/fluent-bit:13-15
Timestamp: 2025-09-14T09:46:09.531Z
Learning: For fluent-bit Debian packaging, /opt/fluent-bit/bin/ is the appropriate installation path since the package may be installed from non-official Debian sources, making /opt compliant with FHS for optional software packages.

Applied to files:

  • dockerfiles/Dockerfile
  • cpack/debian/conffiles
📚 Learning: 2025-08-31T12:46:11.940Z
Learnt from: ThomasDevoogdt
Repo: fluent/fluent-bit PR: 9277
File: .github/workflows/pr-compile-check.yaml:147-151
Timestamp: 2025-08-31T12:46:11.940Z
Learning: In fluent-bit CMakeLists.txt, the system library preference flags are defined as FLB_PREFER_SYSTEM_LIB_ZSTD and FLB_PREFER_SYSTEM_LIB_KAFKA with the FLB_ prefix.

Applied to files:

  • CMakeLists.txt
  • cpack/debian/conffiles
🔇 Additional comments (5)
conf/fluent-bit.conf (1)

29-29: LGTM! Helpful documentation for the YAML alternative.

The commented line provides clear guidance for users who want to try the new YAML parser format while keeping the default .conf format active. This approach aligns well with the PR's goal of introducing YAML support without breaking existing configurations.

CMakeLists.txt (1)

1526-1526: LGTM! RPM configuration file handling is correct.

The addition of parsers.yaml to the RPM filelist with %config(noreplace) is appropriate. Both parsers.conf (line 1525) and parsers.yaml are now properly marked as non-replaceable configuration files, ensuring user customizations are preserved during package upgrades.

cpack/debian/conffiles (1)

2-2: LGTM! Debian configuration file handling is correct.

The addition of /etc/fluent-bit/parsers.yaml to the conffiles list is appropriate. Both parsers.conf (line 1) and parsers.yaml are now listed, ensuring dpkg preserves user modifications during package upgrades. This aligns with the backward-compatible migration strategy.

dockerfiles/Dockerfile (1)

109-114: LGTM! YAML parser files properly added to container image.

The additions mirror the existing .conf parser files (lines 103-108), providing YAML versions alongside the original configuration format. This maintains backward compatibility during the migration. The COPY command syntax is correct and the destination is appropriate.

documentation/examples/multiline/regex-001/fluent-bit.conf (1)

4-5: LGTM! Example updated to demonstrate YAML parser format.

The change appropriately updates this example to reference the new YAML parser file while keeping the old .conf reference as a commented alternative. This aligns with the PR's goal of demonstrating YAML parsers in examples while retaining backwards compatibility.

Since the PR objectives mention that FLB_CONFIG_YAML remains optional, please verify that YAML parser files (.yaml) can be loaded via parsers_file regardless of the FLB_CONFIG_YAML build flag, or clarify any requirements for users who want to use this example. Run the following to check if there are any conditional compilation guards around YAML parser file loading:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants