Skip to content

Conversation

@ahayworth
Copy link

@ahayworth ahayworth commented Oct 31, 2025

There are certain kinds of invalid configurations that pass the
--dry-run check, but fail to start at run time. This commit addresses
the issue by re-using some of the hot reload validation logic.

For example, this config is trivially invalid, passes --dry-run, and
fails at runtime:

pipeline:
  inputs:
  - name: dummy
    tag: test
    invalid_property_that_does_not_exist: some_value
  outputs:
  - name: stdout
    match: '*'
zsh ❮ fluent-bit --dry-run -c ~/tmp/fbconfig-bad-property.yaml
Fluent Bit v4.1.1
* Copyright (C) 2015-2025 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

______ _                  _    ______ _ _             ___   __
|  ___| |                | |   | ___ (_) |           /   | /  |
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __/ /| | `| |
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / /_| |  | |
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /\___  |__| |_
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/     |_(_)___/

configuration test is successful

zsh ❮ fluent-bit -c ~/tmp/fbconfig-bad-property.yaml
Fluent Bit v4.1.1
* Copyright (C) 2015-2025 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

______ _                  _    ______ _ _             ___   __
|  ___| |                | |   | ___ (_) |           /   | /  |
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __/ /| | `| |
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / /_| |  | |
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /\___  |__| |_
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/     |_(_)___/

[2025/10/31 16:05:55.268532000] [ info] [fluent bit] version=4.1.1, commit=, pid=21037
[2025/10/31 16:05:55.268808000] [ info] [storage] ver=1.5.3, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2025/10/31 16:05:55.269140000] [ info] [simd    ] disabled
[2025/10/31 16:05:55.269147000] [ info] [cmetrics] version=1.0.5
[2025/10/31 16:05:55.269407000] [ info] [ctraces ] version=0.6.6
[2025/10/31 16:05:55.269476000] [error] [config] dummy: unknown configuration property 'invalid_property_that_does_not_exist'. The following properties are allowed: samples, dummy, metadata, rate, interval_sec, interval_nsec, copies, start_time_sec, start_time_nsec, fixed_timestamp, flush_on_startup, and test_hang_on_exit.
[2025/10/31 16:05:55.269486000] [ help] try the command: fluent-bit -i dummy -h

[2025/10/31 16:05:55.269515000] [error] [engine] input initialization failed

With this commit, we can see that the additional validation from hot
reload catches the error right away:

zsh ❯ bin/fluent-bit --dry-run -c ~/tmp/fbconfig-bad-property.yaml
Fluent Bit v4.2.0
* Copyright (C) 2015-2025 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

______ _                  _    ______ _ _             ___   __
|  ___| |                | |   | ___ (_) |           /   | /  |
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __/ /| | `| |
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / /_| |  | |
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /\___  |__| |_
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/     |_(_)___/

[2025/10/31 16:07:50.402568000] [error] [config] dummy: unknown configuration property 'invalid_property_that_does_not_exist'. The following properties are allowed: samples, dummy, metadata, rate, interval_sec, interval_nsec, copies, start_time_sec, start_time_nsec, fixed_timestamp, flush_on_startup, and test_hang_on_exit.
[2025/10/31 16:07:50.402792000] [ help] try the command: bin/fluent-bit -i dummy -h

[2025/10/31 16:07:50.402800000] [error] [reload] check properties for input plugins is failed

(The logs of course now say [reload], which is a little misleading...
we can clean that up, if desired).


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change (included in the commit/PR description)
  • Debug log output from testing the change (included in the commit/PR description)
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • [N/A] Run local packaging test showing all targets (including any new ones) build.
  • [N/A] Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • [N/A] Documentation required for this feature

Backporting

  • [N/A] Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • Bug Fixes

    • Dry-run now performs an explicit configuration validation step, always runs cleanup, and exits with failure when unknown or invalid configuration properties are detected, ensuring clearer failure signals during config testing.
  • Tests

    • Added an automated shell test that verifies dry-run returns a non-zero exit, reports the unknown-property error, and logs a reload-validation failure when a config contains invalid properties.

@coderabbitai
Copy link

coderabbitai bot commented Oct 31, 2025

Walkthrough

In dry-run mode, fluent-bit now calls flb_reload_property_check_all(config), captures its return value, always performs cleanup (destroying config/context), and exits non-zero on validation failure or prints a success message and exits normally on success.

Changes

Cohort / File(s) Change Summary
Dry-run mode validation
src/fluent-bit.c
Added call to flb_reload_property_check_all(config) in the --dry-run path; return value stored, cleanup always executed, process exits non-zero on validation failure or prints success and exits normally on success.
Dry-run invalid-property test
tests/runtime_shell/dry_run_invalid_property.sh
New shell test that writes a temp YAML with an invalid property, runs fluent-bit with --dry-run, asserts a non-zero exit code, checks for an unknown property error and a reload-validation failure message, and cleans up temp files.
Test registration
tests/runtime_shell/CMakeLists.txt
Added dry_run_invalid_property.sh to the UNIT_TESTS_SH list so the new test is discovered and run by the test harness.

Sequence Diagram

sequenceDiagram
    participant User
    participant App as "fluent-bit (--dry-run)"
    participant Validator as "flb_reload_property_check_all"

    User->>App: Start with --dry-run
    Note over App: Parse config, init env/context
    App->>Validator: flb_reload_property_check_all(config)
    alt Validation fails (non-zero)
        Validator-->>App: non-zero (failure)
        Note over App: Perform cleanup (destroy config/context)
        App-->>User: Exit with failure (print validation failure)
    else Validation succeeds (zero)
        Validator-->>App: zero (success)
        Note over App: Perform cleanup (destroy config/context)
        App-->>User: Print "configuration test succeeded" and exit (success)
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Areas needing extra attention:
    • Return semantics and side effects of flb_reload_property_check_all().
    • Ensuring cleanup always runs and does not mask errors.
    • Test robustness around temp file handling and exact output matching.

Suggested reviewers

  • edsiper
  • koleini
  • fujimotos

Poem

🐰 I nibble through YAML, sniff each property bright,

I run a dry-hop check before dusk turns to night.
I tidy the burrow, then sweep with delight,
If something's awry I thump — otherwise, all's right. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: adding additional validation logic to the --dry-run mode by reusing hot-reload validation, which is the core objective of this PR.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0e3f6de and 8180652.

📒 Files selected for processing (1)
  • tests/runtime_shell/CMakeLists.txt (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/runtime_shell/CMakeLists.txt

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 93d3fb3 and 57442a0.

📒 Files selected for processing (1)
  • src/fluent-bit.c (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/fluent-bit.c (1)
src/flb_reload.c (1)
  • flb_reload_property_check_all (208-245)

@edsiper
Copy link
Member

edsiper commented Nov 1, 2025

@codex review

@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. Already looking forward to the next diff.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@cosmo0920
Copy link
Contributor

Hi, could you rebase off master?
After rebasing off the current master, I can confirm like as a below even if on macOS:

% bin/fluent-bit -c  invalid_config.yaml --dry-run
Fluent Bit v4.2.0
* Copyright (C) 2015-2025 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

______ _                  _    ______ _ _             ___   __  
|  ___| |                | |   | ___ (_) |           /   | /  | 
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __/ /| | `| | 
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / /_| |  | | 
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /\___  |__| |_
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/     |_(_)___/


[2025/11/07 15:17:51.675942000] [ info] Configuration:
[2025/11/07 15:17:51.675948000] [ info]  flush time     | 1.000000 seconds
[2025/11/07 15:17:51.675952000] [ info]  grace          | 5 seconds
[2025/11/07 15:17:51.675954000] [ info]  daemon         | 0
[2025/11/07 15:17:51.675955000] [ info] ___________
[2025/11/07 15:17:51.675957000] [ info]  inputs:
[2025/11/07 15:17:51.675959000] [ info]      dummy
[2025/11/07 15:17:51.675960000] [ info] ___________
[2025/11/07 15:17:51.675962000] [ info]  filters:
[2025/11/07 15:17:51.675963000] [ info] ___________
[2025/11/07 15:17:51.675965000] [ info]  outputs:
[2025/11/07 15:17:51.675966000] [ info]      stdout.0
[2025/11/07 15:17:51.675968000] [ info] ___________
[2025/11/07 15:17:51.675969000] [ info]  collectors:
[2025/11/07 15:17:51.675986000] [error] [config] dummy: unknown configuration property 'invalid_property_that_does_not_exist'. The following properties are allowed: samples, dummy, metadata, rate, interval_sec, interval_nsec, copies, start_time_sec, start_time_nsec, fixed_timestamp, flush_on_startup, and test_hang_on_exit.
[2025/11/07 15:17:51.675991000] [ help] try the command: bin/fluent-bit -i dummy -h

[2025/11/07 15:17:51.675994000] [error] [reload] check properties for input plugins is failed

Copy link
Contributor

@cosmo0920 cosmo0920 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patch is good but it needs to be rebased off master.

Copy link
Collaborator

@patrick-stephens patrick-stephens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect some tests as well to validate this to ensure we do not regress in the future as well.

There are certain kinds of invalid configurations that pass the
`--dry-run` check, but fail to start at run time. This commit addresses
the issue by re-using some of the hot reload validation logic.

For example, this config is trivially invalid, passes `--dry-run`, and
fails at runtime:

```
pipeline:
  inputs:
  - name: dummy
    tag: test
    invalid_property_that_does_not_exist: some_value
  outputs:
  - name: stdout
    match: '*'
```

```
zsh ❮ fluent-bit --dry-run -c ~/tmp/fbconfig-bad-property.yaml
Fluent Bit v4.1.1
* Copyright (C) 2015-2025 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

______ _                  _    ______ _ _             ___   __
|  ___| |                | |   | ___ (_) |           /   | /  |
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __/ /| | `| |
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / /_| |  | |
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /\___  |__| |_
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/     |_(_)___/

configuration test is successful

zsh ❮ fluent-bit -c ~/tmp/fbconfig-bad-property.yaml
Fluent Bit v4.1.1
* Copyright (C) 2015-2025 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

______ _                  _    ______ _ _             ___   __
|  ___| |                | |   | ___ (_) |           /   | /  |
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __/ /| | `| |
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / /_| |  | |
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /\___  |__| |_
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/     |_(_)___/

[2025/10/31 16:05:55.268532000] [ info] [fluent bit] version=4.1.1, commit=, pid=21037
[2025/10/31 16:05:55.268808000] [ info] [storage] ver=1.5.3, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2025/10/31 16:05:55.269140000] [ info] [simd    ] disabled
[2025/10/31 16:05:55.269147000] [ info] [cmetrics] version=1.0.5
[2025/10/31 16:05:55.269407000] [ info] [ctraces ] version=0.6.6
[2025/10/31 16:05:55.269476000] [error] [config] dummy: unknown configuration property 'invalid_property_that_does_not_exist'. The following properties are allowed: samples, dummy, metadata, rate, interval_sec, interval_nsec, copies, start_time_sec, start_time_nsec, fixed_timestamp, flush_on_startup, and test_hang_on_exit.
[2025/10/31 16:05:55.269486000] [ help] try the command: fluent-bit -i dummy -h

[2025/10/31 16:05:55.269515000] [error] [engine] input initialization failed
```

With this commit, we can see that the additional validation from hot
reload catches the error right away:

```
zsh ❯ bin/fluent-bit --dry-run -c ~/tmp/fbconfig-bad-property.yaml
Fluent Bit v4.2.0
* Copyright (C) 2015-2025 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

______ _                  _    ______ _ _             ___   __
|  ___| |                | |   | ___ (_) |           /   | /  |
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __/ /| | `| |
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / /_| |  | |
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /\___  |__| |_
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/     |_(_)___/

[2025/10/31 16:07:50.402568000] [error] [config] dummy: unknown configuration property 'invalid_property_that_does_not_exist'. The following properties are allowed: samples, dummy, metadata, rate, interval_sec, interval_nsec, copies, start_time_sec, start_time_nsec, fixed_timestamp, flush_on_startup, and test_hang_on_exit.
[2025/10/31 16:07:50.402792000] [ help] try the command: bin/fluent-bit -i dummy -h

[2025/10/31 16:07:50.402800000] [error] [reload] check properties for input plugins is failed
```

(The logs of course now say `[reload]`, which is a little misleading...
we can clean that up, if desired).

Signed-off-by: Andrew Hayworth <[email protected]>
We need to perform the same partial initialization and releasing of
resources whether or not the dry-run was successful, so this commit
ensures that we call those functions before exiting either way.

Signed-off-by: Andrew Hayworth <[email protected]>
@ahayworth ahayworth force-pushed the ahayworth/sc-149733/additional-dry-run-validation branch from 538df01 to 1a9755a Compare November 10, 2025 15:36
This adds a runtime_shell test that runs fluent-bit in dry-run mode and
ensures we are running the additional validation recently added.
Specifically, we test that configuration property validation is
happening, which is borrowed from the hot reload code and was not
previously used for dry-run validation.

Signed-off-by: Andrew Hayworth <[email protected]>
@ahayworth ahayworth force-pushed the ahayworth/sc-149733/additional-dry-run-validation branch from 1a9755a to d180b19 Compare November 10, 2025 15:36
@ahayworth
Copy link
Author

@patrick-stephens @cosmo0920 I've rebased and added a test - I'm not sure how extensive we'd like test coverage to be here, but if needed I can add additional cases.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
src/fluent-bit.c (1)

1438-1451: Cleanup logic now correctly handles both success and failure paths.

The updated implementation properly addresses the resource leak issue raised in the previous review. Resources (cf_opts and ctx) are now cleaned up before exiting in both the success and failure cases, which prevents resource leaks during dry-run validation.

The flow is clear:

  1. Validate configuration
  2. Always perform cleanup
  3. Exit based on validation result

Minor: Comment placement could be more precise.

The comment on line 1441 states "config test is done" but appears before the validation result is checked (lines 1446-1448). Consider moving this comment to line 1439 or rephrasing it to be more precise about the sequence of operations.

tests/runtime_shell/dry_run_invalid_property.sh (2)

11-24: Consider using unique temporary file names.

The test uses a hardcoded path /tmp/dry_run_invalid_property.yaml which could cause conflicts if multiple test instances run in parallel. Consider using mktemp to generate unique temporary file names:

-cat > /tmp/dry_run_invalid_property.yaml << EOL
+TEMP_CONFIG=$(mktemp /tmp/dry_run_invalid_property_XXXXXX.yaml)
+cat > $TEMP_CONFIG << EOL

Then reference $TEMP_CONFIG throughout the script and update the cleanup accordingly.


30-32: Same consideration applies to output file.

The output file also uses a hardcoded path. If you adopt mktemp for the config file (as suggested above), apply the same pattern here:

-OUTPUT_FILE="/tmp/dry_run_invalid_property_output.txt"
+OUTPUT_FILE=$(mktemp /tmp/dry_run_invalid_property_output_XXXXXX.txt)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 538df01 and d180b19.

📒 Files selected for processing (2)
  • src/fluent-bit.c (1 hunks)
  • tests/runtime_shell/dry_run_invalid_property.sh (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/fluent-bit.c (3)
src/flb_reload.c (1)
  • flb_reload_property_check_all (208-245)
src/flb_lib.c (2)
  • flb_init_env (128-141)
  • flb_destroy (228-263)
src/config_format/flb_config_format.c (1)
  • flb_cf_destroy (151-157)
🔇 Additional comments (2)
tests/runtime_shell/dry_run_invalid_property.sh (2)

1-10: LGTM! Environment setup follows good practices.

The environment variable handling with fallback defaults is robust and follows common patterns for shell test scripts. The echo statement on line 9 aids in debugging and test transparency.


42-64: LGTM! Comprehensive validation and proper cleanup.

The test correctly validates all three expected behaviors:

  1. Non-zero exit code from dry-run
  2. Error message about the unknown property
  3. Reload validation failure message

The use of || true on lines 43-44 prevents grep failures from prematurely terminating the script, and the cleanup (lines 47-49) ensures temporary files are removed before the final exit. The detailed diagnostic output on failure (lines 60-62) will aid in debugging test failures.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you just confirm this actually runs in CI @ahayworth ?

Copy link
Author

@ahayworth ahayworth Nov 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦 I did forget to add the test to the list to run, and added that here: 8180652

I'm not 100% sure which CI job would actually run those tests, though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approved to run CI tests. Waiting for the results.

This was an oversight on my part - we actually need to list this as a
test for it to be run.

Signed-off-by: Andrew Hayworth <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants