feat(cli): Handle reload based on referenced file change #22539

gllb · 2025-02-28T11:12:38Z

Summary

The TLS crt_file and key_file from http sinks are now part of the watcher list and therefore they are reloaded on Vector restart.

Ref #17283

Change Type

Bug fix
New feature
Non-functional (chore, refactoring, docs)
Performance

Is this a breaking change?

Yes
No

How did you test this PR?

adding this sink to config/vector.yaml :

http_test:
  type: "http"
  inputs: ["parse_logs"]
  uri: "http://test"
  encoding:
    codec: "json"
    json:
      pretty: true
  tls:
    crt_file: /etc/ssl/tls-cert/tls.crt
    key_file: /etc/ssl/tls-cert/tls.key

then execute :
vector -vvv -c config/vector.yaml -w
in parallel run :
cd /etc/ssl/tls-cert/ && openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -sha256 -days 3650 -nodes -subj "/C=XX/ST=StateName/L=CityName/O=CompanyName/OU=CompanySectionName/CN=CommonNameOrHostname"
And vector logs :

Does this PR include user facing changes?

Yes. Please add a changelog fragment based on our guidelines.
No. A maintainer will apply the "no-changelog" label to this PR.

Checklist

Please read our Vector contributor resources.
- make check-all is a good command to run locally. This check is
  defined here. Some of these
  checks might not be relevant to your PR. For Rust changes, at the very least you should run:
  - cargo fmt --all
  - cargo clippy --workspace --all-targets -- -D warnings
  - cargo nextest run --workspace (alternatively, you can run cargo test --all)
If this PR introduces changes Vector dependencies (modifies Cargo.lock), please
run dd-rust-license-tool write to regenerate the license inventory and commit the changes (if any). More details here.

References

Ref: #10877

src/config/watcher.rs

Co-authored-by: Pavlos Rontidis <[email protected]>

…onent

changelog.d/22386_extend_watcher_paths.enhancement.md

pront

Left a few nits but overall this looks great. Thank you @gllb

src/config/mod.rs

lib/vector-common/src/config.rs

Co-authored-by: Pavlos Rontidis <[email protected]>

gllb · 2025-03-06T10:47:09Z

I'm not able to run cargo nextest run --workspace as it consume too much memory and get killed. But all clippy issue are fixed.

gllb · 2025-03-10T08:14:13Z

@pront, is there anything I need to do in order to make it move forward ?

pront · 2025-03-11T16:55:45Z

src/topology/running.rs

@@ -543,6 +547,11 @@ impl RunningTopology {
                    .is_some()
            }))
            .collect::<Vec<_>>();
+
+        if let Some(mut components) = components_to_reload {
+            sinks_to_change.append(&mut components)


I have a few more thoughts on this implementation. What you have now is good and works for the HTTP sink. However, we can go further.

We should support reloading any component (sources, transforms, sinks)

Looking at files_to_watch , it is http sink specific. We have many components that support the same TLS settings, it should be easy to write a reusable util and expose these paths for all components that use them. (We can start by duplicating code if needed and think about DRY later.)

So you want to this PR to also implement this for all components ?
Or just make sure the code is able to be extended in future change, so make sure this shutdown_diff is able to handle source and transforms change as well ?

Hi @gllb, I am looking at this ticket #10877.

So we have a few choices here:

accept PR as is and follow up with multiple PRs (it doesn't have to be you btw)

enhance this PR to handle all components (sources, transforms, sinks)

choice 2 and handling all components that expose TLS settings

Perhaps choice (2) is good here.

ok for choice 2 and that means in fact : integrate with ConfigDiff instead of patching shutdown_diff as stated in your other comment, right ?

Yes. In order to simplify further reviews, I went ahead and enabled merging for this PR. I would appreciate it if you follow up with a PR to handle all components and a smarter config diff. But it's understandable if you don't have time for that. I think this PR is a big step towards the right direction.

pront · 2025-03-12T14:18:56Z

src/app.rs

@@ -336,9 +336,32 @@ async fn handle_signal(
    allow_empty_config: bool,
 ) -> Option<SignalTo> {
    match signal {
+        Ok(SignalTo::ReloadComponents(component_keys)) => {


Some implementation details:

We can add component_keys (or whatever struct we end up with) as a new RunningTopology field to avoid passing them around to all these functions.

Then, we can use this new field in async fn shutdown_diff(...)

We can probably add the notion of changed watched paths to src/config/diff.rs

the component_keys here are passed when a change is detected from the watcher. If we use it as a field of RunningTopology then it supposed to be empty almost all the time, except when a change is detected. Which will still imply passing it in reload_config_and_respawn anyway, right ? or maybe I don't get it ?

For the notion of watched paths change in configDiff I think I see what you mean.

@pront So I did implement this logic in configDiff, and it does what it should. Only this part : https://github.com/gllb/vector/blob/master/src/topology/running.rs#L490-L509 is causing the sink to be re-used and therefore it is not reloaded.
On the implementation where all is done inside shutdown_diff its easy to work around it as I can append to sink_to_change after so the reuse buffer code doesnt even know about it. But since its in configDiff now, I dont see how to work around it, any idea ?

I could remove this part, but I dont know if its safe to do or not.

Glad we agree on the src/config/diff.rs enhancement.

What is more important is how to keep track of the component type. I don't think we have an enum or something to model currently. But we probably to keep associate a ComponentKey with a new ComponentType. Talking about this, I realize it might be better to do this in a followup PR since it might need some discussion.

If we use it as a field of RunningTopology then it supposed to be empty almost all the time, except when a change is detected.

Correct. It will be empty until the watcher detects a file change, then it will push it there. When we process/reload, it will clear the whole thing. I am a bit ambivalent on this, so feel free to TIOLI.

Which will still imply passing it in reload_config_and_respawn anyway, right ?

To be clear, I was proposing adding it here:

vector/src/topology/running.rs

Line 41 in 7644658

pub struct RunningTopology {

The reload_config_and_respawn will then have access to it via self.

ok I can create follow up PR for it but not sure I will have a lot of time for it, lets see. I still have no idea about the struggle with reuse buffer code I pointed out in shutdown_diff but I will open a PR with what I have, so anybody can contribute as well.

pront · 2025-03-12T22:52:17Z

A failed test prevented this from merging: https://github.com/vectordotdev/vector/actions/runs/13790530146/job/38656922831?pr=22539

gllb · 2025-03-13T10:58:08Z

Hi @pront, hopefully this last fix I push solve it. I struggle a bit to run the test locally as they are very demanding on memory.

pront · 2025-03-13T15:53:20Z

@gllb thank you for the fast iterations on this. If possible, avoid force pushing so I can do incremental reviews. I approved the workflows again.

pront · 2025-03-13T16:56:30Z

Test failures persist. You can iterate faster locally with cargo test.

gllb · 2025-03-14T07:17:20Z

@pront cargo test is consuming too much memory locally so getting killed, and reducing the number of job to 1 -j 1 or scoping to only one test lead to same result unfortunately. If you have any clue on how to reduce this memory consumption please let me know

edit : I manage to find option in nextest to reduce build and test jobs, so now the changed test pass successfuly

pront · 2025-03-17T14:54:03Z

src/config/watcher.rs

@@ -269,8 +268,23 @@ mod tests {
        {
            panic!("Test timed out");
        }
+    }
+    #[tokio::test]


Thank you for adding a test and fixing the failure. I realize that these tests have several problems (existing, not introduced by this PR):

they are disabled on macOS

they have arbitrary delays e.g. 3 seconds and then 3*5

they panic instead of returning an error

But again, this is out of scope for this PR.

src/config/watcher.rs

gllb requested a review from a team as a code owner February 28, 2025 11:12

github-actions bot added domain: topology Anything related to Vector's topology code domain: sinks Anything related to the Vector's sinks labels Feb 28, 2025

gllb force-pushed the master branch from 6300b7d to e32ae56 Compare February 28, 2025 11:21

pront reviewed Feb 28, 2025

View reviewed changes

src/config/watcher.rs Outdated Show resolved Hide resolved

gllb and others added 7 commits March 3, 2025 14:02

feat(watch tls files) Add http sink tls cert/key to config::watcher

2c309f3

Update changelog.d/22386_extend_watcher_paths.enhancement.md

8cc8f7e

Co-authored-by: Pavlos Rontidis <[email protected]>

feat(watch tls file) Add ComponentConfig to let watcher find components

8d6310d

feat(watch tls file) Let watcher send ReloadComponent signal

ad9d0e8

feat(watch tls file) handle ReloadComponent signal

cf0a9dd

feat(watch tls file) Ensure signal is sent once in case of ReloadComp…

8b491c6

…onent

(feat reload components) Handle Vec<ComponentKey> instead of single one

ecee69c

gllb force-pushed the master branch from c941b7f to ecee69c Compare March 3, 2025 13:02

pront added 2 commits March 4, 2025 11:33

Merge remote-tracking branch 'origin/master' into gllb/master

0b58423

cargo fmt

c3c8968

pront reviewed Mar 5, 2025

View reviewed changes

changelog.d/22386_extend_watcher_paths.enhancement.md Outdated Show resolved Hide resolved

pront approved these changes Mar 5, 2025

View reviewed changes

src/config/mod.rs Outdated Show resolved Hide resolved

pront reviewed Mar 5, 2025

View reviewed changes

lib/vector-common/src/config.rs Outdated Show resolved Hide resolved

gllb and others added 4 commits March 5, 2025 16:48

Update changelog.d/22386_extend_watcher_paths.enhancement.md

95df0da

Co-authored-by: Pavlos Rontidis <[email protected]>

Update src/config/mod.rs

5811fd8

Co-authored-by: Pavlos Rontidis <[email protected]>

Fix missing enclosed delimiter

d67394c

solve clippy issues

214cce4

cargo fmt

8ef872a

Merge remote-tracking branch 'origin/master' into gllb/master

14ace20

pront reviewed Mar 11, 2025

View reviewed changes

pront reviewed Mar 12, 2025

View reviewed changes

pront enabled auto-merge March 12, 2025 17:55

auto-merge was automatically disabled March 13, 2025 09:30
Head branch was pushed to by a user without write access

Fix testing

6452ef6

gllb force-pushed the master branch from ec30c68 to 6452ef6 Compare March 13, 2025 10:56

run cargo fmt

c9330e7

gllb force-pushed the master branch from 73bbcc5 to c9330e7 Compare March 13, 2025 15:51

add dedicated watcher test

cc64987

gllb requested a review from pront March 17, 2025 14:27

pront reviewed Mar 17, 2025

View reviewed changes

src/config/watcher.rs Show resolved Hide resolved

pront enabled auto-merge March 17, 2025 16:30

pront added this pull request to the merge queue Mar 17, 2025

Merged via the queue into vectordotdev:master with commit 5e392ad Mar 17, 2025
85 checks passed

pront mentioned this pull request Mar 17, 2025

TLS certificates not reloaded on config reload #10877

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(cli): Handle reload based on referenced file change #22539

feat(cli): Handle reload based on referenced file change #22539

gllb commented Feb 28, 2025 •

edited by pront

Loading

pront left a comment

gllb commented Mar 6, 2025 •

edited

Loading

gllb commented Mar 10, 2025

pront Mar 11, 2025

gllb Mar 12, 2025

pront Mar 12, 2025

gllb Mar 12, 2025 •

edited

Loading

pront Mar 12, 2025

pront Mar 12, 2025 •

edited

Loading

gllb Mar 12, 2025

gllb Mar 12, 2025 •

edited

Loading

gllb Mar 12, 2025

pront Mar 12, 2025

gllb Mar 13, 2025

pront commented Mar 12, 2025

gllb commented Mar 13, 2025 •

edited

Loading

pront commented Mar 13, 2025

pront commented Mar 13, 2025

gllb commented Mar 14, 2025 •

edited

Loading

pront Mar 17, 2025

feat(cli): Handle reload based on referenced file change #22539

feat(cli): Handle reload based on referenced file change #22539

Conversation

gllb commented Feb 28, 2025 • edited by pront Loading

Summary

Change Type

Is this a breaking change?

How did you test this PR?

Does this PR include user facing changes?

Checklist

References

pront left a comment

Choose a reason for hiding this comment

gllb commented Mar 6, 2025 • edited Loading

gllb commented Mar 10, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gllb Mar 12, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pront Mar 12, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gllb Mar 12, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pront commented Mar 12, 2025

gllb commented Mar 13, 2025 • edited Loading

pront commented Mar 13, 2025

pront commented Mar 13, 2025

gllb commented Mar 14, 2025 • edited Loading

Choose a reason for hiding this comment

gllb commented Feb 28, 2025 •

edited by pront

Loading

gllb commented Mar 6, 2025 •

edited

Loading

gllb Mar 12, 2025 •

edited

Loading

pront Mar 12, 2025 •

edited

Loading

gllb Mar 12, 2025 •

edited

Loading

gllb commented Mar 13, 2025 •

edited

Loading

gllb commented Mar 14, 2025 •

edited

Loading