Skip to content

Conversation

titaneric
Copy link
Contributor

@titaneric titaneric commented Oct 15, 2025

Summary

As dicussed in #23597 and #23982, I intend to support Kubernetes logs API to tail the pods logs in this PR.

Features Implemented

  • Kubernetes API Logs Collection: Added log_collection_strategy configuration option to enable log collection via Kubernetes API instead of file-based tailing
  • Event-driven Reconciler: Implemented a reconciler that watches pod events and automatically starts/stops log tailers for running pods
  • Container Log Streaming: Added real-time log streaming from Kubernetes API with proper timestamp tracking and container identification
  • Pod Information Management: Created PodInfo struct for essential pod metadata extraction and container tracking

TODO Items

  • Batched Lines Sending: Implement batch of lines to send to the event channel
  • Position Tracking: Implement timestamp-based position management for log continuity
  • Error Handling: Add comprehensive error handling and retry mechanisms
  • Metrics Integration: Add metrics for API log collection monitoring
  • Metadata Annotation: Complete integration with pod metadata annotation pipeline
  • Performance Optimization: Implement connection pooling and efficient streaming
  • Testing Suite: Add comprehensive unit and integration tests

Vector configuration

api:
  enabled: true
  graphql: true
  playground: true
  address: "127.0.0.1:8686"
sources:
  my_source_id:
    type: kubernetes_logs
    data_dir: /tmp/vector-k8s-logs
    kube_config_file: ./kind-kubeconfig.yaml
    self_node_name: kind-worker
    api_log: true
sinks:
  my_sink_id:
    type: console
    inputs:
    - my_source_id
    encoding:
      codec: raw_message

How did you test this PR?

Given the following kind cluster config named kind-cluster.yaml

kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker

Create kind clulster from config

kind create cluster --config kind-cluster.yaml

Export the cluster's kubeconfig to kind-kubeconfig

kind export kubeconfig --kubeconfig kind-kubeconfig

Apply the following manifest to kind's cluster

kubectl apply -f pod.yaml --kubeconfig kind-kubeconfig
apiVersion: v1
kind: Pod
metadata:
  name: multi-container-test-pod
  labels:
    app: multi-container-test
    vector.dev/exclude: "false"
spec:
  containers:
  - name: logger-a
    image: busybox:1.35
    command: [ "/bin/sh" ]
    args:
    - -c
    - |
      echo "Container A starting at $(date)"
      counter=1
      while true; do
        echo "[A-${counter}] Container A log: $(date -Iseconds) - Testing multi-container timestamp tracking"
        counter=$((counter + 1))
        sleep 3
      done
    resources:
      requests:
        memory: "32Mi"
        cpu: "50m"
      limits:
        memory: "64Mi"
        cpu: "100m"
  - name: logger-b
    image: busybox:1.35
    command: [ "/bin/sh" ]
    args:
    - -c
    - |
      echo "Container B starting at $(date)"
      counter=1
      while true; do
        echo "[B-${counter}] Container B log: $(date -Iseconds) - Different container, same pod"
        counter=$((counter + 1))
        sleep 7
      done
    resources:
      requests:
        memory: "32Mi"
        cpu: "50m"
      limits:
        memory: "64Mi"
        cpu: "100m"
  restartPolicy: Always

Build vector with nessary features and run the given config.

cargo build --no-default-features --features sources-kubernetes_logs --features sinks-console --features api
 ./target/debug/vector --config vector.yaml -v

Change Type

  • Bug fix
  • New feature
  • Non-functional (chore, refactoring, docs)
  • Performance

Is this a breaking change?

  • Yes
  • No

Does this PR include user facing changes?

  • Yes. Please add a changelog fragment based on our guidelines.
  • No. A maintainer will apply the no-changelog label to this PR.

References

Notes

  • Please read our Vector contributor resources.
  • Do not hesitate to use @vectordotdev/vector to reach out to us regarding this PR.
  • Some CI checks run only after we manually approve them.
    • We recommend adding a pre-push hook, please see this template.
    • Alternatively, we recommend running the following locally before pushing to the remote branch:
      • make fmt
      • make check-clippy (if there are failures it's possible some of them can be fixed with make clippy-fix)
      • make test
  • After a review is requested, please avoid force pushes to help us review incrementally.
    • Feel free to push as many commits as you want. They will be squashed into one before merging.
    • For example, you can run git merge origin master and git push.
  • If this PR introduces changes Vector dependencies (modifies Cargo.lock), please
    run make build-licenses to regenerate the license inventory and commit the changes (if any). More details here.

@github-actions github-actions bot added the domain: sources Anything related to the Vector's sources label Oct 15, 2025
rotate_wait: Duration,

/// Whether use k8s logs API or not
api_log: bool,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: would be better if we named this log_collection_strategy and support disk and api

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice idea!

@titaneric titaneric marked this pull request as ready for review October 17, 2025 17:49
@titaneric titaneric requested a review from a team as a code owner October 17, 2025 17:49
@titaneric
Copy link
Contributor Author

@pront , I have introduced the new config named log_collection_strategy , and I have done lots of refactor later.
I believe code review could be time consuming. Please take your time to look at it.

Hope that the whole strucuture is good, and then we could focus on the temporary workaround to handle the Line and further event processing pipeline for api log collection strategy later.

return;
// TODO: fix it until `FunctionTransform` supports Api logs
// emit!(ParserMatchError { value: &s[..] });
drop(log.insert(&message_path, Value::Bytes(s)));
Copy link
Contributor Author

@titaneric titaneric Oct 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is a ugly workaround. For more reasonable PR review size, I dropped the implementation of FunctionTransform trait for Api here (which is very similar to how Cri handle the logs).

@pront pront self-assigned this Oct 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain: sources Anything related to the Vector's sources

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants