Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting theaded to true in prometheus_scrape input causes SIGSEGV #9525

Open
anderssynstad opened this issue Oct 25, 2024 · 3 comments
Open

Comments

@anderssynstad
Copy link

anderssynstad commented Oct 25, 2024

Bug Report

Describe the bug
Setting the threaded key to true for prometheus_scrape input causes Fluent Bit to throw SIGSEGV errors.

To Reproduce

---
service:
  storage.path: /var/spool/fluent-bit
pipeline:
  inputs:
    - name: prometheus_scrape
      host: 127.0.0.1
      port: 9100
      tag: metrics.node
      metrics_path: /metrics
      scrape_interval: 10s
      threaded: false
  outputs:
    - name: null
      match: '*'
# /opt/fluent-bit/bin/fluent-bit -c test.yaml -D
Fluent Bit v3.1.9
* Copyright (C) 2015-2024 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

______ _                  _    ______ _ _           _____  __
|  ___| |                | |   | ___ (_) |         |____ |/  |
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __   / /`| |
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / /   \ \ | |
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /.___/ /_| |_
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/ \____(_)___/

configuration test is successful
#

Then changing threaded from false to true:

---
service:
  storage.path: /var/spool/fluent-bit
pipeline:
  inputs:
    - name: prometheus_scrape
      host: 127.0.0.1
      port: 9100
      tag: metrics.node
      metrics_path: /metrics
      scrape_interval: 10s
      threaded: true
  outputs:
    - name: null
      match: '*'
# /opt/fluent-bit/bin/fluent-bit -c test.yaml -D
Fluent Bit v3.1.9
* Copyright (C) 2015-2024 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

______ _                  _    ______ _ _           _____  __
|  ___| |                | |   | ___ (_) |         |____ |/  |
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __   / /`| |
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / /   \ \ | |
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /.___/ /_| |_
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/ \____(_)___/

configuration test is successful
[2024/10/25 07:55:51] [engine] caught signal (SIGSEGV)
#0  0x5639131a8e7e      in  flb_input_exit_all() at src/flb_input.c:1341
#1  0x5639131c3138      in  flb_engine_shutdown() at src/flb_engine.c:1121
#2  0x56391319d264      in  flb_destroy() at src/flb_lib.c:240
#3  0x56391310dc1b      in  flb_main() at src/fluent-bit.c:1360
#4  0x7f707cc46249      in  ???() at ???:0
#5  0x7f707cc46304      in  ???() at ???:0
#6  0x56391310b800      in  ???() at ???:0
#7  0xffffffffffffffff  in  ???() at ???:0
Aborted
#

Expected behavior
I expect a supported configuration key to not throw an error and cause the program to die.

Your Environment

  • Version used: 3.1.9
  • Configuration: See above.
  • Environment name and version (e.g. Kubernetes? What version?):
  • Server type and version: Virtual
  • Operating System and version: Debian 12
  • Filters and plugins: None.

Additional context
This generates a significant amount of noise due to the fluent-bit service keeps retarting.
On one random server, that fluent-bit service has retarted 184 times the last 8 hours alone ...

@edsiper
Copy link
Member

edsiper commented Oct 25, 2024

@anderssynstad is this running through a package or a custom build ? any more insights about the systems might help us

@anderssynstad
Copy link
Author

anderssynstad commented Oct 26, 2024

Just verified it on a Debian 12 (fully patched) vps on Digital Ocean:

$ lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 12 (bookworm)
Release:        12
Codename:       bookworm

$ cat /etc/apt/sources.list.d/fluentbit.list
deb [arch=amd64 signed-by=/usr/share/keyrings/fluentbit.asc] https://packages.fluentbit.io/debian/bookworm bookworm main

$ dpkg -l fluent-bit
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name           Version      Architecture Description
+++-==============-============-============-=================================
ii  fluent-bit     3.1.9        amd64        Fast data collector for Linux

$ uname -srv
Linux 6.1.0-26-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.112-1 (2024-09-30)

@anderssynstad
Copy link
Author

Not sure if it is related, but the yaml example on https://docs.fluentbit.io/manual/pipeline/inputs/node-exporter-metrics has the "same" result.

$ cat test.yaml
# Node Exporter Metrics + Prometheus Exporter
# -------------------------------------------
# The following example collect host metrics on Linux and expose
# them through a Prometheus HTTP end-point.
#
# After starting the service try it with:
#
# $ curl http://127.0.0.1:2021/metrics
#
service:
    flush: 1
    log_level: info
pipeline:
    inputs:
        - name: node_exporter_metrics
          tag:  node_metrics
          scrape_interval: 2
    outputs:
        - name: prometheus_exporter
          match: node_metrics
          host: 0.0.0.0
          port: 2021
$ /opt/fluent-bit/bin/fluent-bit -c test.yaml -D
Fluent Bit v3.1.9
* Copyright (C) 2015-2024 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

______ _                  _    ______ _ _           _____  __
|  ___| |                | |   | ___ (_) |         |____ |/  |
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __   / /`| |
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / /   \ \ | |
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /.___/ /_| |_
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/ \____(_)___/

configuration test is successful
[2024/10/26 18:50:56] [engine] caught signal (SIGSEGV)
#0  0x55ddedd2ee7e      in  flb_input_exit_all() at src/flb_input.c:1341
#1  0x55ddedd49138      in  flb_engine_shutdown() at src/flb_engine.c:1121
#2  0x55ddedd23264      in  flb_destroy() at src/flb_lib.c:240
#3  0x55ddedc93c1b      in  flb_main() at src/fluent-bit.c:1360
#4  0x7fdd0e246249      in  ???() at ???:0
#5  0x7fdd0e246304      in  ???() at ???:0
#6  0x55ddedc91800      in  ???() at ???:0
#7  0xffffffffffffffff  in  ???() at ???:0
Aborted

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants