Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Introduce heartbeat metric document #414

Open
sorenlouv opened this issue Feb 4, 2021 · 2 comments
Open

Proposal: Introduce heartbeat metric document #414

sorenlouv opened this issue Feb 4, 2021 · 2 comments
Labels

Comments

@sorenlouv
Copy link
Member

sorenlouv commented Feb 4, 2021

We currently treat an instance as active if it is ingesting data. So if it receives traffic it will transmit transactions/spans/errors and/or if metrics is enabled it'll periodically transmit system metrics (cpu, memory).

However, it is theoretically possible for an instance to be active but not receiving any traffic, and metrics being disabled. In this case it'll be treated as not active (terminated).

This has implications on how throughput for the instance is calculated.

Question
Would it make sense to have a "heartbeat" document that is broadcasted periodically (Addendum: And is not possible to disable)?

cc @felixbarny @axw

@sorenlouv sorenlouv changed the title Consider introducing heatbeat metric document Proposal: Introducing heatbeat metric document Feb 4, 2021
@axw
Copy link
Member

axw commented Feb 5, 2021

How would this be different to having cpu/memory metrics enabled? Are you suggesting that these metrics could not be disabled? Then what is the difference to not allowing cpu/memory metrics to be disabled?

Related to this, I think the agents should periodically send statistics to the server for monitoring the agents themselves. e.g. how many transactions/spans/errors have been sent, how many requests were sent to the server, how many of those requests failed (e.g. due to 503). We could use those as a sort of heartbeat.

@sorenlouv
Copy link
Member Author

How would this be different to having cpu/memory metrics enabled? Are you suggesting that these metrics could not be disabled?

Yes, that's what I had in mind. If cpu/memory reporting has an overhead it makes sense to allow disabling. Whereas the overhead of a heartbeat should be smaller (in theory at least).

I think the agents should periodically send statistics to the server for monitoring the agents themselves. e.g. how many transactions/spans/errors have been sent

Yes, this also came up at the UI weekly. @dgieselaar had some thoughts in the same vein.

@mikker mikker changed the title Proposal: Introducing heatbeat metric document Proposal: Introducing heartbeat metric document Feb 5, 2021
@sorenlouv sorenlouv changed the title Proposal: Introducing heartbeat metric document Proposal: Introduce heartbeat metric document Mar 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants