Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[APM] Throughput becomes inaccurate when instances become inactive #89497

Closed
sorenlouv opened this issue Jan 27, 2021 · 5 comments
Closed

[APM] Throughput becomes inaccurate when instances become inactive #89497

sorenlouv opened this issue Jan 27, 2021 · 5 comments
Labels
Team:APM All issues that need APM UI Team support

Comments

@sorenlouv
Copy link
Member

sorenlouv commented Jan 27, 2021

Currently throughout is calculated as the total number of requests per min throughout the entire range. This means that an instance that has stopped will see a falling average throughput. This makes it harder to compare the throughput of instances.

Questions:

  • For instances specifically, should throughput be calculated for the entire time range (like today) or only for the "active" period of an instance?
  • In general, should throughput only be calculated for the "active" periods, thus excluding periods with no data?

The following example shows how throughput for the same instances change over time even though it should (intuitively) stay the same (at least for instance 1 and 2)

3 days range

3days.png

7 days range

7days.png

@sorenlouv sorenlouv added [zube]: Inbox discuss Team:APM All issues that need APM UI Team support v7.12.0 labels Jan 27, 2021
@elasticmachine
Copy link
Contributor

Pinging @elastic/apm-ui (Team:apm)

@sorenlouv sorenlouv added v7.13.0 and removed v7.12.0 labels Jan 27, 2021
@sorenlouv sorenlouv changed the title [APM] Throughput is inaccurate for instances with downtime [APM] Throughput is inaccurate for inactive instances Jan 27, 2021
@sorenlouv sorenlouv changed the title [APM] Throughput is inaccurate for inactive instances [APM] Throughput becomes inaccurate when instances become inactive Jan 27, 2021
@alex-fedotyev
Copy link

@felixbarny - I just realized that APM agents deliver metrics already, which might be useful for this.
It looks like every agent except PHP reports "system.cpu.total.norm.pct", could you confirm?

The problem is that we are looking for a reliable way to calculate lifetime of the service instance, in order to make sure we have proper throughout calculation for 1 instance of a service, and I guess same applied for a set of instances same way too.

@felixbarny
Copy link
Member

Yes, according to their metrics documentation page, that's true for all agents except PHP.

However, metrics sending can be disabled, and older versions of an agent may not be sending these metrics.

To circumvent the first issue, we could send heartbeat metrics in the agents: elastic/apm#414

@botelastic
Copy link

botelastic bot commented Feb 8, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@botelastic botelastic bot added the stale Used to mark issues that were closed for being stale label Feb 8, 2022
@sorenlouv
Copy link
Member Author

@alex-fedotyev Is this something we are still considering or can we close this issue?

@botelastic botelastic bot removed the stale Used to mark issues that were closed for being stale label Feb 9, 2022
@sorenlouv sorenlouv closed this as not planned Won't fix, can't repro, duplicate, stale Jul 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:APM All issues that need APM UI Team support
Projects
None yet
Development

No branches or pull requests

4 participants