Weekly reports on performance #2

MickaelBergem · 2018-09-10T17:00:57Z

Need: to get a periodic overview of the general performance and/or major events on a given set of monitors.

This may include:

mean response time
number and nature of downtime
uptime % or downtime minutes
major events (response time dropped by 20% compared with last week)

Display channels:

in www.howfast.tech
by email
directly in Slack (my favorite so far)

Any comment from the community is welcome! 📣

MickaelBergem · 2020-07-09T12:51:07Z

Here is the draft visual of this weekly/monthly report. @jpcaruana @pascalandy would the information in this report be useful to you? (you had upvoted this feature)

Now is the best time to add / remove / improve the information that goes into this report :)

pascalandy · 2020-07-09T15:18:12Z

I would add titles like:

Weekly stats

bla bla ...

monthly stats

bla bla ...

pascalandy · 2020-07-09T15:18:35Z

Not sure the green and red little triangles add value.

jpcaruana · 2020-07-09T15:26:59Z

I like it (I like green/red triangles : if everything is green, I don't have to read)

It would be great if you could also had a list of worst routes (from the APM part), the top 5 impact

jpcaruana · 2020-07-09T15:28:57Z

as a reference, I like the weekly email by sentry:

it gives me a good sense of "everything is good/bad" at a glance, and I can dvelve into details (not shown in my screenshot)

MickaelBergem · 2020-07-11T12:23:26Z

Thank you both for the quick answer!

I would add titles like:

Weekly stats

bla bla ...

monthly stats

bla bla ...

@pascalandy Just to make sure I understand what you mean, you would like to receive the weekly+monthly report every week? So that you can somehow have some longer-term stats on the performance?

@jpcaruana thanks! The Sentry report is indeed quite useful. Adding APM data in the performance report will come next, I can definitely see the value.

pascalandy · 2020-07-11T15:07:02Z

Correct, it's a two for one :-P

you would like to receive the weekly+monthly report every week

MickaelBergem · 2020-07-20T18:11:23Z

Update: I've been thinking a bit about how to best design this email (in terms of content more than in terms of UI), and here are my thoughts.

User experience

As a user of HowFast, I have limited time, so I will just trash the email if I don't care. To make it easier for the user to know if they should care or not, the email subject has cues such as:

Weekly performance report for GéoSchool: no incident this week 💪
Weekly performance report for HowFast: 7 monitors currently down 😮

If all monitors are up, but there were incidents in the past week, I'm still unsure what is the most useful. We could go with the total number of minutes spent down (45 minutes of cumulated downtime this week), or the maximum (longest incident lasted 6h), or something else. The cumulated downtime becomes much less interesting as soon as you have 3+ monitors: maybe all three went down at the same time and you end up with a 2h15 downtime while it only went down for 45 minutes.

Based on this information, the user can decide to archive/trash the email (especially if there were no incidents), or to read it.

If the user only has to spend 10 seconds scanning through the email, what are the most important metrics? See "head metrics" below. I'm not sure about this part for now.

The current design highlights the monitors that are currently down (the most important information) along with a short explanation of what is happening. If the user needs more information, the rows are clickable and open the monitor in HowFast.

Head metrics

This is the "Sentry-like" report. I'm in favor of adding those big numbers at the top of the email to provide a synthetic view of what happened, but I need to figure out the details. What makes sense in the context of HowFast?

For now I only think about the number of monitors currently down, the number of incidents last week, and maybe the slowest average response time. What are the metrics you would be interested to see?

Weekly AND monthly metrics in a single mail

Given the number of monitors for some of the teams using HowFast, having two tables will make the email super long and harder to read, so I'm not convinced this will add value. I will study the possibility of adding an extra column in the report "uptime over the last 30 days", while making it clear the other one is the "uptime over the last 7 days". Those numbers might as well be easier to show directly inside HowFast instead of in an email.

Next up

Currently the implementation is almost ready, and will be rolled in in the next few days. If you are interested, you can opt-in and start receiving the reports in your mailbox, so that you can see what it will look like with your numbers - I would love to hear your feedback!

MickaelBergem · 2020-07-30T13:37:26Z

The first batch of weekly reports were sent this Monday, with some very good results:

several teams removed old monitors that had been down for months
a few teams tweaked the monitoring configuration
a few teams added new monitors

Overall, several teams were able to get more value out of HowFast thanks to this report.

The next batch will get extra data included, related to certificates expiring soon (in less than two weeks). This will help make sure that even if no notification is configured for the affected monitors, the team can still learn about it.

Feel free to share your feedback :)

jpcaruana · 2020-07-30T13:58:07Z

Hi,

I like these emails and I'm looking forward to seeing it becoming better and better.

Would it be possible to be able to choose the order of monitors inn the email ? I have a lot of monitors, and production monitors are my main focus (the rest is more informational for me) for this kind of weekly digests.

Thanks!

MickaelBergem · 2020-07-30T14:03:01Z

Thank you for your feedback @jpcaruana! Currently, the monitors are ordered by:

status (monitors that are down first)
increasing uptime (so that you can see the problematic monitors first)
decreasing response time (if all your monitors have 100%, you probably want to focus on the slower ones first)

I'm trying to think about a way to make it work for you in this context. I assume we could somehow add a flag for "production" monitors, and display those first, do you see another way to make it work in your case? I will try to think about it.

jpcaruana · 2020-07-30T14:10:00Z

I assume we could somehow add a flag for "production" monitors, and display those first, do you see another way to make it work in your case?

this seems like a perfect use solution for my use case. You could also use this flag for the web UI too i guess

MickaelBergem · 2020-08-03T18:01:20Z

@jpcaruana you mentioned having the most impactful endpoints listed in the email, would that work if it's based off all the APMs in your team, including the non-prod ones, or would the result be significantly useless? I started working on this and might very well be able to send you the results for your team so that you can double check, but maybe you already know.

MickaelBergem · 2020-08-05T17:21:03Z

Here is a draft of the APM summary:

v0.1

v0.2

I think the impact measured in ms per minute makes sense (=milliseconds a worker is spending working on this endpoint during an average minute) and adds value.

jpcaruana · 2021-03-11T11:31:58Z

current weekly report works great: I think you can close here @MickaelBergem :)

MickaelBergem added feature New feature request need-more-details Extra specification is needed labels Sep 10, 2018

MickaelBergem self-assigned this Sep 10, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Weekly reports on performance #2

Weekly reports on performance #2

MickaelBergem commented Sep 10, 2018 •

edited

Loading

MickaelBergem commented Jul 9, 2020 •

edited

Loading

pascalandy commented Jul 9, 2020 •

edited

Loading

pascalandy commented Jul 9, 2020

jpcaruana commented Jul 9, 2020

jpcaruana commented Jul 9, 2020

MickaelBergem commented Jul 11, 2020

Weekly stats

monthly stats

pascalandy commented Jul 11, 2020

MickaelBergem commented Jul 20, 2020

MickaelBergem commented Jul 30, 2020 •

edited

Loading

jpcaruana commented Jul 30, 2020 •

edited

Loading

MickaelBergem commented Jul 30, 2020

jpcaruana commented Jul 30, 2020

MickaelBergem commented Aug 3, 2020

MickaelBergem commented Aug 5, 2020

jpcaruana commented Mar 11, 2021

Weekly reports on performance #2

Weekly reports on performance #2

Comments

MickaelBergem commented Sep 10, 2018 • edited Loading

MickaelBergem commented Jul 9, 2020 • edited Loading

pascalandy commented Jul 9, 2020 • edited Loading

Weekly stats

monthly stats

pascalandy commented Jul 9, 2020

jpcaruana commented Jul 9, 2020

jpcaruana commented Jul 9, 2020

MickaelBergem commented Jul 11, 2020

Weekly stats

monthly stats

pascalandy commented Jul 11, 2020

MickaelBergem commented Jul 20, 2020

User experience

Head metrics

Weekly AND monthly metrics in a single mail

Next up

MickaelBergem commented Jul 30, 2020 • edited Loading

jpcaruana commented Jul 30, 2020 • edited Loading

MickaelBergem commented Jul 30, 2020

jpcaruana commented Jul 30, 2020

MickaelBergem commented Aug 3, 2020

MickaelBergem commented Aug 5, 2020

jpcaruana commented Mar 11, 2021

MickaelBergem commented Sep 10, 2018 •

edited

Loading

MickaelBergem commented Jul 9, 2020 •

edited

Loading

pascalandy commented Jul 9, 2020 •

edited

Loading

MickaelBergem commented Jul 30, 2020 •

edited

Loading

jpcaruana commented Jul 30, 2020 •

edited

Loading