MNTOR-3096 - add monitor glean backend #4544

rhelmer · 2024-05-16T14:03:45Z

References:

Jira: MNTOR-3096

Description

Screenshot (if applicable)

Not applicable.

How to test

Checklist (Definition of Done)

Localization strings (if needed) have been added.
Commits in this PR are minimal and have descriptive commit messages.
I've added or updated the relevant sections in readme and/or code comments
I've added a unit test to test for potential regressions of this bug.
Product Owner accepted the User Story (demo of functionality completed) or waived the privilege.
All acceptance criteria are met.
Jira ticket has been updated (if needed) to match changes made during the development process.
Jira ticket has been updated (if needed) with suggestions for QA when this PR is deployed to stage.

…r-glean-backend

src/telemetry/backend-metrics.yaml

…r-glean-backend

github-actions · 2024-05-21T23:37:53Z

Preview URL 🚀 : https://blurts-server-pr-4544-mgjlpikfea-uk.a.run.app

sean-rose

r+wc

In bug 1896992 I also asked @akkomar to take a look at this, since he has more expertise on Glean backend telemetry.

src/telemetry/backend-metrics.yaml

akkomar

Metrics definitions look good to me, apart from missing send_in_pings declaration.

We'll now want to integrate glean_parser in the build and generate logging code.
We have some docs now at https://mozilla.github.io/glean/book/user/adding-glean-to-your-project/server.html#how-to-add-glean-server-side-collection-to-your-service and you can also compare how FxA did this.

src/telemetry/backend-metrics.yaml

sean-rose

r+wc

The updated subscription.cancel event description looks good. Just the one remaining send_in_pings issue that should be fixed.

src/telemetry/backend-metrics.yaml

…r-glean-backend

…kend' into MNTOR-3096/add-monitor-glean-backend

rhelmer · 2024-06-10T23:15:26Z

Metrics definitions look good to me, apart from missing send_in_pings declaration.

We'll now want to integrate glean_parser in the build and generate logging code. We have some docs now at https://mozilla.github.io/glean/book/user/adding-glean-to-your-project/server.html#how-to-add-glean-server-side-collection-to-your-service and you can also compare how FxA did this.

Thanks! I will review the docs and example, just FYI we split this part of the work to a separate ticket to fit better into our sprints.

rhelmer · 2024-06-14T23:18:52Z

After taking another look at this I would strongly recommend moving user_id, session_id, monitor_user_id, and plan_tier out of extras and making them string metrics.
This way it would be easier to work with them - they would have specific columns in BigQuery and would be easier to discover in Glean Dictionary. We don't have any written guidelines for this, but it seems like measures that are sent with all the events and generally stay unchanged across a session (@rhelmer this seems to be the case here?) should be defined as ping level metrics. cc @quiiver with whom I talked about this yesterday.
For comparison, here's how FxA defined user_id metric.

OK I went ahead and made this change - it used extra_keys because that's the way the front-end metrics work, which was at the request of DS. As long as we won't have a problem piecing together these things in analysis I think it's fine.

@rhelmer, after discussing this with @akkomar some more, we'd recommend going back to the previous extra_keys approach to keep things consistent with the frontend events. Sorry for the runaround!

No problem, thanks! I'll revert that commit.

This reverts commit 194207a.

…kend' into MNTOR-3096/add-monitor-glean-backend

sean-rose · 2024-06-17T00:34:08Z

@rhelmer, after discussing this with @akkomar some more, we'd recommend going back to the previous extra_keys approach to keep things consistent with the frontend events. Sorry for the runaround!

No problem, thanks! I'll revert that commit.

I hate to flip-flop on this again, but I just realized there's another factor that probably tips the scales back toward having those IDs as string metrics: data access/deletion requests. It'd be much more straightforward to handle those if the IDs are simple columns rather than being in the array structure the extra keys values end up in.

So now I'm thinking we should use string metrics for the user/session-level properties in both the backend and the frontend (which would involve changing frontend telemetry, with LookML overrides to fall back to the previous location for those values in earlier frontend events), but with the following naming/hierarchy so the final metric names/columns make sense and aren't repeated with different categories/prefixes:

mozilla
- account_id (metric column name metrics.string.mozilla_account_id)
monitor
- user_id (metric column name metrics.string.monitor_user_id)
- session_id (metric column name metrics.string.monitor_session_id)
- plan_tier (metric column name metrics.string.monitor_plan_tier)

@akkomar what do you think?

akkomar · 2024-06-17T14:34:09Z

I hate to flip-flop on this again, but I just realized there's another factor that probably tips the scales back toward having those IDs as string metrics: data access/deletion requests. It'd be much more straightforward to handle those if the IDs are simple columns rather than being in the array structure the extra keys values end up in.

@sean-rose Good point, these request (DSR) would be easier to handle with ping-level metrics.

So now I'm thinking we should use string metrics for the user/session-level properties in both the backend and the frontend (which would involve changing frontend telemetry, with LookML overrides to fall back to the previous location for those values in earlier frontend events) (...)

Naming/hierarchy looks good to me. Ideally, to make this even easier for DSAR we could have some standardized naming conventions for these identifiers. But I wouldn't split hairs over this as we'll already need to support customizations because of https://bugzilla.mozilla.org/show_bug.cgi?id=1889123.

As for switching the frontend part too - do you know how much effort this requires? It would be great to have these fields consistently in string metrics, but I'm not sure a migration in frontend is worth it. Also it seems to me that unless we rewrite the old data, we'll need to support identfiers in extras for DSRs anyway until this data expires.

FYI @ksiegler1

sean-rose · 2024-06-18T04:14:46Z

As for switching the frontend part too - do you know how much effort this requires? It would be great to have these fields consistently in string metrics, but I'm not sure a migration in frontend is worth it.

I'd hope most of the additional complexity could be contained in centralized code similar to how FxA is doing it, in which case it probably wouldn't be a ton of effort, but that's optimistic speculation on my part. @rhelmer what do you think?

Also it seems to me that unless we rewrite the old data, we'll need to support identfiers in extras for DSRs anyway until this data expires.

It should be pretty easy to populate the new string metric columns for historical records using a simple UPDATE query.

IMO it'd be worth some reasonable effort now to put things in a more supportable state going forward.

rhelmer · 2024-06-25T22:59:42Z

As for switching the frontend part too - do you know how much effort this requires? It would be great to have these fields consistently in string metrics, but I'm not sure a migration in frontend is worth it.

I'd hope most of the additional complexity could be contained in centralized code similar to how FxA is doing it, in which case it probably wouldn't be a ton of effort, but that's optimistic speculation on my part. @rhelmer what do you think?

Also it seems to me that unless we rewrite the old data, we'll need to support identfiers in extras for DSRs anyway until this data expires.

It should be pretty easy to populate the new string metric columns for historical records using a simple UPDATE query.

IMO it'd be worth some reasonable effort now to put things in a more supportable state going forward.

I think migrating the front-end is doable, I'd need to do it in a separate ticket though. I can convert the metrics.yaml back again.

…r-glean-backend

This reverts commit 6d919f3.

…kend' into MNTOR-3096/add-monitor-glean-backend

rhelmer · 2024-06-25T23:07:37Z

So now I'm thinking we should use string metrics for the user/session-level properties in both the backend and the frontend (which would involve changing frontend telemetry, with LookML overrides to fall back to the previous location for those values in earlier frontend events), but with the following naming/hierarchy so the final metric names/columns make sense and aren't repeated with different categories/prefixes:
* `mozilla`
  
  * `account_id` (metric column name `metrics.string.mozilla_account_id`)

* `monitor`
  
  * `user_id` (metric column name `metrics.string.monitor_user_id`)
  * `session_id` (metric column name `metrics.string.monitor_session_id`)
  * `plan_tier` (metric column name `metrics.string.monitor_plan_tier`)

@sean-rose I reverted back to the previous version of the backend-metrics.yaml. Do I need to modify it to account for this too or is that something you're proposing handling on the Glean side?

sean-rose · 2024-06-25T23:38:06Z

So now I'm thinking we should use string metrics for the user/session-level properties in both the backend and the frontend (which would involve changing frontend telemetry, with LookML overrides to fall back to the previous location for those values in earlier frontend events), but with the following naming/hierarchy so the final metric names/columns make sense and aren't repeated with different categories/prefixes:

mozilla

account_id (metric column name metrics.string.mozilla_account_id)

monitor

user_id (metric column name metrics.string.monitor_user_id)

session_id (metric column name metrics.string.monitor_session_id)

plan_tier (metric column name metrics.string.monitor_plan_tier)

@sean-rose I reverted back to the previous version of the backend-metrics.yaml. Do I need to modify it to account for this too or is that something you're proposing handling on the Glean side?

Yes, you'll need to modify backend-metrics.yaml to do that. With the current backend-metrics.yaml the resulting Monitor backend Glean events table in BigQuery would have the following set of duplicative metric columns:

metrics.string.account_monitor_user_id
metrics.string.account_plan_tier
metrics.string.account_session_id
metrics.string.account_user_id
metrics.string.page_monitor_user_id
metrics.string.page_plan_tier
metrics.string.page_session_id
metrics.string.page_user_id
metrics.string.subscription_monitor_user_id
metrics.string.subscription_plan_tier
metrics.string.subscription_session_id
metrics.string.subscription_user_id

rhelmer · 2024-06-26T02:39:25Z

OK thanks @sean-rose lmk if this is what you were expecting.

sean-rose · 2024-06-26T21:28:28Z

OK thanks @sean-rose lmk if this is what you were expecting.

Not exactly:

The string metrics should be specified once directly under top-level mozilla and monitor category keys.
The event metrics should be specified directly under the appropriate top-level category keys (multiple levels of nesting are not supported).

I tried pushing a commit to correct those issues myself, but it turns out I don't have access (which makes sense), so I'll add a separate comment with the changes I'd recommend as one big code suggestion.

src/telemetry/backend-metrics.yaml

Co-authored-by: Sean Rose <[email protected]>

github-actions · 2024-06-27T17:39:11Z

Cleanup completed - database 'blurts-server-pr-4544' destroyed, cloud run service 'blurts-server-pr-4544' destroyed

rhelmer added 7 commits May 3, 2024 07:09

WIP monitor-backend setup

89c9b5c

Merge remote-tracking branch 'origin/main' into monitor-backend-glean

e42012d

correct file pathnames

c189a64

add correct bug review number

984fd2a

correction

476500d

Merge remote-tracking branch 'origin/main' into MNTOR-3096/add-monito…

339a851

…r-glean-backend

revert

124eb1f

rhelmer self-assigned this May 16, 2024

rhelmer marked this pull request as draft May 16, 2024 14:03

remove empty lines for lint

2b73361

sean-rose reviewed May 21, 2024

View reviewed changes

src/telemetry/backend-metrics.yaml Outdated Show resolved Hide resolved

rhelmer added 2 commits May 21, 2024 16:21

Merge remote-tracking branch 'origin/main' into MNTOR-3096/add-monito…

884cf8a

…r-glean-backend

start with plan_tier not plan_type

55d368b

rhelmer requested a review from sean-rose May 21, 2024 23:22

sean-rose approved these changes May 22, 2024

View reviewed changes

src/telemetry/backend-metrics.yaml Outdated Show resolved Hide resolved

correct and clarify the meaning of cancel

192fa10

rhelmer requested a review from sean-rose May 22, 2024 00:23

rhelmer marked this pull request as ready for review May 22, 2024 00:23

Merge branch 'main' into MNTOR-3096/add-monitor-glean-backend

2d568dd

akkomar reviewed May 22, 2024

View reviewed changes

src/telemetry/backend-metrics.yaml Show resolved Hide resolved

Merge branch 'main' into MNTOR-3096/add-monitor-glean-backend

8eb622c

sean-rose mentioned this pull request May 28, 2024

Bug 1896992: Add Glean app monitor.backend mozilla/probe-scraper#757

Merged

sean-rose approved these changes May 28, 2024

View reviewed changes

src/telemetry/backend-metrics.yaml Show resolved Hide resolved

rhelmer marked this pull request as draft June 3, 2024 16:08

rhelmer added 3 commits June 10, 2024 16:09

Merge remote-tracking branch 'origin/main' into MNTOR-3096/add-monito…

75f71f0

…r-glean-backend

define pings and use send_in_pings

d6142dd

Merge remote-tracking branch 'origin/MNTOR-3096/add-monitor-glean-bac…

a43a7f0

…kend' into MNTOR-3096/add-monitor-glean-backend

rhelmer marked this pull request as ready for review June 10, 2024 23:16

rhelmer added 2 commits June 14, 2024 16:19

Revert "use string metrics instead of extra keys"

6d919f3

This reverts commit 194207a.

Merge remote-tracking branch 'origin/MNTOR-3096/add-monitor-glean-bac…

a3da496

…kend' into MNTOR-3096/add-monitor-glean-backend

rhelmer requested review from akkomar and sean-rose June 14, 2024 23:20

Merge branch 'main' into MNTOR-3096/add-monitor-glean-backend

67e5f17

rhelmer added 3 commits June 25, 2024 16:04

Merge remote-tracking branch 'origin/main' into MNTOR-3096/add-monito…

8b68691

…r-glean-backend

Revert "Revert "use string metrics instead of extra keys""

e963a1b

This reverts commit 6d919f3.

Merge remote-tracking branch 'origin/MNTOR-3096/add-monitor-glean-bac…

8a9ec19

…kend' into MNTOR-3096/add-monitor-glean-backend

namespace as suggested in PR

8172cf2

sean-rose reviewed Jun 26, 2024

View reviewed changes

src/telemetry/backend-metrics.yaml Outdated Show resolved Hide resolved

rhelmer and others added 2 commits June 26, 2024 14:55

Update src/telemetry/backend-metrics.yaml

74fe1e1

Co-authored-by: Sean Rose <[email protected]>

generate frontend and backend metrics docs

4d6fb88

sean-rose approved these changes Jun 26, 2024

View reviewed changes

Merge branch 'main' into MNTOR-3096/add-monitor-glean-backend

6e41db9

rhelmer requested review from akkomar and removed request for akkomar June 26, 2024 22:39

akkomar approved these changes Jun 27, 2024

View reviewed changes

Merge branch 'main' into MNTOR-3096/add-monitor-glean-backend

5016c53

rhelmer merged commit 589bf8c into main Jun 27, 2024
16 checks passed

rhelmer deleted the MNTOR-3096/add-monitor-glean-backend branch June 27, 2024 17:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MNTOR-3096 - add monitor glean backend #4544

MNTOR-3096 - add monitor glean backend #4544

rhelmer commented May 16, 2024

github-actions bot commented May 21, 2024

sean-rose left a comment

akkomar left a comment

sean-rose left a comment

rhelmer commented Jun 10, 2024

rhelmer commented Jun 14, 2024

sean-rose commented Jun 17, 2024

akkomar commented Jun 17, 2024

sean-rose commented Jun 18, 2024 •

edited

Loading

rhelmer commented Jun 25, 2024

rhelmer commented Jun 25, 2024

sean-rose commented Jun 25, 2024

rhelmer commented Jun 26, 2024

sean-rose commented Jun 26, 2024

github-actions bot commented Jun 27, 2024

MNTOR-3096 - add monitor glean backend #4544

MNTOR-3096 - add monitor glean backend #4544

Conversation

rhelmer commented May 16, 2024

References:

Description

Screenshot (if applicable)

How to test

Checklist (Definition of Done)

github-actions bot commented May 21, 2024

sean-rose left a comment

Choose a reason for hiding this comment

akkomar left a comment

Choose a reason for hiding this comment

sean-rose left a comment

Choose a reason for hiding this comment

rhelmer commented Jun 10, 2024

rhelmer commented Jun 14, 2024

sean-rose commented Jun 17, 2024

akkomar commented Jun 17, 2024

sean-rose commented Jun 18, 2024 • edited Loading

rhelmer commented Jun 25, 2024

rhelmer commented Jun 25, 2024

sean-rose commented Jun 25, 2024

rhelmer commented Jun 26, 2024

sean-rose commented Jun 26, 2024

github-actions bot commented Jun 27, 2024

sean-rose commented Jun 18, 2024 •

edited

Loading