Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: backfill licensed users #8791

Merged
merged 4 commits into from
Nov 20, 2024
Merged

feat: backfill licensed users #8791

merged 4 commits into from
Nov 20, 2024

Conversation

sjaanus
Copy link
Contributor

@sjaanus sjaanus commented Nov 19, 2024

This migration introduces a query that calculates the licensed user counts and inserts them into the licensed_users table.

The logic ensures that:

  1. All users created up to a specific date are included as active users until they are explicitly deleted.
  2. Deleted users are excluded after their deletion date, except when their deletion date falls within the last 30 days or before their creation date.
  3. The migration avoids duplicating data by ensuring records are only inserted if they don’t already exist in the licensed_users table.

Logic Breakdown:

Identify User Events (user_events): Extracts email addresses from user-related events (user-created and user-deleted) and tracks the type and timestamp of the event. This step ensures the ability to differentiate between user creation and deletion activities.
Generate a Date Range (dates): Creates a continuous range of dates spanning from the earliest recorded event up to the current date. This ensures we analyze every date, even those without events.
Determine Active Users (active_emails): Links dates with user events to calculate the status of each email address (active or deleted) on a given day. This step handles:

  • The user's creation date.
  • The user's deletion date (if applicable).

Calculate Daily Active User Counts (result):
For each date, counts the distinct email addresses that are active based on the conditions:

  • The user has no deletion date.
  • The user's deletion date is within the last 30 days relative to the current date.
  • The user's creation date is before the deletion date.

Copy link

vercel bot commented Nov 19, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
unleash-monorepo-frontend ✅ Ready (Inspect) Visit Preview 💬 Add feedback Nov 19, 2024 1:03pm
1 Skipped Deployment
Name Status Preview Comments Updated (UTC)
unleash-docs ⬜️ Ignored (Inspect) Visit Preview Nov 19, 2024 1:03pm

Copy link
Contributor

github-actions bot commented Nov 19, 2024

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

OpenSSF Scorecard

PackageVersionScoreDetails

Scanned Files

@@ -0,0 +1,85 @@
exports.up = (db, cb) => {
db.runSql(`
WITH user_events AS (
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just gets the correct events

WHERE
type IN ('user-created', 'user-deleted')
),
dates AS (
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generates needed dates from first event to today.

FROM
generated_dates
),
active_emails AS (
Copy link
Contributor Author

@sjaanus sjaanus Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For each date, for each email, find the most recent created and deleted event before the date

d.date,
ue.email
),
result AS (
Copy link
Contributor Author

@sjaanus sjaanus Nov 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Count the emails on date

  1. add 1 if the deleted date is empty, delete happened within last 30 days, delete happened before the creation
  2. add 0 else, so the email was not active during that day.

d.date
ORDER BY
d.date
) INSERT INTO licensed_users (date, count)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Insert all the date results into db, not overwriting existing results.

Comment on lines +96 to +97
expect(rows.find((row) => row.date === '2024-11-04').count).toBe(1);
expect(rows.find((row) => row.date === '2024-11-05').count).toBe(0); // 30 days has passed
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be the same thing as I pointed out in a separate PR, so feel free to ignore if you don't think it's relevant, but October is a 31-day month. So if you create the user on the 1st, and delete them on the 2nd, the 31st will be the 30th day after deletion. Extrapolating, that means that november 3rd is the 30 day mark. Should we switch this to check November 4th? Or should it be 3rd? Again, because this will in reality differentiate on milliseconds, it probably doesn't matter much. I'm sure youv'e got it covered.

If someone is deleted on the 5th, does the 5th count as the first day of 30 or is the 6th the first day?

@sjaanus sjaanus merged commit 4234020 into main Nov 20, 2024
12 checks passed
@sjaanus sjaanus deleted the backfill-licensed-users branch November 20, 2024 07:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants