select *
from snapshots.orders_snapshot
where order_guid in (
'914b8929-e04a-40f8-86ee-357f2be3a2a2',
'05202733-0e17-4726-97c2-0520c024ab85',
'939767ac-357a-4bec-91f8-a7b25edd46c9'
)| order_guid | shipping_service | estimated_delivery_at_utc | status |
|---|---|---|---|
| 05202733-0e17-4726-97c2-0520c024ab85 | preparing | ||
| 05202733-0e17-4726-97c2-0520c024ab85 | ups | 2021-02-19 10:15:26 | shipped |
| 914b8929-e04a-40f8-86ee-357f2be3a2a2 | preparing | ||
| 914b8929-e04a-40f8-86ee-357f2be3a2a2 | ups | 2021-02-19 10:15:26 | shipped |
| 939767ac-357a-4bec-91f8-a7b25edd46c9 | preparing | ||
| 939767ac-357a-4bec-91f8-a7b25edd46c9 | ups | 2021-02-19 10:15:26 | shipped |
Our funnel statistics can be derived from the product_funnel view and product_funnel_percents view.
The funnel levels were defined as:
session -> item added to cart -> checkout
Specifically,
- the overall conversion rate is 62.46%.
- the dropoff between sessions and cart adds is 19.20%.
- the dropoff between cart adds and checkouts is 22.70%, and represents the biggest loss in our funnel.
A.
My dbt pitch:
dbt is the tool that allows for structured growth to your company's data. Without dbt it is difficult to grow your data beyond the number of tables one or two data engineers can keep track of in their heads (~50). dbt gives you the tools to properly structure your data from ingestion to business intelligence in a clean, version controlled system.
Recommendations:
Start with some data / models that can be easily moved to dbt and that will quickly yield an MVP business intelligence insight. Show the data engineering team how easy that was to set up and how manageable it is between the model layers and the DAG. Show the C-level the resulting BI exposures and how they can be confident in the data based on the fact that the data pipeline is fully defined and auditable.
Skills picked up: I had no dbt skills at all prior to this course. Through this course, I've built a layer of Analytics Engineering that will compliment my existing Data Science skills. This gives me the confidence to help plan out and build a BI pipeline for our organization.
B.
No need to re-invent the wheel here. Co-worker, and fellow dbt classmate @onegules has outlined an answer to this question that is a good place to start our discussion at Preteckt here.
Session conversion rate:
select sum(checkout)::float / count(*) as conversion_rate
from dbt_chris_r_reporting.dim_sessions0.6245674740484429
Product conversion rate:
select name, round(conversion_percent::decimal,1)
from dbt_chris_r.product_view_facts as conversion_percent
order by name| Name | Conversion (%) |
|---|---|
| Alocasia Polly | 38.9 |
| Aloe Vera | 49.2 |
| Angel Wings Begonia | 38.7 |
| Arrow Head | 54.7 |
| Bamboo | 52.2 |
| Bird of Paradise | 45.0 |
| Birds Nest Fern | 41.3 |
| Boston Fern | 41.3 |
| Cactus | 54.5 |
| Calathea Makoyana | 50.9 |
| Devil's Ivy | 48.9 |
| Dragon Tree | 46.8 |
| Ficus | 42.6 |
| Fiddle Leaf Fig | 47.5 |
| Jade Plant | 47.8 |
| Majesty Palm | 47.8 |
| Money Tree | 46.4 |
| Monstera | 51.0 |
| Orchid | 45.3 |
| Peace Lily | 40.3 |
| Philodendron | 47.6 |
| Pilea Peperomioides | 45.2 |
| Pink Anthurium | 41.9 |
| Ponytail Palm | 39.4 |
| Pothos | 32.8 |
| Rubber Plant | 50.0 |
| Snake Plant | 39.7 |
| Spider Plant | 47.5 |
| String of pearls | 60.0 |
| ZZ Plant | 52.3 |
I added a few test macros, and a macro to convert values to a percent.
Done, but not fully understood.
Done, and made use of the dbt_utils pivot macro.
Check out marts/core/intermediate/dim_sessions.sql for the pivot macro.
greenery/models/marts/product/product_view_facts.sql for my hand-crafted macro.
Any of the staged tables for index addition.
Repeat user rate:
select percent_with_n_or_more_orders
from dbt_chris_r.user_order_count_facts
where order_count=2Answer: 79.84%
Who is unlikely to order again:
- Users with only one order
- Users with a long time since their last purchase. This is tracked in the users fact Mart.
Added models:
core/dim_hourly_ordersto track how many orders in each hour and what ratio have been fulfilled.core/dim_hourly_sessionsto track how many sessions per hour to assess load on infrasturcurecore/dim_userstracking last order, total revenue, promo use.marketing/order_factsto answer marketing questions and merge in how many items ordered.marketing/user_order_count_factsto answer the specific marketing assignment question.product/low_inventory_factsto see what inventory needs to be re-ordered.product/product_view_factsto see what products are viewed often and the time since the last view.
Tests were added to certify uniqueness of columns and that data wasn't missing from critical columns. I don't know how to ensure that the tests are passing regularly. If you know, please comment in the review.
Reflection questions in Slack submission post.
- How many users do we have:
select count(distinct user_guid) from dbt_chris_r.stg_users
- 130
- On average, how many orders do we receive per hour?
select avg(order_count) from dbt_chris_r.hourly_orders
- 7.5208333333333333
- On average, how long does an order take from being placed to being delivered?
select avg(delivered_at_utc - created_at_utc) as avg_delivery_interval from dbt_chris_r.stg_orders
- 3 days 21:24:11.803279
- How many users have only made one purchase? Two purchases? Three+ purchases?
select count(*) from dbt_chris_r.user_purchase_count group by purchase_count having purchase_count=1
- 25
select count(*) from dbt_chris_r.user_purchase_count group by purchase_count having purchase_count=2
- 28
with cte as ( select purchase_count, count(*) as user_count from dbt_chris_r.user_purchase_count group by purchase_count ) select sum(user_count) from cte where purchase_count >= 3
- 71
- On average, how many unique sessions do we have per hour?
with cte as ( select date_trunc, count(*) as session_count from dbt_chris_r.hourly_sessions group by date_trunc ) select avg(session_count) from cte
- 11.7959183673469388
