-
Notifications
You must be signed in to change notification settings - Fork 518
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Moveing Average and Weighted Moving Average
- Loading branch information
Showing
2 changed files
with
45 additions
and
0 deletions.
There are no files selected for viewing
16 changes: 16 additions & 0 deletions
16
Advanced SQL for Data Science - Time Series/03.Time Series Analysis/04.Moving Average.sql
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
/********** Moving Average *************/ | ||
|
||
/* | ||
We want to know the hourly cpu utilization. | ||
So we can create a sliding windows to get the average cpu utilization of 1 hour. | ||
In hour case, each log is inserted per 5 minutes. So for 1 hour, we need to gather 12 logs (transactions). | ||
NOTE: | ||
OVER statements basically said order by the event time, then given the current row, go back 12 rows. | ||
With that set of date, apply the average function. | ||
*/ | ||
|
||
SELECT | ||
event_time, server_id, | ||
AVG(cpu_utilization) OVER (ORDER BY event_time ROWS BETWEEN 12 PRECEDING AND CURRENT ROW) AS hourly_cpu_util | ||
FROM time_series.utilization; |
29 changes: 29 additions & 0 deletions
29
...SQL for Data Science - Time Series/03.Time Series Analysis/05.Weighted Moving Average.sql
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
/********* Weighted Moving Average ************/ | ||
|
||
/* | ||
The idea is you want to give MORE weight to the MORE RECENT events than the past events. | ||
We want to weight the average temp of last 3 days. so we need to sum up of those last 3 days. | ||
*/ | ||
WITH daily_avg_temp AS ( | ||
SELECT | ||
DATE_TRUNC('day', event_time) as event_date, | ||
ROUND(AVG(temp_celcius),2) AS avg_temp | ||
FROM time_series.location_temp | ||
GROUP BY DATE_TRUNC('day', event_time) | ||
) | ||
SELECT | ||
event_date, avg_temp, | ||
(SELECT ROUND(avg_temp,2) * 0.5 | ||
FROM daily_avg_temp d2 | ||
WHERE d2.event_date = d1.event_date - INTERVAL '1' day | ||
) + | ||
(SELECT ROUND(avg_temp,2) * 0.333 | ||
FROM daily_avg_temp d3 | ||
WHERE d3.event_date = d1.event_date - INTERVAL '2' day | ||
) + | ||
(SELECT ROUND(avg_temp,2) * 0.167 | ||
FROM daily_avg_temp d4 | ||
WHERE d4.event_date = d1.event_date - INTERVAL '3' day | ||
) AS three_days_weighted_avg_temp | ||
FROM daily_avg_temp d1 |