Your managers found the data analysis you performed previously very helpful in understanding their customers, products, and overall business. They have now started asking questions about not just the current state of the business but also future trends for use in long range planning. ACME Company has tasked you with several goals related to prediction and forecasting, specifically predicting their overall sales for the next few years and best selling products. The output of this analysis will be an Excel file called 304_lab8_analysis.xlsx
.
To read data into Excel from SQL Server, follow these steps:
- Open Excel. Under the
Data
Menu, selectGet Data
then,From Database
, thenFrom SQL Server Database
.
- Enter
cosc304.ok.ubc.ca
for the server andanalysis
for the database. SelectAdvanced options
and enter the SQL query that you want to perform. ClickOK
.
- For authentication, select
Database
authentication. The user id and password is provided on Canvas. Note that you must be connected to UBC VPN in order to access the database.
- Click the
Load
button to load the data into a worksheet.
Performing predictions is easy in Excel, and there are several methods to do it.
Let's go through the steps to predict the company sales for the next two years (2024, 2025). The initial query to retrieve the data is:
SELECT YEAR(orderDate) as year, SUM(TotalAmount) as totalSales
FROM ordersummary
GROUP BY YEAR(orderDate)
ORDER BY YEAR(orderDate)
Connect to the database using Excel. Execute the query and load the data into Excel. Store data in sheet called salesdata
.
There are several ways to perform a prediction. Excel has a forecast function that will predict an output value based on an input value and the prior data series points.
Another way is to create a graph and add a linear trend line.
It is also possible to use Forecast Sheet. Once you have a data set loaded and selected, on the Data tab, in the Forecast group, click Forecast Sheet. An example walkthrough written by Microsoft is available.
Create a forecast sheet and rename it as initialpredict
similar to shown below.
The results of these forecast methods produce different results. Completing this part is worth 5 marks.
Year | Linear Regression | Linear Treandline | Forecase Sheet |
---|---|---|---|
2024 | $837,071.64 | $837,072.00 | $762,127.98 |
2025 | $938,035.54 | $938,036.00 | $851,584.09 |
Unfortunately, this prediction is not very accurate for several reasons. First, 2019 was only a partial year for sales starting in October. Similarly, the 2023 selling season is not complete with data only up to October. The art of making predictions comes from deciding what to do with incomplete, missing, or inaccurate data. Perform an updated prediction and save in the sheet salesprediction
. Here are the assumptions:
- 2019 data will not be used as it is incomplete.
- 2023 data will be increased by 20% as there are still 2 months of sales. (Note: This estimate could be better as many retailers do a higher percentage of sales around Christmas.)
- Generate an updated sales table and sales prediction for 2024 and 2025.
This second prediction is worth 5 marks.
ACME Company wants to understand its sales by province (state). Generate a query that produces data with the overall sales by province for the years 2020 to 2022. Use this data to predict the province with the highest sales in 2023 and 2024.
Excel has a nice map chart visualization for displaying data on a map. Create a map chart showing the total sales by province for 2022 similar to below.
The goal is to predict the top 3 selling products in 2024 based on historical data from 2020 to 2023. The data table and visualization should be in a sheet called productsales
. This is an open-ended question, and there are multiple ways to approach this. A challenge is that there are 200 products to consider. Although you may use Python or Java to perform code analysis, that is not required. The question is marked based on your approach and discussion more than your final result. Note: Use of Generative AI (including Copilot and ChatGPT) is allowed for this part. Document any sources used to build your analysis.
Marking:
- Write a short paragraph describing your analysis approach. (2 marks)
- Implement your analysis approach in Excel, SQL, and maybe Java/Python code. (5 marks)
- List any sources (website, AI conversations) used to design and implement your approach. (2 marks)
- List your predicted top 3 selling products for 2024 and their predicted sales volume (quantity*price). (1 mark)
Submit your Excel file on Canvas. Bonus marks are available for early review by the TA in a virtual office hours.