Data mining project
Goal:
Apply the CRISP-DM process and data mining algorithms and techniques covered in class to a real world. You will conduct your study following the CRISP-DM process and jupyter notebook. Make use of resources provided in class and install different libraries for visualization. Students must perform all steps of the CRISP-DM process and report on results. Specifically, students will apply regression, classification, clustering and association rule analysis as they develop a prediction model for Rossman Sales Data. Students will turn in all work via Jupyter notebook for the project.
The Project:
Analysis of Rossman Store Data. Rossmann operates over 3,000 drug stores in 7 European countries. Currently, Rossmann store managers are tasked with predicting their daily sales for up to six weeks in advance. Store sales are influenced by many factors, including promotions, competition, school and state holidays, seasonality, and locality. With thousands of individual managers predicting sales based on their unique circumstances, the accuracy of results can be quite varied.
Rossmann is challenging you to predict 6 weeks of daily sales for 1,115 stores located across Germany. Reliable sales forecasts enable store managers to create effective staff schedules that increase productivity and motivation. By helping Rossmann create a robust prediction model, you will help store managers stay focused on what’s most important to them: their customers and their teams!
Note: Students can use resources to help them with the project. This was a popular Kaggle competition. To get started, you will want to look at: https://serhanaya.github.io/kaggle-rossmann-sales-prediction/
This example will give some insight into combining the train and test data with the store data for added insight. Also, the times series component of the data is addressed in this example for analysis of the Rossman Sales Prediction data. Please remember that while you can use this example, you need to add original insight and results of modeling to your project.