Sparkify Project

Capstone Project, Udacity Data Science Nanodegree

Github Repo:https://github.com/cdumen/DataScientist_Capstone

Project Description

Sparkify is a music streaming service just as Spotify and Pandora. The data provided is the user log of the service, having demographic info, user activities, timestamps and etc. We try to analyze the log and build a model to identify customers who are highly likely to quit using our service, and thus, send marketing offers to them to prevent them from churning. We use F1 score to measure of model performance because we need precision and recall at the same time as we don't want to miss too many customers who are likely to churn whilst we don't want to waste too much on those who are not likely to churn. The model we built has a F1 score of nearly 0.800

File Description

Sprakify .ipynb is the main file of the project, it demonstrates the process of using pyspark to explore the data and build the model.

Methodology

ETL
Define customer churn and EDA
Feature engineering
Modeling
Evaluation
Feature import analysis

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
ReadME.md		ReadME.md
Sparkify.html		Sparkify.html
Sparkify.ipynb		Sparkify.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sparkify Project

Project Description

File Description

Methodology

About

Releases

Packages

Languages

cdumen/Sparkify_Churn_Prediction

Folders and files

Latest commit

History

Repository files navigation

Sparkify Project

Project Description

File Description

Methodology

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages