Skip to content

Latest commit

 

History

History
74 lines (56 loc) · 3.01 KB

File metadata and controls

74 lines (56 loc) · 3.01 KB

Customer Churn Prediction

Overview

This project aims to predict customer churn in a telecommunications company using machine learning. Customer churn, also known as customer attrition, occurs when customers stop doing business with a company. Predicting churn is crucial for businesses to understand and retain their customer base.

Table of Contents

Dataset

The dataset used in this project is located in the data/raw directory and is named customer_churn_dataset.csv. It contains the following columns:

  • CustomerID
  • Name
  • Age
  • Gender
  • Location
  • Subscription_Length_Months
  • Monthly_Bill
  • Total_Usage_GB
  • Churn

Project Structure

This project is organized into the following directories:

  • data/: Contains the dataset in CSV format.
  • scripts/: Includes code scripts for data cleaning, model development, and evaluation.
  • models/: Stores trained machine learning models.
  • notebooks/: Jupyter notebooks detailing the data analysis, model development, and evaluation process.
  • docs/: Documentation files for the project.

Data Preprocessing

In the initial data exploration, we perform the following tasks:

  • Load and explore the dataset.
  • Handle missing data and outliers.
  • Encode categorical variables.
  • Split the data into training and testing sets.

Feature Engineering

To improve the model's prediction accuracy, we generate relevant features from the dataset. We also apply feature scaling or normalization if necessary.

Model Building

We select appropriate machine learning algorithms (e.g., logistic regression, random forest) and train the model on the training dataset. Model performance is evaluated using metrics such as accuracy, precision, recall, and F1-score.

Model Optimization

To enhance model performance, we fine-tune model parameters, explore techniques like cross-validation, and perform hyperparameter tuning.

Model Deployment

Once satisfied with the model's performance, we deploy it in a production-like environment, allowing it to take new customer data as input and provide churn predictions.

Performance Metrics and Visualizations

We use Jupyter notebooks and python scripts to perform model performance metrics and visualizations. The results are included in the notebooks directory.

Requirements

To run this project, you'll need to have Python and the following libraries installed. You can install the required libraries by running the following command:

pip install -r requirements.txt

Contributing

If you'd like to contribute to this project, please follow the Contributing Guidelines.

License

This project is licensed under the MIT License.