Skip to content

This repository contains a project that uses machine learning pipeline to analyze telecom customer data and predict the likelihood of churn, using a Logistic Regression model and advanced preprocessing techniques

Notifications You must be signed in to change notification settings

RebeccaMorolong/TelecomPrediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 

Repository files navigation

πŸ“ž Telecom Churn Prediction

Predicting customer churn is critical for telecom companies aiming to retain valuable customers and reduce revenue loss. This project builds a machine learning pipeline to analyze telecom customer data and predict the likelihood of churn, using a Logistic Regression model and advanced preprocessing techniques.


πŸš€ Project Overview

  • Goal: Predict which customers are likely to churn based on their usage and account features.
  • Techniques:
    • Data preprocessing and feature selection
    • Handling class imbalance with SMOTE
    • Logistic Regression modeling
    • Feature importance analysis
    • Model evaluation (accuracy, confusion matrix, classification report)

πŸ“‚ Dataset

Source:

  • Rows: 3,333
  • Features: 11 (including Churn, AccountWeeks, ContractRenewal, DataPlan, DataUsage, CustServCalls, DayMins, DayCalls, MonthlyCharge, OverageFee, RoamMins)

Project Notebook

telecom_churn.csv

πŸ› οΈ Project Structure


πŸ“Š Key Steps

  1. Data Preprocessing

    • Handle missing values
    • Encode categorical variables
    • Scale features
  2. Class Imbalance Handling

    • Stratified train-test split
    • Synthetic Minority Oversampling Technique (SMOTE)
  3. Modeling

    • Logistic Regression (with class_weight and/or SMOTE)
    • Feature importance analysis
  4. Evaluation

    • Accuracy, Confusion Matrix, Classification Report

πŸ–ΌοΈ Example Visualizations

Feature Importance output

  • Feature Importance using Linear Regression Model output1
    • Feature Importance using XGBOOST Model

βš™οΈ Requirements

  • Python 3.8+
  • pandas
  • numpy
  • scikit-learn
  • imbalanced-learn
  • matplotlib
  • seaborn

🀝 Contributing

Contributions are welcome! Please fork the repository and submit a pull request.


πŸ“„ License

This project is licensed under the MIT License.


πŸ“¬ Contact

For questions or feedback, please open an issue or contact [[email protected]].


Empowering telecoms with data-driven retention strategies!


🏁 Getting Started

1. Clone the repository

About

This repository contains a project that uses machine learning pipeline to analyze telecom customer data and predict the likelihood of churn, using a Logistic Regression model and advanced preprocessing techniques

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published