This project aims to develop a predictive model to identify customers at risk of churn within the Syriatel network. By leveraging machine learning techniques, the goal is to predict churn so the customer can imple,ent customer retention actions
The dataset used for this project contains various features related to customer behavior, usage patterns, demographics, and customer service interactions.
- Exploratory Data Analysis (EDA) was conducted to understand the distribution of features and identify potential patterns.
- Data cleaning techniques were applied to handle missing values and outliers.
- Feature engineering was performed to create new features and extract useful information from existing ones.
- Data encoding was applied to convert categorical variables into numerical format for modeling.
Several machine learning models were trained and evaluated for predicting customer churn, including:
- Logistic Regression
- K-Nearest Neighbors (KNN)
- Decision Tree
- Random Forest
- XGBoost
Each model was trained on the preprocessed dataset and evaluated using performance metrics such as accuracy, precision, recall, and the area under the ROC curve (AUC).
The performance of each model was assessed, and the best-performing model was selected based on the evaluation metrics. The selected model will be deployed in production to predict customer churn in real-time.
By accurately identifying customers at risk of churn, Syriatel can implement targeted retention strategies to improve customer satisfaction and reduce churn rates, ultimately leading to increased business revenue and profitability.