-
No-Churn Telecom is an established Telecom operator in Europe with more than a decade in Business. Due to new players in the market, telecom industry has become very competitive and retaining customers becoming a challenge.
-
In spite of No-Churn initiatives of reducing tariffs and promoting more offers, the churn rate (percentage of customers migrating to competitors) is well above 10%.
-
No-Churn wants to explore possibility of Machine Learning to help with following use cases to retain competitive edge in the industry.
3. Introduce new predicting variable “CHURN-FLAG” with values YES(1) or NO(0) so that email campaigns with lucrative offers be targeted to Churn YES customers.
1.Fetching data from data-base.
2.Domain Analysis.
3.EDA: [Univariate, Bivariate & analysis condition]
4.Data preprocessing/Feature Engineering.
5.Features Selection.
6.Model Creation.
7.Model Evaluation.
8.Model Comparison.
9.Conclusion
-
State - 2-letter code of the US state of customer residence.
-
Account Length - Number of months the customer has been with the current telco provider.
-
Area Code - 3 digit area code.
-
Phone - Phone number of customer.
-
International Plan - The customer has international plan or not.
-
VMail Plan - The customer has voice mail plan or not.
-
VMail Message - Number of voice-mail messages.
-
Day Mins - Total minutes of day calls.
-
Day Calls - Total number of day calls.
-
Day Charge - Total charge of day calls.
-
Eve Mins - Total minutes of evening calls.
-
Eve Calls - Total number of evening calls.
-
Eve Charge - Total charge of evening calls.
-
Night Mins - Total minutes of night calls.
-
Night Calls - Total number of night calls.
-
Night Charge - Total charge of night calls.
-
International Mins - Total minutes of international calls.
-
International calls - Total number of international calls.
-
International Charge - Total charge of international calls.
-
CustServ Calls - Number of calls to customer service.
-
Churn - Customer churn or not. (target variable)
- HANDLING NULL VALUES
- HANDLING CATEGORICAL DATA
Using MinMaxScaler
-
Logistic Regression :- 71.21%
-
Cross validation on logistic regression :- 86.26%
-
Logistic Regression with best hyperparameter :- 86.47%
-
Support Vector Machine :- 85.49%
-
Cross validation on SVM :- 89.06%
-
K-Nearest Neighbor :- 87.66%
-
Cross validation on KNN :- 87.35%
-
K-Nearest Neighbor with best hyperparameter :- 88.52%
-
Decision Tree Classifier :- 96.53%
-
Cross validation on Decision Tree Classifier :- 86.00%
-
Decision Tree with best hyperparameter :- 85.49%
-
Random Forest Classifier :- 97.51%
-
Cross validation on Random Forest Classifier :- 92.18%
-
Random Forest with best hyperparameter :- 97.83%
-
Gradient Boosting :- 91.23%
-
Cross validation on Gradient Boosting :- 92.31%
-
Gradient Boosting with best hyperparameter :- 98.05%
-
XGBoost :- 97.94%
-
Cross validation on XGBoost :- 94.63%
-
XGBoost with best hyperparameter :- 92.42%
-
Artificial Neural Network :- 13.52%
-
Cross validation on ANN :- 92.31%
-
ANN with best hyperparameter :- 97.94%
In the "NO-churn telecom" dataset comprising 4617 entries, the Gradient Boosting model, fine-tuned with the best hyperparameters, demonstrated an exceptional accuracy of 98.05%.
This signifies the effectiveness of Gradient Boosting in accurately predicting churn within the telecommunications dataset. The high accuracy underscores the model's robust performance and suitability for deployment in predicting customer churn.
Careful consideration should be given to potential applications in real-world scenarios, and further analysis may be conducted to ensure the model's reliability and generalizability.