Project Overview

This project focuses on predicting customer happiness based on survey responses from a select customer cohort in the logistics and delivery domain. The main objective is to analyze the provided dataset, preprocess the data, and build classification models to predict customer happiness.

Approach

Exploratory Data Analysis and Preprocessing
- Perform dataset exploration to gain insights into the data.
- Handle missing values, if any, by applying appropriate strategies such as imputation or removal.
- Identify and handle outliers using the Isolation Forest algorithm, considering a 10% baseline for outlier detection.
- Analyze feature correlations using correlation matrices to understand the relationships between variables.
Classification Models (80-20 Split)
- Choose classification models based on the LazyPredict library, which provides a quick overview of model performance.
- Select a random state for training and testing data splits to ensure reproducibility.
- Train and evaluate classification models such as XGBoost, ExtraTreesClassifier, DecisionTreeClassifier, and RandomForestClassifier.
- Optimize the parameters of XGBoost using cross-validation and grid search.
- Conduct SHAP (SHapley Additive exPlanations) analysis on the XGBoost model to interpret feature importances.
- Optimize the parameters of ExtraTreesClassifier using cross-validation and grid search.
- Conduct SHAP analysis on the ExtraTreesClassifier model to interpret feature importances.
Data Augmentation
- Apply data augmentation techniques, starting with the Synthetic Minority Over-sampling Technique (SMOTE), to address class imbalance if present.
- Consider downsizing the data and adjusting skewness to improve the performance of the classification models.
- Evaluate the performance of the augmented data using XGBoost and ExtraTreesClassifier models.
Feature Engineering
- Identify less significant features based on analysis and domain knowledge.
- Remove or transform these less significant features.
- Evaluate the performance of the models after feature engineering to assess the impact on predictive accuracy.

The project aims to provide insights into customer happiness prediction, assess the performance of different classification models, apply data augmentation techniques, and optimize feature engineering to improve predictive accuracy. Throughout the process, it's important to document findings and observations, and communicate the results effectively.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
ACME-HappinessSurvey2020.csv		ACME-HappinessSurvey2020.csv
HappinessScore.ipynb		HappinessScore.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Overview

Approach

About

Releases

Packages

Languages

Mrcl3/aTDFAD59Pb7cC10I

Folders and files

Latest commit

History

Repository files navigation

Project Overview

Approach

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages