-
Data Cleaning
-
Classification using Regression
-
Dimensionality Reduction using PCA
-
Over-Sampling using SMOTE
-
Trained Classifers: SVM, KNN, Logistic Regression, Decision Tree as training classifiers
-
Used Grid Search for each classifiers
-
Used Cross Validation for Grid Search
-
Used Bar Plot to show each classifiers Accuracy, Precision, F1 Score, and roc_auc Score
-
Useed Confusion Matrix plot for clssification results
-
Clustering using K-means and DBSCAN to identify groups of customers with similar characteristics
-
Used Silhouette Score to measure clustering Cohesion
- Preprocessed Pandas DataFrame
- Visualized Data Distribution using Histogram
- Visualized Data Correlation using Pair Plot and Heatmap
- Trained Models using Linear Regression, Polynomial Regression, Ridge Regression, Lasso Regression, Elastic Net Regression, and XGBoost Regression
- Evaluate Model Prediction using Mean Squared Error(MSE), and R2 Score
- Libraries: NumPy, Pandas, Matplotlib, Seaborn, and scikit-learn
- Used MLXtend library
- Applied TransactionEncoder
- Generated Frequent Itemsets using Apriori algoirthm
- Generated Association Rules
- Libraries: NumPy, Pandas, Matplotlib, Seaborn, and scikit-learn
- Preprocessed Dataset
- Visualized Dataset using Scatter Plot, Histogram, Box Plot, and Pair Plot
- Encoded and Normalized Dataset
- Implemented Principal Component Analysis(PCA) from scratch
- Visualized PCA-reduced Data