GitHub - SahandNoey/Data-Mining-Course-Projects

Data Mining Course Projects

Data Cleaning
Classification using Regression
Dimensionality Reduction using PCA
Over-Sampling using SMOTE
Trained Classifers: SVM, KNN, Logistic Regression, Decision Tree as training classifiers
Used Grid Search for each classifiers
Used Cross Validation for Grid Search
Used Bar Plot to show each classifiers Accuracy, Precision, F1 Score, and roc_auc Score
Useed Confusion Matrix plot for clssification results
Clustering using K-means and DBSCAN to identify groups of customers with similar characteristics
Used Silhouette Score to measure clustering Cohesion

Preprocessed Pandas DataFrame
Visualized Data Distribution using Histogram
Visualized Data Correlation using Pair Plot and Heatmap
Trained Models using Linear Regression, Polynomial Regression, Ridge Regression, Lasso Regression, Elastic Net Regression, and XGBoost Regression
Evaluate Model Prediction using Mean Squared Error(MSE), and R2 Score
Libraries: NumPy, Pandas, Matplotlib, Seaborn, and scikit-learn

Libraries: NumPy, Pandas, Matplotlib, Seaborn, and scikit-learn
Preprocessed Dataset
Visualized Dataset using Scatter Plot, Histogram, Box Plot, and Pair Plot
Encoded and Normalized Dataset
Implemented Principal Component Analysis(PCA) from scratch
Visualized PCA-reduced Data

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.ipynb_checkpoints		.ipynb_checkpoints
P1		P1
P2		P2
P3		P3
README.md		README.md