This repository contains code and information for a machine learning model that predicts whether an individual is a smoker or drinker based on various features. The model uses several different classifiers, including Random Forest, Support Vector Machine (SVM), Multi-Layer Perceptron (MLP), Logistic Regression, Gaussian Naive Bayes (GaussianNB), k-Nearest Neighbors (KNN), and Extreme Gradient Boosting (XGBoost).
Machine learning models can be valuable tools for predicting whether an individual is a smoker or drinker based on various factors such as age, gender, socioeconomic status, and more. This project explores different classifiers to build a predictive model and compare their performance.
- Clone this repository to your local machine:
git clone https://github.com/Pushkarm029/Smoking_ML_Model
- Navigate to the project directory:
cd SMOKING_ML_MODEL
This project employs the following classifiers to predict whether an individual is a smoker or drinker:
- Random Forest: A popular ensemble learning method that combines multiple decision trees for classification.
- Support Vector Machine (SVM): A powerful linear and non-linear classifier that finds the best hyperplane to separate data points.
- Multi-Layer Perceptron (MLP): A neural network model with multiple layers that can capture complex patterns in the data.
- Logistic Regression: A simple yet effective linear classifier that models the probability of a binary outcome.
- Gaussian Naive Bayes (GaussianNB): A probabilistic classifier based on Bayes' theorem and the assumption of Gaussian-distributed features.
- k-Nearest Neighbors (KNN): A non-parametric algorithm that classifies data points based on the majority class among their k-nearest neighbors.
- Extreme Gradient Boosting (XGBoost): An ensemble technique that combines multiple weak learners (usually decision trees) to create a strong learner.
Each classifier is evaluated for accuracy, precision, recall, F1-score, and ROC-AUC score.