This repository contains code for analyzing the COMPAS recidivism dataset using various machine learning models. The analysis includes decision trees, XGBoost, and neural networks, along with model interpretability using LIME and SHAP.
- Clone the repository:
git clone https://github.com/yourusername/compas-analysis.git
cd compas-analysis- Create and activate a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Initialize DVC:
dvc init
dvc add data/ # If you have data files to trackOAIP_Skeleton.ipynb: Main notebook containing model training and evaluationLIME_and_SHAP.ipynb: Model interpretability analysis using LIME and SHAPmodels/: Directory containing saved models (tracked by DVC)requirements.txt: Project dependencies.dvc/: DVC configuration and cache
- Run the main analysis notebook:
jupyter notebook OAIP_Skeleton.ipynb- Run the interpretability analysis:
jupyter notebook LIME_and_SHAP.ipynbThe models will be automatically saved to your local models/ directory and tracked by DVC.
Models are saved locally in the models/ directory and tracked using DVC. Each model is saved with its metadata in JSON format. The following models are generated:
- Decision Tree:
models/decision_tree_model.joblib - XGBoost:
models/xgboost_model.json - Neural Network:
models/neural_network_model.keras
The LIME_and_SHAP.ipynb notebook provides detailed analysis of model predictions using:
- LIME (Local Interpretable Model-agnostic Explanations)
- SHAP (SHapley Additive exPlanations)
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.