XAI Analysis for Diabetes Prediction

Description

This project applies Explainable AI (XAI) methods to the Pima Indian Diabetes dataset, leveraging various machine learning models to predict diabetes. The models used include Convolutional Neural Networks (CNN), Multi-Layer Perceptrons (MLP), Random Forest Regression, and Recurrent Neural Networks (RNN). Each model's predictions are interpreted using SHAP, LIME, and ALE, three of the most prominent XAI techniques, to provide insights into the decision-making process of each algorithm.

XAI Techniques

SHAP (SHapley Additive exPlanations): SHAP values interpret the impact of having a certain value for a given feature in comparison to the prediction we'd make if that feature took some baseline value. It's a game theoretic approach to explain the output of any machine learning model.
LIME (Local Interpretable Model-agnostic Explanations): LIME helps us understand the predictions of any classifier in an interpretable and faithful manner, by approximating it locally with an interpretable model.
ALE (Accumulated Local Effects): ALE plots show the main effects of features and are a faster alternative to partial dependence plots (PDPs), which explain the features' contributions based on the accumulation of local effects.

Model Performance

The accuracies achieved by our models on the Pima Indian Diabetes dataset are as follows:

MLP: 75%
CNN: 70%
RNN: 80%
Random Forest Regression: F-score of 30%

Dataset

The dataset utilized is the Pima Indian Diabetes dataset, which consists of several medical predictor variables and one target variable, Outcome. Predictor variables include the number of pregnancies the patient has had, their BMI, insulin level, age, and so on.

Output

Multiple graphs for each XAI model under each ml model were compared which can be seen in the Comparative Study PDF.

The primary comparisons were:

SHAP summary plots

SUMMARY - Common Important Features: Across all models, Glucose, BMI, and Age consistently emerge as key predictors. Variability Across Models: The RNN model emphasizes Age and BMI more than the others, while the CNN and Regression models strongly highlight Glucose. The MLP model shows a broader distribution of feature impacts, reflecting the non-linear interactions it captures. Model-Specific Insights: Each model's unique architecture influences how it prioritizes different features, providing complementary perspectives on feature importance.

LIME plots

SUMMARY - The LIME plots compare four models: CNN, MLP, RNN, and a regression model. While the classification models (CNN, MLP, RNN) share similar feature importance rankings, their predictions vary significantly. The CNN predicts class 1 (67% probability), while MLP and RNN predict class 0 (84% and 86% respectively). The regression model uses a different scale and feature values, predicting 0.50 on a 0-1 scale. MLP and RNN plots are very similar, while the CNN plot shows more distinct probabilities. The regression plot stands apart due to its different format and scale. These differences highlight how various model architectures interpret the same features differently, resulting in distinct predictions and explanations.

SHAP vs LIME plots

ALE plots

Acknowledgments

Credit to the creators of the Pima Indian Diabetes dataset.
Thanks to the developers and contributors of SHAP, LIME, and ALE for their accessible and powerful XAI frameworks.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
Output		Output
Comparative_Study.pdf		Comparative_Study.pdf
LICENSE		LICENSE
README.md		README.md
proj_xai.ipynb		proj_xai.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

XAI Analysis for Diabetes Prediction

Description

XAI Techniques

Model Performance

Dataset

Output

SHAP summary plots

LIME plots

SHAP vs LIME plots

ALE plots

Acknowledgments

About

Releases

Packages

Languages

License

Atharva309/XAI_diabetes

Folders and files

Latest commit

History

Repository files navigation

XAI Analysis for Diabetes Prediction

Description

XAI Techniques

Model Performance

Dataset

Output

SHAP summary plots

LIME plots

SHAP vs LIME plots

ALE plots

Acknowledgments

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages