Skip to content

This project leverages deep learning to predict Covid-19 patient mortality. The model is trained on a dataset generously provided by the Mexican government. It places a strong emphasis on crucial stages, including comprehensive data analysis and rigorous model training, with the ultimate goal of delivering a highly accurate deep learning model.

Notifications You must be signed in to change notification settings

virchan/covid19_deep_learning

Repository files navigation

Predicting Covid-19 Patient Mortality with Deep Learning

A predictive model on Covid-19 patient mortality is presented here. The model is trained on a dataset provided by the Mexican government (link), which is also available at Kaggle (link).

Upon receiving the dataset, we perform exploratory data analysis (EDA). We discovered features associated with high mortality.

More precisely, the following classes of Covid-19 patients:

  • patients on ventilators
  • ICU patients
  • patients with air sacs inflammation

are negatively correlated with survival. This phenomenon shows across most age groups, and matches a conclusion made by Mahendra-Nuchin-Kumar-Shreedhar-Mahesh.

During EDA, we discover the dataset handles missing data by declaring its value to be one of 97, 98 and 99. Take the "asthma" feature, for instance, the missing data can be insignificant.

On the other hand, there are features with large sparsity, preventing them from being dropped. There are also features indicating a patient's medical conditions (such as asthma, diabetes, etc), preventing us to replace their values. Hence, we performed a data cleaning, which increased the dense part of the dataset (i.e., the subset of samples with no missing values at all) from 7.33% to 97.24%. Our deep learning model is trained on the dense part.

Our deep learning model is a convolutional neural network constructed under the framework tensorflow.

We perform a 60-20-20 split on the dense part. The model is then trained, validated and tested.

Using a confusion matrix and a classification report, our model performance is

Our work is documented in the Jupyter notebook meirnizri_covid19_dataset.ipynb (ipynb, html). A copy covid19.h5 of our model is also provided here.

Finally, this is one of the author's portfolio projects on data science and machine learning. No medical advice is given here.

About

This project leverages deep learning to predict Covid-19 patient mortality. The model is trained on a dataset generously provided by the Mexican government. It places a strong emphasis on crucial stages, including comprehensive data analysis and rigorous model training, with the ultimate goal of delivering a highly accurate deep learning model.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published