BSM-MIMICIII

Code for the paper A Binary Soft Mask Approach for ICD Coding from Clinical Text

Model

Dependencies

Python 3.7
PyTorch 1.9.0
tqdm 4.41.1
scikit-learn 0.22.2
numpy 1.19.5
scipy 1.4.1
pandas 1.1.5
gensim 3.6.0
nltk 3.2.5

All of the work has been done on Google Colab with GPU, and these packages are directly imported from the Colab. Other versions may also work.

Data Processing

The Data Processing method is referred from the paper Explainable Prediction of Medical Codes from Clinical Text and this repo.

First, edit the data and model directory in the file constant.py , and place the data into the mimicdata like this:

mimicdata
|   D_ICD_DIAGNOSES.csv
|   D_ICD_PROCEDURES.csv
|   ICD9_descriptions (already in repo)
└───mimic3/
|   |   NOTEEVENTS.csv
|   |   DIAGNOSES_ICD.csv
|   |   PROCEDURES_ICD.csv
|   |   *_hadm_ids.csv (already in repo)

Run all cells in notebooks/dataproc_mimic_III.ipynb. It will take about more than 10 minutes.

Train a new model

To train a new BSM model, first modify the file constants.py, and run train_bsm.sh.

Test a model

We provide a trained BSM model in saved_models folder, by running test_bsm.sh you can check the metrics. There is also a CAML model in this folder, published by Mullenbach: https://github.com/jamesmullenbach/caml-mimic from the paper: Explainable Prediction of Medical Codes from Clinical Text.

Results on MIMIC-III top 50 labels test set

Full Results

In results/omission and results/selection , there are analysis we have done. By running these shell commands you will get some files used for the further analysis, and the examples in the appendix. In results/saved_results, these are results we got.

For comparison of the explainability between CAML and BSM, run all cells in notebooks/comparison.ipynb and you'll get figures.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
dataproc		dataproc
learn		learn
mimicdata		mimicdata
notebooks		notebooks
results		results
saved_models		saved_models
README.md		README.md
constants.py		constants.py
datasets.py		datasets.py
evaluation.py		evaluation.py
extractor.py		extractor.py
get_metrics_for_saved_predictions.py		get_metrics_for_saved_predictions.py
log_reg.py		log_reg.py
omission_on_file.py		omission_on_file.py
omission_on_text.py		omission_on_text.py
persistence.py		persistence.py
test_bsm.sh		test_bsm.sh
train_bsm.sh		train_bsm.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BSM-MIMICIII

Model

Dependencies

Data Processing

Train a new model

Test a model

Results on MIMIC-III top 50 labels test set

Full Results

About

Releases

Packages

Languages

deweihu96/mimic-bsm

Folders and files

Latest commit

History

Repository files navigation

BSM-MIMICIII

Model

Dependencies

Data Processing

Train a new model

Test a model

Results on MIMIC-III top 50 labels test set

Full Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages