Skip to content
This repository has been archived by the owner on Jul 2, 2022. It is now read-only.

zy618/MedTS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MedTS

This repository contains codes and models for the paper:

MedTS: A BERT-based Generation Model to Transform Medical Texts to SQL Queries for Electronic Medical Records

Requirements

Environments

pytorch >= 1.4.0
transformers == 3.0.2
nltk, numpy, tqdm, matplotlib, idna, tushare, sqlalchemy, pandas, 
boto3, requests, regex, more_itertools, interval, translate, num2words

Data preparation

We provide the processed dataset in ./data, including train, validation and test sets.

The original dataset can be found from TREQS.

Training

  • run ./train.sh to train the model.
python -u ./src/train.py \
  --dataset $DATA_DIR \
  --train_data $TRAIN_DATA_PATH \
  --epoch $EPOCH_NUM \
  --save $SAVED_MODEL_DIR \
  --cuda \
  --cuda_device_num $DEVICE_NUM\

Predicting

  • run ./predict.sh to get the prediction on the validation/test set.
python -u ./src/predict.py \
  --dataset $DATA_DIR \
  --eval_data $PREDICT_DATA_PATH \
  --model_dir $SAVED_MODEL_DIR \
  --model $SAVED_MODEL_NAME \
  --output_dir $OUTPUT_DIR \
  --cuda \
  --cuda_device_num $DEVICE_NUM \

Evaluation

The details of evaluation can be found in TREQS_evaluation, which is based on the publicly available real-world de-identified Medical Information Mart for Intensive Care III (MIMIC III) dataset.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published