Skip to content
/ EIR Public
forked from arnor-sigurdsson/EIR

A framework for training deep learning models on genotype, tabular, sequence, image and binary data.

License

Notifications You must be signed in to change notification settings

valentas1/EIR

 
 

Repository files navigation

Documentation Status


Supervised modelling on genotype, tabular, sequence, image and binary data.

WARNING: This project is in alpha phase. Expect backwards incompatiable changes and API changes.

Install

pip install eir-dl

Usage

Please refer to the Documentation for examples and information.

Use Cases

EIR allows for training and evaluating various deep-learning models directly from the command line. This can be useful for:

  • Quick prototyping and iteration when doing supervised modelling on new datasets.
  • Establishing baselines to compare against other methods.
  • Fitting on data sources such as large-scale genomics, where DL implementations are not commonly available.

If you are a ML/DL researcher developing new models, etc., it might not fit your use case. However, it might provie a quick baseline for comparison to the cool stuff you are developing.

Features

  • Train models directly from the command line through .yaml configuration files.
  • Training on genotype, tabular, sequence, image and binary input data, with various modality-specific settings available.
  • Seamless multi-modal (e.g. combining text + image, or any combination of the modalities above) training.
  • Train multiple features extractors on the same data source, e.g. combining vanilla transformer, longformer and a pre-trained BERT variant for text classification.
  • Supports continuous (i.e., regression) and categorical (i.e., classification) targets.
  • Multi-task / multi-label prediction supported out-of-the-box.
  • Model explainability for genotype, tabular, sequence and image data built in.
  • Computes and graphs various evaluation metrics (e.g., RMSE, PCC and R2 for regression tasks, accuracy, ROC-AUC, etc. for classification tasks) during training.
  • Many more settings and configurations (e.g., augmentation, regularization, optimizers) available.

Citation

If you use EIR in a scientific publication, we would appreciate if you could use the following citation:

@article{sigurdsson2021deep,
  title={Deep integrative models for large-scale human genomics},
  author={Sigurdsson, Arnor Ingi and Westergaard, David and Winther, Ole and Lund, Ole and Brunak, S{\o}ren and Vilhjalmsson, Bjarni J and Rasmussen, Simon},
  journal={bioRxiv},
  year={2021},
  publisher={Cold Spring Harbor Laboratory}
}

Acknowledgements

Massive thanks to everyone publishing and developing the packages this project directly and indirectly depends on.

About

A framework for training deep learning models on genotype, tabular, sequence, image and binary data.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%