Arabidopsis kinase genetic interaction prediction project
-
All the raw datasets used for this project are in the folder /home/seguraab/ara-kinase-prediction/data/
-
Paper: https://doi.org/10.1093/molbev/msab111 (Cusack et al., 2021 MBE)
-
Data retrieval sources:
-
GitHub: https://github.com/ShiuLab/Manuscript_Code/tree/master/2021_Arabidopsis_redundancy_modeling
-
Zenodo: https://zenodo.org/records/3987384
- Dataset_2.txt: training set feature matrix (inclusive genetic redundancy gene pairs [RD9]) in /home/seguraab/ara-kinase-prediction/data/2021_cusack_data/
- Dataset_4.txt: training set feature matrix (extreme genetic redundancy gene pairs [RD4]) and testing set (kinases) in /home/seguraab/ara-kinase-prediction/data/2021_cusack__data/
-
From the MSU hpcc: /mnt/research/ShiuLab/21_arabidopsis_redundancy
- This folder contains the raw and pre-processed features from Cusack et al., 2021
- Saved to /home/seguraab/ara-kinase-prediction/data/2021_cusack_data/21_arabidopsis_redundancy
-
-
Feature name list:
- Supplemental_tables_revision.xlsx: Supplemental table 1 (4,116 features) in /home/seguraab/ara-kinase-prediction/data/2021_cusack_data/
-
Instances:
- Training data:
- interactions_fitness.txt: Melissa's duplicate gene pair instances with binary genetic interaction label (0 for not interacting, 1 for interacting)
- Test data:
- FILE TBD: n number of kinase genes from Dataset_4.txt that are not overlapping with interactions_fitness.txt
- Training data: