Skip to content

GENTEL-lab/GerNA-Bind

Repository files navigation

GerNA-Bind: Geometric-informed RNA-ligand Binding Specificity Prediction with Deep Learning

✨ Welcome to the official repository for "GerNA-Bind: Geometric-informed RNA-ligand Binding Specificity Prediction with Deep Learning". This work is a collaborative effort by Yunpeng Xia, Jiayi Li, Chu Yi-Ting, Jiahua Rao, Chen Jing, Will Hua, Dong-Jun Yu, Xiucai Chen, and Shuangjia Zheng from Shanghai Jiaotong University.

Overview

🚀 We introduce GerNA-Bind, a geometric deep learning framework that excels in predicting RNA-ligand binding specificity by integrating multi-modal RNA-ligand representations. GerNA-Bind achieves state-of-the-art performance, successfully identifying 19 compounds binding to oncogenic MALAT1 RNA through high-throughput screening. Wet-lab validation confirmed three compounds with submicromolar affinities, showcasing its potential for advancing RNA-targeted drug discovery.

GerNA-Bind Overview

🗂 Contents

Installation

If you prefer a faster setup, you can use the provided gernabind.yaml file:

conda env create -f gernabind.yaml -y
conda activate gernabind

You can also install the environment either by following the step-by-step instructions below.

# Create a conda environment
conda create -y -n gernabind python=3.8
conda activate gernabind

#conda install pytorch==2.0.1 torchvision==0.15.2 pytorch-cuda=12.2 -c pytorch -c nvidia
pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0

# Install other dependencies 
pip install rna-fm==0.2.2
pip install ml_collections==0.1.1
pip install simtk==0.1.0
pip install openmm==8.1.1
pip install torchdrug==0.2.1
pip install torch_geometric==2.4.0
pip install equiformer-pytorch
pip install edl_pytorch==0.0.2
pip install rdkit==2023.9.5
pip install biopython==1.79
pip install pandas==1.5.3
pip install scikit-learn==1.2.2
pip install prody==2.4.1

Data Preparation

Dataset Description

Refer to the following guides for setting up datasets:

Generate RNA Structure

We use RhoFold+ to generate RNA 3D Structure and RNAfold (version: 2.5.1) to generate RNA 2D structure.

Data Processing

You can process data through the following steps:

python data_utils/process_data.py --fasta example/a.fasta --smile example/mol.txt --RhoFold_path your_RhoFold_project_path --RhoFold_weight RhoFold_model_weight_path

And the processed data will be saved in ./data folder as "new_data.pkl" file.

Dataset Download

We process the processed data with Robin & Biosensor dataset. You can download the processed data from Zenodo.

Using GerNA-Bind

Model Training

We provide the training scripts that you can train the model yourself.

python train_model.py --dataset Robin --split_method random --model_output_path Model/

CheckPoints

Download the model weights and put into the "Model" folder, which contains the model checkpoint. You can direct run the scripts in ./Model folder to ger the model weights.

bash Model/get_weights.sh

RNA Small Molecule Screening

You can use our model to screening small molecules which can binding target RNA.

python inference_affinity.py

RNA Target Binding Site Prediction

Otherwise, you can also use our model to get RNA target binding sites prediction. You can run the file below, so that you can get the RNA_binding.csv about RNA.

python inference_binding_site.py

License

No Commercial use of either the model nor generated data, details to be found in license.md.

Acknowledgements

Our work builds upon EquiFormer, Evidential Deep Learning, MONN, RNA-FM, RhoFold, and TankBind. Thanks for their excellent work and open-source contributions.

About

GerNA-Bind: Geometric-enhanced RNA-ligand Binding Specificity Prediction with Deep Learning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published