Majority of the codes and works are inspired by https://projects.volkamerlab.org/teachopencadd/talktorials and https://github.com/Ash100.
Tut001_Compound_Data_Acquisition_(ChEMBL).ipynb used V5TFZ2 (Uniprot ID; https://www.uniprot.org/uniprotkb/V5TFZ2/entry) which is infact Dengue Type-2 NS5 protein.
Tut002_Dataset_Filteration_and_Analysis.ipynb uses the file "NS5_compounds.csv" produced in Tut001. This file (NS5_compounds.csv) is also present in this folder.
• Tut001–Tut003: Build a clean training dataset from ChEMBL (known Dengue NS3 inhibitors).
• Tut004–Tut006: Feature engineering + scaffold analysis for both datasets.
• Tut007: Entry point for unlabeled ZINC ligands → predict activity using ML.
• Tut008–Tut012: Dock both known and predicted unknowns into Dengue NS3 for structure-based refinement.
• Tut013: Output = ranked hit list of reference actives (ChEMBL) and novel hits (ZINC).