We present PerturBase—the first and most comprehensive database designed for the analysis and visualization of scPerturbation data (http://www.perturbase.cn/). PerturBase consolidates 122 datasets from 46 publicly accessible research studies, covering 115 single-modal and 7 multi-modal datasets that include 24254 genetic and 230 chemical perturbations from about 6 million cells.
This folder DataProcessing_RNAseq contains the code for preprocessing and denoising the raw RNAseq data.
This folder DataProcessing_ATACseq contains the code for preprocessing and denoising the raw ATAC data including Fragment tsv file and Peak Count matrix h5ad file.
This folder E-distance contains the code for calculating the distance between the perturb group and control (non-perturbed group)
This folder Enrichment contains the code for GO and KEGG enrichment and visualization.
This folder Plot contains the basic plot function.
This manuscript reProduce_the_result_of_Mixscape.py is the code for reproducing the extended Data of Fig.9a of Characterizing the molecular regulation of inhibitory immune checkpoints with multimodal single-cell screens. And We use this manuscript evaluate_the_impact_of_hvg.py to evaluate the impact of the selection of hvg in data preprocessing on the perturbation-specific differentially expressed genes.
This folder calDEG contains the code for calculating the differentially expressed genes with the fives softwares。
Zhiting Wei, Duanmiao Si, Qi Liu et al. PerturBase: a comprehensive database for single-cell perturbation data analysis and visualization, 2024.