-
Notifications
You must be signed in to change notification settings - Fork 4
Home
This is a package meant to be used to cluster together Gene-Sets from pathway tools such as Ingenuity Pathway Analysis (IPA), GREAT, GSEA and others. Gene-Sets often appear significant when running such tools with the different labels representing the biological function of the gene sets.
However, it can become difficult to interpret the output of these tools when the data of several experiments are compared. Moreover, the output data has several limitations:
- Low ratio: where there are only a few genes enriched.
- High similarity: where the different genes sets that appear have the same genes enriched despite the different labels.
- Low overlap: where the same gene set labels appear in different experimental settings but different genes are enriched.
So its better to review theses sets of genes together instead of investigating the many different Gene-Sets individually. This package does this by taking the sets of genes of every Gene-Set and calculates a distance between them. The higher it is the greater overlap they have, the larger the distance. This distance score is then used to cluster the Gene-Sets together. By default the package uses the relative risk as a distance.
This tool has been used in several scientific publications: Citation list
Example data is taken from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE111385. The data is DeSeq2 analyzed Transcriptome sequencing of WT and conditional-Tgfbr2 knockout microglia and CNS-repopulating monocyte-derived macrophages from C57BL/6 mouse in triplicates. The genes were picked in a comparison of microglia (uG) vs Macrophages (Mac) in both wild type (WT) and Tgfbr2 knock out (KO) with a 1E-06 pvalue cutoff.
Significant genes were analysed in both IPA and GREAT. For IPA canonical pathways and functional annotations were exported into a excel file with default settings. For GREAT data with and without a background of the sequencing samples was run and exported in tsv format.
For a basic script to analyse the data use: Example Script
For a video user guide, check this link: Youtube
Step 1A Loading the data
Step 1B Creating an Object
Step 2 Combine and Cluster
Step 2B User supplied distance function
Step 2C Highlighting-Genes
Step 3 Exporting Data
Step 4 Functional Investigation
e-mail: [email protected]
Ewing, E., Planell-Picola, N., Jagodic, M. et al. GeneSetCluster: a tool for summarizing and integrating gene-set analysis results. BMC Bioinformatics 21, 443 (2020). https://doi.org/10.1186/s12859-020-03784-z
Example Script: Example
Step 1A: Loading the data
Step 1B: Creating an Object
Step 2: Combine and Cluster
Step 2B: User supplied distance function
Step 2C: Highlighting-Genes
Step 3: Exporting Data
Step 4: Functional Investigation
Video: Step-by-step user guide