-
Notifications
You must be signed in to change notification settings - Fork 16
Workflows: panoply_association_workflow
This workflow performs association to identify differential markers for classes of interest, and create an interactive report. The workflow executes the following modules:
module | description |
---|---|
panoply_association |
performs association analysis, to identify marker genes associated with provided annotations |
panoply_accumulate |
assembles results from association analysis to run panoply_ssgsea |
panoply_ssgsea |
Performs ssGSEA on association contrast values |
panoply_association_report |
creates an interactive R Markdown report of the association results |
Module(s): panoply_association
This module performs association analysis to identify differential markers for classes of interest using a moderated t-test (for binary classes) or F-test (for categorical multi-level classes). The significant markers are then ranked using a combination of p-values and variable importance in accurate classifiers.
Classes used for the analysis are derived from the annotations provided in the groups
file, as long as each level in a class has at least 3 samples.
Module(s): panoply_accumulate
, panoply_ssgsea
ssGSEA is performed on the contrast values (i.e. coefficients of a limma linear model) for all marker features. This analysis is scattered across the results from every class of interest.
Module(s): panoply_association_report
This module creates an interactive R Markdown report of the GSEA results of the panoply_association module. For each class, for each contrast, an interactive volcano plot shows the Normalized Enrichment Score (NES) vs -log10 of the FDR value for each database pathway analyzed.
-
inputData
: (.tar
file) tarball frompanoply_parse_sm_table
or other PANOPLY module;
(.gct
file) normalized/filtered input ifstandalone
isTRUE
-
association_groups
(.csv
file) subset of sample annotations, providing classes for association analysis -
job_identifier
: (String) label for job -
type
: (String) (proteome) data type -
standalone
: (String) set toTRUE
to run as a self-contained module;
ifTRUE
theanalysisDir
andgroupsFile
inputs are required -
yaml
: (.yaml
file) parameters inyaml
format -
geneset_db
: (.gmt
file) gene set database
-
sample_na_max
: (Float, default = 0.8) maximum allowed fraction of NA values per sample/column; error if violated. -
nmiss_factor
: (Float, default = 0.5) features (genes, proteins, PTM sites) with more thannmiss_factor
fraction of NA values will be removed from the analysis -
duplicate_gene_policy
: (String, default = 'maxvar') method used to combine duplicate genes (when mapping protein accession or PTM site to gene symbols) for running GSEA; possible options are:- maxvar: select row with largest variance
- union: union of binary (0/1) values in all rows (e.g for mutation status)
- median: median of values in all rows (for each column/sample)
- mean: mean of values in all rows (for each column/sample)
- min: minimum of values in all rows (for each column/sample)
-
gene_id_col
: (String, default = 'geneSymbol') name of sample annotation column containing gene ids.
-
fdr_value
: (Float, default = 0.01) FDR value cutoff to be considered significant.
-
outputs
: Tarball of files containing the following in theassociation
subdirectory, for each class vector considered for association analysis:- List of significant differential markers derived using LIMMA (
*-markers-all-fdr*.csv
) and p-values for all input features (*-markers-all*.csv
) - Marker importance for significant markers, along with final rank (
*-markerimp-fdr*.csv
) - Heatmap of significant differential markers (
*-markers-heatmap.pdf
) - Classifier performance contingency tables (
*-analysis-model-results.txt
) - Table of prediction results for training data (
*-train-results-*.csv
) and testing data (*-test-results.csv
) using all classifiers - GSEA outputs, along with
.gct
. and.cls
input files, for binary classes (in*-gsea-analysis/
subdirectory).
- List of significant differential markers derived using LIMMA (
-
contrasts
: Tarfile of.gct
files containing the contrast values, for each class of interest. -
ssgsea_assoc_tars
: Array of tarfiles containing the results of ssGSEA analysis on each of the contrast tarfiles. -
report
: Report summarizing the ssGSEA analysis on association results.
- Home
- PANOPLY Tutorial
- Data Preparation Modules
-
Data Analysis Modules
- panoply_association
- panoply_blacksheep
- panoply_clumps_ptm_diffexp
- panoply_clumps_ptm
- panoply_clumps_ptm_postprocess
- panoply_cmap_analysis
- panoply_cna_correlation
- panoply_cons_clust
- panoply_immune_analysis
- panoply_metaboanalyst
- panoply_mimp
- panoply_nmf
- panoply_nmf_postprocess
- panoply_omicsev
- panoply_quilts
- panoply_rna_protein_correlation
- panoply_sankey
- panoply_ssgsea
-
Report Modules
- panoply_association_report
- panoply_blacksheep_report
- panoply_clumps_ptm_report
- panoply_cna_correlation_report
- panoply_cons_clust_report
- panoply_immune_analysis_report
- panoply_metaboanalyst_report
- panoply_mimp_report
- panoply_nmf_report
- panoply_normalize_ms_data_report
- panoply_rna_protein_correlation_report
- panoply_sampleqc_report
- panoply_sankey_report
- panoply_ssgsea_report
- Support Modules
- Navigating Results
- PANOPLY without Terra
- Customizing PANOPLY
-
Workflows
- panoply_association_workflow
- panoply_blacksheep_workflow
- panoply_clumps_ptm_workflow
- panoply_immune_analysis_workflow
- panoply_metaboanalyst_workflow
- panoply_nmf_workflow
- panoply_nmf_internal_workflow
- panoply_normalize_filter_workflow
- panoply_process_SM_table
- panoply_sankey_workflow
- panoply_ssgsea_workflow
- Pipelines