Skip to content

Commit

Permalink
Run prettier
Browse files Browse the repository at this point in the history
  • Loading branch information
LaurenceKuhl committed Sep 5, 2024
1 parent 90ffcc0 commit 699c49c
Show file tree
Hide file tree
Showing 5 changed files with 82 additions and 74 deletions.
7 changes: 5 additions & 2 deletions conf/test_screening.config
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,12 @@ params {
crisprcleanr = "Brunello_Library"
library = params.pipelines_testdata_base_path + "crisprseq/testdata/brunello_target_sequence.txt"
contrasts = params.pipelines_testdata_base_path + "crisprseq/testdata/rra_contrasts.txt"
drugz = params.pipelines_testdata_base_path + "crisprseq/testdata/rra_contrasts.txt"
hit_selection_iteration_nb = 150
drugz = true
hit_selection_iteration_nb = 50
hitselection = true
bagel2 = true
rra = true
mle = true
}

process {
Expand Down
27 changes: 19 additions & 8 deletions docs/usage/screening.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,9 +72,17 @@ Otherwise, if you wish to provide your own file, please provide it in CSV format
| CTCTACGAGAAGCTCTACAC | NM_021446.2 | 0610007P14Rik | ex2 | 12 | + | 85822108 |
| GACTCTATCACATCACACTG | NM_021446.2 | 0610007P14Rik | ex4 | 12 | + | 85816419 |

### Running MAGeCK MLE and BAGEL2 with a contrast file
### Running gene essentiality scoring

To run both MAGeCK MLE and BAGEL2, you can provide a contrast file with the flag `--contrasts` with the mandatory headers "treatment" and "reference". These two columns should be separated with a dot comma (;) and contain the `csv` extension. You can also integrate several samples/conditions by comma separating them in each column. Please find an example here below :
nf-core/crisprseq supports 4 gene essentiality analysis modules : MAGeCK RRA, MAGeCK MLE,
BAGEL2 and DrugZ. You can run any of these modules by providing a contrast file using `--contrasts` and the flag of the tool you wish to use:

- `--rra` for MAGeCK RRA,
- `--mle` for MAGeCK MLE
- `--drugz` for DrugZ
- `--bagel2` for BAGEL2.

The contrast file must contain the headers "treatment" and "reference".These two columns should be separated with a dot comma (;) and contain the `csv` extension. You can also integrate several samples/conditions by comma separating them in each column. Please find an example here below :

| reference | treatment |
| ----------------- | --------------------- |
Expand All @@ -87,14 +95,13 @@ A full example can be found [here](https://raw.githubusercontent.com/nf-core/tes

Running MAGeCK MLE and BAGEL2 with a contrast file will also output a Venn diagram showing common genes having an FDR < 0.1.

### Running MAGeCK RRA only
### MAGeCK RRA

MAGeCK RRA performs robust ranking aggregation to identify genes that are consistently ranked highly across multiple replicate screens. To run MAGeCK RRA, you can define the contrasts as previously stated in the last section with --contrasts your_file.txt(with a `.txt` extension) and also specify `--rra`.
MAGeCK RRA performs robust ranking aggregation to identify genes that are consistently ranked highly across multiple replicate screens. To run MAGeCK RRA, you can define the contrasts as previously stated in the last section with `--contrasts your_file.txt` (with a `.txt` extension) and also specify `--rra`.

### Running MAGeCK MLE only

#### With design matrices
#### With your own design matrices

If you wish to run MAGeCK MLE only, you can specify several design matrices (where you state which comparisons you wish to run) with the flag `--mle_design_matrix`.
MAGeCK MLE uses a maximum likelihood estimation approach to estimate the effects of gene knockout on cell fitness. It models the read count data of guide RNAs targeting each gene and estimates the dropout probability for each gene.
Expand All @@ -106,7 +113,11 @@ If there are several designs to be run, you can input a folder containing all th

This label is not mandatory as in case you are running time series. If you wish to run MAGeCK MLE with the day0 label you can do so by specifying `--day0_label` and the sample names that should be used as day0. The contrast will then be automatically adjusted for the other days.

### MAGECKFlute
#### With the contrast file

To run MAGeCK MLE, you can define the contrasts as previously stated in the last section with --contrasts your_file.txt and also specify `--mle`.

### MAGeCKFlute

The downstream analysis involves distinguishing essential, non-essential, and target-associated genes. Additionally, it encompasses conducting biological functional category analysis and pathway enrichment analysis for these genes. Furthermore, it provides visualization of genes within pathways, enhancing user exploration of screening data. MAGECKFlute is run automatically after MAGeCK MLE and for each MLE design matrice. If you have used the `--day0_label`, MAGeCKFlute will be ran on all the other conditions. Please note that the DepMap data is used for these plots.

Expand All @@ -117,11 +128,11 @@ You can add the parameter `--mle_control_sgrna` followed by your file (one non t
### Running BAGEL2

BAGEL2 (Bayesian Analysis of Gene Essentiality with Location) is a computational tool developed by the Hart Lab at Harvard University. It is designed for analyzing large-scale genetic screens, particularly CRISPR-Cas9 screens, to identify genes that are essential for the survival or growth of cells under different conditions. BAGEL2 integrates information about the location of guide RNAs within a gene and leverages this information to improve the accuracy of gene essentiality predictions.
BAGEL2 uses the same contrasts from `--contrasts`.
BAGEL2 uses the same contrasts from `--contrasts` and is run with the extra parameter `--bagel2`.

### Running drugZ

[DrugZ](https://github.com/hart-lab/drugz) detects synergistic and suppressor drug-gene interactions in CRISPR screens. DrugZ is an open-source Python software for the analysis of genome-scale drug modifier screens. The software accurately identifies genetic perturbations that enhance or suppress drug activity. To run drugZ, you can specify `--drugz` followed a contrast file with the mandatory headers "treatment" and "reference". These two columns should be separated with a dot comma (;) and contain the `csv` extension. You can also integrate several samples/conditions by comma separating them in each column.
[DrugZ](https://github.com/hart-lab/drugz) detects synergistic and suppressor drug-gene interactions in CRISPR screens. DrugZ is an open-source Python software for the analysis of genome-scale drug modifier screens. The software accurately identifies genetic perturbations that enhance or suppress drug activity. To run drugZ, you can specify `--drugz` with the contrast file `--contrasts`. These two columns should be separated with a dot comma (;) and contain the `csv` extension. You can also integrate several samples/conditions by comma separating them in each column.

| reference | treatment |
| ----------------- | --------------------- |
Expand Down
4 changes: 3 additions & 1 deletion nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,9 @@ params {
min_reads = 30
min_targeted_genes = 3
rra = false
mle = false
drugz = false
bagel2 = false
bagel_reference_essentials = 'https://raw.githubusercontent.com/hart-lab/bagel/master/CEGv2.txt'
bagel_reference_nonessentials = 'https://raw.githubusercontent.com/hart-lab/bagel/master/NEGv1.txt'
drugz = null
Expand Down Expand Up @@ -232,7 +235,6 @@ singularity.registry = 'quay.io'
// Nextflow plugins
plugins {
id '[email protected]' // Validation of pipeline parameters and creation of an input channel from a sample sheet
id '[email protected]'
}

// Load igenomes.config if required
Expand Down
18 changes: 13 additions & 5 deletions nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -202,10 +202,23 @@
"description": "Comma-separated file with the conditions to be compared. The first one will be the reference (control)",
"fa_icon": "fas fa-adjust"
},
"mle": {
"type": "boolean",
"description": "Parameter specifying MAGeCK MLE should be run"
},
"rra": {
"type": "boolean",
"description": "Parameter indicating if MAGeCK RRA should be ran instead of MAGeCK MLE."
},
"bagel2": {
"type": "boolean",
"description": "Parameter indicating if BAGEL2 should be run"
},
"drugz": {
"type": "boolean",
"format": "file-path",
"description": "Parameter indicating if DrugZ should be run"
},
"count_table": {
"type": "string",
"format": "file-path",
Expand Down Expand Up @@ -238,11 +251,6 @@
"description": "Non essential gene set for BAGEL2",
"default": "https://raw.githubusercontent.com/hart-lab/bagel/master/NEGv1.txt"
},
"drugz": {
"type": "string",
"format": "file-path",
"description": "Specifies drugz to be run and your contrast file on which comparisons should be done"
},
"drugz_remove_genes": {
"type": "string",
"description": "Essential genes to remove from the drugZ modules",
Expand Down
100 changes: 42 additions & 58 deletions workflows/crisprseq_screening.nf
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,6 @@ include { BOWTIE2_ALIGN } from '../modules/nf-cor
include { INITIALISATION_CHANNEL_CREATION_SCREENING } from '../subworkflows/local/utils_nfcore_crisprseq_pipeline'
// Functions
include { paramsSummaryMap } from 'plugin/nf-validation'
include { gptPromptForText } from 'plugin/nf-gpt'
include { paramsSummaryMultiqc } from '../subworkflows/nf-core/utils_nfcore_pipeline'
include { softwareVersionsToYAML } from '../subworkflows/nf-core/utils_nfcore_pipeline'
include { methodsDescriptionText } from '../subworkflows/local/utils_nfcore_crisprseq_pipeline'
Expand Down Expand Up @@ -245,53 +244,56 @@ workflow CRISPRSEQ_SCREENING {
counts = ch_contrasts.combine(ch_counts)


if(params.bagel2) {
//Define non essential and essential genes channels for bagel2
ch_bagel_reference_essentials= Channel.fromPath(params.bagel_reference_essentials).first()
ch_bagel_reference_nonessentials= Channel.fromPath(params.bagel_reference_nonessentials).first()
ch_bagel_reference_essentials= Channel.fromPath(params.bagel_reference_essentials).first()
ch_bagel_reference_nonessentials= Channel.fromPath(params.bagel_reference_nonessentials).first()

BAGEL2_FC (
counts
)
ch_versions = ch_versions.mix(BAGEL2_FC.out.versions)
BAGEL2_FC (
counts
)
ch_versions = ch_versions.mix(BAGEL2_FC.out.versions)

BAGEL2_BF (
BAGEL2_FC.out.foldchange,
ch_bagel_reference_essentials,
ch_bagel_reference_nonessentials
)
BAGEL2_BF (
BAGEL2_FC.out.foldchange,
ch_bagel_reference_essentials,
ch_bagel_reference_nonessentials
)

ch_versions = ch_versions.mix(BAGEL2_BF.out.versions)
ch_versions = ch_versions.mix(BAGEL2_BF.out.versions)


ch_bagel_pr = BAGEL2_BF.out.bf.combine(ch_bagel_reference_essentials)
ch_bagel_pr = BAGEL2_BF.out.bf.combine(ch_bagel_reference_essentials)
.combine(ch_bagel_reference_nonessentials)

BAGEL2_PR (
ch_bagel_pr
)
ch_versions = ch_versions.mix(BAGEL2_PR.out.versions)
BAGEL2_PR (
ch_bagel_pr
)
ch_versions = ch_versions.mix(BAGEL2_PR.out.versions)

BAGEL2_GRAPH (
BAGEL2_PR.out.pr
)
BAGEL2_GRAPH (
BAGEL2_PR.out.pr
)

ch_versions = ch_versions.mix(BAGEL2_GRAPH.out.versions)
ch_versions = ch_versions.mix(BAGEL2_GRAPH.out.versions)
// Run hit selection on BAGEL2
if(params.hitselection) {

// Run hit selection on BAGEL2
if(params.hitselection) {
HITSELECTION_BAGEL2 (
BAGEL2_PR.out.pr,
INITIALISATION_CHANNEL_CREATION_SCREENING.out.biogrid,
INITIALISATION_CHANNEL_CREATION_SCREENING.out.hgnc,
params.hit_selection_iteration_nb
)
ch_versions = ch_versions.mix(HITSELECTION_BAGEL2.out.versions)
}

HITSELECTION_BAGEL2 (
BAGEL2_PR.out.pr,
INITIALISATION_CHANNEL_CREATION_SCREENING.out.biogrid,
INITIALISATION_CHANNEL_CREATION_SCREENING.out.hgnc,
params.hit_selection_iteration_nb
)
ch_versions = ch_versions.mix(HITSELECTION_BAGEL2.out.versions)
}
}

}

if((params.mle_design_matrix) || (params.contrasts && !params.rra) || (params.day0_label)) {
// Run MLE
if((params.mle_design_matrix) || (params.contrasts && params.mle) || (params.day0_label)) {
//if the user only wants to run mle through their own design matrices
if(params.mle_design_matrix) {
INITIALISATION_CHANNEL_CREATION_SCREENING.out.design.map {
Expand All @@ -306,7 +308,7 @@ workflow CRISPRSEQ_SCREENING {
}

//if the user specified a contrast file
if(params.contrasts) {
if(params.contrasts && params.mle) {
MATRICESCREATION(ch_contrasts)
ch_mle = MATRICESCREATION.out.design_matrix.combine(ch_counts)
MAGECK_MLE (ch_mle, INITIALISATION_CHANNEL_CREATION_SCREENING.out.mle_control_sgrna)
Expand All @@ -318,15 +320,11 @@ workflow CRISPRSEQ_SCREENING {
INITIALISATION_CHANNEL_CREATION_SCREENING.out.hgnc,
params.hit_selection_iteration_nb)

ch_versions = ch_versions.mix(HITSELECTION_BAGEL2.out.versions)
ch_versions = ch_versions.mix(HITSELECTION_MLE.out.versions)
}

MAGECK_FLUTEMLE_CONTRASTS(MAGECK_MLE.out.gene_summary)
ch_versions = ch_versions.mix(MAGECK_FLUTEMLE_CONTRASTS.out.versions)
ch_venndiagram = BAGEL2_PR.out.pr.join(MAGECK_MLE.out.gene_summary)
VENNDIAGRAM(ch_venndiagram)
ch_versions = ch_versions.mix(VENNDIAGRAM.out.versions)

}
if(params.day0_label) {
ch_mle = Channel.of([id: "day0"]).merge(Channel.of([[]])).merge(ch_counts)
Expand All @@ -339,7 +337,7 @@ workflow CRISPRSEQ_SCREENING {

// Launch module drugZ
if(params.drugz) {
Channel.fromPath(params.drugz)
Channel.fromPath(params.contrasts)
.splitCsv(header:true, sep:';' )
.set { ch_drugz }

Expand All @@ -355,30 +353,16 @@ workflow CRISPRSEQ_SCREENING {
INITIALISATION_CHANNEL_CREATION_SCREENING.out.hgnc,
params.hit_selection_iteration_nb)

ch_versions = ch_versions.mix(HITSELECTION_BAGEL2.out.versions)
ch_versions = ch_versions.mix(HITSELECTION.out.versions)
}

}

//
// Parse genes from drugZ to Open AI api
//
gene_source = DRUGZ.out.per_gene_results.map { meta, genes -> genes}
def question = "Which of the following genes enhance or supress drug activity. Only write the gene names with yes or no respectively."
PREPARE_GPT_INPUT(
gene_source,
question
)

PREPARE_GPT_INPUT.out.query.map {
it -> it.text
if(params.mle && params.bagel2) {
ch_venndiagram = BAGEL2_PR.out.pr.join(MAGECK_MLE.out.gene_summary)
VENNDIAGRAM(ch_venndiagram)
}
.collect()
.flatMap { it -> gptPromptForText(it[0]) }
.set { gpt_genes_output }

gpt_genes_output
.collectFile( name: 'gpt_important_genes.txt', newLine: true, sort: false )

//
// Collate and save software versions
Expand Down

0 comments on commit 699c49c

Please sign in to comment.