Skip to content
This repository was archived by the owner on Oct 2, 2020. It is now read-only.
Open
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions test/cnvkit-batch-job.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
{
"bam_files": [
"*Tumor.bam"
],
"normal":[
"*Normal.bam"
],
"targets": "my_baits.bed",
"split": true,
"annotate": "refFlat.txt",
"fasta": "hg19.fasta",
"access": "data/access-5kb-mappable.hg19.bed",
"output_dir": "results/",
"diagram": true,
"scatter": true
}
18 changes: 18 additions & 0 deletions test/cnvkit-batch-test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
- args: [
"cnvkit.py",
"batch",
"--access", "data/access-5kb-mappable.hg19.bed",
"--annotate", "refFlat.txt",
"--diagram",
"--fasta", "hg19.fasta",
"--normal", "*Normal.bam",
"--output-dir", "results/",
"--processes", "1",
"--scatter",
"--split",
"--targets", "my_baits.bed",
"*Tumor.bam",
]
job: cnvkit-batch-job.json
tool: ../tools/cnvkit-batch.cwl
doc: General test of command line generation
13 changes: 13 additions & 0 deletions test/cnvkit-scatter-job.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
{
"segment": "segment.cns",
"chromosome": "chr1",
"split": true,
"gene": "gen1, gen2",
"range_list": "chr -start-end",
"sample_id": "data/access-5kb-mappable.hg19.bed",
"vcf": "data.vcf",
"y_min": 3.04,
"y_max": 4.04,
"trend": true,
"output": "result.txt"
}
19 changes: 19 additions & 0 deletions test/cnvkit-scatter-test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
- args: [
"cnvkit.py",
"scatter",
"--chromosome", "chr1",
"--gene", "gen1, gen2",
"--min-variant-depth", "20",
"--output", "result.txt",
"--range-list", "chr -start-end",
"--sample-id", "data/access-5kb-mappable.hg19.bed",
"--segment", "segment.cns",
"--trend",
"--vcf", "data.vcf",
"--width", "1000000.0",
"--y-max", "4.04",
"--y-min", "3.04",
]
job: cnvkit-scatter-job.json
tool: ../tools/cnvkit-scatter.cwl
doc: General test of command line generation
9 changes: 9 additions & 0 deletions test/cnvkit-segmetrics-job.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
{
"cnarray": "*Tumor.bam",
"segments": "*Normal.cns",
"drop_low_coverage": true,
"output": "results/result.txt",
"stdev": true,
"mad": true,
"pi": true
}
16 changes: 16 additions & 0 deletions test/cnvkit-segmetrics-test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
- args: [
"cnvkit.py",
"segmetrics",
"--alpha", "0.05",
"--bootstrap", "100",
"--drop-low-coverage",
"--mad",
"--output", "results/result.txt",
"--pi",
"--segments", "*Normal.cns",
"--stdev",
"*Tumor.bam"
]
job: cnvkit-segmetrics-job.json
tool: ../tools/cnvkit-segmetrics.cwl
doc: General test of command line generation
8 changes: 8 additions & 0 deletions test/cnvkit-target-job.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
{
"interval": "*Tumor.bam",
"annotate": "refFlat.txt",
"avg_size": 33,
"output": "results.json",
"short_names": true,
"split": true
}
14 changes: 14 additions & 0 deletions test/cnvkit-target-test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
- args: [
"cnvkit.py",
"target",
"--annotate",
"refFlat.txt",
"--avg-size", "33",
"--output", "results.json",
"--short-names",
"--split",
"*Tumor.bam"
]
job: cnvkit-target-job.json
tool: ../tools/cnvkit-target.cwl
doc: General test of command line generation
8 changes: 8 additions & 0 deletions test/test-files/cnvkit-batch/draft.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
command from cnvkit batch tutorial (https://cnvkit.readthedocs.io/en/v0.7.11/pipeline.html#batch) I'm trying to run


cnvkit.py batch *Tumor.bam --normal *Normal.bam \
--targets my_baits.bed --split --annotate refFlat.txt \
--fasta hg19.fasta --access data/access-5kb-mappable.hg19.bed \
--output-reference my_reference.cnn --output-dir results/ \
--diagram --scatter
58,939 changes: 58,939 additions & 0 deletions test/test-files/cnvkit-batch/refFlat.txt

Large diffs are not rendered by default.

173 changes: 173 additions & 0 deletions tools/cnvkit-batch.cwl
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
#!/usr/bin/env cwl-runner

cwlVersion: "cwl:draft-3"

class: CommandLineTool
baseCommand: ['cnvkit.py', 'batch']

requirements:
- class: InlineJavascriptRequirement

description: |
Run the complete CNVkit pipeline on one or more BAM files.

inputs:


- id: bam_files
type:
- "null"
- type: array
items: string

description: Mapped sequence reads (.bam)
inputBinding:
position: 1

- id: male_reference
type: ["null", boolean]
default: null
description: Use or assume a male reference (i.e. female samples will have +1
log-CNR of chrX; otherwise male samples would have -1 chrX).
inputBinding:
prefix: --male-reference
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CWL tip: for Argparse only the position dependent arguments need their position specified. Arguments that have a prefix like --male-reference can occur in any order, so it would be nice if cwlargparse didn't specify the unneeded positions in these cases.


- id: count_reads
type: ["null", boolean]
default: null
description: Get read depths by counting read midpoints within each bin.
(An alternative algorithm).
inputBinding:
prefix: --count-reads

- id: processes
type: ["null", int]
default: 1
description: Number of subprocesses used to running each of the BAM files in
parallel. Give 0 or a negative value to use the maximum number
of available CPUs. [Default - process each BAM in serial]
inputBinding:
prefix: --processes

- id: rlibpath
type: ["null", string]
description: Path to an alternative site-library to use for R packages.
inputBinding:
prefix: --rlibpath

- id: normal
type:
- "null"
- type: array
items: string

description: Normal samples (.bam) to construct the pooled reference.
If this option is used but no files are given, a "flat"
reference will be built.
inputBinding:
prefix: --normal

- id: fasta
type: ["null", string]
description: Reference genome, FASTA format (e.g. UCSC hg19.fa)
inputBinding:
prefix: --fasta

- id: targets
type: ["null", string]
description: Target intervals (.bed or .list)
inputBinding:
prefix: --targets

- id: antitargets
type: ["null", string]
description: Antitarget intervals (.bed or .list)
inputBinding:
prefix: --antitargets

- id: annotate
type: ["null", string]
description: UCSC refFlat.txt or ensFlat.txt file for the reference genome.
Pull gene names from this file and assign them to the target
regions.
inputBinding:
prefix: --annotate

- id: short_names
type: ["null", boolean]
default: null
description: Reduce multi-accession bait labels to be short and consistent.
inputBinding:
prefix: --short-names

- id: split
type: ["null", boolean]
default: null
description: Split large tiled intervals into smaller, consecutive targets.
inputBinding:
prefix: --split

- id: target_avg_size
type: ["null", int]
description: Average size of split target bins (results are approximate).
inputBinding:
prefix: --target-avg-size

- id: access
type: ["null", string]
description: Regions of accessible sequence on chromosomes (.bed), as
output by the 'access' command.
inputBinding:
prefix: --access

- id: antitarget_avg_size
type: ["null", int]
description: Average size of antitarget bins (results are approximate).
inputBinding:
prefix: --antitarget-avg-size

- id: antitarget_min_size
type: ["null", int]
description: Minimum size of antitarget bins (smaller regions are dropped).
inputBinding:
prefix: --antitarget-min-size

- id: output_reference
type: ["null", string]
description: Output filename/path for the new reference file being created.
(If given, ignores the -o/--output-dir option and will write the
file to the given path. Otherwise, "reference.cnn" will be
created in the current directory or specified output directory.)

inputBinding:
prefix: --output-reference

- id: reference
type: ["null", string]
description: Copy number reference file (.cnn).
inputBinding:
prefix: --reference

- id: output_dir
type: ["null", string]
default: .
description: Output directory.
inputBinding:
prefix: --output-dir

- id: scatter
type: ["null", boolean]
default: null
description: Create a whole-genome copy ratio profile as a PDF scatter plot.
inputBinding:
prefix: --scatter

- id: diagram
type: ["null", boolean]
default: null
description: Create a diagram of copy ratios on chromosomes as a PDF.
inputBinding:
prefix: --diagram

outputs:
[]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are several outputs from this command and they vary based on the input BAM filenames and the options given.

  • For each tumor/test-sample BAM named e.g. Sample.bam, the outputs are: "Sample.targetcoverage.cnn", "Sample.antitargetcoverage.cnn", "Sample.cnr", "Sample.cns"
  • If the --scatter option is given, then for each tumor/test sample, "Sample-scatter.pdf" is created
  • Similarly, the --diagram option creates "Sample-diagram.pdf"
  • For all of the above, if -d/--output-dir is specified, the created file names are relative to (i.e. in) that specified directory
  • If the -r/--reference option is not given, then a .cnn file is created either with the filename given by --output-reference (regardless of the -d/--output-dir path) or by default "cnv_reference.cnn"

24 changes: 24 additions & 0 deletions tools/cnvkit-docker.cwl
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
class: DockerRequirement
dockerPull:
dockerFile: |
#################################################################
# Dockerfile
#
# Software: cnvkit
# Software Version: 0.7.11
# Description: cnvkit docker image
# Website: https://github.com/etal/cnvkit
# Provides:
# Base Image:
# Build Cmd:
# Pull Cmd:
# Run Cmd:
#################################################################

FROM python:2.7
MAINTAINER Anton Khodak <[email protected]>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Thanks for wrapping this on a docker container 👍


# Install cnvkit from pip
RUN pip install cnvkit

# Default command to execute at startup of the container
Loading