Options

Help
- -h, -help
Input Arguments
Genotype Arguments
- -g, -genome
- -pcrfree
- -M
- -merge
- -min-ovr
- -pre
- -feats
Classifier Arguments
- -load-clf
- -clf
Config Arguments
- -hg19
- -hg38
- -mm10
Optional Arguments

Help

-h | --help         show this help message and exit

Display help message

Input Arguments

Sample Information

Required

-i | -in    FILE        Sample information [ID, BAM-PATH, VCF-PATH, GENDER]

The sample information file is required by SV².

The sample information file must be either tab or space delimited and must contain four columns. Each line contains one sample to be genotyped by SV².

Multiple samples can be run in parallel with the -c | -cpu argument.

SV files

SV² can take multiple files containing SV predictions as input. BED and VCF files are supported.

BED

-b | -bed    ...        BED file(s) of SVs

One or more BED files can be passed to SV², separating by a space.

$ sv2 -i in.txt -b del.bed dup.bed

BED files are either space or tab delimited, formatted as CHROM START END SVTYPE.

Details on the required BED format

VCF

-v | -vcf    ...        VCF file(s) of SVs

One or more VCF file can be passed to SV², separating by a space.

VCF files are tab delimited, END= and SVTYPE= are required in the INFO column.

Details on the required VCF format

Genotype Arguments

Parallelization

-c | -cpu    INT        Parallelize sample-wise: 1 per CPU [1]

Given more than one sample in the sample information input, SV² can perform preprocessing and feature extraction in parallel. Each subprocess operates on one sample, this is limited by the number of cores on a CPU.

By default SV² will run each sample serially. Note that SV² does not parallelize across chromosomes.

Reference Genome

-g | -genome    STR        Reference genome build [hg19, hg38]. Default: hg19

Accepted reference genome builds for SV² are hg19 (GRCh37) or hg38 (GRCh38). Accepted command line argument strings are either hg19 or hg38.

PCRfree Libraries

-pcrfree        GC content normalization for PCRfree libraries

SV² performs a GC content normalization for coverage estimates adapted from CNVator. Supply this flag if the samples in the sample information list were sequenced with PCRfree chemistries.

By default this flag is off and SV² assumes samples were sequenced with PCR protocols.

bwa mem -M compatibility

-M        bwa mem -M compatibility. Split-reads flagged as secondary instead of supplementary

SV² can accommodate legacy alignments with chimeric reads flagged as secondary. Pass the -M flag if samples in the sample information file were aligned with bwa mem -M.

By default SV² assumes chimeric reads are flagged as supplementary (-M is off).

Random Seed

-s | -seed    INT        Random seed for genome shuffling in preprocessing [42]

During preprocessing, SV² randomly selects reads from each chromosome to generate basic alignment statistics. The random seed is set at 42.

Output

-o | -out    STR        Output name

Prefix for the output files in sv2_genotypes/

Merging Divergent Breakpoints

-merge        Merge SV after genotyping

SV² can merge breakpoints that are reciprocally overlapping by 80% (by default). This step is done reciprocally until no more SVs can be merged. The SV position with the maximum ALT genotype likelihood is retained.

By default SV² does not merge breakpoints.

Minimum Reciprocal Overlap for Merging

-min-ovr    FLOAT        Minimum reciprocal overlap for merging SVs [0.8]

Users can define the minimum reciprocal overlap required for merging SVs after genotyping. The -merge flag is not required if -min-ovr option is passed.

Skip Preprocessing

-pre    PATH        Preprocessing output directory. Skips preprocessing

Users can skip preprocessing by passing the path of the sv2_preprocessing/ directory to the -pre argument. Doing this will instruct SV² to load the values in sv2_preprocessing/ skipping this step. This is useful if users wish to genotype a different set of variants in previously processed samples.

Skip Feature Extraction

-feats    PATH        Feature output directory. Skips feature extraction

Passing the path of the sv2_features/ to the -feats argument will skip this step. This is useful for users that wish to generate a genotype matrix containing multiple samples. An example of skipping feature extraction.

Classifier Arguments

Load a New Classifier

-load-clf    PATH        Add custom classifiers. `-load-clf <clf.JSON>`

SV² can incorporate new classifiers for genotyping. Packaged with SV² is a guide on training new classifiers. The output of this guide is a JSON file containing paths to the new classifier.

Pass the JSON file to the -load-clf argument to add more classifiers to SV². More details are located in the Training section of the User Guide.

Genotype with a New Classifier

-clf    STR        Specify classifers for genotyping [default]

After loading a new classifier, specify the name of the classifier in the -clf argument to genotype variants with that classifier. The original classifier from SV² is named default, and this is the default classifier.

Config Arguments

Before genotyping, users have to supply the full path to FASTA files for SV². At least one FASTA file is required for SV² to run. Configuration needs only to be executed once or updated if the FASTA paths change.

hg19 FASTA

-hg19    PATH        hg19 FASTA file

-hg19 takes the full path to a faidx indexed FASTA file for the hg19 (GRCh37) reference build.

hg38 FASTA

-hg38    PATH        hg38 FASTA file

-hg38 takes the full path to a faidx indexed FASTA file for the hg38 (GRCh38) reference build.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly