Name		Name	Last commit message	Last commit date
parent directory ..
sve_calls		sve_calls
10_call_calculate_stats.sh		10_call_calculate_stats.sh
1_prepare_python_sve_calls.sh		1_prepare_python_sve_calls.sh
2_sort_beds.sh		2_sort_beds.sh
3_merge_beds.sh		3_merge_beds.sh
4_call_bedtools.sh		4_call_bedtools.sh
5_concatenate_bedtools.sh		5_concatenate_bedtools.sh
6_summarize_true_positives.sh		6_summarize_true_positives.sh
7_label_false_positives.sh		7_label_false_positives.sh
8_summarize_false_positives.sh		8_summarize_false_positives.sh
9_summarize_false_negatives.sh		9_summarize_false_negatives.sh
README.md		README.md
calculate_performance.py		calculate_performance.py
get_bed_bins.py		get_bed_bins.py
get_sv_size_bins.py		get_sv_size_bins.py

README.md

Requirements

Python 3 (tested with 3.10.4) NumPy (tested with 1.23.1) bedtools (tested with 2.30.0)

The vcf files from the entire dataset are located in sve_calls/

The following scripts can be run (in numerical order) to quantify the performance of BreakDancer, cnMOPS, CNVnator, Delly, Hydra, and Lumpy.

1_prepare_python_sve_calls.sh
- Calls get_sv_size_bins.py to convert vcf files to separate BED files for each variant type and size range (based on FusorSV size bins)
2_sort_beds.sh, 3_merge_beds.sh
- Combine overlapping SV predictions contained in the BED files for each caller
4_call_bedtools.sh
- Calls bedtools to identify true positives, false positives, and false negatives
5_concatenate_bedtools.sh
- Concatenate the bedtools results from all samples into single files for each SV type
6_summarize_true_positives.sh
- Breakdown TPs by size and type
7_label_false_positives.sh
- Add a name for false postive variants in bedfile. Makes it easier to keep track of false positives during analysis.
8_summarize_false_positives.sh
- Breakdown FPs by size and type
9_summarize_false_negatives.sh
- Breakdown FNs by size and type
10_call_calculate_stats.sh
- Calls calculate_performance.py to summarize performance metrics for each caller