BactCore

Extract per-site flexible (≥95%) bacterial core genome alignments from bacterial whole-genome, reference-anchored alignments, such as from read mapping.

Output includes invariant sites in the output for detection of recombinant sites with ClonalFrameML

Installation

Compile with:

git clone https://github.com/moorembioinfo/BactCore.git
cd BactCore/
make

Usage

Run BactCore

BactCore <input.fasta> <output>

And for strict cores (sites with no gaps):

BactCore --strict <input.fasta> <output.fasta>

Or SNP-sites only from the strict core sites:

BactCore --strict --snps <input.fasta> <output.fasta>

Benchmarks

Maximum memory usage is approximately that of a single sequence, independent of the number of sequences in the alignment. As such with testing performed on Salmonella enterica, Mycobacterium tuberculosis, Streptococcus pyogenes and Escherichia coli maximum memory usage was never greater than 0.25Gb.

Runtime for 5,000, 10,000, 20,000 and 40,000 genomes was 6.4, 14, 26 and 52 minutes

Input

Multi-fasta whole genome alignment derived from mapping to a reference and variant calling such as from snippy, snippy-core and snippy-clean_full_aln:

snippy-core --ref ref.fa snippyoutfiles 
snippy-clean_full_aln core.full.aln > clean.full.aln

Use utils/clean-alignment.py if the alignment contains sites other than {ATCGN-}

Rationale

Strict core extraction is extremely sensitive to a minority of positions with gaps. Salmonella Typhimurium genomes (n= 5,379). The strict approach reduced the original 4,685,848bp alignment by 3,089,082 sites. After strict site exclusion 34.08% of sites remained (1,596,766). Conversely, with a relaxed site exclusion, 99.22% were retained (4,555,408). To avoid minority gaps biasing against strict extraction, below shows the rarefaction sampling from between 300 to 3100 randomly sampled genomes with a step of 100 and sampling repeated 50 times per sample size.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
image		image
src		src
utils		utils
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BactCore

Installation

Usage

Benchmarks

Input

Rationale

About

Releases

Packages

Languages

License

moorembioinfo/BactCore

Folders and files

Latest commit

History

Repository files navigation

BactCore

Installation

Usage

Benchmarks

Input

Rationale

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages