Skip to content

Latest commit

 

History

History
39 lines (34 loc) · 2.28 KB

README.md

File metadata and controls

39 lines (34 loc) · 2.28 KB

Evolution of the GGI

Scripts and data associated with Figures 5-8 of "The Gonococcal Genetic Island defines distinct sub-populations of Neisseria gonorrhoeae"

Figure 5

core_gene_alignment[...]: N. gonorrhoeae core genome alignment and related files including VCF from SnpSites and VCF converted for use with Vcflib by vcflibConversion.sh
ggi_scoary_traits.csv: ggi presence absence
scoaryToVcflibFst.py: runs vcflib Fst outlier analysis and calculates null distribution using phenotype file and vcf
vcflibConversion.sh: converts SnpSites VCF for use with Vcflib
fst.sh - steps for running Fst outlier analysis
GCFstManhattan.R: script for making Figure 5 using fst oulier and homoplasy results

homoplasy/

GCCore_RAxML.newick: core genome phylogeny produced using RAxML
gappless.vcf: vcf with gaps removed using script from https://github.com/tatumdmortimer/formatConverters.git
NCCP11945.fasta: reference strain genome
treetime.sh: running treetime to calculate homoplasies and tree branch lengths

Figure 6

phage_seq.tar.gz: all phage sequences as annotated by ProphET
mash.sh: calculating mash distances
mash_results.txt: mash distances for all phage sequences
mge_annot.tar.gz: MGE as annotated by MobileElementFinder
mge_summary.txt: summary of number of IS elements per strain
GC_phageMDS_MGE.R: R script for plotting Figure 6 using phage MDS and MGE results

Figure 7

piNpiS.sh: bash script for running piN/piS analyses
ggi_alns.tar.gz: GGI gene alignments
ggi_gene_order.txt: order of GGI genes based on strain MS11
ggi_aln_pairs.py: python script which makes fasta files for each pairwise comparison within each GGI gene
GGI_piNpiS.txt: piN/piS results for GGI genes
selectionStats_piNpiS.py: python script that calculates piN/piS using Egglib
GGI_piNpiS.R: R script for plotting piN/piS values (Figure 7C)

Figure 8

CoreGGI_aln.fasta: core GGI genome alignment
GCcore_RAxML.newick: core genome (for N. gonorrhoeae not the GGI core) phylogeny produced using RAxML
GGIcore_RAxML.newick: GGI core genome phylogeny produced using RAxML
CoreGGI_AncRecon.R: R script for producing analyses and plots for Figure 8