Skip to content

hiraokas/ProbeCaptureEnrichmentSequencing_amoA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 

Repository files navigation

Probe capture enrichment sequencing of amoA genes

The ammonia monooxygenase subunit A (amoA) gene has been used to investigate the phylogenetic diversity, spatial distribution, and activity of ammonia-oxidizing archaeal (AOA) and bacterial (AOB), which contribute significantly to the nitrogen cycle in various ecosystems. Amplicon sequencing of amoA is a widely used method; however, it produces inaccurate results owing to the lack of a ‘universal’ primer set. Moreover, currently available primer sets suffer from amplification biases, which can lead to severe misinterpretation. Although shotgun metagenomic and metatranscriptomic analyses are alternative approaches without amplification bias, the low abundance of target genes in heterogeneous environmental DNA restricts a comprehensive analysis to a realizable sequencing depth. In this study, we developed a probe set and bioinformatics workflow for amoA enrichment sequencing using a hybridization capture technique.

Please note that the scripts were prepared for the analysis of the study. This means that the scripts were not implemented as a stand-alone research software. The analysis was performed on the supercomputing system at National Institute of Genetics (NIG), Research Organization of Information and Systems (ROIS), and the Earth Simulator systems at JAMSTEC.

Main script mainScript_amoA.sh and related modules are stored under src/ directory. Also, amoA/CuMMO gene sequenecs used in the study are stored in data/ directory. The official source code repository is at https://github.com/hiraokas/ProbeCaptureEnrichmentSequencing_amoA.

A schematic overview of probe capture enrichment sequencing analysis

Code

The codes are written in shell script and python.

File Description
clustering_seq.sh Sequence clustering
fasta_seqlen_averageWithSD.sh Analysis of sequence length distribution
fastq_pairend_marge.sh Merge paired-end short-read sequences to single reads
fastq_remove_chimera.sh Remove chimeric sequences
getseq_blast_output.sh Get sequences using blast output file
mainScript_amoA.sh Main script in this study
mapping.sh Read mapping
phylogenetic_tree_construction.sh Phylogenetic tree estimation
qsub_DDBJ.sh Autogenerate script for grid engine (supercomputing system at DDBJ, Japan)
qsub_ES.sh Autogenerate script for grid engine (Earth simulator system at JAMSTEC, Japan)
qsub_short.sh Wrapper of qsub_DDBJ.sh (supercomputing system at DDBJ, Japan)
rename_fastafile.sh Convert filenames according to given correspondence table
tidy2table.py Convert tidy data to table format

Dataset

File Description
amoA_MockPlasmid.fasta amoA gene sequences used for the Mock sample
CuMMO_DB.fasta CuMMO gene sequence database used for probe design
CuMMO_Selected.fasta Selected 20 CuMMO gene sequences used as queries to retrieve CuMMO gene sequences from public gene databases (NCBI nt and env_nt)

Dependencies

Also we used some tools and databases for detailed data analysis in this study.

  • Ocean Data View - Oceanic diversity analysis
  • MEGA X - Phylogenetic tree analysis
  • AOA amoA sequence database defined by Alves et al. (Ref.) - Taxonomic assignments of AOA OTUs

Citation

Hiraoka S. (2024) Probe capture enrichment sequencing of amoA genes discloses diverse ammonia-oxidizing archaeal and bacterial populations. bioRxiv. doi:10.1101/2023.04.10.536224

**Probe capture enrichment sequencing of amoA genes discloses diverse ammonia-oxidizing archaeal and bacterial populations**

Satoshi Hiraoka1†*, Minoru Ijichi2†, Hirohiko Takeshima2, Yohei Kumagai2, Ching-Chia Yang2, Yoko Makabe-Kobayashi2, Hideki Fukuda2, Susumu Yoshizawa2, Wataru Iwasaki2,3, Kazuhiro Kogure2, Takuhei Shiozaki2*

1. Research Center for Bioscience and Nanoscience (CeBN), Japan Agency for Marine-Earth Science and Technology (JAMSTEC), 2–15 Natsushima-cho, Yokosuka, Kanagawa 237–0061, Japan
2. Atmosphere and Ocean Research Institute, the University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8564, Japan
3. Department of Integrated Biosciences, Graduate School of Frontier Sciences, the University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-0882, Japan.

† Contributed equally
* Corresponding author

Email: [email protected], [email protected]