Probe capture enrichment sequencing of amoA genes

The ammonia monooxygenase subunit A (amoA) gene has been used to investigate the phylogenetic diversity, spatial distribution, and activity of ammonia-oxidizing archaeal (AOA) and bacterial (AOB), which contribute significantly to the nitrogen cycle in various ecosystems. Amplicon sequencing of amoA is a widely used method; however, it produces inaccurate results owing to the lack of a ‘universal’ primer set. Moreover, currently available primer sets suffer from amplification biases, which can lead to severe misinterpretation. Although shotgun metagenomic and metatranscriptomic analyses are alternative approaches without amplification bias, the low abundance of target genes in heterogeneous environmental DNA restricts a comprehensive analysis to a realizable sequencing depth. In this study, we developed a probe set and bioinformatics workflow for amoA enrichment sequencing using a hybridization capture technique.

Please note that the scripts were prepared for the analysis of the study. This means that the scripts were not implemented as a stand-alone research software. The analysis was performed on the supercomputing system at National Institute of Genetics (NIG), Research Organization of Information and Systems (ROIS), and the Earth Simulator systems at JAMSTEC.

Main script mainScript_amoA.sh and related modules are stored under src/ directory. Also, amoA/CuMMO gene sequenecs used in the study are stored in data/ directory. The official source code repository is at https://github.com/hiraokas/ProbeCaptureEnrichmentSequencing_amoA.

Code

The codes are written in shell script and python.

File	Description
clustering_seq.sh	Sequence clustering
fasta_seqlen_averageWithSD.sh	Analysis of sequence length distribution
fastq_pairend_marge.sh	Merge paired-end short-read sequences to single reads
fastq_remove_chimera.sh	Remove chimeric sequences
getseq_blast_output.sh	Get sequences using blast output file
mainScript_amoA.sh	Main script in this study
mapping.sh	Read mapping
phylogenetic_tree_construction.sh	Phylogenetic tree estimation
qsub_DDBJ.sh	Autogenerate script for grid engine (supercomputing system at DDBJ, Japan)
qsub_ES.sh	Autogenerate script for grid engine (Earth simulator system at JAMSTEC, Japan)
qsub_short.sh	Wrapper of qsub_DDBJ.sh (supercomputing system at DDBJ, Japan)
rename_fastafile.sh	Convert filenames according to given correspondence table
tidy2table.py	Convert tidy data to table format

Dataset

File	Description
amoA_MockPlasmid.fasta	amoA gene sequences used for the Mock sample
CuMMO_DB.fasta	CuMMO gene sequence database used for probe design
CuMMO_Selected.fasta	Selected 20 CuMMO gene sequences used as queries to retrieve CuMMO gene sequences from public gene databases (NCBI nt and env_nt)

Dependencies

fastq2fasta.pl (direct link)
TrimGalore - adapter trim
PRINSEQ++ - remove low complexity sequences
FLASH - merge paired-end reads
metaSPAdes - Metagenomic assembly
rnaSPAdes - Transcriptomic assembly
Prodigal - CDS prediction
DIAMOND - Similarity search
VSEARCH - Chimeric read prediction and removal
SeqKit - Sequence manipulation including length filtering
MMseq2 - Sequence clustering
MAFFT - Sequence alignment
FastTree2 - Phylogenetic tree prediction
Bowtie2 - Read mapping
Nonpareil3 - Metagenomic coverage estimation

Also we used some tools and databases for detailed data analysis in this study.

Ocean Data View - Oceanic diversity analysis
MEGA X - Phylogenetic tree analysis
AOA amoA sequence database defined by Alves et al. (Ref.) - Taxonomic assignments of AOA OTUs

Citation

Hiraoka S. (2024) Probe capture enrichment sequencing of amoA genes discloses diverse ammonia-oxidizing archaeal and bacterial populations. bioRxiv. doi:10.1101/2023.04.10.536224

**Probe capture enrichment sequencing of amoA genes discloses diverse ammonia-oxidizing archaeal and bacterial populations**

Satoshi Hiraoka1†*, Minoru Ijichi2†, Hirohiko Takeshima2, Yohei Kumagai2, Ching-Chia Yang2, Yoko Makabe-Kobayashi2, Hideki Fukuda2, Susumu Yoshizawa2, Wataru Iwasaki2,3, Kazuhiro Kogure2, Takuhei Shiozaki2*

1. Research Center for Bioscience and Nanoscience (CeBN), Japan Agency for Marine-Earth Science and Technology (JAMSTEC), 2–15 Natsushima-cho, Yokosuka, Kanagawa 237–0061, Japan
2. Atmosphere and Ocean Research Institute, the University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-8564, Japan
3. Department of Integrated Biosciences, Graduate School of Frontier Sciences, the University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba 277-0882, Japan.

† Contributed equally
* Corresponding author

Email: [email protected], [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
data		data
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Probe capture enrichment sequencing of amoA genes

Code

Dataset

Dependencies

Citation

About

Languages

hiraokas/ProbeCaptureEnrichmentSequencing_amoA

Folders and files

Latest commit

History

Repository files navigation

Probe capture enrichment sequencing of amoA genes

Code

Dataset

Dependencies

Citation

About

Topics

Resources

Stars

Watchers

Forks

Languages