Skip to content
Martin Asser Hansen edited this page Oct 1, 2015 · 5 revisions

Biopiece: uchime_seq

Description

Chimeric amplicon sequences can be indentfied using uchime_seq which uses a reference database. If a record contains SEQ_NAME and SEQ the sequence will be used to search the database to determine if the sequence is chimeric. A CHIMERIC key is added to each of these records denoting if the sequence was found to be chimeric or not with the values YES and NO:

SEQ_NAME: GXS0P3T01CW1VQ
SEQ: TACTGAGCTAAACCAT
SEQ_LEN: 16
CHIMERIC: YES
---

The recommended GOLD database for uchime_seq and 16S amplicon sequences can be found here:

http://drive5.com/uchime/gold.fa

Usearch must be installed in order for uchime_seq to work.

Read more here:

http://www.drive5.com/usearch/

Usage

... | uchime_seq <database>

Options

[-?           | --help]               #  Print full usage description.
[-d <file!>   | --database=<file!>]   #  Database for uchime search.
[-I <file!>   | --stream_in=<file!>]  #  Read input from stream file  -  Default=STDIN
[-O <file>    | --stream_out=<file>]  #  Write output to stream file  -  Default=STDOUT
[-v           | --verbose]            #  Verbose output.

Examples

To remove chimeric sequences from a 16S amplicon dataset do this:

read_sff -ci data.sff |
uchime_seq -d gold.fa |
grab -e 'CHIMERIC == NO'

See also

read_sff

grab

usearch_seq

uclust_seq

Author

Martin Asser Hansen - Copyright (C) - All rights reserved.

[email protected]

November 2012

License

GNU General Public License version 2

http://www.gnu.org/copyleft/gpl.html

Help

uchime_seq is part of the Biopieces framework.

http://www.biopieces.org

Clone this wiki locally