Skip to content
Martin Asser Hansen edited this page Oct 2, 2015 · 6 revisions

Biopiece: clip_seq

Description

clip_seq clips - or removes - soft masked sequences from the beginning and end of sequences in the stream and quality scores if present. This way of denoting what sequence requres clipping is derived from sff_extract, which is a thrid party tool in the MIRA package.

sff_extract is located in a file named mira_3rdparty... here:

http://sourceforge.net/projects/mira-assembler/files/

Read more about Mira here:

http://www.chevreux.org/projects_mira.html

Usage

... | clip_seq [options]

Options

[-?          | --help]               #  Print full usage description.
[-I <file!>  | --stream_in=<file!>]  #  Read input stream from file  -  Default=STDIN
[-O <file>   | --stream_out=<file>]  #  Write output stream to file  -  Default=STDOUT
[-v          | --verbose]            #  Verbose output.

Examples

Consider the following FASTQ entry in the file test.fastq:

@GP8WFI101DE0H9
tcagTCTACGTCTCTGGACTGtaactgac
+
hhhhhhhhhhTQQOSWZ^^XMMMNNS`YY

We can read in these sequence using read_fastq and then clip the sequence with clip_seq like this:

read_fastq -i test.fq | clip_seq

SEQ_NAME: GP8WFI101DE0H9
SEQ: TCTACGTCTCTGGACTG
SEQ_LEN: 17
SCORES: hhhhhhTQQOSWZ^^XM
---

See also

read_fastq

scores_to_dec

trim_seq

mask_seq

Author

Martin Asser Hansen - Copyright (C) - All rights reserved.

[email protected]

October 2010

License

GNU General Public License version 2

http://www.gnu.org/copyleft/gpl.html

Help

clip_seq is part of the Biopieces framework.

http://www.biopieces.org

Clone this wiki locally