Skip to content
Martin Asser Hansen edited this page Oct 2, 2015 · 6 revisions

Biopiece: shred_seq

Description

shred_seq shred sequences into random subsequences of a specied length until a specified coverage is reached. The subsequences are alterately derived from the + and - strands.

This is useful for generating artificial reads from contigs for use with different assembly software.

Usage

... | shred_seq [options]

Options

[-?         | --help]               #  Print full usage description.
[-s <uint>  | --size=<uint>]        #  Size of subsequences         -  Default=500
[-c <uint>  | --coverage=<uint>]    #  Coverage to reach            -  Default=100
[-I <file!> | --stream_in=<file!>]  #  Read input from stream file  -  Default=STDIN
[-O <file>  | --stream_out=<file>]  #  Write output to stream file  -  Default=STDOUT
[-v         | --verbose]            #  Verbose output.

Examples

Consider the following FASTA entry in the file test.fna

>test
ATGCACATTCGACTAGCA

To read the sequence use read_fasta using the -s switch to chose subsequences of size of 10 and -c to specify a coverage of 2:

read_fasta -i test.fna | shred_seq -s 10 -c 2

SEQ_NAME: test
SEQ: TGCACATTCG
SEQ_LEN: 10
---
SEQ_NAME: test
SEQ: GCTAGTCGAA
SEQ_LEN: 10
---
SEQ_NAME: test
SEQ: TTCGACTAGC
SEQ_LEN: 10
---
SEQ_NAME: test
SEQ: TCGAATGTGC
SEQ_LEN: 10
---

See also

split_seq

read_fasta

Author

Martin Asser Hansen - Copyright (C) - All rights reserved.

[email protected]

January 2011

License

GNU General Public License version 2

http://www.gnu.org/copyleft/gpl.html

Help

shred_seq is part of the Biopieces framework.

http://www.biopieces.org

Clone this wiki locally