Skip to content
Martin Asser Hansen edited this page Oct 2, 2015 · 6 revisions

Biopiece: assemble_seq_ray

Description

assemble_seq_ray assembles sequence in the stream using Ray and outputs the contig sequences from the best assembly chosen as the one with the highest N50.

A number of assemblies are created. Different k-mer lengths give very different assembly results and all uneven k-mer sizes from 19 to 31 are used per default.

An assembly directory must be speficied, and assemble_seq_ray leaves the original assemblies files in this directory.

Consult the Ray documentation for more information.

Ray and MPI must be installed in order for assemble_seq_ray to work.

Read more here:

http://denovoassembler.sourceforge.net/index.html

Usage

... | assemble_seq_ray [options] <dir>

Options

[-?          | --help]                #  Print full usage description.
[-d <dir>    | --dir=<dir>]           #  Assembly directory.
[-t <string> | --type=<string>]       #  Read type: single|paired                   -  Default="single"
[-k <uint>   | --kmer_min=<uint>]     #  Minimum k-mer value                        -  Default=19
[-K <uint>   | --kmer_max=<uint>]     #  Maximum k-mer value                        -  Default=31
[-c <uint>   | --cpus]                #  Number of CPUs to use                      -  Default=1
[-X          | --clean]               #  Remove directory upon completed assembly.
[-I <file!>  | --stream_in=<file!>]   #  Read input from stream file                -  Default=STDIN
[-O <file>   | --stream_out=<file>]   #  Write output to stream file                -  Default=STDOUT
[-v          | --verbose]             #  Verbose output.

Examples

In the below example illustrates a de-novo assembly of a Lactococcus lactus strain. The sequences are read with read_fastq before being piped to assemble_seq_ray. Following the assembly, the contigs are written to file in FASTA format with write_fasta and finally the contig sequences are analyzed with analyze_assembly:

read_fastq -i Lactococcus_NCDO0505.fq |
trim_seq |
assemble_seq_ray -d Ray -v |
write_fasta -o Lactococcus_NCDO0505.contigs |
analyze_assembly -x

N50: 5296
MAX: 35366
MIN: 50
MEAN: 533
TOTAL: 2833428
COUNT: 5308
---

Note that verbose output from assemble_seq_ray is enabled with the -v switch.

See also

read_fastq

trim_seq

write_fasta

analyze_assembly

asseble_seq_idba

asseble_seq_velvet

Author

Martin Asser Hansen - Copyright (C) - All rights reserved.

[email protected]

May 2011

License

GNU General Public License version 2

http://www.gnu.org/copyleft/gpl.html

Help

assemble_seq_ray is part of the Biopieces framework.

http://www.biopieces.org

Clone this wiki locally