Skip to content
Martin Asser Hansen edited this page Oct 2, 2015 · 6 revisions

Biopiece: bowtie_seq

Description

bowtie_seq uses bowtie to map sequences in the stream against a specified index created with format_genome. Sequences can originate from a FASTA type entries or from Solexa or FASTQ type entries in which case the quality scores will be utilized.

The resulting records look like this:

STRAND: +
SCORES: hhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
Q_ID: test
REC_TYPE: BOWTIE
S_ID: chr2L
S_BEG: 9999
SEQ: GGAGGACAATGCAAAAAAGCTAAGAACAAA
DESCRIPTOR: 6:G>C
SEQ_LEN: 30
SCORE: 1
S_END: 10028
---

The key ´SCORE´ denotes how many time this hit was located, which can then be used for outputting BED entries with write_bed.

The key ´DESCRIPTOR´ describes mismatch information.

Bowtie must be installed for bowtie_seq to work. Read more about bowtie here:

http://bowtie-bio.sourceforge.net/index.shtml

Usage

... | bowtie_seq [options] -g <genome>

or

... | bowtie_seq [options] -i <index>

Options

[-?           | --help]                 #  Print full usage description.
[-g <genome!> | --genome=<genome!>]     #  Choose target genome (instead of index).
[-i <string>  | --index_name=<string>]  #  Choose target index (instead of genome).
[-m <uint>    | --mismatches=<uint>]    #  Mismatches allowed (0-3)     -  Default=0
[-h <uint>    | --max_hits=<uint>]      #  Max hits to report           -  Default=all
[-s <uint>    | --seed_length=<uint>]   #  Seed length                  -  Default=28
[-c <uint>    | --cpus=<uint>]          #  Number of CPUs to use        -  Default=1
[-I <file!>   | --stream_in=<file!>]    #  Read input from stream file  -  Default=STDIN
[-O <file>    | --stream_out=<file>]    #  Write output to stream file  -  Default=STDOUT
[-v           | --verbose]              #  Verbose output.

Examples

In order to use bowtie_seq to map a stack of query sequences from a FASTA file to a specified genome previously formatted with format_genome to see availible genomes), do:

read_fasta -i query_sequences.fna | bowtie_seq -g hg18

In order to use bowtie_seq to map a stack of query sequences from a FASTQ file to a specified index previously created with create_bowtie_index, do:

read_fastq -i query_sequences.fq | bowtie_seq -i ~/my_index/myindex

In order to use bowtie_seq to map a stack of query sequences from a Solexa file, do:

read_solexa -i query_sequences.solexa | bowtie_seq -g hg18

The result can be written in BED format with write_bed:

... | bowtie_seq -g hg18 | write_bed -xo output.bed

See also

format_genome

list_genomes

create_bowtie_index

read_fasta

read_fastq

read_solexa

write_bed

Author

Martin Asser Hansen - Copyright (C) - All rights reserved.

[email protected]

July 2009

License

GNU General Public License version 2

http://www.gnu.org/copyleft/gpl.html

Help

bowtie_seq is part of the Biopieces framework.

http://www.biopieces.org

Clone this wiki locally