Skip to content
Martin Asser Hansen edited this page Oct 2, 2015 · 5 revisions

Biopiece: fold_seq

Description

fold_seq fold nucleotide sequence into secodary structure, which is useful to illustrate e.g. miRNA hairpins. The resulting folding information may be uploaded to a custom UCSC Genome Browser track ).

fold_seq currently uses RNAfold from the Vienna package as folding engine. RNAfold must be installed for fold_seq to work. Read more about RNAfold here:

http://www.tbi.univie.ac.at/RNA/

Usage

... | fold_seq [options]

Options

[-?         | --help]               #  Print full usage description.
[-I <file!> | --stream_in=<file!>]  #  Read input from stream file   -  Default=STDIN
[-O <file>  | --stream_out=<file>]  #  Write output to stream file   -  Default=STDOUT
[-v         | --verbose]            #  Verbose output.

Examples

Consider the following RNA sequence entry in FASTA format from the file test.fna:

>MI0000116
UUCAGCCUUUGAGAGUUCCAUGCUUCCUUGCAUUCAAUAGUUAUAUUCAAGCAUAUGGAAUGUAAAGAAGUAUGGAGCGAAAUCUGGCGAG

We can read that file with read_fasta:

read_fasta -i test.fna | fold_seq

SIZE: 91
FREE_ENERGY: -36.20
SCORE: 36
SEQ: UUCAGCCUUUGAGAGUUCCAUGCUUCCUUGCAUUCAAUAGUUAUAUUCAAGCAUAUGGAAUGUAAAGAAGUAUGGAGCGAAAUCUGGCGAG
SEQ_NAME: MI0000116
SEQ_LEN: 91
SEC_STRUCT: ....(((...((..((((((((((((.((((((((.((((((.......))).))).)))))))).))))))))))))....)).)))...
CONF: 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1 <truncated>
---

The resulting record contains the secondary structure in Stockholm format as the value to SEC_STRUCT. Also, the FREE_ENERGY key is useful for selecting good structures. grab is your friend. The CONF key holds confidence information for the folding at any given position (not used, so set to 1s).

Now, if you have a fully configure UCSC Genome Browser system installed, it is possible to upload secondary structure information. To do this, you must have a BED entry, and the folded sequence. The reasonable way to do this is to have a BED file with the coordinates of the sequence you wish to fold and display and then the magic:

read_bed -i <BED file> | get_genome_seq -g <genome> | fold_seq | upload_to_ucsc -d <genome> -t <my_table_rnaSecStr> -x

Notice that the table name must contain the string 'rnaSecStr' to display the folding information in the Genome Browser.

See also

read_fasta

upload_to_ucsc

grab

read_bed

get_genome_seq

Author

Martin Asser Hansen - Copyright (C) - All rights reserved.

[email protected]

August 2007

License

GNU General Public License version 2

http://www.gnu.org/copyleft/gpl.html

Help

fold_seq is part of the Biopieces framework.

http://www.biopieces.org

Clone this wiki locally