read_fasta

Biopiece: read_fasta

Description

read_fasta read in sequence entries from FASTA files. Each sequence entry consists of a sequence name prefixed by a '>' followed by the sequence name on a line of its own, followed by one or my lines of sequence until the next entry or the end of the file. The resulting biopiece record consists of the following record type:

SEQ_NAME: test
SEQ_LEN: 10
SEQ: ATCGATCGAC
---

Input files may be compressed with gzip og bzip2.

For more about the FASTA format:

http://en.wikipedia.org/wiki/Fasta_format

Usage

read_fasta [options] -i <FASTA file(s)>

Options

[-?          | --help]               #  Print full usage description.
[-i <files!> | --data_in=<files!>]   #  Comma separated list of files or glob expression to read.
[-n <uint>   | --num=<uint>]         #  Limit number of records to read.
[-I <file>   | --stream_in=<file!>]  #  Read input stream from file  -  Default=STDIN
[-O <file>   | --stream_out=<file>]  #  Write output stream to file  -  Default=STDOUT
[-v          | --verbose]            #  Verbose output.

Examples

To read all FASTA entries from a file:

read_fasta -i test.fna

To read in only 10 records from a FASTA file:

read_fasta -n 10 -i test.fna

To read all FASTA entries from multiple files:

read_fasta -i test1.fna,test2.fna

To read FASTA entries from multiple files using a glob expression:

read_fasta -i '*.fna'

Author

[email protected]

August 2007

License

GNU General Public License version 2

http://www.gnu.org/copyleft/gpl.html

Help

read_fasta is part of the Biopieces framework.

http://www.biopieces.org

Provide feedback

Saved searches

Use saved searches to filter your results more quickly