-
Notifications
You must be signed in to change notification settings - Fork 23
embl_index
Martin Asser Hansen edited this page Oct 2, 2015
·
6 revisions
EMBL files can be indexes with embl_index that for each EMBL entry output Biopiece records with the entry ID, file OFFSET position and LEN or the EMBL entry as well as the full file path:
FILE: /Users/maasha/DATA/EMBL/rel_ann_pro_01_r100.dat
ID: CH482373
OFFSET: 0
LEN: 6647452
---
This can be used with write_mysql to create a database for fast lookups of EMBL entries.
Note that EMBL files cannot be gzipped because gzipped files does not allow random access.
embl_index [options] -i <EMBL file(s)>
[-? | --help] # Print full usage description.
[-i <files!> | --data_in=<files!>] # Comma separated list of files or glob expression to read.
[-I <file> | --stream_in=<file!>] # Read input stream from file - Default=STDIN
[-O <file> | --stream_out=<file>] # Write output stream to file - Default=STDOUT
[-v | --verbose] # Verbose output.
embl_index -i rel_ann_pro_01_r100.dat | head_records -n 1
FILE: /Users/maasha/DATA/EMBL/rel_ann_pro_01_r100.dat
ID: CH482373
OFFSET: 0
LEN: 6647452
---
To create an EMBL MySQL database do:
embl_index -i rel_ann_pro_01_r100.dat | write_mysql -d EMBL -r Release_100 -x
Martin Asser Hansen - Copyright (C) - All rights reserved.
July 2009
GNU General Public License version 2
http://www.gnu.org/copyleft/gpl.html
embl_index is part of the Biopieces framework.