read_pdb

Biopiece: read_pdb

Description

read_pdb Read PDB entries from one or more files and convert sequences to FASTA format. Two methods to perform conversion are available. The first method requires SEQRES records present in the input PDB file and it is set as default. The second method looks for ATOM records with CA atom name. These two methods does not have to generate identical results, generally the SEQRES method is more reliable and faster. read_pdb can handle missing residues correctly if there are corresponding records and also other common problems in PDB files like non-blank alternate location indicator, non-blank code for insertion of residues or broken chains. It can also translate some (>200) modified residues. The range of translatable modified residues will grow in the future.

The resulting biopiece record consists of the following record type:

REC_TYPE: PDB
SEQ_NAME: test
SEQ: SHGMADEEKL
SEQ_LEN: 10
SEQ_CHAIN: A
ATOM_COORD: <semicolon separated list of coordinates records>
---

read_pdb can filter records written in ATOM_COORD section. See options -c, -a, -r below. If -c option is not set, both ATOM and HETATM records will be included in output. If more than one filter option is set, record must meet all conditions.

Input PDB files must correspond to the standard format described here:

http://www.wwpdb.org/documentation/format33/v3.3.html

Usage

read_pdb [options] -i <PDB file>

Options

[-?          | --help]               #  Print full usage description.
[-i <files!> | --data_in=<files!>]   #  Comma separated list of existing PDB files.
[-I <file>   | --stream_in=<file!>]  #  Read input stream from file  -  Default=STDIN
[-o <file>   | --data_out=<file>]    #  Write result to file.
[-O <file>   | --stream_out=<file>]  #  Write output stream to file  -  Default=STDOUT
[-m <string> | --method=<string>]    #  Method to convert, available: seqres | atom  -  Default=seqres
[-M          | --missing]            #  When using atom method, try to resolve missing residues
[-c <list>   | --record=<list>]      #  Match a subset of record names only.
[-a <list>   | --atom_name=<list>]   #  Match a subset of atom names only.
[-r <list>   | --residue_name=<list> #  Match a subset of residue names only.

Examples

Read PDB file (will use "seqres" method to convert, output to STDOUT):

read_pdb -i test.pdb

Read PDB files using "atom" method and resolve missing residues if possible, write output to file test.fasta :

read_pdb -i test1.pdb,test2.pdb -m atom -M -o test.fasta

Read PDB files and filter atom coordinates records to have only CA atom name and residue name ILE or SER:

read_pdb -i test.pdb -a CA -r ILE,SER

Author

[email protected]

December 2014

License

GNU General Public License version 2

http://www.gnu.org/copyleft/gpl.html

Help

read_pdb is part of the Biopieces framework.

http://www.biopieces.org

Provide feedback

Saved searches

Use saved searches to filter your results more quickly