-
Notifications
You must be signed in to change notification settings - Fork 23
read_pdb
read_pdb Read PDB entries from one or more files and convert sequences to FASTA format. Two methods to perform conversion are available. The first method requires SEQRES records present in the input PDB file and it is set as default. The second method looks for ATOM records with CA atom name. These two methods does not have to generate identical results, generally the SEQRES method is more reliable and faster. read_pdb can handle missing residues correctly if there are corresponding records and also other common problems in PDB files like non-blank alternate location indicator, non-blank code for insertion of residues or broken chains. It can also translate some (>200) modified residues. The range of translatable modified residues will grow in the future.
The resulting biopiece record consists of the following record type:
REC_TYPE: PDB
SEQ_NAME: test
SEQ: SHGMADEEKL
SEQ_LEN: 10
SEQ_CHAIN: A
ATOM_COORD: <semicolon separated list of coordinates records>
---
read_pdb can filter records written in ATOM_COORD section. See options -c, -a, -r below. If -c option is not set, both ATOM and HETATM records will be included in output. If more than one filter option is set, record must meet all conditions.
Input PDB files must correspond to the standard format described here:
http://www.wwpdb.org/documentation/format33/v3.3.html
read_pdb [options] -i <PDB file>
[-? | --help] # Print full usage description.
[-i <files!> | --data_in=<files!>] # Comma separated list of existing PDB files.
[-I <file> | --stream_in=<file!>] # Read input stream from file - Default=STDIN
[-o <file> | --data_out=<file>] # Write result to file.
[-O <file> | --stream_out=<file>] # Write output stream to file - Default=STDOUT
[-m <string> | --method=<string>] # Method to convert, available: seqres | atom - Default=seqres
[-M | --missing] # When using atom method, try to resolve missing residues
[-c <list> | --record=<list>] # Match a subset of record names only.
[-a <list> | --atom_name=<list>] # Match a subset of atom names only.
[-r <list> | --residue_name=<list> # Match a subset of residue names only.
Read PDB file (will use "seqres" method to convert, output to STDOUT):
read_pdb -i test.pdb
Read PDB files using "atom" method and resolve missing residues if possible, write output to file test.fasta :
read_pdb -i test1.pdb,test2.pdb -m atom -M -o test.fasta
Read PDB files and filter atom coordinates records to have only CA atom name and residue name ILE or SER:
read_pdb -i test.pdb -a CA -r ILE,SER
Lukas Astalos - Copyright (C) - All rights reserved.
December 2014
GNU General Public License version 2
http://www.gnu.org/copyleft/gpl.html
read_pdb is part of the Biopieces framework.