-
Notifications
You must be signed in to change notification settings - Fork 23
calc_bit_scores
Martin Asser Hansen edited this page Oct 2, 2015
·
5 revisions
calc_bit_scores calculates the bit score for the sum of all residues per column aligned sequences from the stream. The bit scores are calculated using Shannon's famous general formula for uncertainty as documentet:
http://www.ccrnp.ncifcrf.gov/~toms/paper/hawaii/latex/node5.html
The maximum bit score is 2 and 4 for nucleotide and protein sequences, respectively.
... | calc_bit_scores [options]
[-? | --help] # Print full usage description.
[-I <file!> | --stream_in=<file!>] # Read input from stream file - Default=STDIN
[-O <file> | --stream_out=<file>] # Write output to stream file - Default=STDOUT
[-v | --verbose] # Verbose output.
Consider the following alignment in the file aln.fna
in FASTA format:
>test5
---TAACAGGCACT
>test2
-----GAATCGACT
>test1
--CTAGCTTCGACT
>test3
ACGAAACTAGCATC
>test4
----AGCATCGACT
To calculate the bit scores from the above alignment, read it in with read_fasta and pipe the stream through calc_bit_scores:
read_fasta -i aln.fna | calc_bit_scores | write_tab -x
1.54 1.54 1.07 1.01 1.74 1.03 1.28 1.03 0.63 1.03 1.03 2.00 1.28 1.28
Martin Asser Hansen - Copyright (C) - All rights reserved.
August 2007
GNU General Public License version 2
http://www.gnu.org/copyleft/gpl.html
calc_bit_scores is part of the Biopieces framework.