Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Dark proteome #40

Open
wants to merge 10 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions DARK/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
## Contents of this folder

### [coronavirus.fasta.CAST_thr20_4hackathon.tab](./coronavirus.fasta.CAST_thr20_4hackathon.tab)
Compositionally biased regions detected by the CAST algorithm (Promponas et al., 2000).
CAST v2.2 (Ioannides et al., in preparation) tab-separated output, reformatted to be in line with the Swiss-Model annotation format.
Regions are colored according to amino acid type, following the Rasmol color scheme as described in
http://life.nthu.edu.tw/~fmhsu/rasframe/SHAPELY.HTM.

### [casttab2swmodel.pl](./casttab2swmodel.pl)
Perl code to reformat CAST output for input to Swiss-Model portal. Tested with perl v5.16.2 on MacOS, should work fine on Linux/Windows as well. No external dependencies.

On the command line simply run:
$perl casttab2swmodel.pl path/to/cast_tab_file > path/to/hackathon.tab

### [coronavirus.fasta.CASTV2.2.thr20](./coronavirus.fasta.CASTV2.2.thr20)
Raw tab output from CAST v2.2 to be used for Perl script above.
65 changes: 65 additions & 0 deletions DARK/casttab2swmodel.pl
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
#!/usr/bin/perl
use strict;
use warnings;

# Description
# -----------
# Code to reformat CAST output for the covid-19-Annotations-on-Structures
# project https://github.com/gtauriello/covid-19-Annotations-on-Structures

#
# Author: Vasilis J Promponas
# Contact: [email protected]; [email protected]
#

my %colorscheme = getColorScheme();

my @annot = ();
open(CASTTAB, $ARGV[0]) or die "Could not open CAST tab ($ARGV[0])";
my $entry = <CASTTAB>; # Read header line

while($entry = <CASTTAB>)
{
# Input format (tab separated):
# >sp|P0DTD3|Y14_WCPV Uncharacterized protein 14 OS=Wuhan seafood market pneumonia virus OX=2697049 GN=ORF14 PE=3 SV=1 C 68 70 27
chomp($entry);
next if $entry eq '';
my @tmp = split(/\t/,$entry);
my $aatype = $tmp[1]. '-rich region (' . $tmp[4]. ')';
my $from = $tmp[2];
my $to = $tmp[3];
my @tmp2 = split(/\|/, $tmp[0]);
my $uniprot_id = $tmp2[1];
my $color = $colorscheme{$tmp[1]};
my $out = join("\t", ($uniprot_id, $from, $to, $color, $aatype,)) . "\n";
push @annot, $out;
}
close(CASTTAB);


foreach my $annotation (@annot)
{
print $annotation;
}


sub getColorScheme
{
# Return a hash mapping the Rasmol color scheme
# See http://life.nthu.edu.tw/~fmhsu/rasframe/SHAPELY.HTM
my %colors=();
$colors{'D'} = $colors{'E'} = '#E60A0A';
$colors{'C'} = $colors{'M'} = '#E6E600';
$colors{'K'} = $colors{'R'} = '#145AFF';
$colors{'S'} = $colors{'T'} = '#FA9600';
$colors{'F'} = $colors{'Y'} = '#3232AA';
$colors{'N'} = $colors{'Q'} = '#00DCDC';
$colors{'G'} = '#EBEBEB';
$colors{'L'} = $colors{'V'} = $colors{'I'} = '#0F820F';
$colors{'A'} = '#C8C8C8';
$colors{'W'} = '#B45AB4';
$colors{'H'} = '#8282D2';
$colors{'P'} = '#DC9682';

return(%colors);
}
261 changes: 261 additions & 0 deletions DARK/coronavirus.fasta.CASTV2.2.thr20
Original file line number Diff line number Diff line change
@@ -0,0 +1,261 @@
>sp|P0DTD1|R1AB_WCPV Replicase polyprotein 1ab OS=Wuhan seafood market pneumonia virus OX=2697049 GN=rep PE=1 SV=1
MESLVPGFNEKTHVQLSLPVLQVRDVLVRGFGDSVEEVLSEARQHLKDGTCGLVEVEKGV
LPQLEQPYVFIKRSDARTAPHGHVMVELVAELEGIQYGRSGETLGVLVPHVGEIPVAYRK
VLLRKNXNKXAXXHSYXADLKSFDLGDELGTDPYEDFQENWNTKHSSGVTRELMRELNGG
AYTRYVDNNFCGPDGYPLECIKDLLARAGKASCTLSEQLDFIDTKRGVYCCREHEHEIAW
YTERSEKSYELQTPFEIKLAKKFDTFNGECPNFVFPLNSIIKTIQPRVEKKKLDGFMGRI
RSVYPVASPNECNQMCLSTLMKCDHCGETSWQTGDFVKATCEFCGTENLTKEGATTCGYL
PQNAVVKIYCPACHNSEVGPEHSLAEYHNESGLKTILRKGGRTIAFGGCVFSYVGCHNKC
AYWVPRASANIGCNHTGVVGEGSEGLNDNLLEILQKEKVNINIVGDFKLNEEIAIILASF
SASTSAFVETVKGLDYKAFKQIVESCGNFKVTKGKAKKGAWNIGEQKSILSPLYAFASEA
ARVVRSIFSRTLETAQNSVRVLQKAAITILDGISQYSLRLIDAMMFTSDLATNNLVVMAY
ITGGVVQLTSQWLTNIFGTVYEKLKPVLDWLEEKFKEGVEFLRDGWEIVKFISTCACEIV
GGQIVTCAKEIKESVQTFFKLVNKFLALCADSIIIGGAKLKALNLGETFVTHSKGLYRKC
VKSREETGLLMPLKAPKEIIFLEGETLPTEVLTEEVVLKTGDLQPLEQPTSEAVEAPLVG
TPVCINGLMLLEIKDTEKYCALAPNMMVTNNTFTLKGGAPTKVTFGDDTVIEVQGYKSVN
ITFELDERIDKVLNEKCSAYTVELGTEVNEFACVVADAVIKTLQPVSELLTPLGIDLDEW
SMATYYLFDESGEFKLASHMYCSFYPPDXDXXXGDCXXXXFXPSTQYXYGTXDDYQGKPL
XFGATSAALQPXXXQXXDWLXXXSQQTVGQQXGSXXNQTTTIQTIVXVQPQLXMXLTPVV
QTIEVNSFSGYLKLTDNVYIKNADIVEEAKKVKPTVVVNAANVYLKHXXXVAXALNKATN
NAMQVESDDYIATNGPLKVGGSCVLSGHNLAKHCLHVVGPNVNKGEDIQLLKSAYENFNQ
HEVLLAPLLSAGIFGADPIHSLRVCVDTVRTNVYLAVFDKNLYDKLVSSFLXMXSXXQVX
QXIAXIPXXXVXPFITXSXPSVXQRXQDDXXIXACVXXVTTTLXXTKFLTENLLLYIDIN
GNLHPDSATLVSDIDITFLKKDAPYIVGDVVQEGVLTAVVIPTKKAGGTTEMLAKALRKV
PTDNYITTYPGQGLNGYTVEEAKTVLKKCKSAFYILPSIISNEKQEILGTVSWNLREMLA
HAEETRKLMPVCVETKAIVSTIQRKYKGIKIQEGVVDXGARFXFXXSKXXVASLINXLND
LNEXLVXMPLGYVTHGLNLEEAARYMRSLKVPATVXVXXPDAVTAYNGYLTXXXKTPEEH
FIETIXLAGXYKDWXYXGQXTQLGIEFLKRGDKSVYYTSNPTTFHLDGEVITFDNLKTLL
SLREVRTIKVFTTVDNINLHTQVVDMSMTYGQQFGPTYLDGADVTKIKPHNSHEGKTFYV
LPNDDTLRVEAFEYYHTTDPSFLGRYMSALNHTKKWKYPQVNGLTSIKWADNNCYLATAL
LTLQQIELKFNPPALQDAYYRARAGEAANFCALILAYCNKTVGELGDVRETMSYLFQHAN
LDSCKRVLNVVCKTCGQQQTTLKGVEAVMYMGTLSYEQFKKGVQIPCTCGKQATKYLVQQ
ESPFVMMSAPPAQYELKHGTFTCASEYTGNYQCGHYKHITSKETLYCIDGALLTKSSEYK
GPITDVFYKENSYTTTIKPVTYKLDGVVCTEIDPKLDNYYKKDNSYFTEQPIDLVPNQPY
PNASFDNFKFVCDNIKFADDLNQLTGYKKPASRELKVTFFPDLNGDVVAIDYKHYTPSFK
KGAKLLHKPIVWHVNNATNKATYKPNTWCIRCLWSTKPVETSNSFDVLKSEDAQGMDNLA
CEDLKPVSEEVVENPTIQKDVLECNVKTTEVVGDIILKPANNSLKITEEVGHTDLMAAYV
DNSSLTIKKPNELSRVLGLKTLATHGLAAVNSVPWDTIANYAKPFLNKVVSXXXNIVXRC
LNRVCXNYMPYFFXLLLQLCXFXRSXNSRIKASMPXXIAKNXVKSVGKFCLEASFNYLKS
PNFSKXINIIIWFXXXSVCXGSXIYSTAAXGVXMSNLGMPSYCTGYREGYLNSTNVTIAT
YCTGSIPCSVCLSGLDSLDTYPSLETIQITISSFKWDLTAFGLVAEWXLAYILXTRXXYV
LGLAAIMQLXXSYXAVHXISNSWLXWLIINLVQXAPISAXVRXXIFFASFXXVWKSXVHV
VDGCNSSTCMMCYKRNRATRVECTTIVNGVRRSFYVYANGGKGFCKLHNWNCVNCDTFCA
GSTFISDEVARDLSLQFKRPINPTDQSSYIVDSVTVKNGSIHLYFDKAGQKTYERHSLSH
FVNLDNLRANNTKGSLPINVIVFDGKXKCEEXXAKXAXVYYSQLMCQPILLLDQALVSDV
GDSAEVAVKMFDAYVNTFSSTFNVPMEKLKTLVATAEAELAKNVSLDNVLSTFISAARQG
FVDSDVETKDVVECLKLSHQSDIEVTGDSCNNYMLTYNKVENMTPRDLGACIDCSARHIN
AQVAKSHNIALIWNVKDFMSLSEQLRKQIRSAAKKNNLPFKLTCATTRQXXNXXTTKIAL
KGGKIXNNWLKQLIKXTLXFLFXAAIFYLITPXHXMSKHTDFSSEIIGYKAIDGGVTRDI
ASTDTCFANKHADFDTWFSQRGGSYTNDKACPLIAAVITREVGFVVPGLPGTILRTTNGD
FLHFLPRVFSAVGNICYTPSKLIEYTDFATSACVLAAECTIFKDASGKPVPYCYDTNVLE
GSVAYESLRPDTRYVLMDGSIIQFPNTYLEGSVRVVTTFDSEYCRHGTCERSEAGVCVST
SGRWVLNNDYYRSLPGVFCGVDAVNLLTNMFTPLIQPIGALDISASIVAGGIVAIVVTCL
AYYFMRFRRAFGEYSHVVAFNTLLFLMSFTVLCLTPVXSFLPGVXSVIXLXLTFXLTNDV
SFLAHIQWMVMFTPLVPFWITIAYIICISTKHFXWFFSNXLKRRVVFNGVSFSTFEEAAL
CTFLLNKEMYLKLRSDVLLPLTQXNRXLALXNKXKXFSGAMDTTSYREAACCHLAKALND
FSNSGSDVLYQPPQTSITSAVLQSGFRKMAFPSGKVEGCMVQVTCGTTTLNGLWLDDVVY
CPRHVICTSEDMLNPNYEDLLIRKSNHNFLVQAGNVQLRVIGHSMQNCVLKLKVDTANPK
TPKYKFVRIQPGQTFSVLACYNGSPSGVYQCAMRPNFTIKGSFLNGSCGSVGFNIDYDCV
SFCYMHHMELPTGVHAGTDLEGNFYGPFVDRQXAQAAGXDXXIXVNVLAWLYAAVINGDR
WFLNRFTTTLNDFNLVAMKYNYEPLTQDHVDILGPLSAQTGIAVLDMCASLKELLQNGMN
GRTILGSALLEDEFTPFDVVRQCSGVTFQSAVKRTIKGTHHWXXXTIXTSXXVXVQSTQW
SLXXXLYENAXLPXAXGIIAXSAFAXXFVKHKHAFLCLFLLPSLATVAYFNXVYXPASWV
XRIXTWLDXVDTSLSGFKLKDCVXYASAVVLLILXTARTVYDDGARRVWTLMNVLTLVYK
VYYGNALDQAISMWALIISVTSNYSGVVTTVMFLARGIVFMCVEYCPIFFITGNTLQCIM
LVYCXLGYXCTCYXGLXCLLNRYXRLTLGVYDYLVSTQEFRYMNSQGLLPPKNSIDAFKL
NIKLLGVGGKPCIKVATVQSKMSDVKCTSVVLLSVLQQLRVESSSKLWAQCVQLHNDILL
AKDTTEAFEKMVSLLSVLLSMQGAVDINKLCEEMLDNRXTLQXIXSEFSSLPSYXXFXTX
QEXYEQXVXNGDSEVVLXXLXXSLNVAXSEFDRDAAMQRXLEXMADQAMTQMYXQARSED
XRAXVTSAMQTMLFTMLRKLDXDALXXIIXXARDGCVPLNIIPLTTAAKLMVVIPDYNTY
KNTCDGTTFTYASALWEIQQVVDADSKIVQLSEISMDNSPNLAWPLIVTALRANSAVKLQ
NNELSPVALRQMSCAAGXXQXACXDDNALAYYNXXKGGRFVLALLSDLQDLKWARFPKSD
GTGTIYTELEPPCRFVTDTPKGPKVKYLYFIKGLNNLNRGMVLGSLXXTVRLQXGNXTEV
PXNSTVLSFCXFXVDXXKXYKDYLASGGQPITNCVKMLCTHTGTGQAITVTPEANMDQES
FGGASXXLYXRXHIDHPNPKGFCDLKGKYVQIPTTCANDPVGFTLKNTVCTVCGMWKGYG
CSCDQLREPMLQSADAQSFLNRVCGVSAARLTPCGTGTSTDVVYRAFDIYNDKVAGFAKF
LKTNCCRFQEKXEXXNLIDSYFVVKRHTFSNYQHEETIYNLLKDCPAVAKHDFFKFRIDG
DMVPHISRQRLTKYTMADLVYALRHFDEGNCDTLKEILVTYNCCDDDYFNKKDWYDFVEN
PDILRVYANLGERVRQALLKTVQFCDAMRNAGIVGVLTLDNQDLNGNWYDFGDFIQTTPG
SGVPVVDSYYSLLMPILTLTRALTAESHVDTDLTKPYIKWDLLKYDFTEERLKLFDRXFK
XWDQTXHPNCVNCLDDRCILHCANFNVLFSTVFPPTSFGPLVRKIFVDGVPFVVSTGYHF
RELGVVHNQDVNLHSSRLSFKELLVYAADPAMHAASGNLLLDKRTTCFSVAALTNNVAFQ
TVKPGNFNKDFYDFAVSKGFFKEGSSVELKHFFFAQDGNAAISDXDXXRXNLPTMCDIRQ
LLFVVEVVDKYFDCYDGGCINANQVIVNNLDKSAGFPFNKWGKARLYYDSMSYEDQDALF
AYTKRNVIPTITQMNLKYAISAKNRARTVAGVSICSTMTNRQFHQKLLKSIAATRGATVV
IGTSKFYGGWHNMLKTVYSDVENPHLMGWDYPKCDRAMPNMLRIMASLVLARKHTTCCSL
SHRFYRLANECAQVLSEMVMCGGSLYVKPGGTSSGDATTAYANSVFNICQAVTANVNALL
STDGNKIADKYVRNLQHRLYECLYRNRDVDTDFVNEFYAYLRKHFSMMILSDDAVVCFNS
TYASQGLVASIKNFKSVLYYQNNVFMSEAKCWTETDLTKGPHEFCSQHTMLVKQGDDYVY
LPYPDPSRILGAGCFVDDIVKTDGTLMIERFVSLAIDAYPLTKHPNQEYADVFHLYLQYI
RKLHDELTGHMLDMYSVMLTNDNTSRYWEPEFYEAMYTPHTVLQAVGAXVLXNSQTSLRX
GAXIRRPFLXXKXXYDHVISTSHKLVLSVNPYVCNAPGCDVTDVTQLYLGGMSYYCKSHK
PPISFPLCANGQVFGLYKNTCVGSDNVTDFNAIATCDWTNAGDYILANTCTERLKLFAAE
TLKATEETFKLSYGIATVREVLSDRELHLSWEVGKPRPPLNRNYVFTGYRVTKNSKVQIG
EYTFEKGDYGDAVVYRGTTTYKLNVGDYFVLTSHTVMPLSAPTLVPQEHYVRITGLYPTL
NISDEFSSNVANYQKVGMQKYSTLQGPPGTGKSHFAIGLALYYPSARIVYTACSHAAVDA
LCEKALKYLPIDKCSRIIPARARVECFDKFKVNSTLEQYVFCTVNALPETTADIVVFDEI
SMATNYDLSVVNARLRAKHYVYIGDPAQLPAPRTLLTKGTLEPEYFNSVCRLMKTIGPDM
FLGTCRRCPAEIVDTVSALVYDNKLKAHKDKSAQCFKMFYKGVITHDVSSAINRPQIGVV
REFLTRNPAWRKAVFISPYNSQNAVASKILGLPTQTVDSSQGSEYDYVIFTQTTETAHSC
NVNRFNVAITRAKVGILCIMSDRDLYDKLQFTSLEIPRRNVATLQAENVTGLFKDCSKVI
TGLHPTQAPTHLSVDTKFKTEGLCVDIPGIPKDMTYRRLISMMGFKMNYQVNGYPNMFIT
REEAIRHVRAWIGFDVEGCHATREAVGTNLPLQLGFSTGVNLVAVPTGYVDTPNNTDFSR
VSAKXXXGDQFKHLIPLMYKGLPWNVVRIKIVQMLSDTLKNLSDRVVFVLWAHGFELTSM
KYFVKIGPERTXXLXDRRATCFSTASDTYACWHHSIGFDYVYNPFMIDVQQWGFTGNLQS
NHDLYCQVHGNAHVASCDAIMTRCLAVHECFVKRVDWTIEYPIIGDELKINAACRKVQHM
VVKAALLADKFPVLHDIGNPKAIKCVPQADVEWKFYDAQPCSDKAYKIEELFYSYATHSD
KFTDGVCLFWNCNVDRYPANSIVCRFDTRVLSNLNLPGCDGGSLYVNKHAFHTPAFDKSA
FVNLKQLPFFXXSDSPCESHGKQVVSDIDYVPLKSATCITRCNLGGAVCRHHANEYRLYL
DAYNMMISAGFSLWVYKQFDTYNLWNTFTRLQSLENVAFNVVNKGHFDGQQGEVPVSIIN
NTVYTKVDGVDVELFENKTTLPVNVAFELWAKRNIKPVPEVKILNNLGVDIAANTVIWDY
KRDAPAHISTIGVCSMTDIAKKPTETICAPLTVFFDGRVDGQVDLFRNARNGVLITEGSV
KGLQPSVGPKQASLNGVTLIGEAVKTQFNYYKKVDGVVQQLPETYFTQSRNLQEFKPRSQ
MEIDFLELAMDEFIERYKLEGYAFEHIVYGDFSHSQLGGLHLLIGLAKRFKESPFELEDF
IPMDSTVKNYFITDAQTGSSKCVCSVIDLLLDDFVEIIKSQDLSVVSKVVKVTIDYTEIS
FMLWCKDGHVETFYPKLQSSQAWQPGVAMPNLYKMQRMLLEKCDLQNYGDSATLPKGIMM
NVAKYTQLCQYLNTLTLAVPYNMRVIHFGAGSDKGVAPGTAVLRQWLPTGTLLVDSDLND
FVSDADSTLIGDCATVHTANKWDLIISDMYDPKTKNVTKENDSKEGFFTYICGFIQQKLA
LGGSVAIKITEHSWNADLYKLMGHFAXXTAFVTNVNASSSEAFLIGCNYLGKPREQIDGY
VMHANYIFWRNTNPIQLSSYSLFDMSKFPLKLRGTAVMSLKEGQINDMILSLLSKGRLII
RENNRVVISSDVLVNN
>sp|P0DTC1|R1A_WCPV Replicase polyprotein 1a OS=Wuhan seafood market pneumonia virus OX=2697049 PE=3 SV=1
MESLVPGFNEKTHVQLSLPVLQVRDVLVRGFGDSVEEVLSEARQHLKDGTCGLVEVEKGV
LPQLEQPYVFIKRSDARTAPHGHVMVELVAELEGIQYGRSGETLGVLVPHVGEIPVAYRK
VLLRKNXNKXAXXHSYXADLKSFDLGDELGTDPYEDFQENWNTKHSSGVTRELMRELNGG
AYTRYVDNNFCGPDGYPLECIKDLLARAGKASCTLSEQLDFIDTKRGVYCCREHEHEIAW
YTERSEKSYELQTPFEIKLAKKFDTFNGECPNFVFPLNSIIKTIQPRVEKKKLDGFMGRI
RSVYPVASPNECNQMCLSTLMKCDHCGETSWQTGDFVKATCEFCGTENLTKEGATTCGYL
PQNAVVKIYCPACHNSEVGPEHSLAEYHNESGLKTILRKGGRTIAFGGCVFSYVGCHNKC
AYWVPRASANIGCNHTGVVGEGSEGLNDNLLEILQKEKVNINIVGDFKLNEEIAIILASF
SASTSAFVETVKGLDYKAFKQIVESCGNFKVTKGKAKKGAWNIGEQKSILSPLYAFASEA
ARVVRSIFSRTLETAQNSVRVLQKAAITILDGISQYSLRLIDAMMFTSDLATNNLVVMAY
ITGGVVQLTSQWLTNIFGTVYEKLKPVLDWLEEKFKEGVEFLRDGWEIVKFISTCACEIV
GGQIVTCAKEIKESVQTFFKLVNKFLALCADSIIIGGAKLKALNLGETFVTHSKGLYRKC
VKSREETGLLMPLKAPKEIIFLEGETLPTEVLTEEVVLKTGDLQPLEQPTSEAVEAPLVG
TPVCINGLMLLEIKDTEKYCALAPNMMVTNNTFTLKGGAPTKVTFGDDTVIEVQGYKSVN
ITFELDERIDKVLNEKCSAYTVELGTEVNEFACVVADAVIKTLQPVSELLTPLGIDLDEW
SMATYYLFDESGEFKLASHMYCSFYPPDXDXXXGDCXXXXFXPSTQYXYGTXDDYQGKPL
XFGATSAALQPXXXQXXDWLXXXSQQTVGQQXGSXXNQTTTIQTIVXVQPQLXMXLTPVV
QTIEVNSFSGYLKLTDNVYIKNADIVEEAKKVKPTVVVNAANVYLKHXXXVAXALNKATN
NAMQVESDDYIATNGPLKVGGSCVLSGHNLAKHCLHVVGPNVNKGEDIQLLKSAYENFNQ
HEVLLAPLLSAGIFGADPIHSLRVCVDTVRTNVYLAVFDKNLYDKLVSSFLXMXSXXQVX
QXIAXIPXXXVXPFITXSXPSVXQRXQDDXXIXACVXXVTTTLXXTKFLTENLLLYIDIN
GNLHPDSATLVSDIDITFLKKDAPYIVGDVVQEGVLTAVVIPTKKAGGTTEMLAKALRKV
PTDNYITTYPGQGLNGYTVEEAKTVLKKCKSAFYILPSIISNEKQEILGTVSWNLREMLA
HAEETRKLMPVCVETKAIVSTIQRKYKGIKIQEGVVDXGARFXFXXSKXXVASLINXLND
LNEXLVXMPLGYVTHGLNLEEAARYMRSLKVPATVXVXXPDAVTAYNGYLTXXXKTPEEH
FIETIXLAGXYKDWXYXGQXTQLGIEFLKRGDKSVYYTSNPTTFHLDGEVITFDNLKTLL
SLREVRTIKVFTTVDNINLHTQVVDMSMTYGQQFGPTYLDGADVTKIKPHNSHEGKTFYV
LPNDDTLRVEAFEYYHTTDPSFLGRYMSALNHTKKWKYPQVNGLTSIKWADNNCYLATAL
LTLQQIELKFNPPALQDAYYRARAGEAANFCALILAYCNKTVGELGDVRETMSYLFQHAN
LDSCKRVLNVVCKTCGQQQTTLKGVEAVMYMGTLSYEQFKKGVQIPCTCGKQATKYLVQQ
ESPFVMMSAPPAQYELKHGTFTCASEYTGNYQCGHYKHITSKETLYCIDGALLTKSSEYK
GPITDVFYKENSYTTTIKPVTYKLDGVVCTEIDPKLDNYYKKDNSYFTEQPIDLVPNQPY
PNASFDNFKFVCDNIKFADDLNQLTGYKKPASRELKVTFFPDLNGDVVAIDYKHYTPSFK
KGAKLLHKPIVWHVNNATNKATYKPNTWCIRCLWSTKPVETSNSFDVLKSEDAQGMDNLA
CEDLKPVSEEVVENPTIQKDVLECNVKTTEVVGDIILKPANNSLKITEEVGHTDLMAAYV
DNSSLTIKKPNELSRVLGLKTLATHGLAAVNSVPWDTIANYAKPFLNKVVSXXXNIVXRC
LNRVCXNYMPYFFXLLLQLCXFXRSXNSRIKASMPXXIAKNXVKSVGKFCLEASFNYLKS
PNFSKXINIIIWFXXXSVCXGSXIYSTAAXGVXMSNLGMPSYCTGYREGYLNSTNVTIAT
YCTGSIPCSVCLSGLDSLDTYPSLETIQITISSFKWDLTAFGLVAEWXLAYILXTRXXYV
LGLAAIMQLXXSYXAVHXISNSWLXWLIINLVQXAPISAXVRXXIFFASFXXVWKSXVHV
VDGCNSSTCMMCYKRNRATRVECTTIVNGVRRSFYVYANGGKGFCKLHNWNCVNCDTFCA
GSTFISDEVARDLSLQFKRPINPTDQSSYIVDSVTVKNGSIHLYFDKAGQKTYERHSLSH
FVNLDNLRANNTKGSLPINVIVFDGKXKCEEXXAKXAXVYYSQLMCQPILLLDQALVSDV
GDSAEVAVKMFDAYVNTFSSTFNVPMEKLKTLVATAEAELAKNVSLDNVLSTFISAARQG
FVDSDVETKDVVECLKLSHQSDIEVTGDSCNNYMLTYNKVENMTPRDLGACIDCSARHIN
AQVAKSHNIALIWNVKDFMSLSEQLRKQIRSAAKKNNLPFKLTCATTRQXXNXXTTKIAL
KGGKIXNNWLKQLIKXTLXFLFXAAIFYLITPXHXMSKHTDFSSEIIGYKAIDGGVTRDI
ASTDTCFANKHADFDTWFSQRGGSYTNDKACPLIAAVITREVGFVVPGLPGTILRTTNGD
FLHFLPRVFSAVGNICYTPSKLIEYTDFATSACVLAAECTIFKDASGKPVPYCYDTNVLE
GSVAYESLRPDTRYVLMDGSIIQFPNTYLEGSVRVVTTFDSEYCRHGTCERSEAGVCVST
SGRWVLNNDYYRSLPGVFCGVDAVNLLTNMFTPLIQPIGALDISASIVAGGIVAIVVTCL
AYYFMRFRRAFGEYSHVVAFNTLLFLMSFTVLCLTPVXSFLPGVXSVIXLXLTFXLTNDV
SFLAHIQWMVMFTPLVPFWITIAYIICISTKHFXWFFSNXLKRRVVFNGVSFSTFEEAAL
CTFLLNKEMYLKLRSDVLLPLTQXNRXLALXNKXKXFSGAMDTTSYREAACCHLAKALND
FSNSGSDVLYQPPQTSITSAVLQSGFRKMAFPSGKVEGCMVQVTCGTTTLNGLWLDDVVY
CPRHVICTSEDMLNPNYEDLLIRKSNHNFLVQAGNVQLRVIGHSMQNCVLKLKVDTANPK
TPKYKFVRIQPGQTFSVLACYNGSPSGVYQCAMRPNFTIKGSFLNGSCGSVGFNIDYDCV
SFCYMHHMELPTGVHAGTDLEGNFYGPFVDRQXAQAAGXDXXIXVNVLAWLYAAVINGDR
WFLNRFTTTLNDFNLVAMKYNYEPLTQDHVDILGPLSAQTGIAVLDMCASLKELLQNGMN
GRTILGSALLEDEFTPFDVVRQCSGVTFQSAVKRTIKGTHHWXXXTIXTSXXVXVQSTQW
SLXXXLYENAXLPXAXGIIAXSAFAXXFVKHKHAFLCLFLLPSLATVAYFNXVYXPASWV
XRIXTWLDXVDTSLSGFKLKDCVXYASAVVLLILXTARTVYDDGARRVWTLMNVLTLVYK
VYYGNALDQAISMWALIISVTSNYSGVVTTVMFLARGIVFMCVEYCPIFFITGNTLQCIM
LVYCXLGYXCTCYXGLXCLLNRYXRLTLGVYDYLVSTQEFRYMNSQGLLPPKNSIDAFKL
NIKLLGVGGKPCIKVATVQSKMSDVKCTSVVLLSVLQQLRVESSSKLWAQCVQLHNDILL
AKDTTEAFEKMVSLLSVLLSMQGAVDINKLCEEMLDNRXTLQXIXSEFSSLPSYXXFXTX
QEXYEQXVXNGDSEVVLXXLXXSLNVAXSEFDRDAAMQRXLEXMADQAMTQMYXQARSED
XRAXVTSAMQTMLFTMLRKLDXDALXXIIXXARDGCVPLNIIPLTTAAKLMVVIPDYNTY
KNTCDGTTFTYASALWEIQQVVDADSKIVQLSEISMDNSPNLAWPLIVTALRANSAVKLQ
NNELSPVALRQMSCAAGXXQXACXDDNALAYYNXXKGGRFVLALLSDLQDLKWARFPKSD
GTGTIYTELEPPCRFVTDTPKGPKVKYLYFIKGLNNLNRGMVLGSLXXTVRLQXGNXTEV
PXNSTVLSFCXFXVDXXKXYKDYLASGGQPITNCVKMLCTHTGTGQAITVTPEANMDQES
FGGASXXLYXRXHIDHPNPKGFCDLKGKYVQIPTTCANDPVGFTLKNTVCTVCGMWKGYG
CSCDQLREPMLQSADAQSFLNGFAV
>sp|P0DTC2|SPIKE_WCPV Spike glycoprotein OS=Wuhan seafood market pneumonia virus OX=2697049 GN=S PE=3 SV=1
MFVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFS
NVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIV
NNATNVVIKVCEFQFCNDPFLGVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLE
GKQGNFKNLREFVFKNIDGYFKIYSKHTPINLVRDLPQGFSALEPLVDLPIGINITRFQT
LLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETK
CTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISN
CVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGKIAD
YNYKLPDDFTGCVIAWXSXXLDSKVGGXYXYLYRLFRKSNLKPFERDISTEIYQAGSTPC
NGVEGFNCYFPLQSYGFQPTNGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVN
FNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITP
GTNTSNQVAVLYQDVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEHVNNSY
ECDIPIGAGICAXYQTQTNXPRRARXVAXQXIIAYTMXLGAENXVAYXNNXIAIPXNFXI
XVXXEILPVXMXKXXVDCXMYICGDXTECXNLLLQYGSFCTQLNRALTGIAVEQDKNTQE
VFAQVKQIYKTPPIKDFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDC
LGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAM
QMAYRFNGIGVTQNVLYENQKLIANQFNXAIGKIQDXLXXTAXALGKLQDVVNQNAQALN
TLVKQLXXNFGAIXXVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRA
SANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPA
ICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDP
LQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDL
QELGKYEQYIKXPXYIXLGFIAGLXAXVMVTXMLXXMTSXXSXLKGXXSXGSXXKFXEXX
SEPVLKGVKLHYT
>sp|P0DTC3|AP3A_WCPV Protein 3a OS=Wuhan seafood market pneumonia virus OX=2697049 GN=3a PE=3 SV=1
MDLFMRIFTIGTVTLKQGEIKDATPSDFVRATATIPIQASLPFGWLIVGVALLAVFQSAS
KIITLKKRWQLALSKGVHFVCNXXXXFVTVYSHXXXVAAGXEAPFXXXXAXVXFXQSINF
VRIIMRXWXCWKCRSKNPLLYDANYFLCWHTNCYDYCIPYNSVTSSIVITSGDGTTSPIS
EHDYQIGGYTEKWESGVKDCVVLHSXFTSDXXQLXSTQLSTDTGVEHVTFFIYNKIVDEP
EEHVQIHTIDGSSGVVNPVMEPIYDEPXXXXSVPL
>sp|P0DTC4|VEMP_WCPV Envelope small membrane protein OS=Wuhan seafood market pneumonia virus OX=2697049 GN=E PE=3 SV=1
MYSFVSEETGTXIVNSVXXFXAFVVFXXVTXAIXTAXRXXAYXXNIVNVSLVKPSFYVYS
RVKNLNSSRVPDLLV
>sp|P0DTC5|VME1_WCPV Membrane protein OS=Wuhan seafood market pneumonia virus OX=2697049 PE=3 SV=1
MADSNGTITVEELKKLLEQWNXVIGFXFXTWICXXQFAYANRNRFXYIIKXIFXWXXWPV
TXACFVXAAVYRINWITGGIAIAMACLVGLMWLSYFIASFRLFARTRSMWSFNPETNILL
NVPLHGTILTRPLLESELVIGAVILRGHLRIAGHHLGRCDIKDLPKEITVATSRTLSYYK
LGASQRVAGDSGFAAYSRYRIGXYKLXTDHSSSSDXIALLVQ
>sp|P0DTC6|NS6_WCPV Non-structural protein 6 OS=Wuhan seafood market pneumonia virus OX=2697049 GN=6 PE=3 SV=1
MFHLVDFQVTXAEXLLXXMRTFKVSXWNLDYXXNLXXKNLSKSLTENKYSQLDEEQPMEI
D
>sp|P0DTC7|NS7A_WCPV Protein 7a OS=Wuhan seafood market pneumonia virus OX=2697049 GN=7a PE=3 SV=1
MKIIXFXAXITXATCELYHYQECVRGTTVLLKEPCSSGTYEGNSPFHPLADNKFALTCFS
TQFAFACPDGVKHVYQLRARSVSPKLFIRQEEVQELYSPXFLXVAAXVFXTLCFTLKRKT
E
>sp|P0DTD8|NS7B_WCPV Protein non-structural 7b OS=Wuhan seafood market pneumonia virus OX=2697049 PE=3 SV=1
MIEXSXIDXYXCXXAXXXXXVXIMXIIXWXSXEXQDHNETCHA
>sp|P0DTC8|NS8_WCPV Non-structural protein 8 OS=Wuhan seafood market pneumonia virus OX=2697049 PE=3 SV=1
MKFLVFLGIITTVAAFHQECSLQSCTQHQPYVVDDPCPIHFYSKWYIRVGARKSAPLIEL
CVDEAGSKSPIQYIDIGNYTVSCLPFTINCQEPKLGSLVVRCSFYEDFLEYHDVRVVLDF
I
>sp|P0DTC9|NCAP_WCPV Nucleoprotein OS=Wuhan seafood market pneumonia virus OX=2697049 GN=N PE=3 SV=1
MSDXGPQXQRXAPRITFGGPSDSTGSXQXGERSGARSKQRRPQGLPXXTASWFTALTQHG
KEDLKFPRGQGVPINTNSSPDDQIGYYXXATXXIXGGDGKMKDLSPRWXFXXLGTGPEAG
LPYGANKDGIIWVATEGALNTPKDHIGTRNPANNAAIVLQLPQGTTLPKGFYAEGXRGGX
QAXXRXXXRXRNXXRNXTPGXXRGTXPARMAGNGGDAALALLLLDRLNXLEXKMXGKGXX
XXGXTVTXXXAAEAXXXPRXXRTATXAYNVTXAFGRRGPEXTXGNFGDXELIRXGTDYKH
WPXIAXFAPSASAFFGMSRIGMEVTPSGTWLTYTGAIKLDDKDPNFKDQVILLNKHIDAY
KTFPPTEPXXDXXXXADETXALPXRXXXXXTVTLLPAADLDDFSKXLXXSMSSADSTXA
>tr|A0A663DJA2|A0A663DJA2_9BETC ORF10 protein OS=Wuhan seafood market pneumonia virus OX=2697049 GN=ORF10 PE=2 SV=1
MGYINVFAFPFTIYSLLLCRMNSRNYIAQVDVVNFNLT
>sp|P0DTD2|ORF9B_WCPV Protein 9b OS=Wuhan seafood market pneumonia virus OX=2697049 PE=3 SV=1
MDPKISEMHPALRLVDPQIQLAVTRMENAVGRDQNNVGPKVYPIILRLGSPLSLNMARKT
LNSLEDKAFQLTPIAVQMTKLATTEELPDEFVVVTVK
>sp|P0DTD3|Y14_WCPV Uncharacterized protein 14 OS=Wuhan seafood market pneumonia virus OX=2697049 GN=ORF14 PE=3 SV=1
MLQSCYNFLKEQHCQKASTQKGAEAAVKPLLVPHHVVATVQEIQLQAAVGEXXXXEWXAM
AVMXXXXXXXXTD
Loading