-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can you provide submission of pathogenic sequences other than COVID-19 (e.g influenza virus)? #5
Comments
Yes, you can change all the values via the config file.
Do you have an example sequence and can you tell me where you want to
submit it to?
Do you want one single submission with multiple species or mostly just one?
Do you think you'll switch between organisms or will you always submit just
one organism?
…On Fri, Apr 29, 2022 at 1:35 AM Biopig ***@***.***> wrote:
Hi, @maximilianh <https://github.com/maximilianh>
It's a useful tool for submitting viral sequences in bulk.
I am wondering to know if it was possible to provide submission of
pathogenic sequences other than COVID-19 (e.g influenza virus)?
Best,
Yang
—
Reply to this email directly, view it on GitHub
<#5>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACL4TNDT35DKV6FL5U7LS3VHONNJANCNFSM5UVE7NOA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hi Maximilian Thanks for your prompt response.
Best wishes, |
Do you have a test sequence in your format and can you tell me which virus
it is ? Then I can try a test submission and send you a sample config for
it. The options are all there, I’ve just never used them myself.
…On Fri 29 Apr 2022 at 20:01, Biopig ***@***.***> wrote:
Hi Maximilian
Thanks for your prompt response.
Can you tell me where you want to submit it to?
- It depends on the submission journal's demand. Either NCBI or
GISAID, usually. For me, I'd like to submit to GISAID (epiflu) database.
Do you want one single submission with multiple species or mostly just one?
- Usually, we submit the multiple viral sequences (one
species/subtype) in batch.
Do you think you'll switch between organisms or will you always submit
just one organism?
- For me, I usually focus on the influenza virus. Occasionally, we
need to submit sequences other than influenza, which are only allowed to be
deposited in NCBI.
Best wishes,
Yang
—
Reply to this email directly, view it on GitHub
<#5 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACL4TI4AAKT6Q64DNMXPD3VHSPBNANCNFSM5UVE7NOA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Here is the HA gene of H6N2 avian influenza virus which we had submitted to GISAID.
By the way, since the segment nature of the influenza genome, we usually need to submit the other gene segment at the same time. Therefore, I put the NA gene of this strain here.
Thank you very much for your help! Best, |
Great. So you want this submitted to NCBI or GISAID?
I would edit your meta data to put it into csv or tsv format and save the
sequence to fasta, right?
So two rows for the meta file and two sequences for the fasta.
…On Sat, Apr 30, 2022 at 1:08 AM Biopig ***@***.***> wrote:
Here is the HA gene of H6N2 avian influenza virus which we had submitted
to GISAID.
In case you need other information which usually is a mandatory demand for
the submission to GISAID, more metadata is supplied.
Sample location: Poyang Lake, Jiangxi, China
Sample date: 2/1/2018
Sample source: wild bird fecal
Host: Eurasian Teal
>A/Eurasian_Teal/Jiangxi/2018WB0049/2018(H6N2)
TTGGCAGCAGCCGGGAAGTCAGACAAGATCTGCATTGGATATCATGCCAACAACTCAACAACACAAGTGGATACTATCCTTGAGAAAAATGTCACCGTCACGCACTCAGTTGAATTGCTAGAAACCCAGAAGGAGGAGAGATTCTGCAACATCCTGAACAAGGGCCCTCTCGACCTAAAGGGATGCACCATAGAGGGTTGGATACTGGGGAATCCCCAATGCGACCTGTTGCTTGGTGATCAAAGCTGGTCATATATAGTGGAAAGACCTAGTGCTCAAAATGGGATTTGCTACCCAGGAACCTTGAACGACGCAGAAGAACTTAAGGCACTCATTGAATCAGGAGAAAGAGTAGAGAGATTTGAGATGTTTCCCAAAAGCACATGGGCAGGAGTTGACACCAGCAGTGGGGTGACAAAGGCTTGCCCCTATATTAGTGGTTCATCTTTCTATAGAAATCTCTTATGGATAATAAAGACCAAGTCAGCAGCATACCCAGTGATCAAAGGGACTTACAACAACACTGGAAACCAGCCAATCCTTTATTTCTGGGGTGTGCACCATCCTCCAGACACCAATGAACAAAATACTCTGTATGGCTCTGGTGATAGATACGTTAGGATGGGAACTGAAAGCATGAACTTCGCCAAGAGTCCAGAAATTGCAGCAAGACCTGCTGTGAACGGTCAAAGAGGCAGAATTGATTATTACTGGTCTGTTTTAAAACCAGGTGAAACCTTGAATGTGGAATCTAATGGAAATCTAATTGCCCCTTGGTATGCATACAAATTTGTCAGCACAAATAATAAGGGAGCCATCTTCAAGTCAAGTTTACCAATCGAGAACTGTGATGCCACATGCCAGACTATTGCAGGGGTCCTAAGAACCAATAAAACATTTCAGAATGTAAGTCCTCTGTGGATAGGAGAATGCCCCAAATATGTGAAAAGTGAAAGTTTGAGGCTTGCAACTGGACTGAGGAACGTTCCACAGATTGGAACTAGAGGTCTTTTTGGGGCCATAGCAGGATTTATTGAAGGAGGATGGACTGGAATGATAGATGGGTGGTATGGCTATCACCATGAGAATTCCCAGGGGTCAGGATATGCAGCAGACAAAGAGAGCACTCAAAGGGCTATAGACGGAATTACAAATAAAGTCAATTCCATCATTGATAAAATGAACACACAATTTGAAGCTGTTGACCACGAATTCTCAAATATAGAGAGAAGAATTGACAATCTGAACAAAAGGATGGAAGATGGATTCCTAGATGTTTGGACATACAATGCTGAACTGCTGGTTCTTCTTGAAAACGAAAGGACACTAGACCTGCACGATGCAAATGTAAAGAACCTATATGAGAAGGTCAAATCGCAATTAAGGGACAATGCTAATGATCTGGGAAATGGGTGCTTTGAATTCTGGCATAAGTGTGACAATGAGTGTATGGAATCTGTTAAGAATGGTACTTATGATTATCCCAAGTACCAGGACGAGAGCAAATTGAACAGGCAGGAAATAGAATCGGTAAAGCTAGAAAATCTTGGTGTGTATCAAATCCTTGCTATTTATAGTACGGTATCGAGCAGTCTGGTGTTGGTAGGGCTGATCATAGCAATGGGTCTTTGGATGTGTTCAAATGGTTCAA
By the way, since the segment nature of the influenza genome, we usually
need to submit the other gene segment at the same time. Therefore, I put
the NA gene of this strain here.
>A/Eurasian_Teal/Jiangxi/2018WB0049/2018(H6N2)
TCTGTCTCTCTAACCATTGCAACAGTATGTTTCCTCATGCAAATTGCCATCCTAGCGACAACTATAACACTGCACTTCAAGCAGAATGAATGCAGCATTCCCTCGAACAATCAAGTAGTGCCATGTGAGCCAATCATAGTAGAAAGGAACATAACAGAGATAGTGTATTTGAACAACACCACCATAGAAAAAGAACTTTGTCCTAAATTGACAGAATACAGGGATTGGTTGAAACCACAGTGTCAGATCACAGGATTTGCTCCTTTCTCCAAGGACAACTCAATCCGGCTTTCTGCTGGTGGGGACATTTGGGTAACAAGGGAACCTTATGTATCATGCAGTCCCAATAAGTGTTATCAGTTCGCACTTGGGCAGGGAACCACGCTGGACAACAAACATTCAAACGGCACAATACATGATAGGATTCCCCATCGGACCCTTTTGATGAACGAGTTGGGTGTTCCGTTTCATTTAGGGACCAAACAAGTGTGCATAGCATGGTCCAGCTCAAGCTGCCATGATGGAAGAGCATGGCTTCACGTTTGTGTTACTGGGGATGATAGGAATGCAACCGCCAGTTTCATTTATAATGGGGTGCTTGTTGACAGCATTGGTTCATGGTCCCAAAACATTCTCAGAACTCAGGAGTCAGAATGCGTCTGCATCAATGGAACTTGTACAGTAGTAATGACTGATGGAAGTGCATCAGGAAGGGCTGATACTAGAATACTATTCATTAAAGAAGGGAAAATTGTTCATATCAGCCCATTATCAGGAAGTGCCCAGCATATAGAGGAGTGTTCCTGTTATCCCCGCTATCCAGACGTCAGATGTGTCTGCAGAGACAATTGGAAAGGTTCAAATAGGCCCGTTATAGATATAAATATGGCAGATTATAGCATTGATTCTAGTTATGTGTGCTCAGGGCTTGTTGGAGACACACCGAGAAACGATGATAGCTCTAGCAATAGTAACTGCAAGGATCCTAATAATGAGAGAGGGAACCCAGGAGTGAAAGGGTGGGCATTTGACTATGGAAATGATGTTTGGATGGGAAGAACAATCAGCAAGGATTCTCGCTCAGGTTATGAGACCTTCAGAGTCATTGGCGGTTGGACAACAGCTAATTCCAAATCTCAAGTAAATAGACAAGTCATAGTTGACAATAATAACTGGTCTGGTTATTCTGGCATCTTCTCTGTTGAAGGCAAAAGCTGCATCAATAGGTGTTTTTATGTGGAGTTGATAAGGGGAAGGCCACAAGAGACTAGAGTATGGTGGACTTCAAACAGTATTGTCGTGTTTTGTGGAACTTCAGGTACTTATGGGACAGGCTCATGGCCTGATGGGGCGAATATTAATT
Thank you very much for your help!
Best,
Yang
—
Reply to this email directly, view it on GitHub
<#5 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACL4TJPRKEOYJFHCAIQ5DDVHTS75ANCNFSM5UVE7NOA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Great. So you want this submitted to NCBI or GISAID?
I would edit your meta data to put it into csv or tsv format and save the
So two rows for the meta file and two sequences for the fasta.
|
Yes, sorry, I know how the GISAID side works, I was asking how you wanted
to provide the files to multiSub. So a meta file with two rows and one
fasta file with two sequences.
…On Sat, Apr 30, 2022 at 8:44 PM Biopig ***@***.***> wrote:
Great. So you want this submitted to NCBI or GISAID?
yes, I think GISAID is priority.
I would edit your meta data to put it into csv or tsv format and save the
sequence to fasta, right?
right
So two rows for the meta file and two sequences for the fasta.
For GISAID, there is an official guideline
<https://www.gisaid.org/epiflu-applications/submitting-data-to-epiflutm/>
for batch upload. Another protocol
<https://www.protocols.io/view/sars-cov2-gisaid-submission-protocol-kqdg35oy1v25/v3?step=1.2>
for uploading multiple samples (Batch upload).
—
Reply to this email directly, view it on GitHub
<#5 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACL4TP3ZKU6COUZCMDJBZDVHX4Y7ANCNFSM5UVE7NOA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hi @biopig, do you know how to checkout the "main" branch? I just committed
a test case for this.
If you add this to your ~/.multisub/config (see also config.sample):
organism = "Influenza A virus"
longOrg = "Influenza A virus"
And then go to this directory:
https://github.com/maximilianh/multiSub/tree/main/tests/biopig
And then run this command:
../../multiSub conv seq.fa meta.tsv out
Then a NCBI submission file like this gets created in
out/ncbiSeqAndSource.fa:
A/Eurasian_Teal/Jiangxi/2018WB0049/2018(H6N2) [isolate=Influenza A
virus/Eurasian Teal/USA/2018WB0049/2018] [country=China: Poyang Lake,
Jiangxi] [collection_date=2018-2-1] [host=Eurasian Teal]
[organism=Influenza A virus] Influenza A virusisolate Influenza A
virus/Eurasian Teal/USA/2018WB0049/2018, complete genome
TTGGCAGCAGCCGGGAAGTCAGACAAGATCTGCATTGGATATCATGCCAACAACTCAACA
ACACAAGTGGATACTATCCTTGAGAAAAATGTCACCGTCACGCACTCAGTTGAATTGCTA
GAAACCCAGAAGGAGGAGAGATTCTGCAACATCCTGAACAAGGGCCCTCTCGACCTAAA...
As for GISAID, I cannot download the GISAID template file for Influenza
today. I can't even go to the GISAID Influenza website (epicov) today, the
link at https://www.epicov.org/epi3/frontend doesn't work. If you have the
csv or Excel template from their site for me, I can fix up the GISAID
uploader.
On Sat, Apr 30, 2022 at 11:56 PM Maximilian Haeussler ***@***.***>
wrote:
… Yes, sorry, I know how the GISAID side works, I was asking how you wanted
to provide the files to multiSub. So a meta file with two rows and one
fasta file with two sequences.
On Sat, Apr 30, 2022 at 8:44 PM Biopig ***@***.***> wrote:
> Great. So you want this submitted to NCBI or GISAID?
>
> yes, I think GISAID is priority.
>
> I would edit your meta data to put it into csv or tsv format and save the
> sequence to fasta, right?
>
> right
>
> So two rows for the meta file and two sequences for the fasta.
>
> For GISAID, there is an official guideline
> <https://www.gisaid.org/epiflu-applications/submitting-data-to-epiflutm/>
> for batch upload. Another protocol
> <https://www.protocols.io/view/sars-cov2-gisaid-submission-protocol-kqdg35oy1v25/v3?step=1.2>
> for uploading multiple samples (Batch upload).
>
> —
> Reply to this email directly, view it on GitHub
> <#5 (comment)>,
> or unsubscribe
> <https://github.com/notifications/unsubscribe-auth/AACL4TP3ZKU6COUZCMDJBZDVHX4Y7ANCNFSM5UVE7NOA>
> .
> You are receiving this because you were mentioned.Message ID:
> ***@***.***>
>
|
Hi, @maximilianh Fantastic, let me give it a try. Thanks, |
Hi Biopig, oh darn, the GISAID flu template is totally different from the
Covid one.
Are you sure you need GISAID upload? If you have table files already in
GISAID format, why use multiSub at all?
…On Mon, May 2, 2022 at 3:52 AM Biopig ***@***.***> wrote:
Hi, @maximilianh <https://github.com/maximilianh>
Fantastic, let me give it a try.
Here is the GISAID uploader.
gisaid_batch_uploader.xls
<https://github.com/maximilianh/multiSub/files/8600423/gisaid_batch_uploader.xls>
Thanks,
Yang
—
Reply to this email directly, view it on GitHub
<#5 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACL4TOYTQ4FL2VHLJ43UWTVH4YL7ANCNFSM5UVE7NOA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hi, @maximilianh
It's a useful tool for submitting viral sequences in bulk.
I am wondering to know if it was possible to provide submission of pathogenic sequences other than COVID-19 (e.g influenza virus)?
Best,
Yang
The text was updated successfully, but these errors were encountered: