GFF3::source | GFF3::type #13

lpantano · 2017-06-13T15:21:35Z

Hi all again!

cc: @lpantano @gurgese @ThomasDesvignes @mhalushka @mlhack @keilbeck @BastianFromm @ivlachos @TJU-CMC

I propose to use the database used by the tool to put in the second column: source

I propose to use these labels for the type column (3rd):

hairpin : this could be the parent
annotated: this could be the annotated in the database, this is child from hairpin. I am trying to avoid canonical as we have discussed before. Maybe reference is another idea since it would be similar to the problem we have for SNPs, where reference it was just designated by the first genomes sequenced but doesn't mean is the most abundant. miRNA is another we can use I guess.
isomiR/variant: this could be the detected sequence, this is child from previous one
other types of miRNAs?

Contribute with more options or any thoughts you have about it! thanks!

keilbeck · 2017-06-13T15:33:49Z

Column 3 needs to be a term form the Sequence Ontology.
If the right terms are not there to describe your feature - we need to add it to the ontology
There are 25 miRNA terms in the SO currently
http://www.sequenceontology.org/browser/obob.cgi

lpantano · 2017-06-13T15:35:17Z

thanks that is great!. I’ll take a look and if we need something new we’ll work with you to add them?

…

On Jun 13, 2017, at 11:33 AM, Karen EIlbeck ***@***.***> wrote: Column 3 needs to be a term form the Sequence Ontology. If the right terms are not there to describe your feature - we need to add it to the ontology There are 25 miRNA terms in the SO currently http://www.sequenceontology.org/browser/obob.cgi <http://www.sequenceontology.org/browser/obob.cgi> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#13 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABi_HPm5ZryU_SK5bR0Wxo8_fEMegHLPks5sDqvegaJpZM4N4p5O>.

ivlachos · 2017-06-13T15:44:55Z

I really like the idea of "reference" instead of canonical, since it's very close to reality.
Kudos!

ThomasDesvignes · 2017-06-13T17:09:06Z

I am all for a "reference" miRNA instead of a "canonical" miRNA. In our TiG paper we were actually proposing the creation of a "RefSeq miRNA sequence" as an unchangeable standard, while the most expressed isomiR could change among sample/tissue/etc...

For column , the parent could then be "pre_miRNA" to match the Sequence Ontology. I think the SO database has most if not all covered as of now, except the isomiRs which may be considered as child of the RefSeq miRNA.

ThomasDesvignes · 2017-06-13T19:36:02Z

For column two ("source") do you mean putting: "miRBase_v.XX" or "MirGeneDB_v.X" or "personal annotation"? All that's fine with me. By experience (on fish) I usually do my own annotation and make it public with the publication. And for example both what I've done on Zebrafish and Spotted gar has never been incorporated in any database. How would we deal with that? I am thinking of putting my annotation files on a gitHub/Zenodo page (because I'll continue annotating more species and I know people won't dig into the supplemental files of my publication to retrieve an annotation...), so maybe in column 2 we could have something like "Zenobo_doi..."? Basically something traceable...

BastianFromm · 2017-06-13T20:32:26Z

We did zebrafish....we are doing about 20 more and hope to publish a reference for major metazoans this autumn..

…

On Jun 13, 2017 21:36, "Thomas Desvignes" ***@***.***> wrote: For column two ("source") do you mean putting: "miRBase_v.XX" or "MirGeneDB_v.X" or "personal annotation"? All that's fine with me. By experience (on fish) I usually do my own annotation and make it public with the publication. And for example both what I've done on Zebrafish and Spotted gar has never been incorporated in any database. How would we deal with that? I am thinking of putting my annotation files on a gitHub/Zenodo page (because I'll continue annotating more species and I know people won't dig into the supplemental files of my publication to retrieve an annotation...), so maybe in column 2 we could have something like "Zenobo_doi..."? Basically something traceable... — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#13 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AaAi3-jKCwKH6l8WgT76ENmCLqqPABBBks5sDuSjgaJpZM4N4p5O> .

ThomasDesvignes · 2017-06-13T20:51:57Z

That's awesome Bastian! How many more fish out of the 20? (I'm a fish person ;) )
However, the problem I have with the MirGeneDB is that the criteria for being in the DB are way too strict in regard to the way I study miRNAs and there are many non-canonical miRNAs that I want to continue studying because they are functional and that are not in MirGeneDB (cf previous discussions on canonical miRNAs..), so I guess that at least for my studies I'll continue using my own annotation files which will remain larger than what is in MirGeneDB and I need a way to make them publicly available, so that's why I ask for an alternative "source" of annotation. But we're moving away from the original question here...

lpantano · 2017-06-13T23:57:41Z

Thanks all for the discussion! and awesome we'll have zebrafish there.

Thomas, I think is ok, you can name it as you want, as far as it doesn't overlap with an official name.

I think we can ask for a line like this in the header of the file:

##source-ontology LINK TO DATABASE

or something like that to make sure is traceable.

PS:The idea to upload it to github it seems super good

lpantano · 2017-06-15T13:50:50Z

Hi @keilbeck and all,
I looked at the SO. I think we need something like ref_miRNA and edit_miRNA or isomiR directly? Do you think is possible to add that to the database?

Let me know your thoughts.

keilbeck · 2017-06-21T12:50:40Z

Send me the definitions.

ThomasDesvignes · 2017-06-22T15:35:25Z

Hi Karen, I'm not sure we've reached a consensus yet on the "ref_miRNA" and "isomiR" definitions (I think isomiR is better than edit_miRNA btw), but in our paper together we proposed these definitions, which people can maybe comment and embellish:

Ref_miRNA: A Ref_miRNA sequence is assigned at the creation of a new mature miRNA entry in a database. The Ref_miRNA sequence designation remains unchanged even if a different isomiR is later shown to be expressed at a higher level. A ref_miRNA can be produced by one or multiple pre-miRNA.
IsomiRs: IsomiRs are all the bona fide variants of a mature product. IsomiRs should be connected to the Ref_miRNA it is most likely to be the variant of. Some isomiRs can be variations of one or multiple Ref_miRNA.
(Directly taken from Fig.1 in the Trends in Genetics miRNA Nomenclature paper)

lpantano · 2017-06-22T15:36:47Z

I am happy with those definitions, Thanks Thomas!

…

On Jun 22, 2017, at 11:35 AM, Thomas Desvignes ***@***.***> wrote: Hi Karen, I'm not sure we've reached a consensus yet on the "ref_miRNA" and "isomiR" definitions (I think isomiR is better than edit_miRNA btw), but in our paper together we proposed these definitions, which people can maybe comment and embellish: Ref_miRNA: A Ref_miRNA sequence is assigned at the creation of a new mature miRNA entry in a database. The Ref_miRNA sequence designation remains unchanged even if a different isomiR is later shown to be expressed at a higher level. A ref_miRNA can be produced by one or multiple pre-miRNA. IsomiRs: IsomiRs are all the bona fide variants of a mature product. IsomiRs should be connected to the Ref_miRNA it is most likely to be the variant of. Some isomiRs can be variations of one or multiple Ref_miRNA. (Directly taken from Fig.1 in the Trends in Genetics miRNA Nomenclature paper) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#13 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABi_HFk5RRBi0GTqp0dfCj-M9lBmU_NDks5sGom9gaJpZM4N4p5O>.

keilbeck · 2017-06-22T16:05:05Z

OK, just trying to get my head arounf this

Is a ref_miRNA a genomic feature or a transcript feature?
I think isomiR is a transcript feature right?

ThomasDesvignes · 2017-06-22T16:10:02Z

From my point of view:

ref_miRNA is a transcript feature: it's the mature reference product of a miRNA gene expression. By analogy, it's like the RefSeq transcript of a protein coding gene
isomiR is also a transcript feature. With the same analogy, it's a splicing variant of a protein coding gene.

keilbeck · 2017-06-22T16:19:23Z

Brilliant.
We will add these

nicoleruiz · 2017-06-22T18:05:33Z

SO:0002166 ref_miRNA and SO:0002167 isomiR have been added as children of miRNA.

mhalushka · 2017-06-22T18:14:31Z

Just to probe this further, what if a more abundant isomiR with a change at the 5' end of a ref_miRNA is encountered? This would change the seed sequence and could change the genes to which the miRNA could bind. Would you still keep the original ref_miRNA? Would you consider updating it with a "version change" or similar method?

ThomasDesvignes · 2017-06-22T18:28:41Z

That's a good thought! From my end I still consider it as an isomiR of the ref-miRNA and I usually call it a "seed-shifted isomiR". It will theoretically have a different function/targets due to having a different seed but it still is an alternative product of the same gene/pre-miRNA.
Then if the ref_miRNA has actually been annotated with the "wrong" seed, that would probably need to be fixed I guess..., so all rely on the quality of the sequencing and analysis of the first dataset that leads to the annotation...

lpantano · 2017-06-22T21:35:25Z

It is indeed a good point Marc, and I agree with Thomas. I think that this can be applied to protein coding genes, where some isoforms will change the function depending on the exons that contains. In this case if only mapped to that miRNA, but change the seed, I would continue using ref_miRNA as the reference. Please, remember than reference doesn’t mean anything, just something to compare to. It would change from database to databases, and among versions of the same database probably in the future. In the case the variant map to more than one miRNA, then it would appear as isomiRs for both of the reference miRNA. (and saying is ambiguous in some attibute) I think all these are fine as far as we keep all the information. I think for this reason, the issue opened discussing about the attribute is important. There, I mentioned one space to classify the isomiRs, this will help to use the GFF file and take all the isomiRs that change the seed region compare to the reference if anybody wants to focus on those cases to do more functional analysis. Thanks for all the comments, I think we are improving a lot! Please, chime in #14 to talk about attributes. thanks!

…

On Jun 22, 2017, at 2:28 PM, Thomas Desvignes ***@***.***> wrote: That's a good thought! From my end I still consider it as an isomiR of the ref-miRNA and I usually call it a "seed-shifted isomiR". It will theoretically have a different function/targets due to having a different seed but it still is an alternative product of the same gene/pre-miRNA. Then if the ref_miRNA has actually been annotated with the "wrong" seed, that would probably need to be fixed I guess..., so all rely on the quality of the sequencing and analysis of the first dataset that leads to the annotation... — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#13 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABi_HJNwqfdxjcH-DW0hv4TY2NW6_-uGks5sGrJcgaJpZM4N4p5O>.

ivlachos · 2017-06-22T21:54:07Z

I like the approach of the ref miRNA and isomiRs. It's a convention, it's clear and extensible.
Many isomiRs can have different functionalities despite having the same seed (e.g. different localization) but certainly targeting with a 5' shift could be drastically affected. I agree that it's an isomiR compared to the reference and it's our job to find out what changes and what remains the same, as we are doing for genes.
I also support to avoid "edited", since it brings ADAR to mind.

phillipeloher · 2018-07-16T14:30:43Z

From my perspective, a Ref_miRNA is an abstraction of a series of surrounding isomiRs. The problems with a ref_miRNA include (a) different ref endpoints between databases (e.g. mirbase vs mircarta) for the same locus (b) as folks already mentioned, the isomiR seeds (e.g. isomiRs with different 5p starting points) won't necessarily match the reference (c) the most abundant isomiR (which many ref miRNA annotations were populated) can differ between tissue state and cell type.

In many cases, the isomiR sequence corresponding exactly the ref_miRNA sequence (a 0|5p, 0|3p isomiR) is expressed.

Instead of making an isomiR a child of a ref_miRNA, if we made the Ref_miRNA an abstract_property of an isomiR sequence, it would place (I think rightfully) less emphasis on a somewhat arbitrary Ref_miRNA and more emphasis on the transcriptional products.

lpantano · 2018-07-16T16:53:58Z

Hi @phillipeloher,

thanks for the comment.

We don't consider isomiR to be a child of miRNA_Ref but a child of precursor.

It is true that the Variant attribute is relative to the miRNA_Ref, but I think this is the same problem than any other database where you get a reference somehow. I think having the universal ID can get the data mapped to any other database, and if we allow cross-mapping tool in the API: mirBase to mirGeneDB etc, then we solve this problem somehow, what do you think? I'll open an issue with this request.

It is true we can remove miRNA_ref from there and use the variants to be NA meaning that using that database there is no variants. I'll open a discussion for this specific issue. Thanks! great idea!

lpantano added the discussion label Jun 13, 2017

lpantano added the consensus label Jun 26, 2017

lpantano added a commit that referenced this issue Aug 9, 2017

adaptation from issues #13 #14

5125b18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GFF3::source | GFF3::type #13

GFF3::source | GFF3::type #13

lpantano commented Jun 13, 2017

keilbeck commented Jun 13, 2017

lpantano commented Jun 13, 2017 via email

ivlachos commented Jun 13, 2017

ThomasDesvignes commented Jun 13, 2017 •

edited

Loading

ThomasDesvignes commented Jun 13, 2017

BastianFromm commented Jun 13, 2017 via email

ThomasDesvignes commented Jun 13, 2017

lpantano commented Jun 13, 2017 •

edited

Loading

lpantano commented Jun 15, 2017

keilbeck commented Jun 21, 2017

ThomasDesvignes commented Jun 22, 2017

lpantano commented Jun 22, 2017 via email

keilbeck commented Jun 22, 2017

ThomasDesvignes commented Jun 22, 2017

keilbeck commented Jun 22, 2017

nicoleruiz commented Jun 22, 2017

mhalushka commented Jun 22, 2017

ThomasDesvignes commented Jun 22, 2017

lpantano commented Jun 22, 2017 via email

ivlachos commented Jun 22, 2017 •

edited

Loading

phillipeloher commented Jul 16, 2018 •

edited

Loading

lpantano commented Jul 16, 2018

GFF3::source | GFF3::type #13

GFF3::source | GFF3::type #13

Comments

lpantano commented Jun 13, 2017

keilbeck commented Jun 13, 2017

lpantano commented Jun 13, 2017 via email

ivlachos commented Jun 13, 2017

ThomasDesvignes commented Jun 13, 2017 • edited Loading

ThomasDesvignes commented Jun 13, 2017

BastianFromm commented Jun 13, 2017 via email

ThomasDesvignes commented Jun 13, 2017

lpantano commented Jun 13, 2017 • edited Loading

lpantano commented Jun 15, 2017

keilbeck commented Jun 21, 2017

ThomasDesvignes commented Jun 22, 2017

lpantano commented Jun 22, 2017 via email

keilbeck commented Jun 22, 2017

ThomasDesvignes commented Jun 22, 2017

keilbeck commented Jun 22, 2017

nicoleruiz commented Jun 22, 2017

mhalushka commented Jun 22, 2017

ThomasDesvignes commented Jun 22, 2017

lpantano commented Jun 22, 2017 via email

ivlachos commented Jun 22, 2017 • edited Loading

phillipeloher commented Jul 16, 2018 • edited Loading

lpantano commented Jul 16, 2018

ThomasDesvignes commented Jun 13, 2017 •

edited

Loading

lpantano commented Jun 13, 2017 •

edited

Loading

ivlachos commented Jun 22, 2017 •

edited

Loading

phillipeloher commented Jul 16, 2018 •

edited

Loading