Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Road Plan for half of 2018 - Read please #21

Open
lpantano opened this issue Jul 2, 2018 · 42 comments
Open

Road Plan for half of 2018 - Read please #21

lpantano opened this issue Jul 2, 2018 · 42 comments

Comments

@lpantano
Copy link
Contributor

lpantano commented Jul 2, 2018

@lpantano @gurgese @ThomasDesvignes @mhalushka @mlhack @keilbeck @BastianFromm @ivlachos @TJU-CMC @sbb25 @phillipeloher

Hi all,

It will be a little long email, but please take 15 min to go over. It will help to decide how we start spreading the word about this.

  • BOSC was great, people got a lot of question and it was accepted with open hands.
  • I got a lot of good ideas to help with the format and get people using it:
    • make a logo, we are having the competition during this week, so we are almost there.
    • create a separated repository with the format only, see here
    • submit to EDAM ontology and FAIRsharing, they are database that keep track of formats and databases, (I submitted to both), we are waiting to be reviewed. @BastianFromm maybe you want to submit mirGeneDB to FAIRSharing.org?
    • publish a very small paper with the format only. This actually, I am in favor to do it. We can publish on F1000. It is open and they allow very short papers. The main idea is to have something out soon so we get people aware, without a paper it is more difficult, and the current work with tewari data is great but it will need time. Can you tell me what do you think?

The deadline for important modifications to the first version of the format is in 1 week (07/08/2018). Just to be sure we spread the word with the very first usable format.

For that I need all of you to go to the definition and open an issue with anything you think it is important to have and we don't have it. Anything, you would need to have if I want to develop something over this format, in term of query, visualization, re-mapping, anything you would need to know.

As final idea, all people recommend to try to present this as in many place as possible, but I cannot do it alone. So even if it is a slide in a talk, just do it. Having a paper will help. But you still can do this at any time. If you go to a conference and have a poster, as well, mention this to people, so we can create an ecosystem of mirna data analysis tools.

In summary:

  • need your thoughts about publishing the format in F1000 (very short paper), leaving the python tool and tewari data for the next publication.
  • need your feedback to make the format useful for everybody.

Thanks!

#18

@lpantano
Copy link
Contributor Author

lpantano commented Jul 2, 2018

I forgot to mention to click on watch on this https://github.com/miRTop/mirGFF3 so you know what is going on when the file is modified, please! and Like it with an star as well :)

@lpantano
Copy link
Contributor Author

lpantano commented Jul 9, 2018

Hi all,

Did anyone have a chance to go through this, I would love to have your inputs.

Thanks!

@mhalushka
Copy link
Collaborator

I think this is a great plan. Please let me know how I can better engage and help out. I might have a trainee who can do some analysis of the Tewari data, if you need help there.

@lpantano
Copy link
Contributor Author

Thanks @mhalushka , having more people would help. I am trying to get as well a trainee.

I'll wait until have a couple of more comments, and if everybody agrees, work on the draft for the GFF paper, so we have it for August.

Cheers

@ThomasDesvignes
Copy link
Member

Hi Lorena,
yes, I agree too that this is a great plan.
Logo, independent repository to make this format independent of miRTOP, submission to file format databases, all that sounds great to me and seems like efficient new steps towards for a solid global format. Also, having an initial very short publication of the format before doing the more extensive study is likely a good move. This initial publication may attract more people to the group and this may lead to novel ideas which could improve the solid bases we already have.
I really like the quote you wrote down from Tracy Teal ("If you want to go fast, go alone; If you want to go far, go together"), I think it summarize very well our situation and our common interest in making things happen as a group.
I will try and get more data out soon and comment on the format definition . Please let me know if I can help on this short note you want to write or if I can help in any other way!
I will also add a slide about the group in all my miRNA related talk from now on!
Cheers,

@mlhack
Copy link
Collaborator

mlhack commented Jul 11, 2018 via email

@mhalushka
Copy link
Collaborator

I am happy to jump on a conference call. I'd like to figure out what to assign my trainee to work on.

@lpantano
Copy link
Contributor Author

Thanks Thomas for chiming in.

Thanks Michael for the input.

I proposed to leave the tool and tewari data for a future paper because the tool needs a lot of work if you want to have all the most important features added. For instance, querying the file is not implemented yet.

I think as well that working on the format for a publication will produce some changes for the better and that will affect the tool.

So my current plan is to publish the format, and we can make it long but then I need more contribution for that because I won’t be able to come with all the perspective. I think it would be great to have a good discussion for the paper, if everybody is on board. As well, there is actually no a wide used format, so that makes easier to promote the work we are doing.

If this plan goes ahead, I think mentioning that there is an open community developing the tool can bring more collaborators and make the tool better for the publication with the tewari data, plus we’d use the data to make the point the tool helps integrating big amount of samples and tools.

Here it is the doodle to try to make a conference call:

https://doodle.com/poll/ufinbin4fv772eee

Thanks all for ur feedback!

@lpantano
Copy link
Contributor Author

sorry, I added times to the pool. Remember, ET time. Thanks

@lpantano
Copy link
Contributor Author

Thanks for choosing your time:

I'll set up the meeting for Thursday 19, from 9-11am ET zone time. Some of you can at 9 and others at 10, so I'll be the two hours there and I can update Thomas who can only at 10am (sorry Thomas I totally forgot you are 3h behind me :( , we can meet even later that day if that is ok). I will send minutes at the end with the plan that we hopefully can agree on.

I'll send invitation on Monday!

Thanks!

@phillipeloher
Copy link
Collaborator

The plan looks good to me and Thursday works great.

@lpantano
Copy link
Contributor Author

Hi all,

this is the invitation:

Topic: Mirtop - road plan 2018
Time: Jul 19, 2018 9:00 AM Eastern Time (US and Canada)

Join from PC, Mac, Linux, iOS or Android: https://zoom.us/j/221604941

Or iPhone one-tap :
US: +16699006833,,221604941# or +16468769923,,221604941#
Or Telephone:
Dial(for higher quality, dial a number based on your current location):
US: +1 669 900 6833 or +1 646 876 9923
Meeting ID: 221 604 941
International numbers available: https://zoom.us/u/eu3Ib7wO5

@mhalushka
Copy link
Collaborator

Thank you. I will be on the call. Also, the Tewari paper came out in Nat Biotech. So there is nothing holding us back from moving forward with their data now and getting our findings published. https://www.ncbi.nlm.nih.gov/pubmed/30010675

@lpantano
Copy link
Contributor Author

Minutes:

  • publications:

  • format definition + mirtop tool: first pre-print, then journal submission when mirtop tool is 100% ready

  • tewari re-analysis: organize meeting every two weeks

  • Marc will check we have all the data as it is published in GEO database

  • Philip will re-analyze the data with their tool

@lpantano
Copy link
Contributor Author

Hi all,

just to follow up with specific plan:

I set up a biweekly meeting to talk about the tewari data, everybody who can join is welcome. I know at the beginning it would be difficult to have a lot of people but if at some point you get use to have this day and time lock up, I hope it works. Here is the calendar of the miRTop project

You can add the meeting event using this link

The event will be every two weeks, every Thursday at 10am ET. I'll send a reminder one day before.

As a final item, I'll start the definition format paper and share a google docs with you all. I'll try to setup some deadlines that will be contained in the document itself.

Thanks all for keep pushing!

@lpantano
Copy link
Contributor Author

Information to join the tewari meeting (I added to the calendar as well):

Join from PC, Mac, Linux, iOS or Android: https://zoom.us/j/553765969

Or iPhone one-tap :
US: +16465588665,,553765969# or +14086380986,,553765969#
Or Telephone:
Dial(for higher quality, dial a number based on your current location):
US: +1 646 558 8665 or +1 408 638 0986
Meeting ID: 553 765 969
International numbers available: https://zoom.us/u/b0MgRck4D

@lpantano lpantano removed the urgent label Aug 9, 2018
@lpantano
Copy link
Contributor Author

lpantano commented Aug 9, 2018

Hi all,

All the minutes will be kept here: https://github.com/miRTop/incubator/blob/master/projects/tewari/minutes.md, feel free to bookmark this to catch up.

As well, feel free to check the road map for the project:

https://github.com/miRTop/incubator/projects/2

Thanks!

@lpantano
Copy link
Contributor Author

Hi all,

I have cancelled this week meeting since I am on Holidays. I moved to the next week: Aug, 30th. See link below.

Thanks!

@mhalushka
Copy link
Collaborator

I am away on the 30th, but hopefully Arun can represent us at the talk.

@arunhpatil
Copy link
Member

arunhpatil commented Aug 21, 2018 via email

@lpantano
Copy link
Contributor Author

Hi all,

It would be good to get as many of you as possible for tomorrow meeting. We have spotted an important results that would need a good discussion to know how to move forward.

As well, I will share the draft for the mirtop format paper so you can contribute and make modifications. The idea is to submit to F1000 at the end of October.

I hope some of you can make it.

Cheers

@mhalushka
Copy link
Collaborator

I will be on the call. Looking forward to it.

@BastianFromm
Copy link
Collaborator

BastianFromm commented Sep 19, 2018 via email

@lpantano
Copy link
Contributor Author

Bastian,

the time is 10am Boston time and the link to connect is https://zoom.us/j/553765969

cheers

@BastianFromm
Copy link
Collaborator

BastianFromm commented Sep 20, 2018 via email

@lpantano
Copy link
Contributor Author

Hi Bastian,

Sorry about having a bad time. Maybe next time is possible for you to join. We are having that time so people from the other coast can join.

I have updated the minutes here:

@lpantano
Copy link
Contributor Author

Sorry, I sent it by mistake before was ready.

The minutes are here:

https://github.com/miRTop/incubator/blob/master/projects/tewari/minutes.md#09-20-2018

I played more with the data and I think we see something now that may make sense. Botton line is that filtering the data to look at the miRNAs where the reference is the top expressed, the majority of isomiRs happen to be with an abundance lower than 20% of the total miRNA. More in the link you'll see inside the minutes.

You can see there the next steps.

@mhalushka is ok if I share with you the raw data and you analyze it with miRGe?

Next meeting is on October 4th, at 10 am Boston time. (https://zoom.us/j/553765969)

See you there!

@mhalushka
Copy link
Collaborator

@lpantano. Yes. I'm happy to run the data through miRge.

@lpantano
Copy link
Contributor Author

lpantano commented Oct 4, 2018

Hi all,

these are the minutes: https://github.com/miRTop/incubator/blob/master/projects/tewari/minutes.md#10-04-2018

@mhalushka , we came up with some questions for the authors. Do you think you can try to contact them? happy to clarify over email.

Next meeting is on October 18th, at 10 am Boston time. (https://zoom.us/j/553765969)

cheers

@lpantano
Copy link
Contributor Author

Hi all,

We discussed today mainly about the paper. There are the followed comments that we want to normalize and if no body has a strong opinion we'll go ahead with that:

  • Coordinates based on genome will have the strand together with the chr:start-end(strand). Please chime in if there is any specific format is widely used.
  • The coordinates will be 1-based, inclusive at the start and the end.
  • truncation at 5p will be designated as +N, instead of -N. template additions at 5p as -N, instead of +N. @ThomasDesvignes, can you tell us if there is something that is done for protein coding genes? We didn't remember any case similar in the gene world to discuss about it. This decision is based on the multiple papers that are using this annotation already (see the paper comment from @phillipeloher ).

The minutes are here: https://github.com/miRTop/incubator/blob/master/projects/tewari/minutes.md#10-18-2018

Next meeting is on November 1st, at 10 am Boston time. (https://zoom.us/j/553765969)

cheers

@ThomasDesvignes
Copy link
Member

Hi all,
sorry to the late reply.

  • For the position of the genome, it can be the way you propose, or simply report the start and end the way they are transcribed, so for ex, if miRX-5p is reverse strand and starts at position 1000 on Chr1 and ends at position 980 then it could be either: "1:980-1000(-)" (the way you propose, right?) or "1:1000-980", which is another common way. The second seems simpler to me and without the addition of an extra 3 symbols contains all the information.
  • I agree on inclusive coordinates at start and end.
  • Just to make sure I get the part about +N and -N, I'll try to spell it out in my way: this refers to the start and end of the alignment of an isomiR compared to the alignment of the ref miRNA. If the isomiR is truncated by 1 nt then its start is one nt later than ref isomiR and then it's labelled "+1". Oppositely, if the miRNA has a 1 nt templated addition, then its start is one nt earlier and then "-1".

This totally makes sense to me. It just depends on where the focus is put and seems just like a convention to decide upon. It's just two ways to look at an isomiR and it could be the opposite if we focus more on isomiR sequence itself and its the length (a isomiR with 1 extra nucleotide would be "+1" because it's one longer) vs the start of the alignment (a isomiR with 1 extra nucleotide would be "-1" because its starts one nucleotide earlier).

And I think whichever system is chosen, the important thing is to be very clear in the definition. As far as I know there is not really anything similar for protein coding genes and mRNA transcript isoform are referred to as "delta exon X" or things like that, but there's no real consensus I'm aware of. For myself, I'm totally fine with this convention or the other, as far as we define it without ambiguity.

Cheers,
Thomas

@mhalushka
Copy link
Collaborator

Just for clarity (and I apologize if this was already covered), but for the ref miRNA start position, is this the alignment in miRBase or miRGeneDB? I think there may be a very rare miRNA that differs between the two sites and I don't recall if that is settled or not. It may be that all of the differences between miRBase and miRGeneDB are on the 3' end. I think @BastianFromm may be able to clarify that second point.

@BastianFromm
Copy link
Collaborator

BastianFromm commented Oct 22, 2018 via email

@lpantano
Copy link
Contributor Author

Hi all,

thanks for the comments, very helpful. I would then update the definition to be consistent with @phillipeloher proposal.

I agree that each database would have its reference, and mirgenedb is trying to improve this, but I am very grateful for all that work. For this same reason, I think the idea is to have the file format be database dependent, what can facilitate for instance to translate mirbase to mirgenedb easily. Or in case people are working with some new species, they can still use the format.

Hopefully, this is a tool that can show even more the issues of some databases, and find a consensus in the future.

Thanks!

@phillipeloher
Copy link
Collaborator

Great feedback, we'll work on updating the definitions today and tomorrow.

@lpantano
Copy link
Contributor Author

lpantano commented Nov 2, 2018

Hi,

the minutes from yesterday meeting, pay attention to the paper @phillipeloher found, they have a section of isomiRs, although we are going further we overlap a little:

11-01-2018

People: Lorena, Phillipe, Ioannis, Arun, Marc

We discuss:

  • iso_add should be iso_add3p and we should have iso_add5p and remove the sign of that and only add the number of nts involved in the variations: iso_add3p:2.
  • examples in paper table should be consistent and show one example for each attribute
    isomiR shift in different databases corroborating the high counts of some isomirs that are mainly truncation events and shouldn't be there
  • This papers started to look at isomiR consistency besides miRNAs (kind of similar what we are doing but I think we are going further) https://www.biorxiv.org/content/early/2018/10/30/445437?%3Fcollection=

Ready to-do points:

Update paper according discussed points.

  • Lorena has to send FASTA file of all sequences to Ioannis to get the features that will be used for the machine learning analysis once we can classify sequence into reproducible or not.
  • Look into NTs involved in the truncation events.
  • Ioannis may look at ERCC spikeins ends to see if we see the same.

Next meeting 11-15-2018, 10am EST time, 4pm GMT+1 time.

@lpantano
Copy link
Contributor Author

Hi all,

I hope you get a good holidays (for the ones are in the USA).

sorry to miss the minutes from the last meeting.

We discussed two topics:

Since there is a lot on my plate, I am canceling this week meeting (No meeting on the 29th) and have it on the 13th to decide about the experimental design explained in the previous point.

I will give a local talk on the 6th, that I will stream so you are welcome to join, I will post the link here next week.

Next meeting 12-13-2018, 10am EST time, 4pm GMT+1 time.

Thanks!

@lpantano
Copy link
Contributor Author

lpantano commented Dec 5, 2018

Hi all,

I am giving the talk tomorrow that will summarized the mirtop and mirGFF3 project but as well the re0-analysis of the tewari and other data set. You can join online (tomorrow at 11 am Boston time):

Meeting URL
https://bluejeans.com/211525664

Meeting ID
211 525 664

Moderator Passcode
2730

Want to dial in from a phone?
Dial one of the following numbers:
+1.408.740.7256
(US (San Jose))
+1.888.240.2560
(US Toll Free)
+1.408.317.9253
(US (Primary, San Jose))
see all numbers
Enter the meeting ID and passcode followed by #

Connecting from a room system?
Dial: 199.48.152.152 or bjn.vc and enter your meeting ID & passcode

@lpantano
Copy link
Contributor Author

lpantano commented Dec 7, 2018

Hi all,

I recorded the talk summarizing the work we have done with the re-analysis, you can access it through here: https://bluejeans.com/s/h4CR2

Next meeting 12-13-2018, 10am EST time, 4pm GMT+1 time. (https://zoom.us/j/553765969)

Here is the document to add the experimental design that can help to study the isomiR accuracy in sequencing: https://docs.google.com/document/d/1XyKjQJ2R6qdES12uDK-5ffA43xgEHmcZaZ_6Vr6yWFM/edit?usp=sharing

Talk to you soon.

cc @carriewright11

@lpantano
Copy link
Contributor Author

Hi everybody!

Happy holidays.

Next meeting 01-17-2019, 10am EST time, 4pm GMT+1 time. (https://zoom.us/j/553765969)

Enjoy the break.

Cheers

@lpantano
Copy link
Contributor Author

I am archiving this thread to open the Road plan for 2019 soon.

@miRTop miRTop locked as resolved and limited conversation to collaborators Jan 17, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

7 participants