Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

moving to package independent of membrain #58

Open
6 tasks
alisterburt opened this issue Feb 8, 2024 · 10 comments
Open
6 tasks

moving to package independent of membrain #58

alisterburt opened this issue Feb 8, 2024 · 10 comments
Labels

Comments

@alisterburt
Copy link
Member

alisterburt commented Feb 8, 2024

@LorenzLamm opening an issue to track the conversion of this package into a standalone teamtomo tool that is separate from MemBrain and called by MemBrain.

We discussed reasons for this and I think we ended in agreement, any concerns about any of this before we move ahead?

Checklist for things that need to happen

Do you want help with any of these steps in particular? I'd love to find time to pair on anything I can help with

@LorenzLamm
Copy link
Collaborator

Hey @alisterburt,

Thanks a lot for pushing this and creating all the issues!

I'll respond in more detail in the respective issues, and I agree with most of your points. Honestly, I'm quite buried in work at the moment, so I need to figure out what to best spend time on.

For me, the highest priority is to maintain / improve the performance of MemBrain-seg and maintain / improve easy access to it.
Therefore, I agree that it's important to make the weights & training data available and get the documentation up-to-date. Also, I think it would be nice to have your points regarding API and on-the-fly preprocessing.

Regarding renaming to something different: In my opinion, it would be nice to keep the repository name and usability as it is: People already use it for segmenting the membranes and I also like the name membrain-seg, as it does belong to the MemBrain universe.

I think it would be rather nice to maintain the functionality of the repository as it is, and modularize it and outsource different functionalities. E.g., as we discussed, we might have a separate package for data augmentations, and outsource the preprocessing to libtilt?

@alisterburt
Copy link
Member Author

alisterburt commented Feb 9, 2024

Hey @LorenzLamm

Honestly, I'm quite buried in work at the moment, so I need to figure out what to best spend time on.

I understand and don't take any of these issues as specific pressure on you - a lot of the goal here is to get the package to a point where others (e.g. me, @rdrighetto, @kevinyamauchi, @kephale ) can super easily jump in at any point and help to maintain/improve/update the package 🙂

For me, the highest priority is to maintain / improve the performance of MemBrain-seg and maintain / improve easy access to it.

Totally - getting the training data out in the open will be a bit one for this, if any of us finds a tomogram for which the package isn't performing well we can annotate ourselves and add it to the dataset on zenodo. The zenodo community isn't quite full and direct community editing but it's a start at least!

Regarding renaming to something different: In my opinion, it would be nice to keep the repository name and usability as it is: People already use it for segmenting the membranes and I also like the name membrain-seg, as it does belong to the MemBrain universe.

Hm, I thought on our last call we had ended up on the same page here... what we discussed was:

  • the teamtomo org not being a place for any specific groups projects, rather it's for small, dependable bits of tooling which anyone feels like they can call from their own package and pop in to maintain if they use. starfile is a great example of this, 8 different people from all over have contributed code directly
  • MemBrain thus becomes is it's own package which doesn't live in the teamtomo github org. MemBrain implements the CLI membrain with the subcommand segment, in this way the API is maintained entirely and users change nothing, except it becomes pip install membrain . Other subcommands stats and pick are also implemented there and you maintain complete control, everything user facing goes through you/your group/the membrain universe
  • at the top of the README in this package we add a note pointing at the new home for MemBrain with a title like looking for membrain-seg?
  • once the new package membrain is in place, we transition here to something membrain-independent so that this small isolated piece is not explicitly linked to membrain, rather it was contributed to the community by membrain's developers and membrain depends on it like any other package would. In the same way that napari was given to the community and membrain will depend on it.

The exact order of operations here is unimportant and I totally understand that you would like to keep funneling users through the membrain universe and building that brand - I'm aware of the importance that you get appropriate recognition for your work and am explicitly not trying to take any of that away :-) this is alluded to on the teamtomo org page

Development model
Because packages in teamtomo are small and well scoped it is easy to depend on only those packages relevant to your work.

teamtomo is not a place for the development of your own software projects. Your projects belong to you and the credit should be yours alone! Did you develop a small, reusable component along the way that could benefit the entire community? We would love to work with you to bring it here.

What I am concerned about is trying to make sure that we are building things in a way that the community feels empowered to depend on them and maintain them moving forwards. For me, this means small, well scoped packages which are independent of any one labs 'brand'. I think that tightly coupling the small, useful components we build to our larger, user facing, 'branded' academic projects inherently makes others feel like it is not their responsibility to help to maintain this nice infrastructure, instead we think 'oh, we need to ask Lorenz to fix membrain-seg', 'sjors needs to fix RELION', 'the cryosparc team need to fix my bug'. I really believe that this separation is important for helping us to build a healthy ecosystem where we can benefit from each others work.

I'm really interested in hearing what the sticking point is for you with the context of everything I've said above - I am trying to figure out how we as a community can best do scientific software development and this is a perfect testbed, I want to understand if/why this doesn't work for you. I'm sorry if this request is frustrating and genuinely thank you for your pateince, I hope you now understand better why I've made it.

@rdrighetto I'd also be interested to hear your take on this development philosophy?

@kephale
Copy link
Member

kephale commented Feb 11, 2024

Thanks @alisterburt! This is a clear perspective on the goals of teamtomo.

My interpretation is:

  • If I start a new project/collaboration, then start GH repo somewhere other than teamtomo.
  • If the project becomes a dependency for other folks (likely folks outside my org even), then I might ask folks about migrating it to teamtomo.
  • If while working on my project I find common code with other projects (again, likely outside my org), then consider making a teamtomo repository for the common code.

These types of community-first goals definitely beg for a governance model.

Totally - getting the training data out in the open will be a bit one for this, if any of us finds a tomogram for which the package isn't performing well we can annotate ourselves and add it to the data. It's not quite full community editing but it's a start at least!

This is definitely the intent behind the CZ CryoET Data Portal. If for any reason you think the plan for the data portal isn't aligning with community needs, then I expect @uermel would be happy to discuss.

@alisterburt
Copy link
Member Author

Thanks @kephale ! Appreciate the comments and I think what you've said matches how I think about the project.

You're totally right about a governance model - I'd love to get to a point where others feel empowered and have a sense of ownership of the overall project. @LorenzLamm @kephale would you be interested in discussing/being part of this governance model?

@kephale
Copy link
Member

kephale commented Feb 13, 2024

Sounds great!

I guess this discussion should move to somewhere else (TeamTomo Zulip?)

@rdrighetto
Copy link
Contributor

Hi @alisterburt, thank you so much for raising all these important points!

I agree that the way things are going, MemBrain-seg has sort of "outgrown" teamtomo, or at least, is now beyond the scope of this organization (to use the Github term), exactly as you state here:

MemBrain thus becomes is it's own package which doesn't live in the teamtomo github org. MemBrain implements the CLI membrain with the subcommand segment, in this way the API is maintained entirely and users change nothing, except it becomes pip install membrain . Other subcommands stats and pick are also implemented there and you maintain complete control, everything user facing goes through you/your group/the membrain universe

I think that was always the plan, but it seems not is the time to move the membrain-seg project to our lab's GitHub, where the old but main MemBrain project already lives, until both are merged eventually:
https://github.com/CellArchLab/MemBrain
(I'm not sure what is the best strategy for moving the project to a new home on Github, also in a way that doesn't break the membrain-seg pip package, any advice on that would be appreciated!)

@LorenzLamm can then continue the development there aiming at making MemBrain v2 the cohesive, user-friendly package with all 3 modules (segment, pick, stats) that we envision. Of course, everyone who has contributed so far and new contributions will always be welcome there! We all need to consider that @LorenzLamm is now facing the end of his PhD and who knows what great adventures await him after that. So his priority at the moment is really to push the "MemBrain universe" as far as possible, catering to our labs own research needs which motivated the creation of MemBrain in the first place, but also making it a useful tool for others whenever possible.

Then we come to this point:

once the new package membrain is in place, we transition here to something membrain-independent so that this small isolated piece is not explicitly linked to membrain, rather it was contributed to the community by membrain's developers and membrain depends on it like any other package would. In the same way that napari was given to the community and membrain will depend on it.

Sounds good and we're cool with it in principle, but I don't know exactly what parts of the membrain-seg "backend" would make sense to be hosted here under teamtomo org. Things like #56 do not sound like a "small isolated piece" but rather membrain-seg itself, just with a different name? Because to achieve what you propose here, a lot goes on in the background, and if you plan to host all that here, then in the end it's just the repo renamed? Please forgive me if I'm missing something.

To summarize, we understand that we should now host this repo under our lab, and modularize the most interesting bits and pieces (like the pre-processing code) to be hosted here and call them in MemBrain(-seg). We just don't know what these pieces are exactly 🙃

Thanks for your feedback, happy to discuss more!

@LorenzLamm
Copy link
Collaborator

Hey @alisterburt,

Thanks also from my side for your detailed explanations.
I agree that the teamtomo Github may not be the ideal place anymore for MemBrain(-seg), as it is a software package in its development phase. So, MemBrain should probably live in the CellArchLab Github.

I want to emphasize that we do have aligned interests here in that we want to make this whole package not only available to use for the community, but also to contribute. Many have already contributed by providing patches for our training dataset, but also for the coding side, everyone should of course be welcome to contribute. I'm not sure if the threshold for contribution is much higher if a repository is associated with a publication, but we definitely want to encourage users to contribute.

However, I feel that a membraneer package that is called by MemBrain-seg can also lead to a lot of confusion and not clearly defined use cases. In my opinion, it would make more sense to leave all the membrane segmentation-specific utilities in the MemBrain-seg package, and potentially outsource e.g. the U-Net training, the data augmentations, or the preprocessing into different modules that can easily be imported by other modules.

Practically, what would you suggest? Would it make sense that we move this entire repository to the CellArchLab Github, and then extract modules out of it?

@LorenzLamm
Copy link
Collaborator

Also: There is a #teamtomo Zulip? @rdrighetto and I would be happy to join for further discussion :)

@kephale
Copy link
Member

kephale commented Feb 13, 2024

@alisterburt
Copy link
Member Author

Sorry for the delay responding here, had a great chat offline with @LorenzLamm and @rdrighetto - should have a path forwards for this repo from early next week ☺️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants