From 74ff1cd632fac33388b72347aebaf1b4d9e5e478 Mon Sep 17 00:00:00 2001 From: Larissa Poghosyan <43134338+larissapoghosyan@users.noreply.github.com> Date: Mon, 24 Mar 2025 02:31:25 +0000 Subject: [PATCH 1/7] create the application file --- ...for Classifier Network (Larissa Poghosyan) | 87 +++++++++++++++++++ 1 file changed, 87 insertions(+) create mode 100644 GSoC-2025/cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan) diff --git a/GSoC-2025/cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan) b/GSoC-2025/cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan) new file mode 100644 index 00000000..9013803c --- /dev/null +++ b/GSoC-2025/cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan) @@ -0,0 +1,87 @@ +# Application template + +> [!CAUTION] +> Do not edit this template directly! +> Instead use it to open a new PR as explained in the [README](../README.md#steps). + + +Please use the following template to submit your application to the NIU GSoC 2025 program, and to discuss your proposal with the community. + +The more closely you follow this template, the easier it will be for us to review your application! Please include clear headings for all the different sections. + +## Project title +Follow the following format for the proposal title: `: ()` and provide it in your pull request as a new markdown file of the same name, i.e. `: ().md` + +E.g. "movement: support for Kalman filters (Jane Doe)". + +Please use the same title when you submit your proposal to the GSoC application site! + +## Personal details +Please include the following information: +- **Full name** (include preferred name if desired) +- **Email** +- **GitHub username** +- **Zulip username** +- **Location & time-zone** +- **Personal website / project portfolio** (optional) +- **Code contribution** + + Please link a pull request, ideally submitted to your chosen project or one of the NIU tools. Applications without a code contribution won't be considered. It must be publicly visible and represent your own work, although you may have help from other developers in the community to further improve it. It must be meaningful code contribution (i.e. not just fixing a minor spelling mistake). While AI tools (such as Copilot etc) can be a very useful, contributions mostly created by AI are unlikely to be useful, and will not be accepted. You can link more than one pull request if desired. + +- **Proposal discussion link** + + Please link to the pull request where you discussed your project proposal with the community. + +## Project proposal +_Length: max 1 page_ + +- **Synopsis** + + Briefly explain: what is the project about? Why is it important? What are the goals? What are the deliverables? How would the open source community benefit from this project? + +- **Implementation timeline** + + Please include the following information: + 1. A bullet point list with **minimal set of deliverables** + 2. Additional **stretch goals** or "if time allows" deliverables (optional) + 3. A detailed **weekly timeline**: when do you plan to do what? + - Please use a week as a minimal unit of time, and include any planned vacations or other commitments. + - This timeline could be formatted as a table. + - Remember to also include the number of hours per week you plan to work on the GSoC project. + - When estimating the required time for a task, keep in mind deliverables should include investigation/research, coding and documentation. + - The default schedule for GSoC is 12 weeks - see the [GSoC timeline](https://developers.google.com/open-source/gsoc/timeline) for precise dates. + - Also please specify any prep work you plan to do during the "Community bonding period". + - Usually week 1's deliverables already include some code. Week 6 marks the mid-term point, where usually more than half of the project should be completed. At the end of week 11 you may want to try to "freeze" the code and complete any remaining tests or documentation in weeks 11 and 12. + +- **Communication plan** + + Please explain: how do you plan to communicate with your mentor? How often? (e.g., daily or weekly stand-ups, longer meetings..?) What communication channels will you use? (e.g., video calls, Zulip chat...?) + +## Personal statement + +_Length: max 0.75 page_ + +- **Past experienc.** + + Please describe your past experience with programming, open source, or any other experience you deem relevant for the project you are applying for. Any successful open source projects, published work or content of the like should definitely be highlighted. +- **Motivation: why this project?** + + Why are you interested in this specific project? What aspects of it motivate you to work on it? How does it link to your personal or professional interests? How do you envision its impact in the open source community? +- **Match: why you?** + + Why should we choose you for this project? What unique skills or experiences can you bring to the project and the community? Is there something you have worked on in the past that makes you particularly well-suited for this project? +- **Availability** + + Please state if you have any other plans for the work period (school work, another job, planned vacation)? If so, how do you plan to combine them with your GSoC work? + +## GSoC + +_Length: max 0.25 page_ + +- **GSoC experience** + + What do you expect from the program? + +- **Are you also applying to projects with other organisations in GSoC 2025?** + + If so, which ones? What would be your preference in case of a tie? From efc6c35362a21ddd822213cdab9752da3e2a299f Mon Sep 17 00:00:00 2001 From: Larissa Poghosyan <43134338+larissapoghosyan@users.noreply.github.com> Date: Mon, 24 Mar 2025 02:56:40 +0000 Subject: [PATCH 2/7] add most of the sections --- ...for Classifier Network (Larissa Poghosyan) | 89 +++++++++++-------- 1 file changed, 53 insertions(+), 36 deletions(-) diff --git a/GSoC-2025/cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan) b/GSoC-2025/cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan) index 9013803c..8e1c52e2 100644 --- a/GSoC-2025/cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan) +++ b/GSoC-2025/cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan) @@ -1,61 +1,77 @@ # Application template -> [!CAUTION] -> Do not edit this template directly! -> Instead use it to open a new PR as explained in the [README](../README.md#steps). - - Please use the following template to submit your application to the NIU GSoC 2025 program, and to discuss your proposal with the community. The more closely you follow this template, the easier it will be for us to review your application! Please include clear headings for all the different sections. -## Project title -Follow the following format for the proposal title: `: ()` and provide it in your pull request as a new markdown file of the same name, i.e. `: ().md` - -E.g. "movement: support for Kalman filters (Jane Doe)". +## cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan) -Please use the same title when you submit your proposal to the GSoC application site! ## Personal details Please include the following information: -- **Full name** (include preferred name if desired) -- **Email** -- **GitHub username** -- **Zulip username** -- **Location & time-zone** -- **Personal website / project portfolio** (optional) +- **Full name** Larissa Poghosyan +- **Email** larissa.poghosyan@gmail.com +- **GitHub username** larissapoghosyan +- **Zulip username** Larissa Poghosyan (User ID 891828) +- **Location & time-zone** London, UK (GMT+1 British Summer Time) +- **Personal website / project portfolio** https://github.com/larissapoghosyan - **Code contribution** - - Please link a pull request, ideally submitted to your chosen project or one of the NIU tools. Applications without a code contribution won't be considered. It must be publicly visible and represent your own work, although you may have help from other developers in the community to further improve it. It must be meaningful code contribution (i.e. not just fixing a minor spelling mistake). While AI tools (such as Copilot etc) can be a very useful, contributions mostly created by AI are unlikely to be useful, and will not be accepted. You can link more than one pull request if desired. + - https://github.com/brainglobe/cellfinder/pull/495 + + This Pull Request adds a significant part of the Vision + Transformers implementation for the classifier network in `cellfinder` project. It's completely compatible with the current abstractions of the library. - **Proposal discussion link** - - Please link to the pull request where you discussed your project proposal with the community. + - https://github.com/brainglobe/cellfinder/pull/495 + I used the same PR at the original repository to show the proof-of-concept implementation, list todos, and to invite the community to discussion. ## Project proposal _Length: max 1 page_ - **Synopsis** - Briefly explain: what is the project about? Why is it important? What are the goals? What are the deliverables? How would the open source community benefit from this project? + Accurate detection of labelled cells in whole mouse brain 3D microscopy images is a key step in understanding brain-wide neural circuits. While simple thresholding can be used to extract candidate cell locations, it typically yields a high false positive rate. + + Currently, `cellfinder` use this approach for candidate extraction, followed by a 3D ResNet-based classifier to distinguish true cells from artefacts using local image cuboids. However, recent advances in computer vision demonstrate that Vision Transformers (ViTs) outperform convolutional networks like ResNet in many biomedical image classification tasks. + + This project aims to replace cellfinder’s ResNet classifier with a ViT-based architecture, improving detection accuracy and robustness across brain regions without increasing computational cost. + + Additionally, it would be beneficial to implement the changes and new architectures with the same abstractions to ensure backward compatibility. This way, existing user workflows will remain unaffected, allowing seamless transitions to newer architectures and improved quality and performance. - **Implementation timeline** - Please include the following information: - 1. A bullet point list with **minimal set of deliverables** - 2. Additional **stretch goals** or "if time allows" deliverables (optional) - 3. A detailed **weekly timeline**: when do you plan to do what? - - Please use a week as a minimal unit of time, and include any planned vacations or other commitments. - - This timeline could be formatted as a table. - - Remember to also include the number of hours per week you plan to work on the GSoC project. - - When estimating the required time for a task, keep in mind deliverables should include investigation/research, coding and documentation. - - The default schedule for GSoC is 12 weeks - see the [GSoC timeline](https://developers.google.com/open-source/gsoc/timeline) for precise dates. - - Also please specify any prep work you plan to do during the "Community bonding period". - - Usually week 1's deliverables already include some code. Week 6 marks the mid-term point, where usually more than half of the project should be completed. At the end of week 11 you may want to try to "freeze" the code and complete any remaining tests or documentation in weeks 11 and 12. + - [x] Implement ViT in Keras + - [x] Make sure backward compatibility, and have unified abstractions for classifier network + - [ ] Performance testing on CPUs and GPUs + - [ ] Quantitative comparison with current ResNet classifier + - [ ] Changes in BrainGlobe repo using Cellfinder to expose the new functionality + - [ ] Add pre-train and fine-tune pattern + - [ ] Add support of pretrained backbone models (load backbone and continue fine-tune) + - [ ] Add unit tests to cover all the changes + - [ ] Add documentation + - [ ] Write a blogpost + - [ ] 2D support for ViT + + + | **Week** | **Phase** | **Dates** | **Deliverables** | + |---------------|-------------------|-------------------|--------------------| + | **Week 0** | COMMUNITY BONDING | May 8 - Jun 1 | - Refine the existing ViT implementation (current PR).
- Start testing on large-scale data.
- Finalize benchmarking plan and GPU setup.
- Monitor upstream PRs (e.g. #493) for compatibility. | + | **Week 1** | | Jun 2 - Jun 9 | - Add ViT model configs to the training pipeline.
- Set up experiment tracking (e.g. with Weights & Biases) and create a public project board.
- Launch full-scale training runs on GPU. | + | **Week 2** | | Jun 10 - Jun 16 | - Begin performance analysis on CPU vs GPU.
- Collect accuracy, loss, and training speed metrics.
- Compare with current ResNet model.
- Log results to experiment tracker. | + | **Week 3** | | Jun 17 - Jun 23 | - Add support for loading pretrained 3D ViT backbones.
- Implement fine-tuning mechanism.
- Test with public pretrained ViTs (if available). | + | **Week 4** | | Jun 24 - Jun 30 | - Modularize architecture selection in CLI, Python API, and Napari plugin.
- Continue benchmarking on multiple datasets.
- Polish training scripts/configs. | + | **Week 5** | PRE-MIDTERM PREP | Jul 1 - Jul 7 | - Finalize and analyze benchmarking results (ViT vs ResNet).
- Sync with mentors for midterm prep.
- Address any issues in API/plugin integration. | + | **Week 6** | MIDTERM | Jul 8 - Jul 14 | - Submit midterm evaluation.
- Share benchmarking summary and gather feedback.
- Plan next steps with mentor input. | + | **Week 7** | | Jul 15 - Jul 21 | - Write unit tests for ViT classifier pipeline.
- Ensure compatibility with data loader refactor (`#493`).
- Begin writing user/developer documentation. | + | **Week 8** | | Jul 22 - Jul 28 | - Finalize documentation (training, fine-tuning, CLI usage).
- Add lightweight support for 2D classification mode.
- Validate on small 2D image slices. | + | **Week 9** | | Jul 29 - Aug 4 | - Add support for advanced ViT variants (DeiT, DINO, Swin).
- Integrate backbone selection cleanly into training pipeline.
- Run test training on at least one variant. | + | **Week 10** | | Aug 5 - Aug 11 | - Complete variant model testing and collect comparison results.
- Final cleanup of codebase, refactor and organize files.
- Validate full test coverage and model switching. | + | **Week 11** | | Aug 12 - Aug 18 | - **Code freeze**: finalize codebase for submission.
- Write and publish final blog post summarizing work, results, and next steps. | + | **Week 12** | FINAL WEEK | Aug 19 - Sep 1 | - Submit final GSoC report and code.
- Review and respond to mentor feedback.
- Ensure documentation is contributor-friendly and future-ready. | - **Communication plan** - Please explain: how do you plan to communicate with your mentor? How often? (e.g., daily or weekly stand-ups, longer meetings..?) What communication channels will you use? (e.g., video calls, Zulip chat...?) + I’d prefer to have **weekly video calls (around 20–30 minutes)** with my mentors to check in on progress, ask questions, and plan next steps. I think this cadence works well, but I’d be genuinely happy to meet more often if the mentors are available and open to it. For day-to-day communication and quick feedback, I’ll use Zulip, where I’ll stay active and responsive throughout the program. ## Personal statement @@ -72,7 +88,7 @@ _Length: max 0.75 page_ Why should we choose you for this project? What unique skills or experiences can you bring to the project and the community? Is there something you have worked on in the past that makes you particularly well-suited for this project? - **Availability** - Please state if you have any other plans for the work period (school work, another job, planned vacation)? If so, how do you plan to combine them with your GSoC work? + I am fully available from **May 1 to October 1** with no academic or professional commitments during this time. I plan to dedicate **20–25 hours per week** throughout the program, in line with the **175-hour project** requirement. I’m also flexible to put in extra hours if needed to meet deadlines or polish the final deliverables. ## GSoC @@ -80,8 +96,9 @@ _Length: max 0.25 page_ - **GSoC experience** - What do you expect from the program? + I’m excited about GSoC as a way to contribute to meaningful open-source work and become part of a global community. Working on Cellfinder will allow me to collaborate closely with mentors and researchers, while learning how software can drive scientific discovery. - **Are you also applying to projects with other organisations in GSoC 2025?** - If so, which ones? What would be your preference in case of a tie? + I initially considered several organizations and explored PyDataStructs by Open Science Labs, where I authored a pull request and made some changes. However, I ultimately decided to focus solely on the Cellfinder project to ensure I have meaningful contributions and a strong proof of concept before applying. + I will not be applying elsewhere within GSoC program this year. From 05c2930ddaaa8bc5c0b7cfa2eaa34383372efbc6 Mon Sep 17 00:00:00 2001 From: Larissa Poghosyan <43134338+larissapoghosyan@users.noreply.github.com> Date: Mon, 24 Mar 2025 02:59:52 +0000 Subject: [PATCH 3/7] Rename to md --- ...r Architectures for Classifier Network (Larissa Poghosyan).md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename GSoC-2025/{cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan) => cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan).md} (100%) diff --git a/GSoC-2025/cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan) b/GSoC-2025/cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan).md similarity index 100% rename from GSoC-2025/cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan) rename to GSoC-2025/cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan).md From 17a9a3efba956914b4cbb7c05b07a764c05118f4 Mon Sep 17 00:00:00 2001 From: Larissa Poghosyan <43134338+larissapoghosyan@users.noreply.github.com> Date: Mon, 24 Mar 2025 03:01:40 +0000 Subject: [PATCH 4/7] reformat --- ...tures for Classifier Network (Larissa Poghosyan).md | 10 ++-------- 1 file changed, 2 insertions(+), 8 deletions(-) diff --git a/GSoC-2025/cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan).md b/GSoC-2025/cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan).md index 8e1c52e2..2c511a49 100644 --- a/GSoC-2025/cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan).md +++ b/GSoC-2025/cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan).md @@ -1,11 +1,4 @@ -# Application template - -Please use the following template to submit your application to the NIU GSoC 2025 program, and to discuss your proposal with the community. - -The more closely you follow this template, the easier it will be for us to review your application! Please include clear headings for all the different sections. - -## cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan) - +# cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan) ## Personal details Please include the following information: @@ -52,6 +45,7 @@ _Length: max 1 page_ - [ ] Write a blogpost - [ ] 2D support for ViT + --- | **Week** | **Phase** | **Dates** | **Deliverables** | |---------------|-------------------|-------------------|--------------------| From d0dca524ccc067dcf0c38cbe0ca4a0250a1f3617 Mon Sep 17 00:00:00 2001 From: Larissa Poghosyan <43134338+larissapoghosyan@users.noreply.github.com> Date: Fri, 28 Mar 2025 22:39:10 +0000 Subject: [PATCH 5/7] Update cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan).md --- ... Classifier Network (Larissa Poghosyan).md | 90 +++++++++---------- 1 file changed, 41 insertions(+), 49 deletions(-) diff --git a/GSoC-2025/cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan).md b/GSoC-2025/cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan).md index 2c511a49..d90a4c8a 100644 --- a/GSoC-2025/cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan).md +++ b/GSoC-2025/cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan).md @@ -1,7 +1,6 @@ -# cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan) +#

cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan)

## Personal details -Please include the following information: - **Full name** Larissa Poghosyan - **Email** larissa.poghosyan@gmail.com - **GitHub username** larissapoghosyan @@ -16,30 +15,25 @@ Please include the following information: - **Proposal discussion link** - https://github.com/brainglobe/cellfinder/pull/495 - I used the same PR at the original repository to show the proof-of-concept implementation, list todos, and to invite the community to discussion. -## Project proposal -_Length: max 1 page_ + I used the same PR in the original repository to present the proof-of-concept implementation, list todos, and invite the community to discussion. + This PR also presents a structured study, with references to existing codebases, related literature, and open questions for further exploration. +## Project proposal - **Synopsis** - Accurate detection of labelled cells in whole mouse brain 3D microscopy images is a key step in understanding brain-wide neural circuits. While simple thresholding can be used to extract candidate cell locations, it typically yields a high false positive rate. +Accurate detection of labeled cells in whole-mouse-brain 3D microscopy images is essential for understanding brain-wide neural circuits. While simple thresholding can extract cell locations, it often results in a high false positive rate.
+To address this challenge, cellfinder first extracts candidate locations using the same approach, and then employs a 3D ResNet-based classifier to distinguish true cells from artifacts using local image cuboids.
+This project aims to improve detection accuracy and robustness without increasing computational cost, by upgrading cellfinder’s classifier with Vision Transformers (ViTs), drawing on recent studies that demonstrate ViTs outperform convolutional networks in biomedical image classification tasks. - Currently, `cellfinder` use this approach for candidate extraction, followed by a 3D ResNet-based classifier to distinguish true cells from artefacts using local image cuboids. However, recent advances in computer vision demonstrate that Vision Transformers (ViTs) outperform convolutional networks like ResNet in many biomedical image classification tasks. - - This project aims to replace cellfinder’s ResNet classifier with a ViT-based architecture, improving detection accuracy and robustness across brain regions without increasing computational cost. - - Additionally, it would be beneficial to implement the changes and new architectures with the same abstractions to ensure backward compatibility. This way, existing user workflows will remain unaffected, allowing seamless transitions to newer architectures and improved quality and performance. - **Implementation timeline** - - [x] Implement ViT in Keras - - [x] Make sure backward compatibility, and have unified abstractions for classifier network + - [x] Implement ViT in Keras, while keeping the classifier network abstractions compatible + - [ ] Full-scale training, and quantitative comparison with current ResNet classifier - [ ] Performance testing on CPUs and GPUs - - [ ] Quantitative comparison with current ResNet classifier - [ ] Changes in BrainGlobe repo using Cellfinder to expose the new functionality - - [ ] Add pre-train and fine-tune pattern - - [ ] Add support of pretrained backbone models (load backbone and continue fine-tune) + - [ ] Add support of loading and fine-tuning pretrained backbone models - [ ] Add unit tests to cover all the changes - [ ] Add documentation - [ ] Write a blogpost @@ -47,52 +41,50 @@ _Length: max 1 page_ --- - | **Week** | **Phase** | **Dates** | **Deliverables** | - |---------------|-------------------|-------------------|--------------------| - | **Week 0** | COMMUNITY BONDING | May 8 - Jun 1 | - Refine the existing ViT implementation (current PR).
- Start testing on large-scale data.
- Finalize benchmarking plan and GPU setup.
- Monitor upstream PRs (e.g. #493) for compatibility. | - | **Week 1** | | Jun 2 - Jun 9 | - Add ViT model configs to the training pipeline.
- Set up experiment tracking (e.g. with Weights & Biases) and create a public project board.
- Launch full-scale training runs on GPU. | - | **Week 2** | | Jun 10 - Jun 16 | - Begin performance analysis on CPU vs GPU.
- Collect accuracy, loss, and training speed metrics.
- Compare with current ResNet model.
- Log results to experiment tracker. | - | **Week 3** | | Jun 17 - Jun 23 | - Add support for loading pretrained 3D ViT backbones.
- Implement fine-tuning mechanism.
- Test with public pretrained ViTs (if available). | - | **Week 4** | | Jun 24 - Jun 30 | - Modularize architecture selection in CLI, Python API, and Napari plugin.
- Continue benchmarking on multiple datasets.
- Polish training scripts/configs. | - | **Week 5** | PRE-MIDTERM PREP | Jul 1 - Jul 7 | - Finalize and analyze benchmarking results (ViT vs ResNet).
- Sync with mentors for midterm prep.
- Address any issues in API/plugin integration. | - | **Week 6** | MIDTERM | Jul 8 - Jul 14 | - Submit midterm evaluation.
- Share benchmarking summary and gather feedback.
- Plan next steps with mentor input. | - | **Week 7** | | Jul 15 - Jul 21 | - Write unit tests for ViT classifier pipeline.
- Ensure compatibility with data loader refactor (`#493`).
- Begin writing user/developer documentation. | - | **Week 8** | | Jul 22 - Jul 28 | - Finalize documentation (training, fine-tuning, CLI usage).
- Add lightweight support for 2D classification mode.
- Validate on small 2D image slices. | - | **Week 9** | | Jul 29 - Aug 4 | - Add support for advanced ViT variants (DeiT, DINO, Swin).
- Integrate backbone selection cleanly into training pipeline.
- Run test training on at least one variant. | - | **Week 10** | | Aug 5 - Aug 11 | - Complete variant model testing and collect comparison results.
- Final cleanup of codebase, refactor and organize files.
- Validate full test coverage and model switching. | - | **Week 11** | | Aug 12 - Aug 18 | - **Code freeze**: finalize codebase for submission.
- Write and publish final blog post summarizing work, results, and next steps. | - | **Week 12** | FINAL WEEK | Aug 19 - Sep 1 | - Submit final GSoC report and code.
- Review and respond to mentor feedback.
- Ensure documentation is contributor-friendly and future-ready. | - +| **Week** | **Dates** | **Deliverables** | +|---------------|-------------------|--------------------| +| **Week 0**
COMMUNITY BONDING | May 8 - Jun 1 | - Refine the existing ViT implementation (current PR).
- Start testing on large-scale data.
- Finalize benchmarking plan and GPU setup.
- Monitor upstream PRs (e.g. #493) for compatibility. | +| **Week 1** | Jun 2 - Jun 9 | - Add ViT model configs to the training pipeline.
- Set up experiment tracking (e.g. with Weights & Biases) and create a public project board.
- Launch full-scale training runs on GPU. | +| **Week 2** | Jun 10 - Jun 16 | - Begin performance analysis on CPU vs GPU.
- Collect accuracy, loss, and training speed metrics.
- Compare with current ResNet model.
- Log results to experiment tracker. | +| **Week 3** | Jun 17 - Jun 23 | - Add support for loading pretrained 3D ViT backbones.
- Implement fine-tuning mechanism.
- Test with public pretrained ViTs (if available). | +| **Week 4** | Jun 24 - Jun 30 | - Modularize architecture selection in CLI, Python API, and Napari plugin.
- Continue benchmarking on multiple datasets.
- Polish training scripts/configs. | +| **Week 5**
PRE-MIDTERM PREP | Jul 1 - Jul 7 | - Finalize and analyze benchmarking results (ViT vs ResNet).
- Sync with mentors for midterm prep.
- Address any issues in API/plugin integration. | +| **Week 6**
MIDTERM | Jul 8 - Jul 14 | - Submit midterm evaluation.
- Share benchmarking summary and gather feedback.
- Plan next steps with mentor input. | +| **Week 7** | Jul 15 - Jul 21 | - Write unit tests for ViT classifier pipeline.
- Ensure compatibility with data loader refactor (`#493`).
- Begin writing user/developer documentation. | +| **Week 8** | Jul 22 - Jul 28 | - Finalize documentation (training, fine-tuning, CLI usage).
- Add lightweight support for 2D classification mode.
- Validate on small 2D image slices. | +| **Week 9** | Jul 29 - Aug 4 | - Add support for advanced ViT variants (DeiT, DINO, Swin).
- Integrate backbone selection cleanly into training pipeline.
- Run test training on at least one variant. | +| **Week 10** | Aug 5 - Aug 11 | - Complete variant model testing and collect comparison results.
- Final cleanup of codebase, refactor and organize files.
- Validate full test coverage and model switching. | +| **Week 11** | Aug 12 - Aug 18 | - **Code freeze**: finalize codebase for submission.
- Write and publish final blog post summarizing work, results, and next steps. | +| **Week 12**
FINAL WEEK | Aug 19 - Sep 1 | - Submit final GSoC report and code.
- Review and respond to mentor feedback.
- Ensure documentation is contributor-friendly and future-ready. | - **Communication plan** - - I’d prefer to have **weekly video calls (around 20–30 minutes)** with my mentors to check in on progress, ask questions, and plan next steps. I think this cadence works well, but I’d be genuinely happy to meet more often if the mentors are available and open to it. For day-to-day communication and quick feedback, I’ll use Zulip, where I’ll stay active and responsive throughout the program. - + + I’d prefer **weekly video calls (around 20-30 minutes)** with my mentors to discuss progress and plan next steps. For quick feedback, I’ll stay active on **Zulip**. I plan to work **4-5 hours daily** (around **25-35 hours per week**), to meet the project time requirements. +--- ## Personal statement +- **Past experience** + + **Academic Experience:** I hold a degree in Data Science, with a focus on mathematics, machine learning, and data analytics. My thesis explored the use of Natural Language Processing (NLP) and Machine Learning, specifically analyzing the effectiveness of pre-trained models in real-world applications. Additionally, I was part of the founding cohort of a Data Science club at my university, where we collaborated on projects that benefited both the university and the wider community. -_Length: max 0.75 page_ - -- **Past experienc.** + **Technical Experience:** During my studies, I interned at Metric as an ML/AI intern, where I developed NLP solutions for text classification and sentiment analysis using Python, scikit-learn, Keras, PyTorch and TensorFlow. I collaborated on data preprocessing, feature engineering, and model optimization to enhance text processing efficiency. After that, I joined VXsoft as a junior developer, working on government projects to develop a secure document management system. I collaborated with senior developers on database design and user interface development, improving document storage, retrieval, and user controls. This experience strengthened my skills in communication, project management, and secure software development while giving me hands-on exposure to the full application lifecycle. - Please describe your past experience with programming, open source, or any other experience you deem relevant for the project you are applying for. Any successful open source projects, published work or content of the like should definitely be highlighted. - **Motivation: why this project?** - Why are you interested in this specific project? What aspects of it motivate you to work on it? How does it link to your personal or professional interests? How do you envision its impact in the open source community? + The BrainGlobe initiative and Cellfinder caught my attention for their contributions to neuroinformatics and ML-driven research automation. I’m passionate about neuroinformatics and excited to apply my skills to this impactful project. Having worked with transformers, I’m eager to help optimize the current implementation and develop scalable, reliable solutions for researchers. I also value the open-source nature of the initiative, aligning with my desire to collaborate and create tools that benefit the scientific community. + - **Match: why you?** + + With a strong foundation in **machine learning**, **algorithms**, and **software development**, I have hands-on experience with frameworks like **PyTorch**, **TensorFlow**, and **Keras**. I prioritize writing **clean, maintainable code** and quickly adapt to new technologies to solve complex problems. Recently, I implemented the **proof-of-concept implementation** of a **Vision Transformer (ViT)** model for cell classification in **Cellfinder**, ensuring backward compatibility with the existing codebase. This project aligns with my passion for applying ML to advance scientific research, and I am excited to collaborate, contribute my expertise, and make a meaningful impact while continuing to grow professionally. - Why should we choose you for this project? What unique skills or experiences can you bring to the project and the community? Is there something you have worked on in the past that makes you particularly well-suited for this project? - **Availability** - - I am fully available from **May 1 to October 1** with no academic or professional commitments during this time. I plan to dedicate **20–25 hours per week** throughout the program, in line with the **175-hour project** requirement. I’m also flexible to put in extra hours if needed to meet deadlines or polish the final deliverables. + + I am available from **May 1 to October 1**, with no academic or professional commitments. I plan to dedicate **25–35 hours per week** to meet the project requirement, and I’m flexible to put in extra hours as needed to meet deadlines or refine deliverables. ## GSoC - -_Length: max 0.25 page_ - - **GSoC experience** - - I’m excited about GSoC as a way to contribute to meaningful open-source work and become part of a global community. Working on Cellfinder will allow me to collaborate closely with mentors and researchers, while learning how software can drive scientific discovery. + + I’m excited about GSoC as a way to contribute to meaningful **open-source work** and become part of a **global community**. Working on **Cellfinder** will allow me to collaborate closely with mentors and researchers, while learning how software can drive scientific discovery. - **Are you also applying to projects with other organisations in GSoC 2025?** - + I initially considered several organizations and explored PyDataStructs by Open Science Labs, where I authored a pull request and made some changes. However, I ultimately decided to focus solely on the Cellfinder project to ensure I have meaningful contributions and a strong proof of concept before applying. I will not be applying elsewhere within GSoC program this year. From 0f35c60391e5bc97c316fd4fdc7a433eeaf3d743 Mon Sep 17 00:00:00 2001 From: Larissa Poghosyan <43134338+larissapoghosyan@users.noreply.github.com> Date: Tue, 8 Apr 2025 12:23:55 +0100 Subject: [PATCH 6/7] update application --- ... Classifier Network (Larissa Poghosyan).md | 38 ++++++++++--------- 1 file changed, 20 insertions(+), 18 deletions(-) diff --git a/GSoC-2025/cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan).md b/GSoC-2025/cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan).md index d90a4c8a..5a00387b 100644 --- a/GSoC-2025/cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan).md +++ b/GSoC-2025/cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan).md @@ -8,16 +8,16 @@ - **Location & time-zone** London, UK (GMT+1 British Summer Time) - **Personal website / project portfolio** https://github.com/larissapoghosyan - **Code contribution** + - https://github.com/aimhubio/aim/pull/3319 + - https://github.com/codezonediitj/pydatastructs/pull/595 - https://github.com/brainglobe/cellfinder/pull/495 - - This Pull Request adds a significant part of the Vision - Transformers implementation for the classifier network in `cellfinder` project. It's completely compatible with the current abstractions of the library. + + This PR presents a functional proof-of-concept Vision Transformer classifier, fully compatible with existing cellfinder architecture. - **Proposal discussion link** - https://github.com/brainglobe/cellfinder/pull/495 - I used the same PR in the original repository to present the proof-of-concept implementation, list todos, and invite the community to discussion. - This PR also presents a structured study, with references to existing codebases, related literature, and open questions for further exploration. + This PR presents the PoC implementation, references to related codebases and literature, and invites the community to discussion. ## Project proposal - **Synopsis** @@ -29,30 +29,32 @@ This project aims to improve detection accuracy and robustness without increasin - **Implementation timeline** - - [x] Implement ViT in Keras, while keeping the classifier network abstractions compatible - - [ ] Full-scale training, and quantitative comparison with current ResNet classifier - - [ ] Performance testing on CPUs and GPUs - - [ ] Changes in BrainGlobe repo using Cellfinder to expose the new functionality - - [ ] Add support of loading and fine-tuning pretrained backbone models - - [ ] Add unit tests to cover all the changes - - [ ] Add documentation - - [ ] Write a blogpost - - [ ] 2D support for ViT + - Implement ViT in Keras, while keeping the classifier network abstractions compatible + - Full-scale training, and quantitative comparison with current ResNet classifier + - Performance testing on CPUs and GPUs + - Add changes to relevant BrainGlobe repos (e.g., [brainglobe-workflows](https://github.com/brainglobe/brainglobe-workflows/blob/main/brainglobe_workflows/brainmapper/main.py)) to integrate the new functionality into full the workflow. + - Add support of loading and fine-tuning pretrained backbone models + - Add unit tests to cover all the changes + - Add documentation + - Write a blogpost + +- **Stretch goal:** + - Add support for 2D ViT, if the necessary framework and data adjustments for 2D cell detection are completed in time. --- | **Week** | **Dates** | **Deliverables** | |---------------|-------------------|--------------------| | **Week 0**
COMMUNITY BONDING | May 8 - Jun 1 | - Refine the existing ViT implementation (current PR).
- Start testing on large-scale data.
- Finalize benchmarking plan and GPU setup.
- Monitor upstream PRs (e.g. #493) for compatibility. | -| **Week 1** | Jun 2 - Jun 9 | - Add ViT model configs to the training pipeline.
- Set up experiment tracking (e.g. with Weights & Biases) and create a public project board.
- Launch full-scale training runs on GPU. | +| **Week 1** | Jun 2 - Jun 9 | - Add ViT model configs to the training pipeline.
- Set up experiment tracking (e.g. with Weights & Biases) and create a public project board.
- Launch full-scale training runs on GPU.
- Implement changes in [cellfinder_train CLI](https://github.com/brainglobe/cellfinder/blob/main/cellfinder/core/train/train_yaml.py) to enable the ViT classifier.| | **Week 2** | Jun 10 - Jun 16 | - Begin performance analysis on CPU vs GPU.
- Collect accuracy, loss, and training speed metrics.
- Compare with current ResNet model.
- Log results to experiment tracker. | | **Week 3** | Jun 17 - Jun 23 | - Add support for loading pretrained 3D ViT backbones.
- Implement fine-tuning mechanism.
- Test with public pretrained ViTs (if available). | -| **Week 4** | Jun 24 - Jun 30 | - Modularize architecture selection in CLI, Python API, and Napari plugin.
- Continue benchmarking on multiple datasets.
- Polish training scripts/configs. | +| **Week 4** | Jun 24 - Jun 30 | - Ensure compatibility of external repositories (cellfinder_download, brainmapper) with the new changes.
- Modularize architecture selection and Python API.
- Continue benchmarking on multiple datasets.
- Polish training scripts/configs. | | **Week 5**
PRE-MIDTERM PREP | Jul 1 - Jul 7 | - Finalize and analyze benchmarking results (ViT vs ResNet).
- Sync with mentors for midterm prep.
- Address any issues in API/plugin integration. | | **Week 6**
MIDTERM | Jul 8 - Jul 14 | - Submit midterm evaluation.
- Share benchmarking summary and gather feedback.
- Plan next steps with mentor input. | | **Week 7** | Jul 15 - Jul 21 | - Write unit tests for ViT classifier pipeline.
- Ensure compatibility with data loader refactor (`#493`).
- Begin writing user/developer documentation. | -| **Week 8** | Jul 22 - Jul 28 | - Finalize documentation (training, fine-tuning, CLI usage).
- Add lightweight support for 2D classification mode.
- Validate on small 2D image slices. | -| **Week 9** | Jul 29 - Aug 4 | - Add support for advanced ViT variants (DeiT, DINO, Swin).
- Integrate backbone selection cleanly into training pipeline.
- Run test training on at least one variant. | +| **Week 8** | Jul 22 - Jul 28 | - Finalize documentation (training, fine-tuning, CLI usage).
- Add lightweight support for 2D classification mode (Stretch Goal)
- Implement necessary changes in the Napari plugin to support ViT classifier. | +| **Week 9** | Jul 29 - Aug 4 | - Add support for advanced ViT variants (DeiT, DINO, Swin).
- Integrate backbone selection cleanly into training pipeline.
- Run test training on at least one variant.
- Validate on small 2D image slices (Stretch Goal) | | **Week 10** | Aug 5 - Aug 11 | - Complete variant model testing and collect comparison results.
- Final cleanup of codebase, refactor and organize files.
- Validate full test coverage and model switching. | | **Week 11** | Aug 12 - Aug 18 | - **Code freeze**: finalize codebase for submission.
- Write and publish final blog post summarizing work, results, and next steps. | | **Week 12**
FINAL WEEK | Aug 19 - Sep 1 | - Submit final GSoC report and code.
- Review and respond to mentor feedback.
- Ensure documentation is contributor-friendly and future-ready. | From d193a7774576898b8eeeb96e1450f137f0541b65 Mon Sep 17 00:00:00 2001 From: Adam Tyson Date: Tue, 8 Apr 2025 12:34:29 +0100 Subject: [PATCH 7/7] Rename cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan).md to cellfinder-Exploring Newer Architectures for Classifier Network (Larissa Poghosyan).md --- ...r Architectures for Classifier Network (Larissa Poghosyan).md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename GSoC-2025/{cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan).md => cellfinder-Exploring Newer Architectures for Classifier Network (Larissa Poghosyan).md} (100%) diff --git a/GSoC-2025/cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan).md b/GSoC-2025/cellfinder-Exploring Newer Architectures for Classifier Network (Larissa Poghosyan).md similarity index 100% rename from GSoC-2025/cellfinder: Exploring Newer Architectures for Classifier Network (Larissa Poghosyan).md rename to GSoC-2025/cellfinder-Exploring Newer Architectures for Classifier Network (Larissa Poghosyan).md