Add consistent parameter docs to the main functions and napari #542

matham · 2025-06-07T21:07:18Z

Description

What is this PR

Bug fix
Addition of a new feature
Other

Why is this PR needed?

Right now we pass the same parameters from napari and to various functions. But these parameters have inconsistent or no docs.

What does this PR do?

Docs
1. Adds similarly worded docs for parameters to all main functions, napari, and setup_filters.
2. Makes the docs a bit more complete and readable.
3. Re-ordered the napari detection parameters order so it flows in the order of how filtering happens. I.e. 2d filtering -> 3d filtering -> cluster splitting.
A few parameters (voxel_sizes, n_sds_above_mean_thresh in napari) was set as ints, but they should be floats.
Split parameters
1. Exposed the split_ parameters in all the main functions, but not in napari.
2. The normal parameters are all in microns, but the split_ parameters were in pixels. I changed it so it's also in microns by multiplying by the default voxel size. But the default values for the parameters are unchanged.
3. Exposed n_splitting_iter with same default to the main functions. Previously it was hard coded in setup_filters.
4. Removed split_soma_diameter because it was redundant with soma_spread_factor and not used. AFAICT, soma_diameter is only used during 2d filtering, not during 3d filtering. Splitting only uses 3d filtering. The only place during splitting split_soma_diameter would be used is to calculate the volume of a cell to decide whether we need to split it. But, we already have soma_spread_factor that multiplies soma_diameter not split_soma_diameter.
Cleaned up batch_size into detection_batch_size and classification_batch_size. The former defaults to 1 in napari and None in main. Which means 1 for GPU and 4 for CPU. The latter default is unchanged. E.g. on my machine with very large planes, a batch of just 1 detection planes can fit in the 10GB GPU, but 128+ classification cubes can fit.

References

#532 and #464 (comment).

How has this PR been tested?

Locally and with data.

Is this a breaking change?

Only the batch_size parameter name is renamed. Everything else is effectively the same.

Does this PR require an update to the documentation?

No.

Checklist:

The code has been tested locally
Tests have been added to cover all new functionality (unit & integration)
The documentation has been updated to reflect any changes
The code has been formatted with pre-commit

matham · 2025-06-07T23:02:09Z

I'm not sure why the codecov check is failing?

alessandrofelder

I am a big fan of the consistency introduced here - thanks a lot @matham

I have followed the changes and think they make sense. I will run a sanity check on local data overnight, but assuming that passes and you respond to my minor comments/suggestions, this is ready to be merged

PS

I'm not sure why the codecov check is failing?

Maybe because we're adding extra lines through formatting etc (and maybe docstrings too?)... too small a change to warrant deeper investigation, I think, especially since patch coverage is 100%

alessandrofelder · 2025-06-09T10:15:29Z

cellfinder/core/detect/detect.py

-        Maximum size of a cluster in physical units.
-
+        The expected in-plane (xy) soma diameter (microns).
+    max_cluster_size : int


Suggested change

max_cluster_size : int

max_cluster_size : float

I think, for consistency with typehint at least (don't think it matters)

Changed the sig to int, because it's int everywhere else and it's number of voxels so should be int.

it's number of voxels

So that means the docstring should say number of voxels instead of cubic um right?

Ahhh it is um so I changed it to float everywhere..

alessandrofelder · 2025-06-09T10:17:33Z

cellfinder/core/detect/detect.py

-        Size of the sigma for the log filter.
-
+        Gaussian filter width (as a fraction of soma diameter) used during
+        2d in-plane filtering.


Suggested change

2d in-plane filtering.

2d in-plane Laplacian of Gaussian filtering.

Nitpicky, but may help explain the variable name

alessandrofelder · 2025-06-09T10:21:17Z

cellfinder/core/detect/detect.py

+        for all the filters. Tune to maximize memory usage without running
+        out. Check your GPU/CPU memory to verify it's not full.


Suggested change

for all the filters. Tune to maximize memory usage without running

out. Check your GPU/CPU memory to verify it's not full.

for all the filters. For performance-critical applications, tune to maximize memory usage without running

out. Check your GPU/CPU memory to verify it's not full.

This suggestion to make clear that tuning is not a requirement.

Also, should this default to 1, as suggested by the old docstring (but not the code 😅 )?

Added it everywhere. For GPU it does default to 1 in the code!?

alessandrofelder · 2025-06-09T10:22:27Z

cellfinder/core/detect/detect.py

    -------
    List[Cell]
-        List of detected cells.
+        List of detected potential cells and artifacts.


Suggested change

List of detected potential cells and artifacts.

List of detected cell candidates.

I think this is the wording we use in the paper and elsewhere.

alessandrofelder · 2025-06-09T10:24:21Z

cellfinder/core/detect/detect.py

+    split_ball_xy_size: int
+        Similar to `ball_xy_size`, except the value to use for the 3d
+        filter during cluster splitting.
+    split_ball_z_size: int
+        Similar to `ball_z_size`, except the value to use for the 3d filter
+        during cluster splitting.


Suggested change

split_ball_xy_size: int

Similar to `ball_xy_size`, except the value to use for the 3d

filter during cluster splitting.

split_ball_z_size: int

Similar to `ball_z_size`, except the value to use for the 3d filter

during cluster splitting.

split_ball_xy_size: float

Similar to `ball_xy_size`, except the value (in microns) to use for the 3d

filter during cluster splitting.

split_ball_z_size: float

Similar to `ball_z_size`, except the value (in microns) to use for the 3d filter

during cluster splitting.

Should now also be floats, as we use them as micron quantities now?

Fixed everywhere.

alessandrofelder · 2025-06-09T10:40:50Z

cellfinder/core/main.py

+    split_ball_xy_size: int
+        Similar to `ball_xy_size`, except the value to use for the 3d
+        filter during cluster splitting.
+    split_ball_z_size: int
+        Similar to `ball_z_size`, except the value to use for the 3d filter
+        during cluster splitting.


make these have float typehints too?

alessandrofelder · 2025-06-09T10:41:21Z

cellfinder/core/main.py

+    ball_xy_size : float
+        3d filter's in-plane (xy) filter ball size (microns).
+    ball_z_size : float
+        3d filter's axial (z) filter ball size (microns).


make typehints in function signature match these typehints?

alessandrofelder · 2025-06-09T10:42:56Z

cellfinder/napari/detect/detect.py

+            The expected in-plane (xy) soma diameter (microns)
+        log_sigma_size : float
+            Gaussian filter width (as a fraction of soma diameter) used during
+            2d in-plane filtering


Suggested change

2d in-plane filtering

2d in-plane Laplacian of Gaussian filtering

alessandrofelder · 2025-06-09T12:40:16Z

cellfinder/napari/detect/detect_containers.py

-            batch_size=dict(value=cls.defaults()["batch_size"]),
+            classification_batch_size=dict(
+                value=cls.defaults()["classification_batch_size"],
+                label="Batch size",


Suggested change

label="Batch size",

label="Batch size (classification)",

Helpful to be explicit?

alessandrofelder · 2025-06-09T12:45:09Z

cellfinder/core/classify/classify.py

+    max_workers: int
+        The number of sub-processes to use for data loading / processing.
+        Defaults to 8.
+    pin_memory: bool


presumably this will come in a follow up PR shortly? (it's not currently part of the function signature)

Too many PRs at once... It leaked over :D

matham · 2025-06-09T20:03:39Z

I think I addressed all of the comments, including in all the other places they would occur.

alessandrofelder

Thanks, @matham - I got delayed in doing my local tests on separate data (sorry!), just as a sanity check, but I am now happy with the code and will merge when I have done the local tests (assuming they go well, which they should)

matham · 2025-06-17T19:15:05Z

I moved the split parameters added to the main main function to the end, so as to make it less likely to conflict if someone called it using positional args only.

I'm wondering if we should make all of these main args, except the first few, keyword args only. So that we can change their order or add new ones in the future without compatibility issues. It's already risky calling them using positional input only, like is evidenced by brainglobe/brainglobe-workflows#150 where I converted them to keyword args. So maybe we should prevent it?

alessandrofelder · 2025-06-23T16:33:53Z

Thanks, @matham

All looks good to me now.

For our future selves, I did some one-off manual regression testing:

I confirm that I have run this code on our internal reference whole-brain dataset MS_cx_left and got 99.6% overlap of cell candidates centres with current main. Overlap was as closer than 1 pixel euclidean distance between centres, using brainglobe_utils.cells.match_cells. (~270 non-overlapping cells out of >71'000)

I also confirm that I was able to run brainmapper to successfully find the same cell candidates, so we haven't broken the API with this PR.

I'm wondering if we should make all of these main args, except the first few, keyword args only. So that we can change their order or add new ones in the future without compatibility issues. It's already risky calling them using positional input only, like is evidenced by brainglobe/brainglobe-workflows#150 where I converted them to keyword args. So maybe we should prevent it?

I've opened #550 so others can chime in, and we don't continue the discussion on a merged PR.

matham added 2 commits June 6, 2025 19:32

Add docs to the main funcs and make them consistent.

3b3f5fb

Remove extranious threshold param from other PR.

e103092

matham changed the title ~~Add consisten parameter docs to the main functions and napari~~ Add consistent parameter docs to the main functions and napari Jun 7, 2025

This was referenced Jun 7, 2025

Local threshold #543

Closed

Visualize 2d and 3d filtering steps #544

Merged

This was referenced Jun 8, 2025

Add tiled thresholds to 2d filtering for determining foreground pixels #545

Merged

Add missing split params to napari and clean up docstrings #532

Closed

Expose pin_memory to main and napari #546

Merged

alessandrofelder self-requested a review June 9, 2025 09:52

alessandrofelder requested changes Jun 9, 2025

View reviewed changes

Implement review suggestions.

8418f7d

alessandrofelder self-requested a review June 10, 2025 14:07

alessandrofelder reviewed Jun 10, 2025

View reviewed changes

Update cluster size to float b/c it's um.

658c6c4

matham mentioned this pull request Jun 17, 2025

Add support for new args from cellfinder brainglobe/brainglobe-workflows#150

Draft

7 tasks

Correct soma type and move new params to end.

444dd28

alessandrofelder self-requested a review June 20, 2025 09:14

alessandrofelder mentioned this pull request Jun 23, 2025

[Feature] make cellfinder main functions' arguments keyword-only #550

Open

alessandrofelder approved these changes Jun 23, 2025

View reviewed changes

alessandrofelder merged commit 4224d72 into brainglobe:main Jun 23, 2025
16 checks passed

matham deleted the docs branch June 23, 2025 19:40

	2d in-plane filtering.
	2d in-plane Laplacian of Gaussian filtering.

		for all the filters. Tune to maximize memory usage without running
		out. Check your GPU/CPU memory to verify it's not full.

	List of detected potential cells and artifacts.
	List of detected cell candidates.

	2d in-plane filtering
	2d in-plane Laplacian of Gaussian filtering

Add consistent parameter docs to the main functions and napari #542

Add consistent parameter docs to the main functions and napari #542

Uh oh!

Conversation

matham commented Jun 7, 2025

Description

References

How has this PR been tested?

Is this a breaking change?

Does this PR require an update to the documentation?

Checklist:

Uh oh!

matham commented Jun 7, 2025

Uh oh!

alessandrofelder left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

matham commented Jun 9, 2025

Uh oh!

alessandrofelder left a comment

Choose a reason for hiding this comment

Uh oh!

matham commented Jun 17, 2025

Uh oh!

alessandrofelder commented Jun 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alessandrofelder commented Jun 23, 2025 •

edited

Loading