MVP for morpohology module #866

timtreis · 2024-08-07T21:32:13Z

MRE:

import squidpy as sq
import spatialdata as sd
from spatialdata.datasets import raccoon
import spatialdata_plot
from scipy.stats import entropy

sdata = raccoon()

def my_func(regionmask, intensity_image):
    
    masked_values = intensity_image[regionmask]
    histogram, bin_edges = np.histogram(masked_values, bins=256, range=(0, 255))
    probabilities = histogram / np.sum(histogram)

    return entropy(probabilities, base=2)

sq.im.quantify_morphology(
    sdata,
    label="segmentation",
    image="raccoon",
    methods=["area", "perimeter", "circularity", "intensity_mean", my_func],
    split_by_channels=True,
)

Notes

circularity is a non-skimage method but internal method (we need a library of those, can "steal" from CellProfiler + X)
my_func is an external method that gets fed to skimage.measure.regionprops. We need to provide doc on how to make such method. They take in a mask of the respective cell as well as a (potentially multi-channel) intensity image (both np.ndarrays).
add back to adata instead, align with rest of codebase
now thing of how to integrate custom callables that are external to the codebase, f.e. huggingface models
Needs to respect transformations & deal with different scales
Needs notebook in squidpy-notebooks

From discussion on 2024-08-13

Workstreams
- 1. "baseline" regionprops: what skimage has natively as functions, can query by string
- 1. "non-baseline" regionprops: other metrics we can "steal" from CellProfiler etc and make available by string internally
  - f.e. take math from https://github.com/CellProfiler/CellProfiler/blob/main/src/frontend/cellprofiler/modules/measuregranularity.py#L432
- 1. feed in methods that take in mask / intensity image
- 1. test performance of regionprops on actual data - does it work at scale at all?
  - if not, can we hijack the method and parallelize across chunks (parallel io) and collect
  - can we lazy-compute? dask?
- 1. respect transformations and datatree "zoom" - which matches, what does the user want?
- 1. my_func wrapper for model inference, f.e. from some HF-model?
  - download model etc?
  - they might require specific resolution, might have to scale all cells because area contains info

for more information, see https://pre-commit.ci

codecov-commenter · 2024-08-07T21:41:27Z

Codecov Report

Attention: Patch coverage is 20.33898% with 47 lines in your changes missing coverage. Please review.

Project coverage is 69.52%. Comparing base (4a632d6) to head (bbecec3).
Report is 5 commits behind head on main.

Files	Patch %	Lines
src/squidpy/im/_feature.py	20.33%	47 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #866      +/-   ##
==========================================
- Coverage   69.99%   69.52%   -0.47%     
==========================================
  Files          39       39              
  Lines        5532     5667     +135     
  Branches     1037     1063      +26     
==========================================
+ Hits         3872     3940      +68     
- Misses       1367     1429      +62     
- Partials      293      298       +5

Files	Coverage Δ
src/squidpy/im/_feature.py	`51.81% <20.33%> (-37.08%)`	⬇️

... and 2 files with indirect coverage changes

for more information, see https://pre-commit.ci

… as numpy array

npeschke · 2024-08-16T13:14:17Z

I had to remove the code that called the label function from skimage as it was creating additional labels that we cannot trace back to the original labels.

Also fix some failing code checks

https://github.com/afermg/cp_measurements/blob/main/src/cp_measure/minimal/measuregranularity.py

for more information, see https://pre-commit.ci

… and label

…into feature/add_morphology_toolbox

for more information, see https://pre-commit.ci

giovp · 2024-09-16T16:19:30Z

src/squidpy/im/__init__.py

@@ -3,7 +3,7 @@
 from __future__ import annotations

 from squidpy.im._container import ImageContainer
-from squidpy.im._feature import calculate_image_features
+from squidpy.im._feature import calculate_image_features, quantify_morphology


hi @npeschke @timtreis was just taking a look at this, was wondering why the duplication of the function? It seems that quantify_morphology does something very similar to calculate_image_features. Do you plan to discontinue the latter? wouldn't it be better to adapt the latter to use spatialdata + adapt? thank you

tbh not sure yet what the optimal solution will be. There is clear redundancy and a strong overlap, but when I wrote the original MVP, it didn't clearly fit with the current calculate_image_features. On the one hand, some of these features don't need an image but only the label, but then also the structure of the function was going to be quite different.

Even if we merge them into the calculate_image_features function, the parameters of that function would have to change. So yeah, not fully sure yet.

I generally wanted to have sth that takes an arbitrary callable with a region_props like footprint so we could also inject f.e. some HugginfFace featuriser or sth

right, that's a good point. The calculate_image_features does support arbitrary functions, see https://squidpy.readthedocs.io/en/stable/api/squidpy.im.calculate_image_features.html and https://squidpy.readthedocs.io/en/stable/api/squidpy.im.calculate_image_features.html and I understand that the current implementation doesn't operate on masks (or not only on mask), but maybe it would be ok to just change the input to "image" to not be just an Image but also a Label?

the parallelization/out of core functionality I think it's also important. The current implementation relies on joblib parallel to iterate over spots, and understand it's not optimal (e.g. how to do it with raster labels and raster image?). But maybe a combination of that and dask, e.g. https://examples.dask.org/applications/image-processing.html could be useful. Basically I think we should strive to implement scalability in time/memory.

looking at this, not maintained but maybe some ideas could be reused
https://github.com/jrussell25/dask-regionprops/blob/main/dask_regionprops/regionprops.py

EDIT: looking more deeply, not sure anymore 😅

Yeah, this ties into the larger topic of moving things to the GPU

Just for context, the current implementation takes 20 seconds on my machine to calculate all available features of the MIBI-TOF dataset. So parallelization might not need to be that urgent.

It'll be if we add more, potentially more expensive to compute, features and analyse datasets like Xenium with 100k+ cells ;)

hi @npeschke , thanks for sharing the time. If the mibi-tof dataset you are referring to is the one from squidpy, that is a toy dataset that does not really recapitulate real data complexity and size. I'd be curious to see the performance on a e.g. xenium dataset.

giovp · 2024-09-17T00:13:12Z

src/squidpy/im/_feature.py

+        # if we didn't get any properties, we'll do the bare minimum
+        props = ["label"]
+
+    np_rgb_image = image_element.values.transpose(1, 2, 0)  # (c, y, x) -> (y, x, c)


like, stuff like this doesn't work with real data. Everything needs to be either lazy, or looped/parallelized

timtreis and others added 2 commits August 7, 2024 17:29

MVP for internal calling of extra props

b99fa5f

[pre-commit.ci] auto fixes from pre-commit.com hooks

bbecec3

for more information, see https://pre-commit.ci

timtreis and others added 5 commits August 8, 2024 19:06

Added option to externally feed in functions

6e3d652

merge conflict resolved

7b21fda

[pre-commit.ci] auto fixes from pre-commit.com hooks

753ec3a

for more information, see https://pre-commit.ci

Merge branch 'main' into feature/add_morphology_toolbox

c9a3cc9

add rough functionality to write regionprops into sdata["table"].obsm…

34f745f

… as numpy array

npeschke and others added 10 commits August 23, 2024 15:12

DataFrame now written to obsm instead of numpy array

07606e2

Also fix some failing code checks

add granularity measurement from afermg/cp_measurements

0f0ea7c

https://github.com/afermg/cp_measurements/blob/main/src/cp_measure/minimal/measuregranularity.py

add border_occupied_factor

b33895b

add sanity checks and make multiple coordinate systems work

ebc08d8

fix assertion to accommodate new interface

50d58d2

[pre-commit.ci] auto fixes from pre-commit.com hooks

328ba16

for more information, see https://pre-commit.ci

add possibility for granularity to return multiple values per channel…

f7b5270

… and label

Merge remote-tracking branch 'origin/feature/add_morphology_toolbox' …

5c1d70f

…into feature/add_morphology_toolbox

[pre-commit.ci] auto fixes from pre-commit.com hooks

112d1a9

for more information, see https://pre-commit.ci

Merge branch 'main' into feature/add_morphology_toolbox

0ba44d2

giovp reviewed Sep 16, 2024

View reviewed changes

giovp reviewed Sep 17, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MVP for morpohology module #866

MVP for morpohology module #866

timtreis commented Aug 7, 2024 •

edited

Loading

codecov-commenter commented Aug 7, 2024 •

edited

Loading

npeschke commented Aug 16, 2024

giovp Sep 16, 2024

timtreis Sep 16, 2024

timtreis Sep 16, 2024

giovp Sep 16, 2024

giovp Sep 16, 2024

giovp Sep 16, 2024 •

edited

Loading

timtreis Sep 16, 2024

npeschke Sep 16, 2024

timtreis Sep 16, 2024 •

edited

Loading

giovp Sep 17, 2024

giovp Sep 17, 2024

MVP for morpohology module #866

Are you sure you want to change the base?

MVP for morpohology module #866

Conversation

timtreis commented Aug 7, 2024 • edited Loading

Notes

From discussion on 2024-08-13

codecov-commenter commented Aug 7, 2024 • edited Loading

Codecov Report

npeschke commented Aug 16, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

giovp Sep 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

timtreis Sep 16, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

timtreis commented Aug 7, 2024 •

edited

Loading

codecov-commenter commented Aug 7, 2024 •

edited

Loading

giovp Sep 16, 2024 •

edited

Loading

timtreis Sep 16, 2024 •

edited

Loading