Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Image analysis workflow #801

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions recipe/meta.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,7 @@ requirements:
- sqlalchemy ==1.4.46 # FIXME https://github.com/dask/dask/issues/9896
- pynvml ==11.5.0
- bokeh ==2.4.3 # FIXME https://github.com/dask/distributed/issues/7173
- dask-image ==2023.3.0

test:
imports:
Expand Down
30 changes: 30 additions & 0 deletions tests/workflows/test_image_analysis.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
import dask
import dask.array as da
import numpy as np
from dask_image import ndfilters, ndmeasure, ndmorph


def test_BBBC039(small_client):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest adding a link to those talk slides and/or the repo. That way if anyone needs to work on this benchmark later, they can look there for context/answers there before quizzing James about it.

images = da.from_zarr(
"s3://coiled-datasets/BBBC039", storage_options={"anon": True}
)
Comment on lines +9 to +11
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I converted the original .tif dataset to Zarr format and uploaded them to a public S3 bucket. This had the side-effect of bypassing dask/dask-image#84.

smoothed = ndfilters.gaussian_filter(images, sigma=[0, 1, 1])
thresh = ndfilters.threshold_local(smoothed, block_size=images.chunksize)
threshold_images = smoothed > thresh
structuring_element = np.array(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest adding a short comment: to this line about why we're not using the default structuring element. Eg:

# Since this image stack appears to be 3-dimensional,
# we sandwich a 2d structuring element in between zeros
# so that each 2d image slice has the binary closing applied independently

Apparently I only ever said that verbally during the talk, may as well write it down.

[
[[0, 0, 0], [0, 0, 0], [0, 0, 0]],
[[0, 1, 0], [1, 1, 1], [0, 1, 0]],
[[0, 0, 0], [0, 0, 0], [0, 0, 0]],
]
)
binary_images = ndmorph.binary_closing(
threshold_images, structure=structuring_element
)
label_images, num_features = ndmeasure.label(binary_images)
index = np.arange(num_features)
# FIXME: Only selecting the first few images due to cluster idle timeout.
# Maybe sending large graph? Need to investigate a bit.
Copy link

@GenevieveBuckley GenevieveBuckley Apr 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no useful suggestions, but it's great* that this has possibly already identified some kind of problem your users might occasionally run into.

*You know, great for your users, but admittedly not so great for the person who now needs to investigate it 😆 Have fun with that, James!

area = ndmeasure.area(images[:3], label_images[:3], index)
mean_intensity = ndmeasure.mean(images[:3], label_images[:3], index)
dask.compute(mean_intensity, area)