Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add ImageData/MaskData classes to compute statistics #226

Open
wants to merge 13 commits into
base: development
Choose a base branch
from

Conversation

jjjermiah
Copy link
Contributor

@jjjermiah jjjermiah commented Feb 12, 2025

view example as objects

Image Data

ImageData(
    hash='dd5733a31e935efab18bb26f475faffbf2b17d2e',
    spacing=Spacing3D(x=0.97656, y=0.97656, z=3.00000),
    size=Size3D(w=512, h=512, d=134),
    dimensions=3,
    max=3071.0,
    min=-1000.0,
    mean=-863.3482872977186,
    std=345.8227120542874,
    variance=119593.34817258258,
    sum=-30327090839.0
)
Notes on the mask

I used the following code to get the mask:

    mask_rtss = rtss.to_segmentation(
        img, {"GTV": "GTV.*"}
    )  # because of the way the we handle dictionaries, this will have 2 volumes in the same mask

which is why we wee volume_count=2 below

Mask Data

MaskData(
    hash='662ba6d49991858c1d9e49a07068c5e152b37d89',
    spacing=Spacing3D(x=0.97656, y=0.97656, z=3.00000),
    size=Size3D(w=512, h=512, d=134),
    dimensions=3,
    masked_max=230,
    masked_min=-118,
    masked_mean=88.12666145426114,
    masked_std=34.717529655755854,
    masked_variance=1205.3068653982873,
    masked_sum=np.int64(112714),
    num_labels=1,
    label=1,
    perimeter=1588.72625,
    touching_border=False,
    voxel_count=1279,
    volume_count=2,
    bbox=RegionBox(
        min=Coordinate3D(x=219, y=233, z=53),
        max=Coordinate3D(x=253, y=253, z=66)
        size=Size3D(w=34, h=20, d=13)
),
    centroid_index=Coordinate3D(x=237, y=243, z=60)
)

mask_stats(cropped to exactly bbox)

MaskData(
    hash='194ba4d0b3b2ae3c8670c51107bf8ca23ce2bc9f',
    spacing=Spacing3D(x=0.97656, y=0.97656, z=3.00000),
    size=Size3D(w=34, h=20, d=13),
    dimensions=3,
    masked_max=230,
    masked_min=-118,
    masked_mean=88.12666145426114,
    masked_std=34.717529655755854,
    masked_variance=1205.3068653982873,
    masked_sum=np.int64(112714),
    num_labels=1,
    label=1,
    perimeter=1588.72625,
    touching_border=True,
    voxel_count=1279,
    volume_count=2,
    bbox=RegionBox(
        min=Coordinate3D(x=0, y=0, z=0),
        max=Coordinate3D(x=34, y=20, z=13)
        size=Size3D(w=34, h=20, d=13)
),
    centroid_index=Coordinate3D(x=18, y=10, z=7)
)
view example as dictionaries
********************************************************************************
********************************************************************************
{
    'hash': 'dd5733a31e935efab18bb26f475faffbf2b17d2e',
    'spacing': {'x': 0.9765625, 'y': 0.9765625, 'z': 3.0},
    'size': {'width': 512, 'height': 512, 'depth': 134},
    'dimensions': 3,
    'max': 3071.0,
    'min': -1000.0,
    'mean': -863.3482872977186,
    'std': 345.8227120542874,
    'variance': 119593.34817258258,
    'sum': -30327090839.0
}
********************************************************************************
********************************************************************************
{
    'hash': '662ba6d49991858c1d9e49a07068c5e152b37d89',
    'spacing': {'x': 0.9765625, 'y': 0.9765625, 'z': 3.0},
    'size': {'width': 512, 'height': 512, 'depth': 134},
    'dimensions': 3,
    'masked_max': 230,
    'masked_min': -118,
    'masked_mean': 88.12666145426114,
    'masked_std': 34.717529655755854,
    'masked_variance': 1205.3068653982873,
    'masked_sum': np.int64(112714),
    'num_labels': 1,
    'label': 1,
    'perimeter': 1588.72625,
    'touching_border': False,
    'voxel_count': 1279,
    'volume_count': 2,
    'bbox': {
		'min': {'x': 219, 'y': 233, 'z': 53}, 
		'max': {'x': 253, 'y': 253, 'z': 66}, 
		'size': {'width': 34, 'height': 20, 'depth': 13}},
    'centroid_index': {'x': 237, 'y': 243, 'z': 60}
}

mask_stats(cropped to exactly bbox)
{
    'hash': '194ba4d0b3b2ae3c8670c51107bf8ca23ce2bc9f',
    'spacing': {'x': 0.9765625, 'y': 0.9765625, 'z': 3.0},
    'size': {'width': 34, 'height': 20, 'depth': 13},
    'dimensions': 3,
    'masked_max': 230,
    'masked_min': -118,
    'masked_mean': 88.12666145426114,
    'masked_std': 34.717529655755854,
    'masked_variance': 1205.3068653982873,
    'masked_sum': np.int64(112714),
    'num_labels': 1,
    'label': 1,
    'perimeter': 1588.72625,
    'touching_border': True,
    'voxel_count': 1279,
    'volume_count': 2,
    'bbox': {
		'min': {'x': 0, 'y': 0, 'z': 0}, 
		'max': {'x': 34, 'y': 20, 'z': 13}, 
		'size': {'width': 34, 'height': 20, 'depth': 13}},
    'centroid_index': {'x': 18, 'y': 10, 'z': 7}
}

Summary by CodeRabbit

  • New Features

    • Enhanced image region extraction for improved processing accuracy.
    • Introduced comprehensive image and mask analytics, enabling detailed statistical insights and cropping operations.
    • Added a 3D spacing tool with precision formatting for refined spatial data representation.
  • Bug Fixes

    • Improved string representation of the Spacing3D class to display values with five decimal places.

Copy link
Contributor

coderabbitai bot commented Feb 12, 2025

📝 Walkthrough

Walkthrough

The changes modify the behavior in four modules. In the box module, the from_mask_centroid method now creates distinct coordinate objects for minimum and maximum values. The image statistics module gains two new data classes (ImageData and MaskData) along with associated methods and a demonstration function. A new Spacing3D class is added in the spatial types module with its __repr__ method refined for higher precision formatting. Finally, a clarifying comment has been added in the image utilities module regarding the expected input type for a transformation function.

Changes

File(s) Change Summary
src/imgtools/coretypes/box.py Modified RegionBox.from_mask_centroid to instantiate separate Coordinate3D objects for minimum and maximum coordinates, ensuring distinct object creation while retaining the expand_to_cube functionality.
src/imgtools/coretypes/image_statistics.py Added two new classes, ImageData and MaskData, with class methods (from_image and from_image_and_mask) to compute image and mask statistics respectively; also introduced a main function to demonstrate their usage.
src/imgtools/coretypes/spatial_types.py Added new Spacing3D class with a constructor supporting multiple input formats and updated its __repr__ method to format values to five decimal places.
src/imgtools/utils/imageutils.py Inserted a comment clarifying that the TransformIndexToPhysicalPoint function expects a list rather than a NumPy array, with the transformation logic otherwise unchanged.

Suggested labels

enhancement

Constructive Feedback

  • Maintainability: The refactoring in box.py improves object isolation; however, consider adding inline documentation to explain the rationale behind creating separate objects for similar coordinates. This will aid future developers in understanding the design choices more clearly.
  • Readability: The new classes in image_statistics.py are well-structured; including more detailed docstrings and example usage in the class definitions could further improve clarity. This would help users grasp the intended use of these classes quickly.
  • Consistency: The updated formatting in Spacing3D enhances precision. Maintaining a consistent style across similar classes will aid future maintenance. Consider reviewing other classes for similar formatting improvements.
  • Code Comments: The new comment in imageutils.py is helpful. Ensure that similar edge cases and type expectations are clearly documented throughout the codebase. This practice can prevent misunderstandings and facilitate easier debugging later on.

📜 Recent review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0d00f9e and 2f1fbc0.

⛔ Files ignored due to path filters (1)
  • pixi.lock is excluded by !**/*.lock and included by none
📒 Files selected for processing (1)
  • tests/coretypes/test_helpers.py (1 hunks)
✅ Files skipped from review due to trivial changes (1)
  • tests/coretypes/test_helpers.py
⏰ Context from checks skipped due to timeout of 90000ms (13)
  • GitHub Check: Unit-Tests (windows-latest, py313)
  • GitHub Check: Unit-Tests (windows-latest, py312)
  • GitHub Check: Unit-Tests (windows-latest, py311)
  • GitHub Check: Unit-Tests (windows-latest, py310)
  • GitHub Check: Unit-Tests (macos-13, py313)
  • GitHub Check: Unit-Tests (macos-13, py312)
  • GitHub Check: Unit-Tests (macos-13, py311)
  • GitHub Check: Unit-Tests (macos-13, py310)
  • GitHub Check: Unit-Tests (macos-latest, py310)
  • GitHub Check: Unit-Tests (ubuntu-latest, py313)
  • GitHub Check: Unit-Tests (ubuntu-latest, py312)
  • GitHub Check: Unit-Tests (ubuntu-latest, py311)
  • GitHub Check: Unit-Tests (ubuntu-latest, py310)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@jjjermiah jjjermiah marked this pull request as ready for review February 19, 2025 22:28
@coderabbitai coderabbitai bot added the enhancement New feature or request label Feb 19, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
src/imgtools/coretypes/image_statistics.py (2)

39-49: Consider renaming attributes that match built-in function names.
Attributes like min, max, and sum overshadow Python built-ins, which can cause confusion. Renaming them to min_value, max_value, and intensity_sum (or similar) would enhance clarity.

- min: float
- max: float
- sum: float
+ min_value: float
+ max_value: float
+ intensity_sum: float

217-263: Separate demonstration code from production code.
This main() function is helpful for illustrating usage, but consider moving it to an example script or Jupyter notebook. That way, production modules remain focused on reusable logic.

src/imgtools/utils/imageutils.py (1)

187-187: Consider performance implications of the list comprehension.
The vectorized approach was replaced with a list comprehension, which is more readable but might be slower for very large arrays. If performance becomes critical, you could revisit vectorization or other batch-processing solutions.

src/imgtools/coretypes/box.py (2)

138-140: Add a clarifying comment for zero-size box initialization.

The code creates a box with identical min and max coordinates, which results in a zero-size box that will be expanded later. A comment explaining this would improve readability.

 return RegionBox(
+            # Initialize with zero size, will be expanded by expand_to_cube
             Coordinate3D(*centroid_idx), Coordinate3D(*centroid_idx)
         ).expand_to_cube(desired_size)

115-130: Enhance docstring to explain box initialization and expansion.

The docstring could better explain the method's behavior by mentioning that it creates a zero-size box at the centroid and always expands it to a cube.

         """Creates a RegionBox from the centroid of a mask image.
 
         Parameters
         ----------
         mask : sitk.Image
             The input mask image.
         label : int, optional
             label in the mask image to calculate the centroid.
         desired_size : int | None, optional
             The desired size of the box. If None, the minimum size default from `expand_to_min_size` is used.
 
         Returns
         -------
         RegionBox
             The bounding box coordinates as a RegionBox object.
+
+        Notes
+        -----
+        The box is initially created with zero size at the centroid and then
+        expanded to a cube using `expand_to_cube`. This ensures that the box
+        is centered on the centroid with equal dimensions.
         """
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5516648 and 1212031.

⛔ Files ignored due to path filters (1)
  • pixi.lock is excluded by !**/*.lock and included by none
📒 Files selected for processing (4)
  • src/imgtools/coretypes/box.py (1 hunks)
  • src/imgtools/coretypes/image_statistics.py (1 hunks)
  • src/imgtools/coretypes/spatial_types.py (2 hunks)
  • src/imgtools/utils/imageutils.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`src/**/*.py`: Review the Python code for compliance with PE...

src/**/*.py: Review the Python code for compliance with PEP 8 and PEP 257 (docstring conventions). Ensure the following: - Variables and functions follow meaningful naming conventions. - Docstrings are present, accurate, and align with the implementation. - Code is efficient and avoids redundancy while adhering to DRY principles. - Consider suggestions to enhance readability and maintainability. - Highlight any potential performance issues, edge cases, or logical errors. - Ensure all imported libraries are used and necessary.

  • src/imgtools/utils/imageutils.py
  • src/imgtools/coretypes/spatial_types.py
  • src/imgtools/coretypes/box.py
  • src/imgtools/coretypes/image_statistics.py
⏰ Context from checks skipped due to timeout of 90000ms (12)
  • GitHub Check: Unit-Tests (windows-latest, py313)
  • GitHub Check: Unit-Tests (windows-latest, py312)
  • GitHub Check: Unit-Tests (windows-latest, py311)
  • GitHub Check: Unit-Tests (windows-latest, py310)
  • GitHub Check: Unit-Tests (macos-13, py313)
  • GitHub Check: Unit-Tests (macos-13, py312)
  • GitHub Check: Unit-Tests (macos-13, py311)
  • GitHub Check: Unit-Tests (macos-13, py310)
  • GitHub Check: Unit-Tests (macos-latest, py310)
  • GitHub Check: Unit-Tests (ubuntu-latest, py313)
  • GitHub Check: Unit-Tests (ubuntu-latest, py312)
  • GitHub Check: Unit-Tests (ubuntu-latest, py310)
🔇 Additional comments (7)
src/imgtools/coretypes/image_statistics.py (4)

10-37: Docstrings look clear and comprehensive.
They accurately describe the purpose and usage of the ImageData class and its attributes in a concise way.


50-81: Confirm or constrain the supported image dimensions.
This method computes statistics from any dimensional SimpleITK image. If the application only supports 3D images, consider adding a check or clarifying it in the docstring.

Would you like to verify how this function behaves for non-3D images?


83-127: Thorough documentation for MaskData.
The docstring is well-structured and clearly explains each attribute, making it easy to track mask-specific statistics.


128-146: Consistent attribute naming.
The attributes related to masked statistics avoid overshadowing built-ins, improving readability. Good work!

src/imgtools/utils/imageutils.py (1)

172-174: New comment clarifies the expected data type.
Explicitly documenting that a list is required to transform indices helps avoid confusion and potential type errors. Nicely done.

src/imgtools/coretypes/spatial_types.py (1)

162-186: Precise __repr__ enhances debugging.
Formatting spacing to five decimal places improves clarity. If further precision is needed (e.g., medical imaging requiring high accuracy), document and confirm the rationale. Otherwise, this is a solid implementation.

src/imgtools/coretypes/box.py (1)

1-492: Implementation looks solid!

The changes to create distinct coordinate objects in from_mask_centroid are well-implemented. The overall code is well-documented, has good error handling, and follows best practices.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
src/imgtools/coretypes/image_statistics.py (1)

147-211: ⚠️ Potential issue

Check for empty masked_array_nonzero using its size, not None.
Line 180 checks for None, but masked_array[masked_array != 0] returns an empty array when there are no non-zero pixels, never None. This can lead to .max() or .min() calls on an empty array, causing runtime errors. Update the condition to handle empty arrays correctly.

Here’s a suggested fix:

-if (masked_array_nonzero := masked_array[masked_array != 0]) is None:
-    msg = f"No non-zero values found in masked region with {label=}."
-    raise ValueError(msg)
+masked_array_nonzero = masked_array[masked_array != 0]
+if masked_array_nonzero.size == 0:
+    raise ValueError(
+        f"No non-zero values found in masked region with {label=}."
+    )
🧹 Nitpick comments (2)
src/imgtools/coretypes/image_statistics.py (2)

10-49: Consider renaming min and max to avoid overshadowing built-ins.
Using min and max as attribute names overshadows Python’s built-in functions. Prefer something like intensity_min and intensity_max for clarity and to uphold coding best practices.


214-261: Add a docstring or move demonstration code out of production.
While it’s valuable to illustrate how ImageData and MaskData work, consider adding a docstring or isolating this demonstration in a dedicated example script to keep library code lean.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1212031 and 0d00f9e.

📒 Files selected for processing (1)
  • src/imgtools/coretypes/image_statistics.py (1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`src/**/*.py`: Review the Python code for compliance with PE...

src/**/*.py: Review the Python code for compliance with PEP 8 and PEP 257 (docstring conventions). Ensure the following: - Variables and functions follow meaningful naming conventions. - Docstrings are present, accurate, and align with the implementation. - Code is efficient and avoids redundancy while adhering to DRY principles. - Consider suggestions to enhance readability and maintainability. - Highlight any potential performance issues, edge cases, or logical errors. - Ensure all imported libraries are used and necessary.

  • src/imgtools/coretypes/image_statistics.py
⏰ Context from checks skipped due to timeout of 90000ms (13)
  • GitHub Check: Unit-Tests (windows-latest, py313)
  • GitHub Check: Unit-Tests (windows-latest, py312)
  • GitHub Check: Unit-Tests (windows-latest, py311)
  • GitHub Check: Unit-Tests (windows-latest, py310)
  • GitHub Check: Unit-Tests (macos-13, py313)
  • GitHub Check: Unit-Tests (macos-13, py312)
  • GitHub Check: Unit-Tests (macos-13, py311)
  • GitHub Check: Unit-Tests (macos-13, py310)
  • GitHub Check: Unit-Tests (macos-latest, py310)
  • GitHub Check: Unit-Tests (ubuntu-latest, py313)
  • GitHub Check: Unit-Tests (ubuntu-latest, py312)
  • GitHub Check: Unit-Tests (ubuntu-latest, py311)
  • GitHub Check: Unit-Tests (ubuntu-latest, py310)
🔇 Additional comments (3)
src/imgtools/coretypes/image_statistics.py (3)

1-8: All good with the imports.
These imports, including from __future__ import annotations, align well with modern Python practices.


50-81: ImageData.from_image seems correct.
The usage of StatisticsImageFilter and returning a populated ImageData is straightforward and readable. Nice job!


83-146: MaskData design is logically consistent.
The data class fields and their docstrings properly describe the mask-related statistics. This fosters maintainability and clear data handling.

@jjjermiah jjjermiah requested a review from strixy16 February 19, 2025 22:51
@jjjermiah jjjermiah changed the title Refactor Spacing3D and add ImageData/MaskData classes feat: Add ImageData/MaskData classes to compute statistics Feb 19, 2025
Copy link

codecov bot commented Feb 20, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 65.13%. Comparing base (d9a83d1) to head (2f1fbc0).
Report is 1 commits behind head on development.

Additional details and impacted files
@@             Coverage Diff              @@
##           development     #226   +/-   ##
============================================
  Coverage        65.12%   65.13%           
============================================
  Files               55       55           
  Lines             3710     3711    +1     
============================================
+ Hits              2416     2417    +1     
  Misses            1294     1294           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant