WIP: ImageMaskInput #233

jjjermiah · 2025-02-19T18:24:12Z

Work In Progress

mostly meets the requirements of Med-ImageNet, but could be useful as a public API for anyone

semantics (naming and methods) are all open to be changed

Prototype `ImageMaskInput`

Crawl
Build Graph
possible inputs and the loaded result:

Important

by design, RTSTRUCTs are converted to imgtools.Segmentation returned with all ROIs
which matches the default design of SEG objects, and will allow for uniform filtering of ROINames on one data type in the future

Image Modality (`imgtools.Scan`)	Mask Modality	Mask Class
CT	RTSTRUCT	`imgtools.Segmentation`(converted from `imgtools.StructureSet` with all ROIs)
MR	RTSTRUCT	`imgtools.Segmentation`(converted from `imgtools.StructureSet` with all ROIs)
CT	SEG	`imgtools.Segmentation`
MR	SEG	`imgtools.Segmentation`

class ImageMaskInput(BaseInput):
    """
    Easily index and load scan-mask pairs.

    This class crawls through the specified 
	directory to index the dataset,
    creates a graph of the dataset, automatically
	querying the graph for image-mask pairs.

    Parameters
    ----------
    dir_path : pathlib.Path
        Path to the directory containing the dataset.
    modalities : ImageMaskModalities
        Modalities to be used for querying the graph.
    n_jobs : int, optional
        Number of jobs to use for crawling, by default -1.
    update_crawl : bool, optional
        Whether to force update the crawl, by default False.
    update_edges : bool, optional
        Whether to force update the edges, by default False.
	"""

Key Features:

Streamlined Interface
Automatic Dataset Crawling
Built-in Dataset Graphing:
Automatically constructs a DataGraph and querying of specific modalities (e.g., CT, MR, RTSTRUCT, SEG).
Versatile Modality Support using ImageMaskModalities Enum
- CT & RTSTRUCT
- MR & RTSTRUCT
- CT & SEG
- MR & SEG
Automatically configure loader for user

Summary

Simplifies Image & Mask Dataset Loading: Eliminates boilerplate code for
parsing, indexing, and querying datasets.
Improves Workflow Efficiency: Automates graph-based querying and
loader setup.
Flexible & Extensible: Easily supports 4 main Image-Mask modalities

Tutorial: Loading Images and Masks from DICOMs

from imgtools.ops.input_classes import ImageMaskInput, ImageMaskModalities
from imgtools.logging import logger
from pathlib import Path

Getting Started

You need the following at minimum:

Path to the directory containing the DICOM files
Establish which Image and Mask modalities you want to use:
Combinations are:

For this tutorial we will download two datasets:

# Define the path to the data
testdata = Path("testdata")
# for this tutorial we will use some test image data
datasets_name = ["NSCLC-Radiomics", "Vestibular-Schwannoma-SEG"]

%%capture 
# download data using the imgtools cli
!imgtools testdata -a Vestibular-Schwannoma-SEG.tar.gz -a NSCLC-Radiomics.tar.gz {testdata.absolute()}

Setting up the loaders with `ImageMaskInput`

Vestibular-Schwannoma-SEG

has MR as scan and RTSTRUCT as mask:

Patient ID	Modality	Number of Series
VS-SEG-001	MR	2
VS-SEG-001	RTSTRUCT	2
VS-SEG-002	MR	2
VS-SEG-002	RTSTRUCT	2

vs_seg = ImageMaskInput(
  dir_path=testdata / datasets_name[1],
  modalities=ImageMaskModalities.MR_RTSTRUCT
)

print(vs_seg)

ImageMaskInput<
	num_cases=4,
	dataset_name='Vestibular-Schwannoma-SEG',
	modalities=<MR,RTSTRUCT>,
	output_streams=['MR', 'RTSTRUCT_MR'],
	series_col_names=['series_MR', 'series_RTSTRUCT_MR'],
>

NSCLC-Radiomics

has CT as scan and BOTH RTSTRUCT and SEG as masks.

Patient ID	Modality	Number of Series
LUNG1-001	CT	1
LUNG1-001	RTSTRUCT	1
LUNG1-001	SEG	1
LUNG1-002	CT	1
LUNG1-002	RTSTRUCT	1
LUNG1-002	SEG	1

CT & RTSTRUCT

nsclsc_rtstruct = ImageMaskInput(
    dir_path=testdata / datasets_name[0],
    modalities=ImageMaskModalities.CT_RTSTRUCT
)
print(nsclsc_rtstruct)

ImageMaskInput<
	num_cases=2,
	dataset_name='NSCLC-Radiomics',
	modalities=<CT,RTSTRUCT>,
	output_streams=['CT', 'RTSTRUCT_CT'],
	series_col_names=['series_CT', 'series_RTSTRUCT_CT'],
>

CT & SEG

nsclsc_seg = ImageMaskInput(
    dir_path=testdata / datasets_name[0],
    modalities=ImageMaskModalities.CT_SEG
)
print(nsclsc_seg)

ImageMaskInput<
	num_cases=2,
	dataset_name='NSCLC-Radiomics',
	modalities=<CT,SEG>,
	output_streams=['CT', 'SEG'],
	series_col_names=['series_CT', 'series_SEG'],
>

Using the Input Datasets

# List the case IDs (subject IDs)
print(nsclsc_rtstruct.keys())
# ['0_LUNG1-002', '1_LUNG1-001']

Load a case

StructureSets are automatically converted to `Segmentation` object with all ROIs

# by case ID or index
image, mask = nsclsc_rtstruct['0_LUNG1-002']
print(mask)

mask=<Segmentation with ROIs: {'Esophagus': 1, 'GTV-1': 2, 'Heart': 3, 'Lung-Left': 4, 'Lung-Right': 5, 'Spinal-Cord': 6}>)

Whereas native `SEG` modalities get loaded as `Segmentation` objects

# by case ID or index
case_seg = nsclsc_seg[nsclsc_seg.keys()[0]]
image = case_seg.image
mask = case_seg.mask
mask

# (Segmentation.from_dicom is broken, but ignore for now)
mask=<Segmentation with ROIs: {'label_1': 1}>)

MR & RTSTRUCT example

case_id = vs_seg.keys()[0]
case = vs_seg[case_id]
image = case.image
mask = case.mask

mask

mask=<Segmentation with ROIs: {'*Skull': 1, 'TV': 2, 'cochlea': 3}>)

Summary by CodeRabbit

New Features
- Introduced a new input-processing module for enhanced image and mask data management with integrated performance tracking.
Bug Fixes
- Corrected region boundary computations to ensure accurate detection and improved handling for optional size adjustments.
Refactor
- Optimized linting configuration to include more files for better code quality checks.
- Enhanced the structure and functionality of the input classes for improved modularity and readability.
- Updated the module's public interface for clarity and maintainability.
Tests
- Updated DICOM data handling tests for more robust and comprehensive image object processing.

… object usage

…orator documentation

coderabbitai · 2025-02-19T18:24:22Z

📝 Walkthrough

Walkthrough

This pull request updates the linting configuration by modifying the config/ruff.toml file to include and exclude specific file paths. It corrects the RegionBox.from_mask_centroid method in the core types module to ensure distinct coordinate instances are generated, while also refining the expand_to_cube method. The ops module undergoes significant refactoring with the introduction of the ImageMaskInput class, a new timer decorator, and an enum for modalities. Additionally, unused imports are removed, and minor adjustments are made in utility and test modules.

Changes

File Path	Change Summary
`config/ruff.toml`	Updated linting configuration: uncommented lines and added inclusion for `src/imgtools/ops/*/.py`; adjusted exclusions for `src/imgtools/ops/old_output_ops.py`, `src/imgtools/coretypes/deprecated_bbox.py`, `src/imgtools/types.py`, and `src/imgtools/utils/crawl.py`; removed `extend-include` section.
`src/imgtools/coretypes/box.py`	Modified `RegionBox.from_mask_centroid` to generate distinct `Coordinate3D` instances and added a return type annotation; updated `expand_to_cube` to use the `desired_size` parameter more robustly.
`src/imgtools/ops/input_classes.py`	Introduced the `ImageMaskInput` class (extending `BaseInput`), added a `timer` decorator, introduced the `ImageMaskModalities` enum, and defined a type alias; refactored methods for dataset crawling and parsing.
`src/imgtools/ops/ops.py`	Removed the unused import of `map_over_labels` from `imgtools.modules` and added a new `__all__` declaration to clarify the public interface.
`src/imgtools/utils/imageutils.py`	Adjusted `idxs_to_physical_points` by removing outdated comments and revising the return statement to ensure the transformation is applied correctly to a list of indices.
`tests/legacy_tests/test_modalities.py`	Changed DICOM handling by removing `.image` attribute access, so that variables now directly hold the full DICOM objects returned by `read_dicom_auto`.
`src/imgtools/ops/__init__.py`	Added new imports for `ImageMask`, `ImageMaskInput`, and `ImageMaskModalities`; updated `__all__` list to include newly imported entities for public API clarity.
`src/imgtools/utils/timer.py`	Introduced a new `timer` function as a decorator to log execution time of functions.

✨ Finishing Touches

📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

…nd remove unused imports

codecov · 2025-02-19T18:38:30Z

Codecov Report

Attention: Patch coverage is 40.25974% with 92 lines in your changes missing coverage. Please review.

Project coverage is 64.06%. Comparing base (d9a83d1) to head (7b61169).
Report is 1 commits behind head on development.

Files with missing lines	Patch %	Lines
src/imgtools/ops/input_classes.py	36.76%	86 Missing ⚠️
src/imgtools/utils/timer.py	57.14%	6 Missing ⚠️

Additional details and impacted files

@@               Coverage Diff               @@
##           development     #233      +/-   ##
===============================================
- Coverage        65.12%   64.06%   -1.06%     
===============================================
  Files               55       56       +1     
  Lines             3710     3857     +147     
===============================================
+ Hits              2416     2471      +55     
- Misses            1294     1386      +92

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

…h test image data

…geMaskInput

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (7)

src/imgtools/ops/input_classes.py (5)
5-10: Consider unifying your namedtuple usage for consistency.

Currently, both namedtuple from collections (line 5) and NamedTuple from typing (line 9) are imported. If you need stricter typing, consider switching fully to the typing.NamedTuple approach to enhance maintainability and type checking consistency.

30-53: Use time.perf_counter instead of time.time for more precise timing.

The current decorator relies on time.time(), which can be less precise on some platforms. For higher-resolution timing, consider using time.perf_counter():
- start_time = time.time()
...
- end_time = time.time()
+ start_time = time.perf_counter()
...
+ end_time = time.perf_counter()
55-63: Optional: Add docstrings for enum members.

While your ImageMaskModalities enum is straightforward, brief docstrings or comments explaining each modality combination could help new contributors quickly understand the possible modality pairs.

254-286: Namedtuple packaging is convenient, but ensure all modality cases are handled.

The match statement for "RTSTRUCT" and "SEG" is great for structural consistency. For more robust code, consider augmenting or extending the logic if other modalities will be added in the future.

520-538: Demonstration code is fine, but consider wrapping in a test or example module.

Inline demonstration within the __main__ guard is quick and illustrative. For larger codebases, grouping such usage examples into a dedicated example/test module (or a Jupyter notebook) can aid discoverability.
tests/legacy_tests/test_modalities.py (1)
30-30: Remove redundant self-assignment.

This line is effectively a no-op:
- dicom = dicom
Removing it avoids confusion for future maintainers.

🧰 Tools

🪛 Ruff (0.8.2)

30-30: Self-assignment of variable dicom

(PLW0127)
src/imgtools/utils/imageutils.py (1)

172-188: Vectorization can further optimize large index arrays.

Replacing the list comprehension (line 187) with a NumPy-vectorized approach (like your commented-out vectorized_transform) may boost performance when idxs is large. Current implementation is readable, but for massive data, consider reintroducing a vectorized method.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5516648 and b00c91f.

⛔ Files ignored due to path filters (2)

docs/usage/Input/ImageMask.ipynb is excluded by none and included by none
pixi.lock is excluded by !**/*.lock and included by none

📒 Files selected for processing (6)

config/ruff.toml (1 hunks)
src/imgtools/coretypes/box.py (1 hunks)
src/imgtools/ops/input_classes.py (2 hunks)
src/imgtools/ops/ops.py (1 hunks)
src/imgtools/utils/imageutils.py (2 hunks)
tests/legacy_tests/test_modalities.py (2 hunks)

🧰 Additional context used

📓 Path-based instructions (2)

`src/**/*.py`: Review the Python code for compliance with PE...

src/**/*.py: Review the Python code for compliance with PEP 8 and PEP 257 (docstring conventions). Ensure the following: - Variables and functions follow meaningful naming conventions. - Docstrings are present, accurate, and align with the implementation. - Code is efficient and avoids redundancy while adhering to DRY principles. - Consider suggestions to enhance readability and maintainability. - Highlight any potential performance issues, edge cases, or logical errors. - Ensure all imported libraries are used and necessary.

src/imgtools/utils/imageutils.py
src/imgtools/ops/input_classes.py
src/imgtools/coretypes/box.py
src/imgtools/ops/ops.py

`tests/**/*`: Review the test code written with Pytest. Conf...

tests/**/*: Review the test code written with Pytest. Confirm: - Tests cover all critical functionality and edge cases. - Test descriptions clearly describe their purpose. - Pytest best practices are followed, such as proper use of fixtures. - Ensure the tests are isolated and do not have external dependencies (e.g., network calls). - Verify meaningful assertions and avoidance of redundant tests. - Test code adheres to PEP 8 style guidelines.

tests/legacy_tests/test_modalities.py

🪛 Ruff (0.8.2)

tests/legacy_tests/test_modalities.py

30-30: Self-assignment of variable dicom

(PLW0127)

⏰ Context from checks skipped due to timeout of 90000ms (13)

GitHub Check: Unit-Tests (windows-latest, py313)
GitHub Check: Unit-Tests (windows-latest, py312)
GitHub Check: Unit-Tests (windows-latest, py311)
GitHub Check: Unit-Tests (windows-latest, py310)
GitHub Check: Unit-Tests (macos-13, py313)
GitHub Check: Unit-Tests (macos-13, py312)
GitHub Check: Unit-Tests (macos-13, py311)
GitHub Check: Unit-Tests (macos-13, py310)
GitHub Check: Unit-Tests (macos-latest, py310)
GitHub Check: Unit-Tests (ubuntu-latest, py313)
GitHub Check: Unit-Tests (ubuntu-latest, py312)
GitHub Check: Unit-Tests (ubuntu-latest, py311)
GitHub Check: Unit-Tests (ubuntu-latest, py310)

🔇 Additional comments (7)

src/imgtools/ops/input_classes.py (5)

25-28: Looks good!

The introduction of BaseInput and the new LoaderFunction type alias is clear and aligns well with your SITK-based pipeline.

65-93: Dataclass design is clear and well-documented.

The class-level docstring effectively explains basic usage and critical parameters. No immediate concerns with naming or structure here.

113-185: Validate multidimensional modality assumptions.

In __call__, the code assumes self.modality_list[1] exists (line 260). If a new enum (with just one modality) or alternative usage arises, this could potentially raise an IndexError. Consider adding a safety check or clarifying in the docstring that exactly two modalities (e.g., “image” and “mask”) are expected for this class.

201-221: Crawler usage and log messages are cohesive.

The _crawl method properly respects the update_crawl flag, logs relevant details, and warns the user when indexing already exists. Great job on providing meaningful logs.

223-241: Confirm the validity of modality inputs.

The parse_graph method performs basic checks (str, list, enum), but it might be helpful to confirm validity of each modality token when the user passes a string or list. Invalid user input could produce unexpected partial matches.

tests/legacy_tests/test_modalities.py (1)

21-21: Direct usage of read_dicom_auto is fine.

Since you are directly storing the return object in img, there's no immediate red flag here. Make sure that you consistently handle the returned object shape across all tests.

config/ruff.toml (1)

14-14: Great addition to linting coverage!

Including input_classes.py in the linting process aligns with the project's goal of expanding linting coverage. This will help maintain code quality and consistency.

src/imgtools/coretypes/box.py

src/imgtools/ops/ops.py

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (5)

src/imgtools/ops/input_classes.py (5)
5-12: Consider organizing imports into logical groups.

For better readability, consider organizing imports into three groups:

Standard library imports

Third-party imports

Local imports
-from collections import namedtuple
-from enum import Enum
-import pathlib
-import time
-from typing import Any, Generator, List, NamedTuple, Optional, Callable
-import SimpleITK as sitk
-from imgtools.modules import StructureSet, Segmentation, Scan
-from dataclasses import dataclass, field
-import pandas as pd

+# Standard library imports
+from collections import namedtuple
+from dataclasses import dataclass, field
+from enum import Enum
+import pathlib
+import time
+from typing import Any, Generator, List, NamedTuple, Optional, Callable
+
+# Third-party imports
+import pandas as pd
+import SimpleITK as sitk
+
+# Local imports
+from imgtools.modules import StructureSet, Segmentation, Scan
27-27: Add docstring to the LoaderFunction type alias.

Adding a docstring would help other developers understand the purpose and expected return types of the loader functions.
-LoaderFunction = Callable[..., sitk.Image | StructureSet | Segmentation]
+# Type alias for functions that load medical images or structures
+LoaderFunction = Callable[..., sitk.Image | StructureSet | Segmentation]
+"""Type hint for functions that load medical images, structure sets, or segmentations."""
55-63: Add documentation for each modality combination.

Consider adding docstrings to explain each modality combination and its typical use case.
 class ImageMaskModalities(Enum):
+    """Enum defining supported combinations of medical image modalities and their masks."""
+
     CT_RTSTRUCT = ("CT", "RTSTRUCT")
+    """CT scan with radiotherapy structure set."""
     CT_SEG = ("CT", "SEG")
+    """CT scan with segmentation mask."""
     MR_RTSTRUCT = ("MR", "RTSTRUCT")
+    """MR scan with radiotherapy structure set."""
     MR_SEG = ("MR", "SEG")
+    """MR scan with segmentation mask."""
420-422: Address TODO comment about utility classes.

The TODO comment indicates these classes need work. Consider:

Documenting specific improvements needed

Creating separate issues to track the work

Adding examples of intended usage

Would you like me to help create detailed issues for tracking the improvements needed for these utility classes?

522-538: Enhance example usage with more comprehensive cases.

Consider improving the example section by:

Using pathlib's resolve() for robust path handling

Adding comments explaining the expected output

Including error handling examples
-    testdata = pathlib.Path("data")
+    # Use resolved path to ensure proper path handling
+    testdata = pathlib.Path("data").resolve()
+
+    # Example: Loading and basic error handling
+    try:
+        vs_seg = ImageMaskInput(
+            dir_path=testdata / datasets_name[1],
+            modalities=ImageMaskModalities.MR_RTSTRUCT
+        )
+        # Example: Accessing the first case
+        first_case = vs_seg[0]
+        print(f"Successfully loaded case with shape: {first_case.scan.shape}")
+    except Exception as e:
+        print(f"Error loading dataset: {e}")

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b00c91f and e51901b.

📒 Files selected for processing (1)

src/imgtools/ops/input_classes.py (2 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

`src/**/*.py`: Review the Python code for compliance with PE...

src/**/*.py: Review the Python code for compliance with PEP 8 and PEP 257 (docstring conventions). Ensure the following: - Variables and functions follow meaningful naming conventions. - Docstrings are present, accurate, and align with the implementation. - Code is efficient and avoids redundancy while adhering to DRY principles. - Consider suggestions to enhance readability and maintainability. - Highlight any potential performance issues, edge cases, or logical errors. - Ensure all imported libraries are used and necessary.

src/imgtools/ops/input_classes.py

⏰ Context from checks skipped due to timeout of 90000ms (13)

GitHub Check: Unit-Tests (windows-latest, py313)
GitHub Check: Unit-Tests (windows-latest, py312)
GitHub Check: Unit-Tests (windows-latest, py311)
GitHub Check: Unit-Tests (windows-latest, py310)
GitHub Check: Unit-Tests (macos-13, py313)
GitHub Check: Unit-Tests (macos-13, py312)
GitHub Check: Unit-Tests (macos-13, py311)
GitHub Check: Unit-Tests (macos-13, py310)
GitHub Check: Unit-Tests (macos-latest, py310)
GitHub Check: Unit-Tests (ubuntu-latest, py313)
GitHub Check: Unit-Tests (ubuntu-latest, py312)
GitHub Check: Unit-Tests (ubuntu-latest, py311)
GitHub Check: Unit-Tests (ubuntu-latest, py310)

🔇 Additional comments (1)

src/imgtools/ops/input_classes.py (1)

30-52: Well-implemented timer decorator!

The implementation is clean, well-documented, and follows best practices. The docstring clearly explains the purpose and usage.

src/imgtools/ops/input_classes.py

…askInput

… pairs

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (2)

src/imgtools/ops/input_classes.py (2)

131-191: 🛠️ Refactor suggestion

Consider breaking down post_init into smaller methods.

The __post_init__ method is handling multiple responsibilities. Breaking it down would improve maintainability.

299-301: ⚠️ Potential issue

Fix f-string formatting in error message.

The error message is not properly formatted using f-strings.

🧹 Nitpick comments (3)

src/imgtools/ops/input_classes.py (3)
30-55: Consider adding specific return type hints for the wrapped function.

The timer decorator is well-implemented, but could benefit from more specific type hints. Instead of Any, consider using TypeVar to preserve the original function's return type.
-def timer(name: str) -> Callable[[Callable[..., Any]], Callable[..., Any]]:
+from typing import TypeVar, ParamSpec
+
+T = TypeVar('T')
+P = ParamSpec('P')
+
+def timer(name: str) -> Callable[[Callable[P, T]], Callable[P, T]]:
73-81: Add docstring to the Enum class.

Consider adding a docstring to explain the purpose and usage of the ImageMaskModalities enum.
 class ImageMaskModalities(Enum):
+    """
+    Enum representing valid combinations of image and mask modalities.
+    
+    Each enum value is a tuple of (image_modality, mask_modality) where:
+    - image_modality can be 'CT' or 'MR'
+    - mask_modality can be 'RTSTRUCT' or 'SEG'
+    """
     CT_RTSTRUCT = ("CT", "RTSTRUCT")
541-544: Consider using environment variables or configuration files for data paths.

Hard-coded data paths in the main block could make the code less portable. Consider using environment variables or configuration files.
+from pathlib import Path
+from os import getenv
+
 if __name__ == "__main__":  # pragma: no cover
-    testdata = pathlib.Path("data")
+    testdata = Path(getenv("IMGTOOLS_DATA_DIR", "data"))

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e51901b and 625747a.

📒 Files selected for processing (1)

src/imgtools/ops/input_classes.py (2 hunks)

🧰 Additional context used

📓 Path-based instructions (1)

`src/**/*.py`: Review the Python code for compliance with PE...

src/**/*.py: Review the Python code for compliance with PEP 8 and PEP 257 (docstring conventions). Ensure the following: - Variables and functions follow meaningful naming conventions. - Docstrings are present, accurate, and align with the implementation. - Code is efficient and avoids redundancy while adhering to DRY principles. - Consider suggestions to enhance readability and maintainability. - Highlight any potential performance issues, edge cases, or logical errors. - Ensure all imported libraries are used and necessary.

src/imgtools/ops/input_classes.py

⏰ Context from checks skipped due to timeout of 90000ms (13)

GitHub Check: Unit-Tests (windows-latest, py313)
GitHub Check: Unit-Tests (windows-latest, py312)
GitHub Check: Unit-Tests (windows-latest, py311)
GitHub Check: Unit-Tests (windows-latest, py310)
GitHub Check: Unit-Tests (macos-13, py313)
GitHub Check: Unit-Tests (macos-13, py312)
GitHub Check: Unit-Tests (macos-13, py311)
GitHub Check: Unit-Tests (macos-13, py310)
GitHub Check: Unit-Tests (macos-latest, py310)
GitHub Check: Unit-Tests (ubuntu-latest, py313)
GitHub Check: Unit-Tests (ubuntu-latest, py312)
GitHub Check: Unit-Tests (ubuntu-latest, py311)
GitHub Check: Unit-Tests (ubuntu-latest, py310)

🔇 Additional comments (2)

src/imgtools/ops/input_classes.py (2)

5-28: Well-organized imports and type definitions!

The imports are logically grouped, and the LoaderFunction type alias enhances code readability by providing a clear type hint for loader functions.

57-71: Clean and well-documented NamedTuple implementation!

The ImageMask NamedTuple is well-documented with clear type hints and purpose.

…odule clarity

… and structure

…and exclude deprecated files

coderabbitai

Actionable comments posted: 0

♻️ Duplicate comments (1)

src/imgtools/ops/input_classes.py (1)

272-274: ⚠️ Potential issue

Fix f-string formatting in error message.

The error message is missing an f-string for the length value.

-                errmsg = (
-                    f"Index {key} out of range Dataset has {len(self)} cases."
-                )
+                errmsg = (
+                    f"Index {key} out of range. Dataset has {len(self)} cases."
+                )

🧹 Nitpick comments (4)

src/imgtools/ops/input_classes.py (4)

46-54: Add docstring to ImageMaskModalities enum.

While the implementation is good, adding a docstring would help users understand the available modality combinations and their use cases.

 class ImageMaskModalities(Enum):
+    """Enum for valid combinations of image and mask modalities.
+
+    Available combinations:
+    - CT_RTSTRUCT: CT scan with radiotherapy structure set
+    - CT_SEG: CT scan with segmentation
+    - MR_RTSTRUCT: MR scan with radiotherapy structure set
+    - MR_SEG: MR scan with segmentation
+    """
     CT_RTSTRUCT = ("CT", "RTSTRUCT")

114-114: Improve lambda function readability.

The lambda function could be more readable with a proper function name.

-        create_path = lambda f: self.dir_path.parent / self.imgtools_dir / f
+        def create_path(filename: str) -> pathlib.Path:
+            return self.dir_path.parent / self.imgtools_dir / filename

226-230: Improve error message in modalities type check.

The error message could be more descriptive and properly formatted.

-                errmsg = (
-                    "Modalities must be a string or a"
-                    "list of strings got {type(modalities)}"
-                )
+                errmsg = (
+                    f"Modalities must be a string or a list of strings, "
+                    f"got {type(modalities)}"
+                )

513-529: Enhance example documentation.

The examples would be more helpful with additional comments explaining the expected outcomes and use cases.

     # Define the path to the data
     testdata = pathlib.Path("data")
-    # for this tutorial we will use some test image data
+    # Example usage with different datasets and modality combinations
     datasets_name = ["NSCLC-Radiomics", "Vestibular-Schwannoma-SEG"]
+
+    # Load MR images with RTSTRUCT masks from Vestibular-Schwannoma dataset
     vs_seg = ImageMaskInput(
         dir_path=testdata / datasets_name[1],
         modalities=ImageMaskModalities.MR_RTSTRUCT,
     )
 
+    # Load CT images with RTSTRUCT masks from NSCLC dataset
     nsclsc_rtstruct = ImageMaskInput(
         dir_path=testdata / datasets_name[0],
         modalities=ImageMaskModalities.CT_RTSTRUCT,
     )
+
+    # Load CT images with SEG masks from NSCLC dataset
     nsclsc_seg = ImageMaskInput(
         dir_path=testdata / datasets_name[0],
         modalities=ImageMaskModalities.CT_SEG,
     )

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 625747a and 7b61169.

📒 Files selected for processing (5)

config/ruff.toml (1 hunks)
src/imgtools/ops/__init__.py (2 hunks)
src/imgtools/ops/input_classes.py (4 hunks)
src/imgtools/ops/ops.py (3 hunks)
src/imgtools/utils/timer.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

src/imgtools/ops/ops.py
config/ruff.toml

🧰 Additional context used

📓 Path-based instructions (1)

`src/**/*.py`: Review the Python code for compliance with PE...

src/**/*.py: Review the Python code for compliance with PEP 8 and PEP 257 (docstring conventions). Ensure the following: - Variables and functions follow meaningful naming conventions. - Docstrings are present, accurate, and align with the implementation. - Code is efficient and avoids redundancy while adhering to DRY principles. - Consider suggestions to enhance readability and maintainability. - Highlight any potential performance issues, edge cases, or logical errors. - Ensure all imported libraries are used and necessary.

src/imgtools/utils/timer.py
src/imgtools/ops/__init__.py
src/imgtools/ops/input_classes.py

⏰ Context from checks skipped due to timeout of 90000ms (10)

GitHub Check: Unit-Tests (windows-latest, py313)
GitHub Check: Unit-Tests (windows-latest, py312)
GitHub Check: Unit-Tests (windows-latest, py311)
GitHub Check: Unit-Tests (windows-latest, py310)
GitHub Check: Unit-Tests (macos-13, py313)
GitHub Check: Unit-Tests (macos-13, py312)
GitHub Check: Unit-Tests (macos-13, py311)
GitHub Check: Unit-Tests (macos-13, py310)
GitHub Check: Unit-Tests (macos-latest, py311)
GitHub Check: Unit-Tests (ubuntu-latest, py312)

🔇 Additional comments (3)

src/imgtools/utils/timer.py (1)

7-41: Well-structured timer decorator implementation!

The implementation follows best practices with proper type hints, clear documentation, and accurate time measurement.

src/imgtools/ops/__init__.py (1)

1-76: Well-organized module interface!

The imports are properly categorized and the all list clearly defines the public API.

src/imgtools/ops/input_classes.py (1)

30-44: Well-documented data structure!

The ImageMask NamedTuple provides a clear and type-safe way to store image-mask pairs.

JoshuaSiraj · 2025-02-21T16:01:09Z

src/imgtools/ops/input_classes.py

+        5. create output streams
+        """
+        self.dataset_name = self.dir_path.name
+        create_path = lambda f: self.dir_path.parent / self.imgtools_dir / f


gonna steal this lol

JoshuaSiraj

Noice

JoshuaSiraj · 2025-02-21T16:10:26Z

src/imgtools/ops/input_classes.py

+            raise ValueError(errmsg) from e
+        parsed_cols = self.parsed_df.columns.tolist()


NITPICK: Add a space for readibility

Suggested change

raise ValueError(errmsg) from e

parsed_cols = self.parsed_df.columns.tolist()

raise ValueError(errmsg) from e

parsed_cols = self.parsed_df.columns.tolist()

jjjermiah added 4 commits February 19, 2025 18:12

fix: update image handling in test_modalities to ensure correct DICOM…

fd67fdf

… object usage

chore: update pixi.lock to reflect dependency changes

a14296a

chore: ruff format

2203173

refactor: update input_classes to use BaseInput and improve timer dec…

ed8985a

…orator documentation

refactor: streamline input_classes to enhance BaseInput integration a…

467c774

…nd remove unused imports

jjjermiah added 2 commits February 19, 2025 18:44

refactor: update main block in input_classes to demonstrate usage wit…

fdd9179

…h test image data

docs: add tutorial for loading images and masks from DICOMs using Ima…

b00c91f

…geMaskInput

jjjermiah marked this pull request as ready for review February 19, 2025 18:52

coderabbitai bot reviewed Feb 19, 2025

View reviewed changes

src/imgtools/coretypes/box.py Show resolved Hide resolved

src/imgtools/ops/ops.py Show resolved Hide resolved

jjjermiah requested review from strixy16, skim2257 and JoshuaSiraj February 19, 2025 19:01

refactor: enhance docstring for ImageMaskInput to clarify functionality

e51901b

coderabbitai bot reviewed Feb 19, 2025

View reviewed changes

src/imgtools/ops/input_classes.py Outdated Show resolved Hide resolved

src/imgtools/ops/input_classes.py Outdated Show resolved Hide resolved

jjjermiah added 3 commits February 19, 2025 21:13

refactor: simplify lambda parameter name in ImageMaskInput for clarity

32ef726

refactor: improve docstrings and streamline code formatting in ImageM…

e05e1fa

…askInput

refactor: add ImageMask NamedTuple for better structure of image-mask…

625747a

… pairs

coderabbitai bot reviewed Feb 19, 2025

View reviewed changes

jjjermiah added 4 commits February 19, 2025 21:35

refactor: move timer decorator to a new module for better organization

0da6e70

refactor: reorganize imports and enhance __all__ exports for better m…

91d4a81

…odule clarity

refactor: update ImageMaskInput and ImageAutoInput to enhance clarity…

cc4a4a7

… and structure

refactor: update ruff configuration to include all operation modules …

7b61169

…and exclude deprecated files

coderabbitai bot reviewed Feb 19, 2025

View reviewed changes

JoshuaSiraj reviewed Feb 21, 2025

View reviewed changes

JoshuaSiraj approved these changes Feb 21, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: ImageMaskInput #233

WIP: ImageMaskInput #233

jjjermiah commented Feb 19, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 19, 2025 •

edited

Loading

Walkthrough

Changes

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

codecov bot commented Feb 19, 2025 •

edited

Loading

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

coderabbitai bot left a comment

JoshuaSiraj Feb 21, 2025

JoshuaSiraj left a comment

JoshuaSiraj Feb 21, 2025

		raise ValueError(errmsg) from e
		parsed_cols = self.parsed_df.columns.tolist()

WIP: ImageMaskInput #233

Are you sure you want to change the base?

WIP: ImageMaskInput #233

Conversation

jjjermiah commented Feb 19, 2025 • edited by coderabbitai bot Loading

Prototype ImageMaskInput

Key Features:

Summary

Tutorial: Loading Images and Masks from DICOMs

Getting Started

Setting up the loaders with ImageMaskInput

Vestibular-Schwannoma-SEG

NSCLC-Radiomics

Using the Input Datasets

Load a case

StructureSets are automatically converted to Segmentation object with all ROIs

Whereas native SEG modalities get loaded as Segmentation objects

MR & RTSTRUCT example

Summary by CodeRabbit

Summary by CodeRabbit

coderabbitai bot commented Feb 19, 2025 • edited Loading

Walkthrough

Changes

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

Documentation and Community

codecov bot commented Feb 19, 2025 • edited Loading

Codecov Report

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

JoshuaSiraj Feb 21, 2025

Choose a reason for hiding this comment

JoshuaSiraj left a comment

Choose a reason for hiding this comment

JoshuaSiraj Feb 21, 2025

Choose a reason for hiding this comment

jjjermiah commented Feb 19, 2025 •

edited by coderabbitai bot

Loading

Prototype `ImageMaskInput`

Setting up the loaders with `ImageMaskInput`

StructureSets are automatically converted to `Segmentation` object with all ROIs

Whereas native `SEG` modalities get loaded as `Segmentation` objects

coderabbitai bot commented Feb 19, 2025 •

edited

Loading

codecov bot commented Feb 19, 2025 •

edited

Loading