Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: ImageMaskInput #233

Open
wants to merge 15 commits into
base: development
Choose a base branch
from
Open

Conversation

jjjermiah
Copy link
Contributor

@jjjermiah jjjermiah commented Feb 19, 2025

Work In Progress

mostly meets the requirements of Med-ImageNet, but could be useful as a public API for anyone

  • semantics (naming and methods) are all open to be changed

Prototype ImageMaskInput

  1. Crawl
  2. Build Graph
  3. possible inputs and the loaded result:

Important

by design, RTSTRUCTs are converted to imgtools.Segmentation returned with all ROIs
which matches the default design of SEG objects, and will allow for uniform filtering of ROINames on one data type in the future

Image Modality (imgtools.Scan) Mask Modality Mask Class
CT RTSTRUCT imgtools.Segmentation(converted from imgtools.StructureSet with all ROIs)
MR RTSTRUCT imgtools.Segmentation(converted from imgtools.StructureSet with all ROIs)
CT SEG imgtools.Segmentation
MR SEG imgtools.Segmentation
class ImageMaskInput(BaseInput):
    """
    Easily index and load scan-mask pairs.

    This class crawls through the specified 
	directory to index the dataset,
    creates a graph of the dataset, automatically
	querying the graph for image-mask pairs.

    Parameters
    ----------
    dir_path : pathlib.Path
        Path to the directory containing the dataset.
    modalities : ImageMaskModalities
        Modalities to be used for querying the graph.
    n_jobs : int, optional
        Number of jobs to use for crawling, by default -1.
    update_crawl : bool, optional
        Whether to force update the crawl, by default False.
    update_edges : bool, optional
        Whether to force update the edges, by default False.
	"""

Key Features:

  • Streamlined Interface
  • Automatic Dataset Crawling
  • Built-in Dataset Graphing:
    Automatically constructs a DataGraph and querying of specific modalities (e.g., CT, MR, RTSTRUCT, SEG).
  • Versatile Modality Support using ImageMaskModalities Enum
    • CT & RTSTRUCT
    • MR & RTSTRUCT
    • CT & SEG
    • MR & SEG
  • Automatically configure loader for user

Summary

  • Simplifies Image & Mask Dataset Loading: Eliminates boilerplate code for
    parsing, indexing, and querying datasets.
  • Improves Workflow Efficiency: Automates graph-based querying and
    loader setup.
  • Flexible & Extensible: Easily supports 4 main Image-Mask modalities

Tutorial: Loading Images and Masks from DICOMs

from imgtools.ops.input_classes import ImageMaskInput, ImageMaskModalities
from imgtools.logging import logger
from pathlib import Path

Getting Started

You need the following at minimum:

  • Path to the directory containing the DICOM files
  • Establish which Image and Mask modalities you want to use:
    Combinations are:

For this tutorial we will download two datasets:

# Define the path to the data
testdata = Path("testdata")
# for this tutorial we will use some test image data
datasets_name = ["NSCLC-Radiomics", "Vestibular-Schwannoma-SEG"]
%%capture 
# download data using the imgtools cli
!imgtools testdata -a Vestibular-Schwannoma-SEG.tar.gz -a NSCLC-Radiomics.tar.gz {testdata.absolute()}

Setting up the loaders with ImageMaskInput

Vestibular-Schwannoma-SEG

has MR as scan and RTSTRUCT as mask:

Patient ID Modality Number of Series
VS-SEG-001 MR 2
VS-SEG-001 RTSTRUCT 2
VS-SEG-002 MR 2
VS-SEG-002 RTSTRUCT 2
vs_seg = ImageMaskInput(
  dir_path=testdata / datasets_name[1],
  modalities=ImageMaskModalities.MR_RTSTRUCT
)

print(vs_seg)
ImageMaskInput<
	num_cases=4,
	dataset_name='Vestibular-Schwannoma-SEG',
	modalities=<MR,RTSTRUCT>,
	output_streams=['MR', 'RTSTRUCT_MR'],
	series_col_names=['series_MR', 'series_RTSTRUCT_MR'],
>

NSCLC-Radiomics

has CT as scan and BOTH RTSTRUCT and SEG as masks.

Patient ID Modality Number of Series
LUNG1-001 CT 1
LUNG1-001 RTSTRUCT 1
LUNG1-001 SEG 1
LUNG1-002 CT 1
LUNG1-002 RTSTRUCT 1
LUNG1-002 SEG 1

CT & RTSTRUCT

nsclsc_rtstruct = ImageMaskInput(
    dir_path=testdata / datasets_name[0],
    modalities=ImageMaskModalities.CT_RTSTRUCT
)
print(nsclsc_rtstruct)
ImageMaskInput<
	num_cases=2,
	dataset_name='NSCLC-Radiomics',
	modalities=<CT,RTSTRUCT>,
	output_streams=['CT', 'RTSTRUCT_CT'],
	series_col_names=['series_CT', 'series_RTSTRUCT_CT'],
>

CT & SEG

nsclsc_seg = ImageMaskInput(
    dir_path=testdata / datasets_name[0],
    modalities=ImageMaskModalities.CT_SEG
)
print(nsclsc_seg)
ImageMaskInput<
	num_cases=2,
	dataset_name='NSCLC-Radiomics',
	modalities=<CT,SEG>,
	output_streams=['CT', 'SEG'],
	series_col_names=['series_CT', 'series_SEG'],
>

Using the Input Datasets

# List the case IDs (subject IDs)
print(nsclsc_rtstruct.keys())
# ['0_LUNG1-002', '1_LUNG1-001']

Load a case

StructureSets are automatically converted to Segmentation object with all ROIs

# by case ID or index
image, mask = nsclsc_rtstruct['0_LUNG1-002']
print(mask)
mask=<Segmentation with ROIs: {'Esophagus': 1, 'GTV-1': 2, 'Heart': 3, 'Lung-Left': 4, 'Lung-Right': 5, 'Spinal-Cord': 6}>)

Whereas native SEG modalities get loaded as Segmentation objects

# by case ID or index
case_seg = nsclsc_seg[nsclsc_seg.keys()[0]]
image = case_seg.image
mask = case_seg.mask
mask
# (Segmentation.from_dicom is broken, but ignore for now)
mask=<Segmentation with ROIs: {'label_1': 1}>)

MR & RTSTRUCT example

case_id = vs_seg.keys()[0]
case = vs_seg[case_id]
image = case.image
mask = case.mask
mask
mask=<Segmentation with ROIs: {'*Skull': 1, 'TV': 2, 'cochlea': 3}>)

Summary by CodeRabbit

Summary by CodeRabbit

  • New Features

    • Introduced a new input-processing module for enhanced image and mask data management with integrated performance tracking.
  • Bug Fixes

    • Corrected region boundary computations to ensure accurate detection and improved handling for optional size adjustments.
  • Refactor

    • Optimized linting configuration to include more files for better code quality checks.
    • Enhanced the structure and functionality of the input classes for improved modularity and readability.
    • Updated the module's public interface for clarity and maintainability.
  • Tests

    • Updated DICOM data handling tests for more robust and comprehensive image object processing.

Copy link
Contributor

coderabbitai bot commented Feb 19, 2025

📝 Walkthrough

Walkthrough

This pull request updates the linting configuration by modifying the config/ruff.toml file to include and exclude specific file paths. It corrects the RegionBox.from_mask_centroid method in the core types module to ensure distinct coordinate instances are generated, while also refining the expand_to_cube method. The ops module undergoes significant refactoring with the introduction of the ImageMaskInput class, a new timer decorator, and an enum for modalities. Additionally, unused imports are removed, and minor adjustments are made in utility and test modules.

Changes

File Path Change Summary
config/ruff.toml Updated linting configuration: uncommented lines and added inclusion for src/imgtools/ops/**/*.py; adjusted exclusions for src/imgtools/ops/old_output_ops.py, src/imgtools/coretypes/deprecated_bbox.py, src/imgtools/types.py, and src/imgtools/utils/crawl.py; removed extend-include section.
src/imgtools/coretypes/box.py Modified RegionBox.from_mask_centroid to generate distinct Coordinate3D instances and added a return type annotation; updated expand_to_cube to use the desired_size parameter more robustly.
src/imgtools/ops/input_classes.py Introduced the ImageMaskInput class (extending BaseInput), added a timer decorator, introduced the ImageMaskModalities enum, and defined a type alias; refactored methods for dataset crawling and parsing.
src/imgtools/ops/ops.py Removed the unused import of map_over_labels from imgtools.modules and added a new __all__ declaration to clarify the public interface.
src/imgtools/utils/imageutils.py Adjusted idxs_to_physical_points by removing outdated comments and revising the return statement to ensure the transformation is applied correctly to a list of indices.
tests/legacy_tests/test_modalities.py Changed DICOM handling by removing .image attribute access, so that variables now directly hold the full DICOM objects returned by read_dicom_auto.
src/imgtools/ops/__init__.py Added new imports for ImageMask, ImageMaskInput, and ImageMaskModalities; updated __all__ list to include newly imported entities for public API clarity.
src/imgtools/utils/timer.py Introduced a new timer function as a decorator to log execution time of functions.
✨ Finishing Touches
  • 📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

codecov bot commented Feb 19, 2025

Codecov Report

Attention: Patch coverage is 40.25974% with 92 lines in your changes missing coverage. Please review.

Project coverage is 64.06%. Comparing base (d9a83d1) to head (7b61169).
Report is 1 commits behind head on development.

Files with missing lines Patch % Lines
src/imgtools/ops/input_classes.py 36.76% 86 Missing ⚠️
src/imgtools/utils/timer.py 57.14% 6 Missing ⚠️
Additional details and impacted files
@@               Coverage Diff               @@
##           development     #233      +/-   ##
===============================================
- Coverage        65.12%   64.06%   -1.06%     
===============================================
  Files               55       56       +1     
  Lines             3710     3857     +147     
===============================================
+ Hits              2416     2471      +55     
- Misses            1294     1386      +92     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@jjjermiah jjjermiah marked this pull request as ready for review February 19, 2025 18:52
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (7)
src/imgtools/ops/input_classes.py (5)

5-10: Consider unifying your namedtuple usage for consistency.

Currently, both namedtuple from collections (line 5) and NamedTuple from typing (line 9) are imported. If you need stricter typing, consider switching fully to the typing.NamedTuple approach to enhance maintainability and type checking consistency.


30-53: Use time.perf_counter instead of time.time for more precise timing.

The current decorator relies on time.time(), which can be less precise on some platforms. For higher-resolution timing, consider using time.perf_counter():

- start_time = time.time()
...
- end_time = time.time()
+ start_time = time.perf_counter()
...
+ end_time = time.perf_counter()

55-63: Optional: Add docstrings for enum members.

While your ImageMaskModalities enum is straightforward, brief docstrings or comments explaining each modality combination could help new contributors quickly understand the possible modality pairs.


254-286: Namedtuple packaging is convenient, but ensure all modality cases are handled.

The match statement for "RTSTRUCT" and "SEG" is great for structural consistency. For more robust code, consider augmenting or extending the logic if other modalities will be added in the future.


520-538: Demonstration code is fine, but consider wrapping in a test or example module.

Inline demonstration within the __main__ guard is quick and illustrative. For larger codebases, grouping such usage examples into a dedicated example/test module (or a Jupyter notebook) can aid discoverability.

tests/legacy_tests/test_modalities.py (1)

30-30: Remove redundant self-assignment.

This line is effectively a no-op:

- dicom = dicom

Removing it avoids confusion for future maintainers.

🧰 Tools
🪛 Ruff (0.8.2)

30-30: Self-assignment of variable dicom

(PLW0127)

src/imgtools/utils/imageutils.py (1)

172-188: Vectorization can further optimize large index arrays.

Replacing the list comprehension (line 187) with a NumPy-vectorized approach (like your commented-out vectorized_transform) may boost performance when idxs is large. Current implementation is readable, but for massive data, consider reintroducing a vectorized method.

📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5516648 and b00c91f.

⛔ Files ignored due to path filters (2)
  • docs/usage/Input/ImageMask.ipynb is excluded by none and included by none
  • pixi.lock is excluded by !**/*.lock and included by none
📒 Files selected for processing (6)
  • config/ruff.toml (1 hunks)
  • src/imgtools/coretypes/box.py (1 hunks)
  • src/imgtools/ops/input_classes.py (2 hunks)
  • src/imgtools/ops/ops.py (1 hunks)
  • src/imgtools/utils/imageutils.py (2 hunks)
  • tests/legacy_tests/test_modalities.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
`src/**/*.py`: Review the Python code for compliance with PE...

src/**/*.py: Review the Python code for compliance with PEP 8 and PEP 257 (docstring conventions). Ensure the following: - Variables and functions follow meaningful naming conventions. - Docstrings are present, accurate, and align with the implementation. - Code is efficient and avoids redundancy while adhering to DRY principles. - Consider suggestions to enhance readability and maintainability. - Highlight any potential performance issues, edge cases, or logical errors. - Ensure all imported libraries are used and necessary.

  • src/imgtools/utils/imageutils.py
  • src/imgtools/ops/input_classes.py
  • src/imgtools/coretypes/box.py
  • src/imgtools/ops/ops.py
`tests/**/*`: Review the test code written with Pytest. Conf...

tests/**/*: Review the test code written with Pytest. Confirm: - Tests cover all critical functionality and edge cases. - Test descriptions clearly describe their purpose. - Pytest best practices are followed, such as proper use of fixtures. - Ensure the tests are isolated and do not have external dependencies (e.g., network calls). - Verify meaningful assertions and avoidance of redundant tests. - Test code adheres to PEP 8 style guidelines.

  • tests/legacy_tests/test_modalities.py
🪛 Ruff (0.8.2)
tests/legacy_tests/test_modalities.py

30-30: Self-assignment of variable dicom

(PLW0127)

⏰ Context from checks skipped due to timeout of 90000ms (13)
  • GitHub Check: Unit-Tests (windows-latest, py313)
  • GitHub Check: Unit-Tests (windows-latest, py312)
  • GitHub Check: Unit-Tests (windows-latest, py311)
  • GitHub Check: Unit-Tests (windows-latest, py310)
  • GitHub Check: Unit-Tests (macos-13, py313)
  • GitHub Check: Unit-Tests (macos-13, py312)
  • GitHub Check: Unit-Tests (macos-13, py311)
  • GitHub Check: Unit-Tests (macos-13, py310)
  • GitHub Check: Unit-Tests (macos-latest, py310)
  • GitHub Check: Unit-Tests (ubuntu-latest, py313)
  • GitHub Check: Unit-Tests (ubuntu-latest, py312)
  • GitHub Check: Unit-Tests (ubuntu-latest, py311)
  • GitHub Check: Unit-Tests (ubuntu-latest, py310)
🔇 Additional comments (7)
src/imgtools/ops/input_classes.py (5)

25-28: Looks good!

The introduction of BaseInput and the new LoaderFunction type alias is clear and aligns well with your SITK-based pipeline.


65-93: Dataclass design is clear and well-documented.

The class-level docstring effectively explains basic usage and critical parameters. No immediate concerns with naming or structure here.


113-185: Validate multidimensional modality assumptions.

In __call__, the code assumes self.modality_list[1] exists (line 260). If a new enum (with just one modality) or alternative usage arises, this could potentially raise an IndexError. Consider adding a safety check or clarifying in the docstring that exactly two modalities (e.g., “image” and “mask”) are expected for this class.


201-221: Crawler usage and log messages are cohesive.

The _crawl method properly respects the update_crawl flag, logs relevant details, and warns the user when indexing already exists. Great job on providing meaningful logs.


223-241: Confirm the validity of modality inputs.

The parse_graph method performs basic checks (str, list, enum), but it might be helpful to confirm validity of each modality token when the user passes a string or list. Invalid user input could produce unexpected partial matches.

tests/legacy_tests/test_modalities.py (1)

21-21: Direct usage of read_dicom_auto is fine.

Since you are directly storing the return object in img, there's no immediate red flag here. Make sure that you consistently handle the returned object shape across all tests.

config/ruff.toml (1)

14-14: Great addition to linting coverage!

Including input_classes.py in the linting process aligns with the project's goal of expanding linting coverage. This will help maintain code quality and consistency.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (5)
src/imgtools/ops/input_classes.py (5)

5-12: Consider organizing imports into logical groups.

For better readability, consider organizing imports into three groups:

  1. Standard library imports
  2. Third-party imports
  3. Local imports
-from collections import namedtuple
-from enum import Enum
-import pathlib
-import time
-from typing import Any, Generator, List, NamedTuple, Optional, Callable
-import SimpleITK as sitk
-from imgtools.modules import StructureSet, Segmentation, Scan
-from dataclasses import dataclass, field
-import pandas as pd

+# Standard library imports
+from collections import namedtuple
+from dataclasses import dataclass, field
+from enum import Enum
+import pathlib
+import time
+from typing import Any, Generator, List, NamedTuple, Optional, Callable
+
+# Third-party imports
+import pandas as pd
+import SimpleITK as sitk
+
+# Local imports
+from imgtools.modules import StructureSet, Segmentation, Scan

27-27: Add docstring to the LoaderFunction type alias.

Adding a docstring would help other developers understand the purpose and expected return types of the loader functions.

-LoaderFunction = Callable[..., sitk.Image | StructureSet | Segmentation]
+# Type alias for functions that load medical images or structures
+LoaderFunction = Callable[..., sitk.Image | StructureSet | Segmentation]
+"""Type hint for functions that load medical images, structure sets, or segmentations."""

55-63: Add documentation for each modality combination.

Consider adding docstrings to explain each modality combination and its typical use case.

 class ImageMaskModalities(Enum):
+    """Enum defining supported combinations of medical image modalities and their masks."""
+
     CT_RTSTRUCT = ("CT", "RTSTRUCT")
+    """CT scan with radiotherapy structure set."""
     CT_SEG = ("CT", "SEG")
+    """CT scan with segmentation mask."""
     MR_RTSTRUCT = ("MR", "RTSTRUCT")
+    """MR scan with radiotherapy structure set."""
     MR_SEG = ("MR", "SEG")
+    """MR scan with segmentation mask."""

420-422: Address TODO comment about utility classes.

The TODO comment indicates these classes need work. Consider:

  1. Documenting specific improvements needed
  2. Creating separate issues to track the work
  3. Adding examples of intended usage

Would you like me to help create detailed issues for tracking the improvements needed for these utility classes?


522-538: Enhance example usage with more comprehensive cases.

Consider improving the example section by:

  1. Using pathlib's resolve() for robust path handling
  2. Adding comments explaining the expected output
  3. Including error handling examples
-    testdata = pathlib.Path("data")
+    # Use resolved path to ensure proper path handling
+    testdata = pathlib.Path("data").resolve()
+
+    # Example: Loading and basic error handling
+    try:
+        vs_seg = ImageMaskInput(
+            dir_path=testdata / datasets_name[1],
+            modalities=ImageMaskModalities.MR_RTSTRUCT
+        )
+        # Example: Accessing the first case
+        first_case = vs_seg[0]
+        print(f"Successfully loaded case with shape: {first_case.scan.shape}")
+    except Exception as e:
+        print(f"Error loading dataset: {e}")
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b00c91f and e51901b.

📒 Files selected for processing (1)
  • src/imgtools/ops/input_classes.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`src/**/*.py`: Review the Python code for compliance with PE...

src/**/*.py: Review the Python code for compliance with PEP 8 and PEP 257 (docstring conventions). Ensure the following: - Variables and functions follow meaningful naming conventions. - Docstrings are present, accurate, and align with the implementation. - Code is efficient and avoids redundancy while adhering to DRY principles. - Consider suggestions to enhance readability and maintainability. - Highlight any potential performance issues, edge cases, or logical errors. - Ensure all imported libraries are used and necessary.

  • src/imgtools/ops/input_classes.py
⏰ Context from checks skipped due to timeout of 90000ms (13)
  • GitHub Check: Unit-Tests (windows-latest, py313)
  • GitHub Check: Unit-Tests (windows-latest, py312)
  • GitHub Check: Unit-Tests (windows-latest, py311)
  • GitHub Check: Unit-Tests (windows-latest, py310)
  • GitHub Check: Unit-Tests (macos-13, py313)
  • GitHub Check: Unit-Tests (macos-13, py312)
  • GitHub Check: Unit-Tests (macos-13, py311)
  • GitHub Check: Unit-Tests (macos-13, py310)
  • GitHub Check: Unit-Tests (macos-latest, py310)
  • GitHub Check: Unit-Tests (ubuntu-latest, py313)
  • GitHub Check: Unit-Tests (ubuntu-latest, py312)
  • GitHub Check: Unit-Tests (ubuntu-latest, py311)
  • GitHub Check: Unit-Tests (ubuntu-latest, py310)
🔇 Additional comments (1)
src/imgtools/ops/input_classes.py (1)

30-52: Well-implemented timer decorator!

The implementation is clean, well-documented, and follows best practices. The docstring clearly explains the purpose and usage.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
src/imgtools/ops/input_classes.py (2)

131-191: 🛠️ Refactor suggestion

Consider breaking down post_init into smaller methods.

The __post_init__ method is handling multiple responsibilities. Breaking it down would improve maintainability.


299-301: ⚠️ Potential issue

Fix f-string formatting in error message.

The error message is not properly formatted using f-strings.

🧹 Nitpick comments (3)
src/imgtools/ops/input_classes.py (3)

30-55: Consider adding specific return type hints for the wrapped function.

The timer decorator is well-implemented, but could benefit from more specific type hints. Instead of Any, consider using TypeVar to preserve the original function's return type.

-def timer(name: str) -> Callable[[Callable[..., Any]], Callable[..., Any]]:
+from typing import TypeVar, ParamSpec
+
+T = TypeVar('T')
+P = ParamSpec('P')
+
+def timer(name: str) -> Callable[[Callable[P, T]], Callable[P, T]]:

73-81: Add docstring to the Enum class.

Consider adding a docstring to explain the purpose and usage of the ImageMaskModalities enum.

 class ImageMaskModalities(Enum):
+    """
+    Enum representing valid combinations of image and mask modalities.
+    
+    Each enum value is a tuple of (image_modality, mask_modality) where:
+    - image_modality can be 'CT' or 'MR'
+    - mask_modality can be 'RTSTRUCT' or 'SEG'
+    """
     CT_RTSTRUCT = ("CT", "RTSTRUCT")

541-544: Consider using environment variables or configuration files for data paths.

Hard-coded data paths in the main block could make the code less portable. Consider using environment variables or configuration files.

+from pathlib import Path
+from os import getenv
+
 if __name__ == "__main__":  # pragma: no cover
-    testdata = pathlib.Path("data")
+    testdata = Path(getenv("IMGTOOLS_DATA_DIR", "data"))
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e51901b and 625747a.

📒 Files selected for processing (1)
  • src/imgtools/ops/input_classes.py (2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`src/**/*.py`: Review the Python code for compliance with PE...

src/**/*.py: Review the Python code for compliance with PEP 8 and PEP 257 (docstring conventions). Ensure the following: - Variables and functions follow meaningful naming conventions. - Docstrings are present, accurate, and align with the implementation. - Code is efficient and avoids redundancy while adhering to DRY principles. - Consider suggestions to enhance readability and maintainability. - Highlight any potential performance issues, edge cases, or logical errors. - Ensure all imported libraries are used and necessary.

  • src/imgtools/ops/input_classes.py
⏰ Context from checks skipped due to timeout of 90000ms (13)
  • GitHub Check: Unit-Tests (windows-latest, py313)
  • GitHub Check: Unit-Tests (windows-latest, py312)
  • GitHub Check: Unit-Tests (windows-latest, py311)
  • GitHub Check: Unit-Tests (windows-latest, py310)
  • GitHub Check: Unit-Tests (macos-13, py313)
  • GitHub Check: Unit-Tests (macos-13, py312)
  • GitHub Check: Unit-Tests (macos-13, py311)
  • GitHub Check: Unit-Tests (macos-13, py310)
  • GitHub Check: Unit-Tests (macos-latest, py310)
  • GitHub Check: Unit-Tests (ubuntu-latest, py313)
  • GitHub Check: Unit-Tests (ubuntu-latest, py312)
  • GitHub Check: Unit-Tests (ubuntu-latest, py311)
  • GitHub Check: Unit-Tests (ubuntu-latest, py310)
🔇 Additional comments (2)
src/imgtools/ops/input_classes.py (2)

5-28: Well-organized imports and type definitions!

The imports are logically grouped, and the LoaderFunction type alias enhances code readability by providing a clear type hint for loader functions.


57-71: Clean and well-documented NamedTuple implementation!

The ImageMask NamedTuple is well-documented with clear type hints and purpose.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
src/imgtools/ops/input_classes.py (1)

272-274: ⚠️ Potential issue

Fix f-string formatting in error message.

The error message is missing an f-string for the length value.

-                errmsg = (
-                    f"Index {key} out of range Dataset has {len(self)} cases."
-                )
+                errmsg = (
+                    f"Index {key} out of range. Dataset has {len(self)} cases."
+                )
🧹 Nitpick comments (4)
src/imgtools/ops/input_classes.py (4)

46-54: Add docstring to ImageMaskModalities enum.

While the implementation is good, adding a docstring would help users understand the available modality combinations and their use cases.

 class ImageMaskModalities(Enum):
+    """Enum for valid combinations of image and mask modalities.
+
+    Available combinations:
+    - CT_RTSTRUCT: CT scan with radiotherapy structure set
+    - CT_SEG: CT scan with segmentation
+    - MR_RTSTRUCT: MR scan with radiotherapy structure set
+    - MR_SEG: MR scan with segmentation
+    """
     CT_RTSTRUCT = ("CT", "RTSTRUCT")

114-114: Improve lambda function readability.

The lambda function could be more readable with a proper function name.

-        create_path = lambda f: self.dir_path.parent / self.imgtools_dir / f
+        def create_path(filename: str) -> pathlib.Path:
+            return self.dir_path.parent / self.imgtools_dir / filename

226-230: Improve error message in modalities type check.

The error message could be more descriptive and properly formatted.

-                errmsg = (
-                    "Modalities must be a string or a"
-                    "list of strings got {type(modalities)}"
-                )
+                errmsg = (
+                    f"Modalities must be a string or a list of strings, "
+                    f"got {type(modalities)}"
+                )

513-529: Enhance example documentation.

The examples would be more helpful with additional comments explaining the expected outcomes and use cases.

     # Define the path to the data
     testdata = pathlib.Path("data")
-    # for this tutorial we will use some test image data
+    # Example usage with different datasets and modality combinations
     datasets_name = ["NSCLC-Radiomics", "Vestibular-Schwannoma-SEG"]
+
+    # Load MR images with RTSTRUCT masks from Vestibular-Schwannoma dataset
     vs_seg = ImageMaskInput(
         dir_path=testdata / datasets_name[1],
         modalities=ImageMaskModalities.MR_RTSTRUCT,
     )
 
+    # Load CT images with RTSTRUCT masks from NSCLC dataset
     nsclsc_rtstruct = ImageMaskInput(
         dir_path=testdata / datasets_name[0],
         modalities=ImageMaskModalities.CT_RTSTRUCT,
     )
+
+    # Load CT images with SEG masks from NSCLC dataset
     nsclsc_seg = ImageMaskInput(
         dir_path=testdata / datasets_name[0],
         modalities=ImageMaskModalities.CT_SEG,
     )
📜 Review details

Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 625747a and 7b61169.

📒 Files selected for processing (5)
  • config/ruff.toml (1 hunks)
  • src/imgtools/ops/__init__.py (2 hunks)
  • src/imgtools/ops/input_classes.py (4 hunks)
  • src/imgtools/ops/ops.py (3 hunks)
  • src/imgtools/utils/timer.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/imgtools/ops/ops.py
  • config/ruff.toml
🧰 Additional context used
📓 Path-based instructions (1)
`src/**/*.py`: Review the Python code for compliance with PE...

src/**/*.py: Review the Python code for compliance with PEP 8 and PEP 257 (docstring conventions). Ensure the following: - Variables and functions follow meaningful naming conventions. - Docstrings are present, accurate, and align with the implementation. - Code is efficient and avoids redundancy while adhering to DRY principles. - Consider suggestions to enhance readability and maintainability. - Highlight any potential performance issues, edge cases, or logical errors. - Ensure all imported libraries are used and necessary.

  • src/imgtools/utils/timer.py
  • src/imgtools/ops/__init__.py
  • src/imgtools/ops/input_classes.py
⏰ Context from checks skipped due to timeout of 90000ms (10)
  • GitHub Check: Unit-Tests (windows-latest, py313)
  • GitHub Check: Unit-Tests (windows-latest, py312)
  • GitHub Check: Unit-Tests (windows-latest, py311)
  • GitHub Check: Unit-Tests (windows-latest, py310)
  • GitHub Check: Unit-Tests (macos-13, py313)
  • GitHub Check: Unit-Tests (macos-13, py312)
  • GitHub Check: Unit-Tests (macos-13, py311)
  • GitHub Check: Unit-Tests (macos-13, py310)
  • GitHub Check: Unit-Tests (macos-latest, py311)
  • GitHub Check: Unit-Tests (ubuntu-latest, py312)
🔇 Additional comments (3)
src/imgtools/utils/timer.py (1)

7-41: Well-structured timer decorator implementation!

The implementation follows best practices with proper type hints, clear documentation, and accurate time measurement.

src/imgtools/ops/__init__.py (1)

1-76: Well-organized module interface!

The imports are properly categorized and the all list clearly defines the public API.

src/imgtools/ops/input_classes.py (1)

30-44: Well-documented data structure!

The ImageMask NamedTuple provides a clear and type-safe way to store image-mask pairs.

5. create output streams
"""
self.dataset_name = self.dir_path.name
create_path = lambda f: self.dir_path.parent / self.imgtools_dir / f
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gonna steal this lol

Copy link
Collaborator

@JoshuaSiraj JoshuaSiraj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noice

Comment on lines +128 to +129
raise ValueError(errmsg) from e
parsed_cols = self.parsed_df.columns.tolist()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NITPICK: Add a space for readibility

Suggested change
raise ValueError(errmsg) from e
parsed_cols = self.parsed_df.columns.tolist()
raise ValueError(errmsg) from e
parsed_cols = self.parsed_df.columns.tolist()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants