Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bitmask input array dimension check #454

Merged
merged 4 commits into from
Nov 7, 2023

Conversation

sergei-encord
Copy link
Collaborator

@sergei-encord sergei-encord commented Nov 6, 2023

Raise an exception with the clear error message if a provided bitmask is not a 2d array

Summary by CodeRabbit

Bug fix:

  • Added validation to ensure the provided bitmask is a 2-dimensional array in bitmask.py.
  • Improved error handling in BitmaskCoordinates class methods _from_array, from_array, and to_numpy_array in bitmask.py.
  • Updated an error message for better clarity in ontology_labels_impl.py.

🎉 Here's to the bugs we've squashed, 🐛

To the errors caught, and the crashes paused. 🚫💥

With each line of code, our software grows, 🌱

Like a poet finding rhythm in prose. 🖋️📜

So here's to the changes, big and small, 🔄

Making our code the best of all! 🏆💻

Copy link

github-actions bot commented Nov 6, 2023

Image description CodeRabbit

Walkthrough: The changes primarily focus on enhancing error handling and improving clarity in the code. They introduce checks for input validation in bitmask.py to ensure robustness, and modify an error message in ontology_labels_impl.py for better accuracy.

Changes:

Files Summary
encord/objects/bitmask.py Added checks to validate that the provided bitmask is a 2-dimensional array of boolean values and not None. Improved error handling in the to_numpy_array method by raising an exception if Numpy is not installed.
encord/objects/ontology_labels_impl.py Modified an error message to provide a clearer and more accurate description of the issue when bitmask dimensions don't match the media dimensions.

Notes

Chat with Image description CodeRabbit Bot (@coderabbitai)

  • Reply on review comments left by this bot to ask follow-up questions. A review comment is a comment on a diff or a file.
  • Invite the bot into a review comment chain by tagging @coderabbitai in a reply.

Code suggestions

  • The bot may make code suggestions, but please review them carefully before committing since the line number ranges may be misaligned.
  • You can edit the comment made by the bot and manually tweak the suggestion if it is slightly off.

Ignoring further reviews

  • Type @coderabbitai: ignore anywhere in the PR description to ignore further reviews from the bot.

Copy link

github-actions bot commented Nov 6, 2023

Unit test report

124 tests   113 ✔️  2s ⏱️
    1 suites    11 💤
    1 files        0

Results for commit 7557b7e.

♻️ This comment has been updated with latest results.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description CodeRabbit

Commits Files that changed from the base of the PR and between 93ce36d and 909ea9b commits.
Files selected (2)
  • encord/objects/bitmask.py (1)
  • encord/objects/ontology_labels_impl.py (1)
Files skipped from review due to trivial changes (1)
  • encord/objects/ontology_labels_impl.py
Review comments generated (1)
  • Review: 1
  • LGTM: 0

Comment on lines 136 to 163

@staticmethod
def _from_array(source: Any) -> BitmaskCoordinates.EncodedBitmask:
if source is not None:
if hasattr(source, "__array_interface__"):
arr = source.__array_interface__
data_type = arr["typestr"]
data = arr["data"]
shape = arr["shape"]

if data_type != "|b1":
raise EncordException(
"Bitmask should be an array of boolean values. For numpy array call .astype(bool)."
)
if source is not None and hasattr(source, "__array_interface__"):
arr = source.__array_interface__
data_type = arr["typestr"]
data = arr["data"]
shape = arr["shape"]

if len(shape) != 2:
raise EncordException("Bitmask should be a 2-dimensional array.")

if data_type != "|b1":
raise EncordException(
"Bitmask should be an array of boolean values. For numpy array call .astype(bool)."
)

raw_data = data if isinstance(data, bytes) else source.tobytes()
raw_data = data if isinstance(data, bytes) else source.tobytes()

rle = _mask_to_rle(raw_data)
rle_string = _rle_to_string(rle)
rle = _mask_to_rle(raw_data)
rle_string = _rle_to_string(rle)

return BitmaskCoordinates.EncodedBitmask(
top=0, left=0, height=shape[0], width=shape[1], rle_string=rle_string
)
return BitmaskCoordinates.EncodedBitmask(
top=0, left=0, height=shape[0], width=shape[1], rle_string=rle_string
)

raise EncordException(f"Can't import bitmask from {source.__class__}")

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description CodeRabbit

The changes made in this hunk are logically sound and improve the error handling by adding a check for the dimensionality of the input array. The error messages are clear and provide useful information to the user about what is expected. The code is also more maintainable as it checks for potential issues before they can cause problems later on.

Suggested change
@staticmethod
def _from_array(source: Any) -> BitmaskCoordinates.EncodedBitmask:
if source is not None:
if hasattr(source, "__array_interface__"):
arr = source.__array_interface__
data_type = arr["typestr"]
data = arr["data"]
shape = arr["shape"]
if data_type != "|b1":
raise EncordException(
"Bitmask should be an array of boolean values. For numpy array call .astype(bool)."
)
if source is not None and hasattr(source, "__array_interface__"):
arr = source.__array_interface__
data_type = arr["typestr"]
data = arr["data"]
shape = arr["shape"]
if len(shape) != 2:
raise EncordException("Bitmask should be a 2-dimensional array.")
if data_type != "|b1":
raise EncordException(
"Bitmask should be an array of boolean values. For numpy array call .astype(bool)."
)
raw_data = data if isinstance(data, bytes) else source.tobytes()
raw_data = data if isinstance(data, bytes) else source.tobytes()
rle = _mask_to_rle(raw_data)
rle_string = _rle_to_string(rle)
rle = _mask_to_rle(raw_data)
rle_string = _rle_to_string(rle)
return BitmaskCoordinates.EncodedBitmask(
top=0, left=0, height=shape[0], width=shape[1], rle_string=rle_string
)
return BitmaskCoordinates.EncodedBitmask(
top=0, left=0, height=shape[0], width=shape[1], rle_string=rle_string
)
raise EncordException(f"Can't import bitmask from {source.__class__}")
@staticmethod
def _from_array(source: Any) -> BitmaskCoordinates.EncodedBitmask:
if source is not None and hasattr(source, "__array_interface__"):
arr = source.__array_interface__
data_type = arr["typestr"]
data = arr["data"]
shape = arr["shape"]
if len(shape) != 2:
raise EncordException("Bitmask should be a 2-dimensional array.")
if data_type != "|b1":
raise EncordException(
"Bitmask should be an array of boolean values. For numpy array call .astype(bool)."
)
raw_data = data if isinstance(data, bytes) else source.tobytes()
rle = _mask_to_rle(raw_data)
rle_string = _rle_to_string(rle)
return BitmaskCoordinates.EncodedBitmask(
top=0, left=0, height=shape[0], width=shape[1], rle_string=rle_string
)
raise EncordException(f"Can't import bitmask from {source.__class__}")

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description CodeRabbit

Commits Files that changed from the base of the PR and between 909ea9b and 20459ed commits.
Files selected (1)
  • encord/objects/bitmask.py (1)
Review comments generated (6)
  • Review: 4
  • LGTM: 2

Comment on lines +139 to 141
if source is None:
raise EncordException("Bitmask can't be created from None")

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description CodeRabbit

The check for None is good, but it might be more Pythonic to use if source is None instead of if source == None. The is operator checks for identity, not equality, which is more appropriate when comparing with None.

Suggested change
if source is None:
raise EncordException("Bitmask can't be created from None")
if source is None:
raise EncordException("Bitmask can't be created from None")

Comment on lines +145 to +151
arr = source.__array_interface__
data_type = arr["typestr"]
data = arr["data"]
shape = arr["shape"]

rle = _mask_to_rle(raw_data)
rle_string = _rle_to_string(rle)
if len(shape) != 2:
raise EncordException("Bitmask should be a 2-dimensional array.")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description CodeRabbit

The extraction and validation of the array interface are well done. However, the error message in line 151 could be more informative by including the actual dimensions of the input array.

Suggested change
arr = source.__array_interface__
data_type = arr["typestr"]
data = arr["data"]
shape = arr["shape"]
rle = _mask_to_rle(raw_data)
rle_string = _rle_to_string(rle)
if len(shape) != 2:
raise EncordException("Bitmask should be a 2-dimensional array.")
if len(shape) != 2:
raise EncordException(f"Bitmask should be a 2-dimensional array, but got {len(shape)}-dimensional array.")

Comment on lines +153 to 155
if data_type != "|b1":
raise EncordException("Bitmask should be an array of boolean values. For numpy array call .astype(bool).")

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description CodeRabbit

The data type check and the corresponding error message are clear and helpful.

Comment on lines +156 to +161
raw_data = data if isinstance(data, bytes) else source.tobytes()

rle = _mask_to_rle(raw_data)
rle_string = _rle_to_string(rle)

return BitmaskCoordinates.EncodedBitmask(top=0, left=0, height=shape[0], width=shape[1], rle_string=rle_string)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description CodeRabbit

The conversion of the data to bytes and the creation of the EncodedBitmask object are correctly implemented.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description CodeRabbit

Commits Files that changed from the base of the PR and between 20459ed and 7557b7e commits.
Files selected (1)
  • encord/objects/bitmask.py (2)
Review comments generated (3)
  • Review: 2
  • LGTM: 1

Comment on lines 136 to +156

@staticmethod
def _from_array(source: Any) -> BitmaskCoordinates.EncodedBitmask:
if source is not None:
if hasattr(source, "__array_interface__"):
arr = source.__array_interface__
data_type = arr["typestr"]
data = arr["data"]
shape = arr["shape"]
if source is None:
raise EncordException("Bitmask can't be created from None")

if data_type != "|b1":
raise EncordException(
"Bitmask should be an array of boolean values. For numpy array call .astype(bool)."
)
if not hasattr(source, "__array_interface__"):
raise EncordException(f"Can't import bitmask from {source.__class__}")

raw_data = data if isinstance(data, bytes) else source.tobytes()
arr = source.__array_interface__
data_type = arr["typestr"]
data = arr["data"]
shape = arr["shape"]

rle = _mask_to_rle(raw_data)
rle_string = _rle_to_string(rle)
if len(shape) != 2:
raise EncordException("Bitmask should be a 2-dimensional array.")

return BitmaskCoordinates.EncodedBitmask(
top=0, left=0, height=shape[0], width=shape[1], rle_string=rle_string
)
if data_type != "|b1":
raise EncordException("Bitmask should be an array of boolean values. For numpy array call .astype(bool).")

raise EncordException(f"Can't import bitmask from {source.__class__}")
raw_data = data if isinstance(data, bytes) else source.tobytes()
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description CodeRabbit

The changes made in these lines are good for improving the robustness of the code. The developer has added checks to ensure that the source is not None, it's a 2-dimensional array, and it consists of boolean values only. If any of these conditions are not met, an exception is raised with a clear error message. This will help in debugging and maintaining the code.

Comment on lines 172 to 183
def to_numpy_array(self):
"""
Converts the mask to numpy array with dtype bool.
Converts the mask to a 2D numpy array with dtype bool.

Numpy needs to be installed for this call to work
Numpy needs to be installed for this call to work.
"""
try:
import numpy as np # type: ignore[missing-import]
except ImportError:
raise EncordException("Numpy is required for .to_numpy call.")
except ImportError as e:
raise EncordException("Numpy is required for .to_numpy_array call.") from e

return np.array(self)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image description CodeRabbit

The changes made in these lines are also beneficial. The developer has improved the error handling by raising an exception if Numpy is not installed. This makes the code more robust and easier to debug. The change in the docstring from ".to_numpy call" to ".to_numpy_array call" is also a good one as it accurately reflects the method name.

@sergei-encord sergei-encord merged commit 032df93 into master Nov 7, 2023
4 checks passed
@sergei-encord sergei-encord deleted the bitmask-input-array-dimension-check branch November 7, 2023 10:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant