Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About Image Cropping #11

Open
NIUYIHAHA opened this issue Dec 8, 2024 · 5 comments
Open

About Image Cropping #11

NIUYIHAHA opened this issue Dec 8, 2024 · 5 comments

Comments

@NIUYIHAHA
Copy link

NIUYIHAHA commented Dec 8, 2024

Hello dear author, I was reading through the volume of your article and noticed that you are cropping the image to 512*512 size, and in conjunction with your dataset, I noticed that the targets in your dataset are smaller, so it may be less likely that the foreground will be truncated, whereas my foreground is larger and therefore truncated (e.g., the head of a camel is in 1.jpg, and it's body is in 2.jpg. But 1.jpg and 2.jpg belong to the same original image), how to deal with this situation? (My native language is not English, and I have great respect for you, if you feel any offense and confusion, it's my translation software's problem, haha),I have a second question for you if it's convenient for you, if the size of the target in the image that I want to count is slightly larger than the size of the target in the image in your dataset, does this method still work?

@cwinkelmann
Copy link

Hi @NIUYIHAHA,
I am working with this codebase but I am not the author. I have a similar problem as the objects I am trying to count and detect are iguanas therefore very long but not wide. This leads to boxes which which contain more background than foreground and the centroid of the box is often not the object itself. While slicing an image into patches this leads to sliced boxes which don't contain any of the actual object. I work around for training by just drawing a black patch on boxes which intersect with the edge of the crop.
For evaluation herdnet contains a stitcher in the infer.py function. So you can just run it with the original image.

@Alexandre-Delplanque
Copy link
Owner

Dear @NIUYIHAHA,

Thanks @cwinkelmann for your kind reply and sharing.

@NIUYIHAHA, to avoid such behavior, it would be good to choose a patch size that can contain your target species in its entirety. Do no hesitate to use the patcher.py tool to cut your original images into patches. Note also that this tool contains an argument (i.e. -min) that lets you define when to keep a bounding box annotation when cutting. For example, if you set this argument to 0.5, the box will be kept if, after cutting, the patch contains at least 50% of the original box area.

Regarding overlap when cutting images into patches, I usually set overlap to at least the length of a representative animal instance. In this way, I ensure that each individual is found at least once in its entirety in the dataset.

Please note that if you're creating points from bounding boxes (i.e. center), it is likely that your point will be outside the animal's body, particularly if the animal is in crescent shape or if your images were acquired in oblique view. I recommend checking the points created and correcting their position so that they point towards the center of each animal's body. Otherwise, you could end up with several predicted points on each animal.

Hope this helps!

@NIUYIHAHA
Copy link
Author

@Alexandre-Delplanque
I’m glad to receive your reply! Your guidance is very helpful! I indeed encountered the issue of multiple center points being predicted for the same target. Does HerdNet tend to recognize smaller targets in high-resolution images? Since it uses a point to indicate the center of the target, if the target is relatively large, the center point position cannot be unified, which may introduce errors during the annotation stage, leading to multiple center points being predicted for the same target. Therefore, my idea is as follows: 1. Reduce the size of the targets to be recognized (try to make the images closer to the style in your paper, i.e., small targets in high-resolution images, rather than medium or large targets), 2. Can the radius of FIDTM be increased? Is my approach reasonable?

@NIUYIHAHA
Copy link
Author

NIUYIHAHA commented Jan 9, 2025

@cwinkelmann Thank you very much for your enthusiastic help! I appreciate it!I am very interested in discussing with you and the authors about the training and application experiences of HerdNet in different scenarios. May I ask, if it’s convenient, what are the original image sizes and the sizes of the targets in your images? Have you encountered the issue where the center point is too small relative to the target? I noticed that the radius of FIDT in the author’s code is set to 2px, while my targets are approximately elliptical with dimensions of 150*50.

@Alexandre-Delplanque
Copy link
Owner

Hi @NIUYIHAHA,

Once properly trained, HerdNet si able to detect both large and small species. But I agree with you that if the animal is quite large, the positions of the annotated points may vary. One solution to avoid extracting multiple points from the localization map for the same individual would be to use a larger LMDS window (3x3px by default). You could try multiple window size and see the impact on your metrics.

As for FIDT radius parameter, you should keep it at 1px, since the transformation is adaptive to the proximity between points. Besides, target of 150x50px should not be an issue.

Hope this helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants