Skip to content

Conversation

@louis-she
Copy link
Contributor

An image detection example using PyTorch Faster RCNN with Aim. Some screenshots:

image

image

image

@github-actions github-actions bot added the examples Examples label Mar 21, 2022
@louis-she louis-she changed the title add image detection example [WIP] add image detection example Mar 21, 2022
Copy link
Collaborator

@vfdev-5 vfdev-5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@louis-she thanks a lot for the PR !
This is a great start for finally adding detection example to ignite !
I left some comments on how to improve some parts of it...

Comment on lines 197 to 216
@visualizer.on(Events.ITERATION_COMPLETED)
def add_vis_images(engine):
engine.state.model_outputs.append(engine.state.output)

@visualizer.on(Events.ITERATION_COMPLETED)
def submit_vis_images(engine):
aim_images = []
for outputs in engine.state.model_outputs:
for image, target, pred in zip(outputs["x"], outputs["y"], outputs["y_pred"]):
image = (image * 255).byte()
pred_labels = [Dataset.class2name[l.item()] for l in pred["labels"]]
pred_boxes = pred["boxes"].long()
image = draw_bounding_boxes(image, pred_boxes, pred_labels, colors="red")

target_labels = [Dataset.class2name[l.item()] for l in target["labels"]]
target_boxes = target["boxes"].long()
image = draw_bounding_boxes(image, target_boxes, target_labels, colors="green")

aim_images.append(aim.Image(image.numpy().transpose((1, 2, 0))))
aim_logger.experiment.track(aim_images, name="vis", step=trainer.state.epoch)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite sure to understand why we need to accumulate and submit on iterations...
Can't we just submit on iterations without accumulating ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can submit on iterations only if we set step as trainer.state.iteration, or aim will override the previous images.

Gather visualization images in epoch can give a good view when using aim I think, but indeed it will eat more memory.

@@ -0,0 +1,252 @@
"""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a comment for ourselves that we have to advance on implementing mAP from our side

Comment on lines 104 to 107
def synchronize_between_processes(self):
for iou_type in self.iou_types:
self.eval_imgs[iou_type] = np.concatenate(self.eval_imgs[iou_type], 2)
create_common_coco_eval(self.coco_eval[iou_type], self.img_ids, self.eval_imgs[iou_type])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method looks a bit weird as according to its name it should sync between procs...

labels = targets["labels"].tolist()
areas = targets["area"].tolist()
iscrowd = targets["iscrowd"].tolist()
if "masks" in targets:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, the implementation could be simplified since no masks neither key points are used.

parser.add_argument("--image-size", type=int, default=512, help="image size to train")
parser.add_argument("--experiment-name", type=str, default="test", help="name of one experiment")
args = parser.parse_args()
run(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idist could be used for ddp

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have 2 GPUs to test distribution now. I'll try to write the distribution code and, maybe test on AWS?

name2class = {v: k + 1 for k, v in enumerate(classes)}
class2name = {k + 1: v for k, v in enumerate(classes)}

def __getitem__(self, index: int) -> Tuple[Any, Any]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that what is done here could be implemented as a transformation. It should be moved in another file IMO.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean the name and class number conversion? If yes I think it's ok to place here since there are nothing to do with the images.

@sayantan1410
Copy link
Contributor

Also, the tests are failing because of code-formatting error, try running
bash ./tests/run_code_style.sh install
bash ./tests/run_code_style.sh fmt

@louis-she louis-she changed the title [WIP] add image detection example add image detection example Mar 30, 2022
@louis-she louis-she requested review from sdesrozis and vfdev-5 March 30, 2022 05:10
Comment on lines 166 to 156
with torch.cuda.amp.autocast(enabled=True):
loss_dict = model(images, targets)
loss = sum(loss for loss in loss_dict.values())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still cuda.amp.autocast, can it work on CPU ?

engine.state.model_outputs.append(engine.state.output)

@visualizer.on(Events.ITERATION_COMPLETED)
def submit_vis_images(engine):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this ok in distributed ? I would have added a @one_rank_only() decorator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I havn't tested it on multi gpu core yet, there is a @one_rank_only() protection on the trainer's epoch_end so the visualizer engine will not be called on other ranks.

@louis-she
Copy link
Contributor Author

@vfdev-5 @sdesrozis
added retinanet support and distributed training, have tested on 2 GPUs.

@louis-she
Copy link
Contributor Author

louis-she commented Apr 6, 2022

One of the ci is failed cause codecov -4.27%. I think there are no need to write test code for example right? Should we add example folder in the ignore in codecov.yml ?

@louis-she
Copy link
Contributor Author

I see that there is a PR of mAP implementation here #2130, but there was no update for like half a year.

Can I open a new PR to implemente mAP?

## Usage:

```bash
python main.py exp_name
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a bit unclear what is exp_name here. We may do something like

python main.py <exp_name>
# for example
# python main.py frcnn

local_rank: int,
device: str,
experiment_name: str,
gpus: Optional[Union[int, List[int], str]] = None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like gpus is unused ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

examples Examples

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants