Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Evaluation while traing maskRCNN with detectron2 in colab #5132

Open
Neeha-Mandala opened this issue Oct 24, 2023 · 1 comment
Open

Comments

@Neeha-Mandala
Copy link

Neeha-Mandala commented Oct 24, 2023

### Issue with Evaluation while traing maskRCNN with detectron2 in colab

I'm training my maskRCNN model on the coco dataset with detectron 2 in colab. While evaluating inference is done but it stops right after
"[10/24 08:08:00 d2.evaluation.coco_evaluation]: Evaluating predictions with unofficial COCO API...
Loading and preparing results...
DONE (t=0.47s)
creating index...
index created!
[10/24 08:08:01 d2.evaluation.fast_eval_api]: Evaluate annotation type bbox"
No errors are diplayed, but the cell run icon is red, which states "cell execution was unsuccesfull.

code:
1. CONFIG CELL:

def plot_samples(dataset_name):
dataset_custom = DatasetCatalog.get(dataset_name)
dataset_custom_metadata = MetadataCatalog.get(dataset_name)
for s in dataset_custom:
    img = cv2.imread(s["file_name"])
    v = Visualizer(img[:,:,::-1], metadata=dataset_custom_metadata, scale = 1.0, instance_mode=ColorMode.SEGMENTATION)
    v = v.draw_dataset_dict(s)
    plt.figure(figsize=(15,20))
    plt.imshow(v.get_image())
    plt.show()
def get_train_cfg(config_file_path,checkpoint_url, train_dataset_name,test_dataset_name,num_classes,device, output_dir):
   cfg = get_cfg()
   cfg.merge_from_file(model_zoo.get_config_file(config_file_path))
   cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url(checkpoint_url)
   cfg.DATASETS.TRAIN = (train_dataset_name,)
   cfg.DATASETS.TEST = (test_dataset_name,)
   cfg.DATALOADER.NUM_WORKERS = 2
   cfg.SOLVER.IMS_PER_BATCH = 2
   cfg.SOLVER.BASE_LR = 0.0005
   cfg.SOLVER.MAX_ITER = 41041
   cfg.SOLVER.STEPS = []
   cfg.MODEL.ROI_HEADS.NUM_CLASSSES = num_classes
   cfg.MODEL.DEVICE = device
   cfg.OUTPUT_DIR = output_dir
   cfg.TEST.EVAL_PERIOD = 41041
   cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 16
   cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5
   return cfg

2. TRAINING CELL

config_file_path = "COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"
checkpoint_url = "COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"
output_dir = "./coco_colab_output4/instance_segmentation"
num_classes = 80
if torch.cuda.is_available():
   device = "cuda"
   print('cuda')
 else:
   device = "cpu"
   print('cpu')

train_dataset_name = "train_2014_f"
train_images_path = "/content/drive/MyDrive/Thesis/coco_dataset/train"
train_json_annot_path = "/content/drive/MyDrive/Thesis/coco_dataset/annotations/instances_train2014.json"

test_dataset_name = "test_2014_f"
test_images_path = "/content/drive/MyDrive/Thesis/coco_dataset/valid"
test_json_annot_path = "/content/drive/MyDrive/Thesis/coco_dataset/annotations/instances_val2014.json"
cfg_save_path = "Instance_seg_cfg.pickle"
if train_dataset_name not in DatasetCatalog.list():
    register_coco_instances(name = train_dataset_name,metadata={},json_file= train_json_annot_path, image_root = 
   train_images_path)
if test_dataset_name not in DatasetCatalog.list():
    register_coco_instances(name = test_dataset_name,metadata={},json_file= test_json_annot_path, image_root = 
    test_images_path)

class CocoTrainer(DefaultTrainer):
     @classmethod
     def build_evaluator(cls, cfg, dataset_name, output_folder=None):
            if output_folder is None:
               os.makedirs("coco_eval2", exist_ok=True)
               output_folder = "coco_eval2"
               tasks = ("bbox", "segm")  # because using cfg in evaluator gives a warning stating its a depreciated behaviour
            return COCOEvaluator(dataset_name, tasks, False, output_folder)

 def main():
       cfg = get_train_cfg(config_file_path, checkpoint_url, train_dataset_name, test_dataset_name, num_classes, device, 
       output_dir)
       with open(cfg_save_path, 'wb') as f:
              pickle.dump(cfg, f, protocol =pickle.HIGHEST_PROTOCOL)

       os.makedirs(cfg.OUTPUT_DIR, exist_ok =True)
       trainer = CocoTrainer(cfg)
       trainer.resume_or_load(resume=False)
       trainer.train()

 if __name__ == '__main__':
     main()

Results (log):

After training:

[10/24 05:22:15 d2.utils.events]: eta: 0:00:11 iter: 41019 total_loss: 0.8259 loss_cls: 0.21 loss_box_reg: 0.3218 loss_mask: 0.1955 loss_rpn_cls: 0.02254 loss_rpn_loc: 0.02922 time: 0.5350 last_time: 0.5728 data_time: 0.0071 last_data_time: 0.0071 lr: 0.0005 max_mem: 3189M
[10/24 05:22:26 d2.utils.events]: eta: 0:00:00 iter: 41039 total_loss: 1.158 loss_cls: 0.3318 loss_box_reg: 0.3927 loss_mask: 0.2567 loss_rpn_cls: 0.04291 loss_rpn_loc: 0.05291 time: 0.5350 last_time: 0.4741 data_time: 0.0077 last_data_time: 0.0012 lr: 0.0005 max_mem: 3189M
[10/24 05:22:33 d2.utils.events]: eta: 0:00:00 iter: 41040 total_loss: 1.158 loss_cls: 0.3318 loss_box_reg: 0.3927 loss_mask: 0.2615 loss_rpn_cls: 0.04291 loss_rpn_loc: 0.06052 time: 0.5350 last_time: 0.5285 data_time: 0.0081 last_data_time: 0.0158 lr: 0.0005 max_mem: 3189M
[10/24 05:22:33 d2.engine.hooks]: Overall training speed: 41039 iterations in 6:05:55 (0.5350 s / it)
[10/24 05:22:33 d2.engine.hooks]: Total training time: 6:06:47 (0:00:51 on hooks)
[10/24 05:22:43 d2.data.datasets.coco]: Loading /content/drive/MyDrive/Thesis/coco_dataset/annotations/instances_val2014.json takes 10.08 seconds.
WARNING [10/24 05:22:43 d2.data.datasets.coco]:
Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you.

After evaluation( inference)

[10/24 08:07:35 d2.evaluation.evaluator]: Inference done 40442/40504. Dataloading: 0.0031 s/iter. Inference: 0.2349 s/iter. Eval: 0.0061 s/iter. Total: 0.2442 s/iter. ETA=0:00:15
[10/24 08:07:40 d2.evaluation.evaluator]: Inference done 40464/40504. Dataloading: 0.0031 s/iter. Inference: 0.2349 s/iter. Eval: 0.0061 s/iter. Total: 0.2442 s/iter. ETA=0:00:09
[10/24 08:07:45 d2.evaluation.evaluator]: Inference done 40483/40504. Dataloading: 0.0031 s/iter. Inference: 0.2349 s/iter. Eval: 0.0061 s/iter. Total: 0.2442 s/iter. ETA=0:00:05
[10/24 08:07:50 d2.evaluation.evaluator]: Total inference time: 2:44:50.702556 (0.244221 s / iter per device, on 1 devices)
[10/24 08:07:50 d2.evaluation.evaluator]: Total inference pure compute time: 2:38:31 (0.234859 s / iter per device, on 1 devices)
[10/24 08:07:55 d2.evaluation.coco_evaluation]: Preparing results for COCO format ...
[10/24 08:07:56 d2.evaluation.coco_evaluation]: Saving results to coco_eval2/coco_instances_results.json
[10/24 08:08:00 d2.evaluation.coco_evaluation]: Evaluating predictions with unofficial COCO API...
Loading and preparing results...
DONE (t=0.47s)
creating index...
index created!
[10/24 08:08:01 d2.evaluation.fast_eval_api]: Evaluate annotation type bbox

Expected behavior:

it should actually give the following log

index created!
[09/15 19:01:52 d2.evaluation.fast_eval_api]: Evaluate annotation type bbox
[09/15 19:01:52 d2.evaluation.fast_eval_api]: COCOeval_opt.evaluate() finished in 0.00 seconds.
[09/15 19:01:52 d2.evaluation.fast_eval_api]: Accumulating evaluation results...
[09/15 19:01:52 d2.evaluation.fast_eval_api]: COCOeval_opt.accumulate() finished in 0.00 seconds.
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 1.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 1.000
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 1.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000

I mean the log should continue of this format.
And I'm running it in google colab

@Neeha-Mandala Neeha-Mandala changed the title Please read & provide the following Issue with Evaluation while traing maskRCNN with detectron2 in colab Oct 24, 2023
@Programmer-RD-AI
Copy link
Contributor

Hi,
At the begging add:

from detectron2.utils.logger import setup_logger
setup_logger()

It should resolve the issue you had.
For further information check: #144 & Source code for detectron2.utils.logger

Best regards,
Ranuga

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants