Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Average Precision and Recall negative (-1.000) and No Prediction Results #413

Closed
jwnz opened this issue Apr 9, 2021 · 4 comments
Closed

Comments

@jwnz
Copy link

jwnz commented Apr 9, 2021

I have two major issues.

  1. In the training process, the Average Recall and Precision for small and medium are both negative (-1).
  2. After training, regardless of the value of Average Precision (area= Large), I am unable to produce a single bounding box. This also applies to when I try to infer a bounding box on an image in the training set as well.

These problems seems to be common as there are many related issues:
negative precision/recall

My input image sizes vary slightly but the vast majority of each image is of size 1920x1080

Below is an example of the validation output.
Note: I am using the same images for validation and training.

DONE (t=0.01s).
IoU metric: bbox
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.030
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.030
2021-04-09 11:56:55,228 train.py[line:447] INFO: Created checkpoint directory

Below is the command I'm using to train. I have tried several learning rates. It has no effect on the small and medium recall.

python train.py -l 0.25 -g 0 -pretrained ./pytorch-YOLOv4/yolov4.conv.137.pth -classes 1 -dir ../images -train_label_path ./data/simple.txt

The input for the bounding boxes is correct; Here is an example or my train.txt:

./data/368.jpg 272,391,1290,929,0
./data/8.jpg 322,114,1293,909,0
./data/737.jpg 681,46,1152,1079,0
...

In the cfg.py file I have changed the variables as folllows:

  • Cfg.batch=4
  • Cfg.subdivisions=1
  • Cfg.width=416
  • Cfg.height=416
  • Cfg.max_batches=2000 # I'm only training a single new class
  • Cfg.steps=1600,1800
    - Cfg.boxes=1 # only one box per image

I modified the yolov4-custom.cfg file to match the cfg.py file as well as made the changes mentioned in THIS link.
I changed the image_size variable in the Yolo_loss class to match the width and height (416) in the cfg file.
I created my own get_image_id function in the dataset.py file to return an integer representing a particular image.

Here is an example of my prediction output:

Loading weights from Yolov4_epoch3.pth... Done!
-----------------------------------
           Preprocess : 0.000637
      Model Inference : 0.818682
-----------------------------------
/???/???/???/pytorch-YOLOv4/tool/utils.py:199: RuntimeWarning: invalid value encountered in greater
  argwhere = max_conf[i] > conf_thresh
-----------------------------------
       max and argmax : 0.000091
                  nms : 0.000212
Post processing total : 0.000302
-----------------------------------
[]
-----------------------------------
           Preprocess : 0.000371
      Model Inference : 0.729752
-----------------------------------
-----------------------------------
       max and argmax : 0.000088
                  nms : 0.000046
Post processing total : 0.000135
-----------------------------------
[]
./data/368.jpg: Predicted in 0.732061 seconds.
save plot results to predictions.jpg

If I modify the thresholds in demo.py file in the detect_cv2 funrtion from 0.4 and 0.6 to a negative number, I get bounding boxes, but here is a rough example of the results: [0.927, 0.927, 0.927, 0.927, 0, 0, 0]

I have absolutely no clue where I'm going wrong here. Am I missing something?

@missFuture
Copy link

missFuture commented Apr 10, 2021

I also have the problem.

my result
continue training from last 100 epoch train-weights
train_preocess

tensorboard:
tensorboard

we can get that the loss is down to hundred(200) from a high value

AP AR is almost zero
AP-AR

but when i test each validation or test image, i can get nearly correct prediction.like this
prediction

prediction2

what i have done under this repo?

i dont know how to solve this problem. please help me

@jwnz
Copy link
Author

jwnz commented Apr 14, 2021

I believe the issue I'm having is related to the size of the training images. Due to the memory constraints of my GPU (CPU training just doesn't work on my machine), I was forced to train the model with an image size of 416x416. However, the item (and coincidentally the bounding boxes) I was trying to detect are significantly larger then the width and height of 416.

My hypothesis is that by increasing the height and width , the two issues I mentioned in my original post above may be solved. Unfortunately, I don't have the resources to test this hypothesis.

I hope that this helps someone. However, I have managed to solve the issues I was having regarding this task by simply using another repository.

@BruceDai003
Copy link

I also have the problem.

my result
continue training from last 100 epoch train-weights
train_preocess

tensorboard:
tensorboard

we can get that the loss is down to hundred(200) from a high value

AP AR is almost zero
AP-AR

but when i test each validation or test image, i can get nearly correct prediction.like this
prediction

prediction2

what i have done under this repo?

i dont know how to solve this problem. please help me

I like your sesame-paste. What's your command to train?

@jwnz jwnz closed this as completed Apr 19, 2021
@tylister
Copy link

I also have the problem.

my result
continue training from last 100 epoch train-weights
train_preocess

tensorboard:
tensorboard

we can get that the loss is down to hundred(200) from a high value

AP AR is almost zero
AP-AR

but when i test each validation or test image, i can get nearly correct prediction.like this
prediction

prediction2

what i have done under this repo?

i dont know how to solve this problem. please help me

what's the format of your dataset?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants