-
Notifications
You must be signed in to change notification settings - Fork 559
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support OHD-SJTU dataset and object heading detection #704
base: dev-1.x
Are you sure you want to change the base?
Conversation
Thanks for your contribution. This pr looks like a great start for head detection. Implementing Head Detection is a big job, i think we should start by designing the overall framework. The first step is to represent the rotated box with head, there could be different representations. For example (x,y,w,h,a,x_head,y_head) in this pr, or (x,y,w,h,a) with 360 degree, or (x,y,w,h,a,axis) in OHD-SJTU paper. This part need more discussion. I guess werid |
Please refer to #720. Maybe you can support the OHD-SJTU dataset based on 360° detection. |
But (x,y)_head coordinates just exist in RotatedHeadBoxes, and they will be transformed to head quadrant(0/1/2/3) in model head. In model head I directly predict one head quadrant for an object(just like classification) and use FocalLoss to calculate the head loss between pedicted head quadrant and target quadrant(transformed from x,y_head coordinates) |
I ran into a problem, I tried many times and I can't solve it, please what is the reason for this problem. thanks |
I think we need more details to find the problem |
Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.
Motivation
Support OHD-SJTU dataset.
Support object heading detection.
Complete task8 and task9 in #626
Modification
Implement OHD-SJTU dataset, creating a new box type "RotatedHeadBoxes" to support box head quadrant gt.
Implement object heading detection by adding a head classification branch in rotated retinanet, supervised by FocalLoss, changing codes related to prediction and loss calculation functions.
Current performance(using ohd-sjtuL dataset):
2)TP is the same as classic mAP.
HEAD_TP: iou>0.5, category and head quadrant are all predict correct.
Problem:
Current experiment performance represents that model hardly learns how to predict box head quadrant, and I think that the big difference in magnitude between loss_head and loss_cls/loss_bbox may be the reason, as shown below
Epoch(train) [1][ 50/2615] lr: 3.9880e-04 grad_norm: 1.3056 loss: 2.9622 loss_cls: 1.2014 loss_bbox: 1.1142 loss_head: 0.6467
Epoch(train) [1][ 100/2615] lr: 4.6560e-04 grad_norm: 1.0562 loss: 2.7534 loss_cls: 1.1452 loss_bbox: 0.9708 loss_head: 0.6374
Epoch(train) [1][ 150/2615] lr: 5.3240e-04 grad_norm: 1.3597 loss: 2.7922 loss_cls: 1.1481 loss_bbox: 1.0197 loss_head: 0.6244
Epoch(train) [1][ 200/2615] lr: 5.9920e-04 grad_norm: 2.0736 loss: 2.7875 loss_cls: 1.1531 loss_bbox: 1.0378 loss_head: 0.5966
Epoch(train) [1][ 250/2615] lr: 6.6600e-04 grad_norm: 6.9892 loss: 2.5320 loss_cls: 1.1415 loss_bbox: 1.0527 loss_head: 0.3378
Epoch(train) [1][ 300/2615] lr: 7.3280e-04 grad_norm: 7.8637 loss: 2.3590 loss_cls: 1.1344 loss_bbox: 1.1265 loss_head: 0.0981
Epoch(train) [1][ 350/2615] lr: 7.9960e-04 grad_norm: 6.0629 loss: 2.2941 loss_cls: 1.1327 loss_bbox: 1.0805 loss_head: 0.0809
Epoch(train) [1][ 400/2615] lr: 8.6640e-04 grad_norm: 8.0070 loss: 2.2093 loss_cls: 1.0657 loss_bbox: 1.0669 loss_head: 0.0767
Epoch(train) [1][ 450/2615] lr: 9.3320e-04 grad_norm: 10.7075 loss: 2.1032 loss_cls: 0.9992 loss_bbox: 1.0426 loss_head: 0.0615
........................................................................................
Epoch(train) [2][2000/2615] lr: 1.0000e-03 grad_norm: 6.9920 loss: 1.1298 loss_cls: 0.3704 loss_bbox: 0.7534 loss_head: 0.0060
Epoch(train) [2][2050/2615] lr: 1.0000e-03 grad_norm: 6.6444 loss: 1.1109 loss_cls: 0.3370 loss_bbox: 0.7677 loss_head: 0.0061
Epoch(train) [2][2100/2615] lr: 1.0000e-03 grad_norm: 7.8718 loss: 1.1643 loss_cls: 0.4220 loss_bbox: 0.7350 loss_head: 0.0073
Epoch(train) [2][2150/2615] lr: 1.0000e-03 grad_norm: 8.7219 loss: 1.2167 loss_cls: 0.5056 loss_bbox: 0.7037 loss_head: 0.0074
Epoch(train) [2][2200/2615] lr: 1.0000e-03 grad_norm: 7.6649 loss: 1.1461 loss_cls: 0.4118 loss_bbox: 0.7260 loss_head: 0.0083
The whole strcture of task8 and task9 in #626 has been completed, but I don't know why loss_head descent so fast that model can't learn knowledge about box head, so I pull this PR to get some help.
BC-breaking (Optional)
Use cases (Optional)
If this PR introduces a new feature, it is better to list some use cases here, and update the documentation.
Checklist