Fix: incorrect num_classes used in RFDETR.train_from_config #356
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
TL;DR
The current framework handles the number of classes in a dataset incorrectly as discussed in the following issues:
#330
#51
In Detail:
The rf-detr framework calculates the number of classes as the number of classes in the dataset and reinitializes the head with this number in case it doesn't match the model's num_classes.
There are several issues with this:
First of all, the head should be of size
num_classes+1
to account for the fact that COCO datasets are 1 indexed. In the original DETR the head was even of size 92 because they add another +1 for the "no-object" class at the very end. This has been discussed in facebookresearch/detr#108. There they note that in theory this is not necessary to do this because the logit with index 0 could also act as the "no-object" class. This is exactly why RF-DETR's head has a size of 91, meaning the 0 index class acts as the "no-object" class. (As far as I understand this is no problem as since Deformable DETR they stopped using a special weight in the cross_entropy loss for the "no-object" class which was at index -1 for DETR.)So why does this not cause any problems for datasets downloaded from ROBOFLOW?
In roboflow datasets indexing starts at 0 and the 0th class is in fact no actual class it is the superclass of all other classes.
Here an example from the soccer players.v2 dataset:
If you inspect it on ROBOFLOW https://universe.roboflow.com/roboflow-100/soccer-players-5fuqs it indeed only has 3 classes and not 4 so the first class is really just a dummy class. This is why by accident the current implementation works as
len(anns["categories"]) = 4 = number_of_actual_classes+1
However, this will not work for any 1indexed dataset such as the original COCO dataset as it doesn't have this dummy class in the beginning.
Another issue is that
num_classes
should not be the number of classes in the dataset but rather themax_id
because there might be some "holes". This is in fact the case for the original COCO dataset. It actually has only 80 classes in total and not 90 which is it'smax_id
because for some reason some indices are missing.I can only suspect the original RF-DETR models haven't been trained with this pipeline because this would have caused the head to be reinitialized with 80 instead of 91 logits as this is the
len(anns["categories"])
in the case of COCO.The fix accounts for both issues and will still work with ROBOFLOW datasets. I set the num_classes to the
max_id
which is 3 in the above example. Then I reinitialize the head with sizenum_classes+1
to account for both the "no-object" logit in the 0th index and the fact that labels range from 1 tonum_classes
.Edit: I just noticed this is the same fix as proposed by #330
Type of change
Please delete options that are not relevant.
How has this change been tested, please provide a testcase or example of how you tested the change?
Here is how I veryfied the problem:
I downloaded the soccer players.v2 dataset from roboflow as mentioned above.
As the first category is only a supercategory there are only 3 actual categories.
Roboflow also states that there are 3 categories https://universe.roboflow.com/roboflow-100/soccer-players-5fuqs
When I train the model on the dataset as is:
It reinitializes the head to size 4 which is correct as 4=3+1=num_classes+1 but only because of the dummy class in the dataset.
Now, to test what happens if I have a 1indexed dataset I removed the dummy class. Now it reinitializes it to only 3 logits and I get an "index out of bounds" error as expected as the "referee" category with id 3 coresponds to the 3rd index which doesn't exist.
After the changes this is handeled correctly and the head get initialized with 4 logits and it prints the correct number of classes (3) in the reinitialization warning.