Skip to content

Conversation

timnielen
Copy link

@timnielen timnielen commented Sep 12, 2025

Description

TL;DR
The current framework handles the number of classes in a dataset incorrectly as discussed in the following issues:
#330
#51

In Detail:
The rf-detr framework calculates the number of classes as the number of classes in the dataset and reinitializes the head with this number in case it doesn't match the model's num_classes.
There are several issues with this:

First of all, the head should be of size num_classes+1 to account for the fact that COCO datasets are 1 indexed. In the original DETR the head was even of size 92 because they add another +1 for the "no-object" class at the very end. This has been discussed in facebookresearch/detr#108. There they note that in theory this is not necessary to do this because the logit with index 0 could also act as the "no-object" class. This is exactly why RF-DETR's head has a size of 91, meaning the 0 index class acts as the "no-object" class. (As far as I understand this is no problem as since Deformable DETR they stopped using a special weight in the cross_entropy loss for the "no-object" class which was at index -1 for DETR.)
So why does this not cause any problems for datasets downloaded from ROBOFLOW?
In roboflow datasets indexing starts at 0 and the 0th class is in fact no actual class it is the superclass of all other classes.
Here an example from the soccer players.v2 dataset:

[
        {
            "id": 0,
            "name": "soccer-players",
            "supercategory": "none"
        },
        {
            "id": 1,
            "name": "football",
            "supercategory": "soccer-players"
        },
        {
            "id": 2,
            "name": "player",
            "supercategory": "soccer-players"
        },
        {
            "id": 3,
            "name": "referee",
            "supercategory": "soccer-players"
        }
]

If you inspect it on ROBOFLOW https://universe.roboflow.com/roboflow-100/soccer-players-5fuqs it indeed only has 3 classes and not 4 so the first class is really just a dummy class. This is why by accident the current implementation works as len(anns["categories"]) = 4 = number_of_actual_classes+1
However, this will not work for any 1indexed dataset such as the original COCO dataset as it doesn't have this dummy class in the beginning.

Another issue is that num_classes should not be the number of classes in the dataset but rather the max_id because there might be some "holes". This is in fact the case for the original COCO dataset. It actually has only 80 classes in total and not 90 which is it's max_id because for some reason some indices are missing.

I can only suspect the original RF-DETR models haven't been trained with this pipeline because this would have caused the head to be reinitialized with 80 instead of 91 logits as this is the len(anns["categories"]) in the case of COCO.

The fix accounts for both issues and will still work with ROBOFLOW datasets. I set the num_classes to the max_id which is 3 in the above example. Then I reinitialize the head with size num_classes+1 to account for both the "no-object" logit in the 0th index and the fact that labels range from 1 to num_classes.

Edit: I just noticed this is the same fix as proposed by #330

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)

How has this change been tested, please provide a testcase or example of how you tested the change?

Here is how I veryfied the problem:

I downloaded the soccer players.v2 dataset from roboflow as mentioned above.

As the first category is only a supercategory there are only 3 actual categories.
Roboflow also states that there are 3 categories https://universe.roboflow.com/roboflow-100/soccer-players-5fuqs

When I train the model on the dataset as is:

model = RFDETRNano()

model.train(
    dataset_dir="soccer players.v2-release.coco"
    )

It reinitializes the head to size 4 which is correct as 4=3+1=num_classes+1 but only because of the dummy class in the dataset.

Now, to test what happens if I have a 1indexed dataset I removed the dummy class. Now it reinitializes it to only 3 logits and I get an "index out of bounds" error as expected as the "referee" category with id 3 coresponds to the 3rd index which doesn't exist.

After the changes this is handeled correctly and the head get initialized with 4 logits and it prints the correct number of classes (3) in the reinitialization warning.

@CLAassistant
Copy link

CLAassistant commented Sep 12, 2025

CLA assistant check
All committers have signed the CLA.

@timnielen
Copy link
Author

I have read the CLA Document and I sign the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants