A small set of tools to filter the TACO image set for computer vision waste recognition.
As highlighted by Roboflow, it is important to incorporate images with single annotations, as well as null images, into datasets for computer vision models. This allows for more rapid development, easier debugging, and more accurate models. This repository is meant to give some tools for filtering the TACO dataset to provide these types of images for training.
Imports annotations from an annotations.json
file in the format of the TACO dataset and COCO data format.
Provide file path to annotations.json
and returns Python dictionary of imported JSON contents.
Creates a set of image IDs present in the json_contents of annotations.json
which only have one annotation.
Gets the links for a provided set or list of image IDs from the json_contents of the annotations.json
file. Can be used for downloading images for batches or for confirming that only images with one piece of trash in them have been filtered.
Gets the proportions of each subcategory (type of trash) given a set of image IDs and the json_contents of annotations.json
.
Creates a new Python dictionary following the scheme of annotations.json
which only contains images whos IDs are given in the id_set.
This can be used to build an annotations.json
file with only single annotation images.
The TACO dataset does not currently provide any null images. One possible addition to this repository would be functions that artificially remove sections of images or zoom into areas without trash to create such null images.
Modifying the download scripts from the TACO dataset to automatically build batches from the updated annotations.json
file.
Made for ZotBins