# Generate instance prototypes for base category of OVD COCO
python3 ./tools/extract_instance_prototypes.py --dataset fs_coco17_train_base --model vits14
# will produce a file `fs_coco17_train_base.vits14.pkl`
# Generate instance prototypes for novel category of OVD COCO
python3 ./tools/extract_instance_prototypes.py --dataset fs_coco17_support_novel_30shot --model vits14 --epochs 60
# will produce a file `fs_coco17_support_novel_30shot.vits14.pkl`
# back then, augmentation is used, so the `epoch` parameter exists. But we find augmentation is not a critical element.
# So we remove it. But in order to keep consistency, we still add some epochs.
# Generate class prototypes through clustering
python3 ./tools/run_sinkhorn_cluster.py --inp ./fs_coco17_train_base.vits14.pkl --epochs 10 --momentum 0.002 --num_prototypes 10
# will produce a file `fs_coco17_train_base.vits14.p10.sk.pkl`
python3 ./tools/run_sinkhorn_cluster.py --inp ./fs_coco17_support_novel_30shot.vits14.pkl --epochs 30 --momentum 0.002 --num_prototypes 10
# will produce a file `fs_coco17_support_novel_30shot.vits14.p10.sk.pkl`Then, set DE.CLASS_PROTOTYPES to fs_coco17_train_base.vits14.p10.sk.pkl,fs_coco17_support_novel_30shot.vits14.p10.sk.pkl to use the above generated prototypes. The stuff classes of coco_2017_train_panoptic_stuffonly dataset are used to extract background prototypes with the same procedure.
A list of base / novel dataset pairs:
-
LVIS
lvis_v1_known_train / lvis_v1_novel_train(note thatlvis_v1_trainare used in config because they will be splited during loading, check <detectron2/data/datasets/lvis.py>. -
Few-Shot COCO 14
fs_coco14_base_train / fs_coco_trainval_novel_10shot. -
OVD COCO 17
coco_2017_ovd_b_train / fs_coco17_support_novel_30shot.fs_coco17_train_baseis an alias forcoco_2017_ovd_b_train. -
One-Shot COCO split 1
coco_2017_train_oneshot_s1 / coco_2017_novel_oneshot_s1_r50(r50 means reservoir 50 samples, we randomly pick 50 samples for eval during training)
Since our model relies on ViT and RPN, we need to combine them into a single checkpoint. All initial models with names like vits+rpn.pth are pre-combined. The tool used is tools/combine_vit_rpn_weights.py.