|
1 |
| -## Deep Metric Learning Research in PyTorch |
| 1 | +# Deep Metric Learning Research in PyTorch |
2 | 2 |
|
3 |
| -This repository will contain all code and implementations used in: |
| 3 | +--- |
| 4 | +## What can I find here? |
| 5 | + |
| 6 | +This repository contains all code and implementations used in: |
4 | 7 |
|
5 |
| -(1) https://arxiv.org/abs/2002.08473 |
6 | 8 | ```
|
7 | 9 | Revisiting Training Strategies and Generalization Performance in Deep Metric Learning
|
8 | 10 | ```
|
9 | 11 |
|
10 |
| -and |
| 12 | +**Link**: https://arxiv.org/abs/2002.08473 |
| 13 | + |
| 14 | +The code is meant to serve as a research starting point in Deep Metric Learning. |
| 15 | +By implementing key baselines under a consistent setting and logging a vast set of metrics, it should be easier to ensure that method gains are not due to implementational variations, while better understanding driving factors. |
| 16 | + |
| 17 | +It is set up in a modular way to allow for fast and detailed prototyping, but with key elements written in a way that allows the code to be directly copied into other pipelines. In addition, multiple training and test metrics are logged in W&B to allow for easy and large-scale evaluation. |
| 18 | + |
| 19 | +Finally, please find a public W&B repo with key runs performed in (1) here: https://app.wandb.ai/confusezius/RevisitDML. |
| 20 | + |
| 21 | +**Contact **: Karsten Roth, [email protected] |
| 22 | + |
| 23 | +*Suggestions are always welcome!* |
| 24 | + |
| 25 | +--- |
| 26 | +## Some Notes: |
| 27 | + |
| 28 | +If you use this code in your research, please cite |
| 29 | +``` |
| 30 | +@misc{roth2020revisiting, |
| 31 | + title={Revisiting Training Strategies and Generalization Performance in Deep Metric Learning}, |
| 32 | + author={Karsten Roth and Timo Milbich and Samarth Sinha and Prateek Gupta and Björn Ommer and Joseph Paul Cohen}, |
| 33 | + year={2020}, |
| 34 | + eprint={2002.08473}, |
| 35 | + archivePrefix={arXiv}, |
| 36 | + primaryClass={cs.CV} |
| 37 | +} |
| 38 | +``` |
| 39 | + |
| 40 | +This repository contains (in parts) code that has been adapted from: |
| 41 | +* https://github.com/idstcv/SoftTriple |
| 42 | +* https://github.com/bnu-wangxun/Deep_Metric |
| 43 | +* https://github.com/valerystrizh/pytorch-histogram-loss |
| 44 | +* https://github.com/Confusezius/Deep-Metric-Learning-Baselines |
| 45 | + |
| 46 | +Make sure to also check out the following repo with a great plug-and-play implementation of DML methods: |
| 47 | +* https://github.com/KevinMusgrave/pytorch-metric-learning |
| 48 | + |
| 49 | +--- |
| 50 | + |
| 51 | +**[All implemented methods and metrics are listed at the bottom!]()** |
| 52 | + |
| 53 | +--- |
| 54 | + |
| 55 | +## Paper-related Information |
| 56 | + |
| 57 | +#### Reproduce results from our paper **-> Revisiting Training Strategies and Generalization Performance in Deep Metric Learning <-** |
| 58 | + |
| 59 | +* *ALL* standardized Runs that were used are available in `Revisit_Runs.sh`. |
| 60 | +* These runs are also logged in this public W&B repo: https://app.wandb.ai/confusezius/RevisitDML. |
| 61 | +* All Runs and their respective metrics can be downloaded and evaluated to generate the plots in our paper by following `Result_Evaluations.py`. This also allows for potential introspection of other relations. It also converts results directly into Latex-table format with mean and standard deviations. |
| 62 | +* To utilize different batch-creation methods, simply set the flag `--data_sampler` to the method of choice. Allowed flags are listed in `datasampler/__init__.py`. |
| 63 | +* To use the proposed spectral regularization for tuple-based methods, set `--batch_mining rho_distance` with flip probability `--miner_rho_distance_cp e.g. 0.2`. |
| 64 | +* A script to run the toy experiments in the paper is provided in `toy_experiments`. |
| 65 | + |
| 66 | +**Note**: There may be small deviations in results based on the Hardware (e.g. between P100 and RTX GPUs) and Software (different PyTorch/Cuda versions) used to run these experiments, but they should be covered in the standard deviations reported in the paper. |
| 67 | + |
| 68 | +--- |
| 69 | + |
| 70 | +## How to use this Repo |
| 71 | + |
| 72 | +### Requirements: |
| 73 | + |
| 74 | +* PyTorch 1.2.0+ & Faiss-Gpu |
| 75 | +* Python 3.6+ |
| 76 | +* pretrainedmodels, torchvision 0.3.0+ |
| 77 | + |
| 78 | +An exemplary setup of a virtual environment containing everything needed: |
| 79 | +``` |
| 80 | +(1) wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh |
| 81 | +(2) bash Miniconda3-latest-Linux-x86_64.sh (say yes to append path to bashrc) |
| 82 | +(3) source .bashrc |
| 83 | +(4) conda create -n DL python=3.6 |
| 84 | +(5) conda activate DL |
| 85 | +(6) conda install matplotlib scipy scikit-learn scikit-image tqdm pandas pillow |
| 86 | +(7) conda install pytorch torchvision faiss-gpu cudatoolkit=10.0 -c pytorch |
| 87 | +(8) pip install wandb pretrainedmodels |
| 88 | +(9) Run the scripts! |
| 89 | +``` |
| 90 | + |
| 91 | +### Datasets: |
| 92 | +Data for |
| 93 | +* CUB200-2011 (http://www.vision.caltech.edu/visipedia/CUB-200.html) |
| 94 | +* CARS196 (https://ai.stanford.edu/~jkrause/cars/car_dataset.html) |
| 95 | +* Stanford Online Products (http://cvgl.stanford.edu/projects/lifted_struct/) |
| 96 | + |
| 97 | +can be downloaded either from the respective project sites or directly via Dropbox: |
| 98 | + |
| 99 | +* CUB200-2011 (1.08 GB): https://www.dropbox.com/s/tjhf7fbxw5f9u0q/cub200.tar?dl=0 |
| 100 | +* CARS196 (1.86 GB): https://www.dropbox.com/s/zi2o92hzqekbmef/cars196.tar?dl=0 |
| 101 | +* SOP (2.84 GB): https://www.dropbox.com/s/fu8dgxulf10hns9/online_products.tar?dl=0 |
| 102 | + |
| 103 | +**The latter ensures that the folder structure is already consistent with this pipeline and the dataloaders**. |
| 104 | + |
| 105 | +Otherwise, please make sure that the datasets have the following internal structure: |
| 106 | + |
| 107 | +* For CUB200-2011/CARS196: |
| 108 | +``` |
| 109 | +cub200/cars196 |
| 110 | +└───images |
| 111 | +| └───001.Black_footed_Albatross |
| 112 | +| │ Black_Footed_Albatross_0001_796111 |
| 113 | +| │ ... |
| 114 | +| ... |
| 115 | +``` |
| 116 | + |
| 117 | +* For Stanford Online Products: |
| 118 | +``` |
| 119 | +online_products |
| 120 | +└───images |
| 121 | +| └───bicycle_final |
| 122 | +| │ 111085122871_0.jpg |
| 123 | +| ... |
| 124 | +| |
| 125 | +└───Info_Files |
| 126 | +| │ bicycle.txt |
| 127 | +| │ ... |
| 128 | +``` |
| 129 | + |
| 130 | +Assuming your folder is placed in e.g. `<$datapath/cub200>`, pass `$datapath` as input to `--source`. |
| 131 | + |
| 132 | +### Training: |
| 133 | +Training is done by using `main.py` and setting the respective flags, all of which are listed and explained in `parameters.py`. A vast set of exemplary runs is provided in `Revisit_Runs.sh`. |
| 134 | + |
| 135 | +**[I.]** **A basic sample run using default parameters would like this**: |
| 136 | + |
| 137 | +``` |
| 138 | +python main.py --loss margin --batch_mining distance --log_online \ |
| 139 | + --project DML_Project --group Margin_with_Distance --seed 0 \ |
| 140 | + --gpu 0 --bs 112 --data_sampler class_random --samples_per_class 2 \ |
| 141 | + --arch resnet50_frozen_normalize --source $datapath --n_epochs 150 \ |
| 142 | + --lr 0.00001 --embed_dim 128 --evaluate_on_gpu |
| 143 | +``` |
| 144 | + |
| 145 | +The purpose of each flag explained: |
| 146 | + |
| 147 | +* `--loss <loss_name>`: Name of the training objective used. See folder `criteria` for implementations of these methods. |
| 148 | +* `--batch_mining <batchminer_name>`: Name of the batch-miner to use (for tuple-based ranking methods). See folder `batch_mining` for implementations of these methods. |
| 149 | +* `--log_online`: Log metrics online via either W&B (Default) or CometML. Regardless, plots, weights and parameters are all stored offline as well. |
| 150 | +* `--project`, `--group`: Project name as well as name of the run. Different seeds will be logged into the same `--group` online. The group as well as the used seed also define the local savename. |
| 151 | +* `--seed`, `--gpu`, `--source`: Basic Parameters setting the training seed, the used GPU and the path to the parent folder containing the respective Datasets. |
| 152 | +* `--arch`: The utilized backbone, e.g. ResNet50. You can append `_frozen` and `_normalize` to the name to ensure that BatchNorm layers are frozen and embeddings are normalized, respectively. |
| 153 | +* `--data_sampler`, `--samples_per_class`: How to construct a batch. The default method, `class_random`, selects classes at random and places `<samples_per_class>` samples into the batch until the batch is filled. |
| 154 | +* `--lr`, `--n_epochs`, `--bs` ,`--embed_dim`: Learning rate, number of training epochs, the batchsize and the embedding dimensionality. |
| 155 | +* `--evaluate_on_gpu`: If set, all metrics are computed using the gpu - requires Faiss-GPU and may need additional GPU memory. |
| 156 | + |
| 157 | +#### Some Notes: |
| 158 | +* During training, metrics listed in `--evaluation_metrics` will be logged for both training and validation/test set. If you do not care about detailed training metric logging, simply set the flag `--no_train_metrics`. A checkpoint is saved for improvements in metrics listed in `--storage_metrics` on training, validation or test sets. |
| 159 | +* If one wishes to use a training/validation split, simply set `--use_tv_split` and `--tv_split_perc <train/val split percentage>`. |
| 160 | + |
| 161 | + |
| 162 | +**[II.]** **Advanced Runs**: |
11 | 163 |
|
12 |
| -(2) https://arxiv.org/abs/2004.13458 |
13 | 164 | ```
|
14 |
| -DiVA: Diverse Visual Feature Aggregation for Deep Metric Learning |
| 165 | +python main.py --loss margin --batch_mining distance --loss_margin_beta 0.6 --miner_distance_lower_cutoff 0.5 ... (basic parameters) |
15 | 166 | ```
|
16 | 167 |
|
| 168 | +* To use specific parameters that are loss, batchminer or e.g. datasampler-related, simply set the respective flag. |
| 169 | +* For structure and ease of use, parameters relating to a specifc loss function/batchminer etc. are marked as e.g. `--loss_<lossname>_<parameter_name>`, see `parameters.py`. |
| 170 | +* However, every parameter can be called from every class, as all parameters are stored in a shared namespace that is passed to all methods. This makes it easy to create novel fusion losses and the likes. |
17 | 171 |
|
18 |
| -It is set up in a modular way to allow for fast and detailed prototyping. In addition, multiple training and test metrics are logged in WandB to allow for easy and large-scale evaluation. |
19 | 172 |
|
| 173 | +### Evaluating Results with W&B |
| 174 | +Here some information on using W&B (highly encouraged!) |
| 175 | + |
| 176 | +* Create an account here (free): https://wandb.ai |
| 177 | +* After the account is set, make sure to include your API key in `parameters.py` under `--wandb_key`. |
| 178 | +* To make sure that W&B data can be stored, ensure to run `wandb on` in the folder pointed to by `--save_path`. |
| 179 | +* When data is logged online to W&B, one can use `Result_Evaluations.py` to download all data, create named metric and correlation plots and output a summary in the form of a latex-ready table with mean and standard deviations of all metrics. **This ensures that there are no errors between computed and reported results.** |
| 180 | + |
| 181 | + |
| 182 | +### Creating custom methods: |
| 183 | + |
| 184 | +1. **Create custom objectives**: Simply take a look at e.g. `criteria/margin.py`, and ensure that the used methods has the following properties: |
| 185 | + * Inherit from `torch.nn.Module` and define a custom `forward()` function. |
| 186 | + * When using trainable parameters, make sure to either provide a `self.lr` to set the learning rate of the loss-specific parameters, or set `self.optim_dict_list`, which is a list containing optimization dictionaries passed to the optimizer (see e.g `criteria/proxynca.py`). If both are set, `self.optim_dict_list` has priority. |
| 187 | + * Depending on the loss, remember to set the variables `ALLOWED_MINING_OPS = None or list of allowed mining operations`, `REQUIRES_BATCHMINER = False or True`, `REQUIRES_OPTIM = False or True` to denote if the method needs a batchminer or optimization of internal parameters. |
| 188 | + |
| 189 | + |
| 190 | +2. **Create custom batchminer**: Simply take a look at e.g. `batch_mining/distance.py` - The miner needs to be a class with a defined `__call__()`-function, taking in a batch and labels and returning e.g. a list of triplets. |
| 191 | + |
| 192 | +3. **Create custom datasamplers**:Simply take a look at e.g. `datasampler/class_random_sampler.py`. The sampler needs to inherit from `torch.utils.data.sampler.Sampler` and has to provide a `__iter__()` and a `__len__()` function. It has to yield a set of indices that are used to create the batch. |
20 | 193 |
|
21 |
| -*WandB-Results for key runs in (1) will be made available as well.* |
22 | 194 |
|
23 | 195 | ---
|
24 | 196 |
|
25 |
| -Contact: Karsten Roth, [email protected] |
| 197 | +## Implemented Methods |
| 198 | + |
| 199 | +For a detailed explanation of everything, please refer to the supplementary of our paper! |
| 200 | + |
| 201 | +### DML criteria |
| 202 | + |
| 203 | +* **Angular** [[Deep Metric Learning with Angular Loss](https://arxiv.org/pdf/1708.01682.pdf)] |
| 204 | +* **ArcFace** [[ArcFace: Additive Angular Margin Loss for Deep Face Recognition](https://arxiv.org/pdf/1801.07698.pdf)] |
| 205 | +* **Contrastive** [[Dimensionality Reduction by Learning an Invariant Mapping](http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf)] |
| 206 | +* **Generalized Lifted Structure** [[In Defense of the Triplet Loss for Person Re-Identification](https://arxiv.org/abs/1703.07737)] |
| 207 | +* **Histogram** [[Learning Deep Embeddings with Histogram Loss](https://arxiv.org/pdf/1611.00822.pdf)] |
| 208 | +* **Marginloss** [[Sampling Matters in Deep Embeddings Learning](https://arxiv.org/abs/1706.07567)] |
| 209 | +* **MultiSimilarity** [[Multi-Similarity Loss with General Pair Weighting for Deep Metric Learning](https://arxiv.org/abs/1904.06627)] |
| 210 | +* **N-Pair** [[Improved Deep Metric Learning with Multi-class N-pair Loss Objective](https://papers.nips.cc/paper/6200-improved-deep-metric-learning-with-multi-class-n-pair-loss-objective)] |
| 211 | +* **ProxyNCA** [[No Fuss Distance Metric Learning using Proxies](https://arxiv.org/pdf/1703.07464.pdf)] |
| 212 | +* **Quadruplet** [[Beyond triplet loss: a deep quadruplet network for person re-identification](https://arxiv.org/abs/1704.01719)] |
| 213 | +* **Signal-to-Noise Ratio (SNR)** [[Signal-to-Noise Ratio: A Robust Distance Metric for Deep Metric Learning](https://arxiv.org/pdf/1904.02616.pdf)] |
| 214 | +* **SoftTriple** [[SoftTriple Loss: Deep Metric Learning Without Triplet Sampling](https://arxiv.org/abs/1909.05235)] |
| 215 | +* **Normalized Softmax** [[Classification is a Strong Baseline for Deep Metric Learning](https://arxiv.org/abs/1811.12649)] |
| 216 | +* **Triplet** [[Facenet: A unified embedding for face recognition and clustering](https://arxiv.org/abs/1503.03832)] |
| 217 | + |
| 218 | +### DML batchminer |
| 219 | + |
| 220 | +* **Random** [[Facenet: A unified embedding for face recognition and clustering](https://arxiv.org/abs/1503.03832)] |
| 221 | +* **Semihard** [[Facenet: A unified embedding for face recognition and clustering](https://arxiv.org/abs/1503.03832)] |
| 222 | +* **Softhard** [https://github.com/Confusezius/Deep-Metric-Learning-Baselines] |
| 223 | +* **Distance-based** [[Sampling Matters in Deep Embeddings Learning](https://arxiv.org/abs/1706.07567)] |
| 224 | +* **Rho-Distance** [[Revisiting Training Strategies and Generalization Performance in Deep Metric Learning](https://arxiv.org/abs/2002.08473)] |
| 225 | +* **Parametric** [[PADS: Policy-Adapted Sampling for Visual Similarity Learning](https://arxiv.org/abs/2003.11113)] |
| 226 | + |
| 227 | +### Architectures |
| 228 | + |
| 229 | +* **ResNet50** [[Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385)] |
| 230 | +* **Inception-BN** [[Going Deeper with Convolutions](https://arxiv.org/abs/1409.4842)] |
| 231 | +* **GoogLeNet** (torchvision variant w/ BN) [[Going Deeper with Convolutions](https://arxiv.org/abs/1409.4842)] |
| 232 | + |
| 233 | +### Datasets |
| 234 | +* **CUB200-2011** [[Caltech-UCSD Birds-200-2011](http://www.vision.caltech.edu/visipedia/CUB-200-2011.html)] |
| 235 | +* **CARS196** [[Cars Dataset](https://ai.stanford.edu/~jkrause/cars/car_dataset.html)] |
| 236 | +* **Stanford Online Products** [[Deep Metric Learning via Lifted Structured Feature Embedding](https://cvgl.stanford.edu/projects/lifted_struct/)] |
| 237 | + |
| 238 | +### Evaluation Metrics |
| 239 | + |
| 240 | +* **Recall@k** |
| 241 | +* **Normalized Mutual Information (NMI)** |
| 242 | +* **F1** |
| 243 | +* **mAP (class-averaged)** |
| 244 | +* **Spectral Variance** |
| 245 | +* **Mean Intraclass Distance** |
| 246 | +* **Mean Interclass Distance** |
| 247 | +* **Ratio Intra- to Interclass Distance** |
0 commit comments