retrain distilled images with minibatch-SGD #36

JialunSong · 2021-01-13T08:24:55Z

Hey I am very interested in this work, and have some questions to ask.
I used 20 images per class in MINIST dataset-distillation by using
python main.py --mode distill_basic --dataset MNIST --arch LeNet \--distill_steps 1 --train_nets_type known_init --n_nets 1 \--test_nets_type same_as_train
and achieved 96.54 testing accuracy.
But when I use these distilled images as training data to retrain a same initial model as used in distillation step by minibatch-SGD, the testing accuracy dropped to 62% and the overfitting occurred. My question is
(1)Is it just because the different way of optimization?
(2)Why optimized the network in the way of yours can avoid overfitting even used only 1 sample per class in MINIST dataset-distillation?
(3)How to use distilled images to retrain a good model in normal training way such as minibatch-SGD?

The text was updated successfully, but these errors were encountered:

ssnl · 2021-01-14T04:14:30Z

The images may be optimized to jointly give some gradient. If you want to be order/batch agnostic, you can try modifying the distillation procedure to apply the images in randomly ordered batches.

JialunSong · 2021-01-14T13:39:23Z

Thanks for your reply. May be I express my question in a wrong way. I aim to use the distilled images which have been generated to achieve the best testing performance (such as the MINIST distilled data which have achieved 96.54% accuracy) to retrain a model from scratch.
I optimized model with these distilled images by minibatch-SGD(shuffle=True) and the distilled images is freeze in this term, only the network parameters is updated to achieve a good classification on MINIST test data.
I don't aim to change the data distillation procedure in a randomly ordered batches way.
Is it possible to use the distilled data to retrain a model which performs as good as the final model in the data distillation term?

ssnl · 2021-01-14T16:57:03Z

You expressed well and I understood exactly what you meant. What I was saying is that if you want the images to be able to be applied in a certain way (e.g., randomly ordered and batched), it is best to modify the training to suit that, because they might overfit to the fixed ordering and batching used in training. Hence doing these randomly in training is also important.

JialunSong · 2021-01-15T01:47:46Z

I understand what you said now. I'II try that and share the next results. Thanks a lot.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

retrain distilled images with minibatch-SGD #36

retrain distilled images with minibatch-SGD #36

JialunSong commented Jan 13, 2021

ssnl commented Jan 14, 2021

JialunSong commented Jan 14, 2021

ssnl commented Jan 14, 2021

JialunSong commented Jan 15, 2021

retrain distilled images with minibatch-SGD #36

retrain distilled images with minibatch-SGD #36

Comments

JialunSong commented Jan 13, 2021

ssnl commented Jan 14, 2021

JialunSong commented Jan 14, 2021

ssnl commented Jan 14, 2021

JialunSong commented Jan 15, 2021