Title

Understanding and Exploring the Network with Stochastic Architectures

Author

Zhijie Deng, Yinpeng Dong, Shifeng Zhang, Jun Zhu

Abstract

There is an emerging trend to train a network with stochastic architectures to enable various architectures to be plugged and played during inference. However, the existing investigation is highly entangled with neural architecture search (NAS), limiting its widespread use across scenarios. In this work, we decouple the training of a network with stochastic architectures (NSA) from NAS and provide a first systematical investigation on it as a stand-alone problem. We first uncover the characteristics of NSA in various aspects ranging from training stability, convergence, predictive behaviour, to generalization capacity to unseen architectures. We identify various issues of the vanilla NSA, such as training/test disparity and function mode collapse, and further propose the solutions to these issues with theoretical and empirical insights. We believe that these results could also serve as good heuristics for NAS. Given these understandings, we further apply the NSA with our improvements into diverse scenarios to fully exploit its promise of inference-time architecture stochasticity, including model ensemble, uncertainty estimation and semi-supervised learning. Remarkable performance (e.g., 2.75% error rate and 0.0032 expected calibration error on CIFAR-10) validate the effectiveness of such a model, providing new perspectives of exploring the potential of the network with stochastic architectures, beyond NAS.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deng2020Understanding.md

Deng2020Understanding.md

Title

Author

Abstract

Bib

Files

Deng2020Understanding.md

Latest commit

History

Deng2020Understanding.md

File metadata and controls

Title

Author

Abstract

Bib