February 2020
tl;dr: A computationally efficient module to compress neural network by cheaply generate more feature maps
Instead of generating all n feature maps from all c channels in the input, generate m feature maps first (m < n), and then use cheap linear operation to generate n-m feature maps. The compression ratio is s.
The linear operation is usually a 3x3 conv. Different from original 3x3 convolution which takes in all c channels in the input, the m-n are generated from each of the m features directly (injective 单摄).
The paper has a very good description of compact model design, including mobilenets, mobilenets v2, mobilenets v3.
Model compression methods are usually bounded by the pretrained deep neural network taken as their baseline. The best way is to design such an efficient neural network that lends themselves to compression.
- Summaries of the key ideas
- Summary of technical details