Improving CNN linear layers with power mean non-linearity

2019 
Abstract Nowadays, Convolutional Neural Network (CNN) has achieved great success in various computer vision tasks. However, in classic CNN models, convolution and fully connected (FC) layers just perform linear transformations to their inputs. Non-linearity is often added by activation and pooling layers. It is natural to explore and extend convolution and FC layers non-linearly with affordable costs. In this paper, we first investigate the power mean function, which is proved effective and efficient in SVM kernel learning. Then, we investigate the power mean kernel, which is a non-linear kernel having linear computational complexity with the asymmetric kernel approximation function. Motivated by this scalable kernel, we propose Power Mean Transformation, which nonlinearizes both convolution and FC layers. It only needs a small modification on current CNNs, and improves the performance with a negligible increase of model size and running time. Experiments on various tasks show that Power Mean Transformation can improve classification accuracy, bring generalization ability and add different non-linearity to CNN models. Large performance gain on tiny models shows that Power Mean Transformation is especially effective in resource restricted deep learning scenarios like mobile applications. Finally, we add visualization experiments to illustrate why Power Mean Transformation works.
    • Correction
    • Source
    • Cite
    • Save
    • Machine Reading By IdeaReader
    63
    References
    12
    Citations
    NaN
    KQI
    []