在解释具体的高效CNN模型之前,我们先检查一下高效CNN模型中使用的组成部分的计算量,看看卷积在空间和通道域中是如何进行的。
假设 H x W 为输出feature map的空间大小,N为输入通道数,K x K为卷积核的大小,M为输出通道数,则标准卷积的计算量为 HWNK²M 。
这里重要的一点是,标准卷积的计算量与(1)输出特征图H x W的空间大小,(2)卷积核K的大小,(3)输入输出通道的数量N x M成正比。
当在空间域和通道域进行卷积时,需要上述计算量。通过分解这个卷积,可以加速 CNNs,如图所示。
[1] M. Lin, Q. Chen, and S. Yan, “Network in Network,” in Proc. of ICLR, 2014.[2] L. Sifre, “Rigid-motion Scattering for Image Classification, Ph.D. thesis, 2014.[3] L. Sifre and S. Mallat, “Rotation, Scaling and Deformation Invariant Scattering for Texture Discrimination,” in Proc. of CVPR, 2013.[4] F. Chollet, “Xception: Deep Learning with Depthwise Separable Convolutions,” in Proc. of CVPR, 2017.[5] X. Zhang, X. Zhou, M. Lin, and J. Sun, “ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices,” in arXiv:1707.01083, 2017.[6] K. He, X. Zhang, S. Ren, and J. Sun, “Deep Residual Learning for Image Recognition,” in Proc. of CVPR, 2016.[7] S. Xie, R. Girshick, P. Dollár, Z. Tu, and K. He, “Aggregated Residual Transformations for Deep Neural Networks,” in Proc. of CVPR, 2017.[8] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam, “Mobilenets: Efficient Convolutional Neural Networks for Mobile Vision Applications,” in arXiv:1704.04861, 2017.[9] M. Sandler, A. Howard, M. Zhu, A. Zhmoginov, and L. Chen, “MobileNetV2: Inverted Residuals and Linear Bottlenecks,” in arXiv:1801.04381v3, 2018.