In this paper, we show that extending the butterfly operations from the FFT algorithm to a general Butterfly Transform (BFT) can be beneficial in building an efficient block structure for CNN designs. Pointwise convolutions, which we refer to as channel fusions, are the main computational bottleneck in the state-of-the-art efficient CNNs (e.g. MobileNets ). We introduce a set of criteria for channel fusion and prove that BFT yields an asymptotically optimal FLOP count with respect to these criteria. By replacing pointwise convolutions with BFT, we reduce the computational complexity of these layers from O(n^2) to O(n\log n) with respect to the number of channels. Our experimental evaluations show that our method results in significant accuracy gains across a wide range of network architectures, especially at low FLOP ranges. For example, BFT results in up to a 6.75% absolute Top-1 improvement for MobileNetV1, 4.4 \% for ShuffleNet V2 and 5.4% for MobileNetV3 on ImageNet under a similar number of FLOPS. Notably, ShuffleNet-V2+BFT outperforms state-of-the-art architecture search methods MNasNet, FBNet and MobilenetV3 in the low FLOP regime.
翻译:在本文中,我们显示,将蝴蝶从FFT算法扩大到一般蝴蝶变换(BFT)有助于为CNN设计建立一个高效的区块结构。我们称之为频道聚合的Porfise Convolutions是最新高效CNN(例如移动网络)的主要计算瓶颈。我们引入了一套频道融合标准,并证明BFT产生了与这些标准有关的无影响的最佳FLOP值。通过与BFT取代点光相近的组合,我们将这些层的计算复杂性从O(n)2降低到O(n)log n),即频道数量。我们的实验性评估表明,我们的方法在一系列广泛的网络结构中取得了显著的准确性收益,特别是在低 FLOP 范围。例如,BFFT的结果是:移动网络1、4.4+ ShuffleNet V2和5.4%的移动网络V3,在类似数量的FLOPS-FTS-FS-FFS-Fstalstal-FTFS-FS-FS-FFFMFS-FS-FMFFFS-FMFT-FT-T-Slev-T-FT-S-Systststyal Styal Styal State Stylations,S-SU,SUT-FUT-FUT-FUT-FFF-FT-FT-FFFMUT-FT-FT-FT-FT-FT-FT-FT-FT-FT-FT-FMT-FT-FT-FT-FT-FS-FT-FT-FT-FT-FT-FT-FT-FT-FT-FT-F-F-FT-FT-F-FT-FT-FT-F-FT-FT-FS-FT-FT-FT-FT-F-F-F-FS-FS-FS-FS-FS-FT-FT-FT-FT-FS-FS-FT-FT-F-FS-F-F-F-F-F-F-F