Trainable layers such as convolutional building blocks are the standard network design choices by learning parameters to capture the global context through successive spatial operations. When designing an efficient network, trainable layers such as the depthwise convolution is the source of efficiency in the number of parameters and FLOPs, but there was little improvement to the model speed in practice. This paper argues that simple built-in parameter-free operations can be a favorable alternative to the efficient trainable layers replacing spatial operations in a network architecture. We aim to break the stereotype of organizing the spatial operations of building blocks into trainable layers. Extensive experimental analyses based on layer-level studies with fully-trained models and neural architecture searches are provided to investigate whether parameter-free operations such as the max-pool are functional. The studies eventually give us a simple yet effective idea for redesigning network architectures, where the parameter-free operations are heavily used as the main building block without sacrificing the model accuracy as much. Experimental results on the ImageNet dataset demonstrate that the network architectures with parameter-free operations could enjoy the advantages of further efficiency in terms of model speed, the number of the parameters, and FLOPs. Code and ImageNet pretrained models are available at https://github.com/naver-ai/PfLayer.
翻译:通过连续的空间操作,通过学习参数来捕捉全球背景,标准网络设计选择,如革命建筑块等可培训的层层,通过连续的空间操作来捕捉全球背景。在设计高效的网络时,诸如深层次的融合等可培训的层层是参数和FLOP数量的效率来源,但实际中模型速度几乎没有改进。本文认为,简单的内建无参数操作可以成为高效的可培训层取代网络结构空间操作的有利替代物。我们的目标是打破将建筑块的空间操作组织成可培训层的定型观念。根据经过充分训练的模型和神经结构搜索的层级研究进行的广泛实验分析,以调查无参数操作(如最大资源库)是否起作用。这些研究最终为我们提供了一个简单而有效的网络结构重新设计构想,在不牺牲模型准确性的情况下大量使用无参数操作作为主建筑块。图像网络数据集的实验结果显示,具有无参数操作的网络结构在模型速度、参数数量和FLOPS/FAREFFA前模型方面可以享受进一步效率的好处。