Spatial attention has been demonstrated to enable convolutional neural networks to focus on critical information to improve network performance, but it still has limitations. In this paper, we explain the effectiveness of spatial attention from a new perspective, it is that the spatial attention mechanism essentially solves the problem of convolutional kernel parameter sharing. However, the information contained in the attention map generated by spatial attention is still lacking for large-size convolutional kernels. So, we propose a new attention mechanism called Receptive-Field Attention (RFA). The Convolutional Block Attention Module (CBAM) and Coordinate Attention (CA) only focus on spatial features and cannot fully solve the problem of convolutional kernel parameter sharing, but in RFA, the receptive-field spatial feature not only is focused but also provide good attention weights for large-size convolutional kernels. The Receptive-Field Attention convolutional operation (RFAConv) designed by RFA can be considered a new way to replace the standard convolution and brings almost negligible computational cost and a number of parameters. Numerous experiments on Imagenet-1k, MS COCO, and VOC demonstrate the superior performance of our approach in classification, object detection, and semantic segmentation tasks. Importantly, we believe that for some current spatial attention mechanisms that focus only on spatial features, it is time to improve the performance of the network by focusing on receptive-field spatial features. The code and pre-trained models for the relevant tasks can be found at https://github.com/Liuchen1997/RFAConv
翻译:空间注意力机制已经证明使得卷积神经网络能够聚焦于关键信息以提高网络性能,但它仍然存在缺陷。在本文中,我们从新的角度解释了空间注意力的有效性,即空间注意力机制从本质上解决了卷积核参数共享的问题。然而,由空间注意力所生成的注意力图中所包含的信息尚不能满足大尺寸卷积核的需求。因此,我们提出了一种称之为感受野注意(RFA)的新型注意力机制。在传统的空间注意力模型中,卷积块注意模块(CBAM)和坐标注意(CA)只关注空间特征,无法完全解决卷积核参数共享的问题,但是在RFA中,感受野空间特征不仅可以被关注,同时还可以为大尺寸卷积核提供有效的注意权重。感受野注意卷积操作(RFAConv)便是通过RFA所设计出来的一种新的用于代替标准卷积运算的方法,可以带来几乎可以忽略不计的计算代价和参数数量。在Imagenet-1k、MS COCO和VOC上的大量实验表明,本文提出的方法在分类、目标检测和语义分割任务上展示了出优秀的性能。更为重要的是,我们认为,对于某些只聚焦于空间特征的当前空间注意力机制来说,现在是通过关注感受野空间特征来提高网络性能的时候了。可在https://github.com/Liuchen1997/RFAConv找到相关任务的代码和预训练模型。