Attention mechanism, especially channel attention, has gained great success in the computer vision field. Many works focus on how to design efficient channel attention mechanisms while ignoring a fundamental problem, i.e., using global average pooling (GAP) as the unquestionable pre-processing method. In this work, we start from a different view and rethink channel attention using frequency analysis. Based on the frequency analysis, we mathematically prove that the conventional GAP is a special case of the feature decomposition in the frequency domain. With the proof, we naturally generalize the pre-processing of channel attention mechanism in the frequency domain and propose FcaNet with novel multi-spectral channel attention. The proposed method is simple but effective. We can change only one line of code in the calculation to implement our method within existing channel attention methods. Moreover, the proposed method achieves state-of-the-art results compared with other channel attention methods on image classification, object detection, and instance segmentation tasks. Our method could improve by 1.8% in terms of Top-1 accuracy on ImageNet compared with the baseline SENet-50, with the same number of parameters and the same computational cost. Our code and models will be made publicly available.
翻译:许多工作的重点是如何设计高效的频道关注机制,同时忽略一个根本问题,即使用全球平均集合(GAP)作为不可置疑的预处理方法。在这项工作中,我们从不同的角度出发,用频率分析重新思考频道关注。根据频率分析,我们在数学上证明常规GAP是频率域特征分解的特殊案例。有了证据,我们自然会将频率域频道关注预处理机制的预处理方法普遍化,并以新的多光谱频道关注方式提出FcaNet。提议的方法简单而有效。我们只能在计算中改变一行代码,以便在现有频道关注方法内执行我们的方法。此外,与关于图像分类、天体探测和实例分解任务的其他频道关注方法相比,拟议方法取得了最新的结果。我们的方法可以在图像网络上与基准SENet-50相比,在最大1的准确度方面改进1.8%,同时提供相同的参数和相同的计算成本。我们的代码和模型将公开提供。