Attention mechanism, especially channel attention, has gained great success in the computer vision field. Many works focus on how to design efficient channel attention mechanisms while ignoring a fundamental problem, i.e., using global average pooling (GAP) as the unquestionable pre-processing method. In this work, we start from a different view and rethink channel attention using frequency analysis. Based on the frequency analysis, we mathematically prove that the conventional GAP is a special case of the feature decomposition in the frequency domain. With the proof, we naturally generalize the pre-processing of channel attention mechanism in the frequency domain and propose FcaNet with novel multi-spectral channel attention. The proposed method is simple but effective. We can change only one line of code in the calculation to implement our method within existing channel attention methods. Moreover, the proposed method achieves state-of-the-art results compared with other channel attention methods on image classification, object detection, and instance segmentation tasks. Our method could improve by 1.8% in terms of Top-1 accuracy on ImageNet compared with the baseline SENet-50, with the same number of parameters and the same computational cost. Our code and models are publicly available at https://github.com/cfzd/FcaNet.
翻译:许多工作的重点是如何设计高效的频道关注机制,同时忽略一个根本问题,即使用全球平均共享(GAP)作为不可质疑的预处理方法。在这项工作中,我们从不同的角度出发,利用频率分析重新思考频道关注。根据频率分析,我们数学证明常规GAP是频率域特征分解的特殊案例。有证据,我们自然会将频道关注预处理机制在频率域的预处理加以概括,并以新的多光谱频道关注方式提出FcaNet。提议的方法简单而有效。我们只能在计算中改变一行代码,以便在现有频道关注方法内执行我们的方法。此外,拟议方法取得最新的结果,而与其他频道关注图像分类、对象探测和实例分解任务的方法相比。我们的方法可以比图像网的顶端一精确率提高1.8%,与基准SENet-50相比,参数数相同,计算成本相同。我们使用的代码和模型在httpsqub/comcfd上公开提供。