Grad-CAM 微粒视觉分类引导频道空间关注模块 (Grad-CAM guided channel-spatial attention module for fine-grained visual classification)

Fine-grained visual classification (FGVC) is becoming an important research field, due to its wide applications and the rapid development of computer vision technologies. The current state-of-the-art (SOTA) methods in the FGVC usually employ attention mechanisms to first capture the semantic parts and then discover their subtle differences between distinct classes. The channel-spatial attention mechanisms, which focus on the discriminative channels and regions simultaneously, have significantly improved the classification performance. However, the existing attention modules are poorly guided since part-based detectors in the FGVC depend on the network learning ability without the supervision of part annotations. As obtaining such part annotations is labor-intensive, some visual localization and explanation methods, such as gradient-weighted class activation mapping (Grad-CAM), can be utilized for supervising the attention mechanism. We propose a Grad-CAM guided channel-spatial attention module for the FGVC, which employs the Grad-CAM to supervise and constrain the attention weights by generating the coarse localization maps. To demonstrate the effectiveness of the proposed method, we conduct comprehensive experiments on three popular FGVC datasets, including CUB-$200$-$2011$, Stanford Cars, and FGVC-Aircraft datasets. The proposed method outperforms the SOTA attention modules in the FGVC task. In addition, visualizations of feature maps also demonstrate the superiority of the proposed method against the SOTA approaches.

翻译：精细视觉分类(FGVC)由于应用广泛和计算机视觉技术的迅速发展,正在成为一个重要的研究领域。目前FGVC中最先进的技术(SOTA)方法通常使用关注机制,首先捕捉语义部分,然后发现不同类别之间的微妙差异。频道空间关注机制同时侧重于歧视性渠道和区域,大大改善了分类性能。但是,现有关注模块没有很好地指导,因为FGVC中的部分基于检测器在不受部分说明监督的情况下取决于网络学习能力。由于获得这种部分说明是劳动密集型的,一些视觉本地化和解释方法,例如梯度加权类激活绘图(Grad-CAM),可以用来监督关注机制。我们建议为FGVC建立一个G-C引导的频道空间关注模块,该模块使用格拉德-CAM来监督和限制关注权重,通过生成可分析的本地化地图。为了展示拟议方法的有效性,我们还在三种通用的SUBC-GFGSFSFC格式任务中,对拟议的SUC-GFG格式数据模型进行全面实验。

相关内容

注意力机制

关注 120

Attention机制最早是在视觉图像领域提出来的，但是真正火起来应该算是google mind团队的这篇论文《Recurrent Models of Visual Attention》[14]，他们在RNN模型上使用了attention机制来进行图像分类。随后，Bahdanau等人在论文《Neural Machine Translation by Jointly Learning to Align and Translate》 [1]中，使用类似attention的机制在机器翻译任务上将翻译和对齐同时进行，他们的工作算是是第一个提出attention机制应用到NLP领域中。接着类似的基于attention机制的RNN模型扩展开始应用到各种NLP任务中。最近，如何在CNN中使用attention机制也成为了大家的研究热点。下图表示了attention研究进展的大概趋势。

【NeurIPS 2020】图神经网络的参数化解释器，Parameterized Explainer for GNN

专知会员服务

22+阅读 · 2020年11月13日

【IJCAJ 2020】多通道神经网络 Multi-Channel Graph Neural Networks

专知会员服务

26+阅读 · 2020年7月19日

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【CVPR2020-中科院-腾讯优图】基于注意力卷积二叉神经树的细粒度视觉分类

专知会员服务

26+阅读 · 2020年3月29日