Locating discriminative parts plays a key role in fine-grained visual classification due to the high similarities between different objects. Recent works based on convolutional neural networks utilize the feature maps taken from the last convolutional layer to mine discriminative regions. However, the last convolutional layer tends to focus on the whole object due to the large receptive field, which leads to a reduced ability to spot the differences. To address this issue, we propose a novel Granularity-Aware Convolutional Neural Network (GA-CNN) that progressively explores discriminative features. Specifically, GA-CNN utilizes the differences of the receptive fields at different layers to learn multi-granularity features, and it exploits larger granularity information based on the smaller granularity information found at the previous stages. To further boost the performance, we introduce an object-attentive module that can effectively localize the object given a raw image. GA-CNN does not need bounding boxes/part annotations and can be trained end-to-end. Extensive experimental results show that our approach achieves state-of-the-art performances on three benchmark datasets.
翻译:由于不同物体之间的高度相似性,在细微视觉分类中,区别性部件的定位具有关键作用,因为不同物体之间有着高度相似性。最近基于卷发神经网络的工程利用从上一个卷发层到有区别的区域的地貌图。然而,最后的卷发层由于大面积的可接收场而倾向于关注整个物体,这导致发现差异的能力下降。为了解决这一问题,我们提议建立一个新型的颗粒-磁盘神经网络(GA-CNN),逐步探索有区别性特征。具体地说,GA-CNN利用不同层的可接收场的差异来学习多色特征,并利用基于前几个阶段发现的较小颗粒度信息的较大颗粒度信息。为了进一步提高性能,我们引入了一个可以有效地将对象定位于原始图像的物体定位的物体强化模块。GA-CNN不需要捆绑框/部分说明,可以接受端对端到端的训练。广泛的实验结果显示,我们的方法在三个基准数据集上达到了状态。