We introduce a regularization concept based on the proposed Batch Confusion Norm (BCN) to address Fine-Grained Visual Classification (FGVC). The FGVC problem is notably characterized by its two intriguing properties, significant inter-class similarity and intra-class variations, which cause learning an effective FGVC classifier a challenging task. Inspired by the use of pairwise confusion energy as a regularization mechanism, we develop the BCN technique to improve the FGVC learning by imposing class prediction confusion on each training batch, and consequently alleviate the possible overfitting due to exploring image feature of fine details. In addition, our method is implemented with an attention gated CNN model, boosted by the incorporation of Atrous Spatial Pyramid Pooling (ASPP) to extract discriminative features and proper attentions. To demonstrate the usefulness of our method, we report state-of-the-art results on several benchmark FGVC datasets, along with comprehensive ablation comparisons.
翻译:我们引入了一个基于拟议的批量混杂规范(BCN)的正规化概念,以解决精美视觉分类(FGVC)问题。FGVC问题的主要特征是其两种令人感兴趣的特性,即各等级之间的大量相似性和各等级内部的差异,这导致学习一个有效的FGVC分类,这是一项具有挑战性的任务。由于使用双向混杂能源作为正规化机制,我们开发了BCN技术,通过对每批培训进行课堂预测,改进FGVC的学习,从而减轻因探索细微细节的图像特征而可能存在的过度匹配。此外,我们的方法是以CNN为主的注意锁定模型加以实施,通过纳入Atrom Space Pyramid 集合(ASPP)来强化,以提取歧视性特征和适当关注。为了展示我们方法的有用性,我们报告几个基准FGVC数据集的最新结果,同时进行全面的通胀比较。