Classifying the sub-categories of an object from the same super-category (e.g., bird) in a fine-grained visual classification (FGVC) task highly relies on mining multiple discriminative features. Existing approaches mainly tackle this problem by introducing attention mechanisms to locate the discriminative parts or feature encoding approaches to extract the highly parameterized features in a weakly-supervised fashion. In this work, we propose a lightweight yet effective regularization method named Channel DropBlock (CDB), in combination with two alternative correlation metrics, to address this problem. The key idea is to randomly mask out a group of correlated channels during training to destruct features from co-adaptations and thus enhance feature representations. Extensive experiments on three benchmark FGVC datasets show that CDB effectively improves the performance.
翻译:将同一超级类物体(如鸟类)的亚类分类归为精细的视觉分类(FGVC)任务,高度依赖采矿的多重歧视特征。现有办法主要通过引入注意机制来解决这一问题,即定位歧视性部件或特征编码方法,以弱小的监视方式提取高度参数特征。在这项工作中,我们提议一种轻量但有效的规范化方法,即“通道漏斗”(CDB),结合两种替代的相关度量度来解决这一问题。关键的想法是随机遮盖一组相关渠道,在训练期间从共同适应中摧毁特征,从而增强特征表现。关于三个基准的FGVC数据集的广泛实验表明CDB有效地改进了性能。