Few-shot learning (FSL) aims to learn novel visual categories from very few samples, which is a challenging problem in real-world applications. Many methods of few-shot classification work well on general images to learn global representation. However, they can not deal with fine-grained categories well at the same time due to a lack of subtle and local information. We argue that localization is an efficient approach because it directly provides the discriminative regions, which is critical for both general classification and fine-grained classification in a low data regime. In this paper, we propose a Self-Attention Based Complementary Module (SAC Module) to fulfill the weakly-supervised object localization, and more importantly produce the activated masks for selecting discriminative deep descriptors for few-shot classification. Based on each selected deep descriptor, Semantic Alignment Module (SAM) calculates the semantic alignment distance between the query and support images to boost classification performance. Extensive experiments show our method outperforms the state-of-the-art methods on benchmark datasets under various settings, especially on the fine-grained few-shot tasks. Besides, our method achieves superior performance over previous methods when training the model on miniImageNet and evaluating it on the different datasets, demonstrating its superior generalization capacity. Extra visualization shows the proposed method can localize the key objects more interval.
翻译:少见的学习( FSL) 旨在从为数不多的样本中学习新颖的视觉类别,这是现实世界应用中一个具有挑战性的问题。许多少见的分类方法在一般图像上很好地学习全球代表性。然而,由于缺乏微妙和本地的信息,它们无法同时处理细微的分类。 我们争辩说, 本地化是一种有效的方法, 因为它直接提供了歧视区域, 这对于在低数据制度中的一般分类和精细的分类都是至关重要的。 在本文中, 我们提议了一个基于自我的C补充模块( SAC 模块), 以完成微弱监督对象的本地化, 更重要的是生成用于选择少数分类的歧视性深度描述器的激活面罩。 根据每个选定的深度描述符, Sermantic 调整模块( SAM) 计算了查询和支持图像之间的语义调整距离, 以提升分类绩效。 广泛的实验显示我们的方法超越了在各种环境中的基准数据集的状态, 特别是在微缩缩略的图像模型上, 展示了我们之前的超高端方法。