Teaching machines to recognize a new category based on few training samples especially only one remains challenging owing to the incomprehensive understanding of the novel category caused by the lack of data. However, human can learn new classes quickly even given few samples since human can tell what discriminative features should be focused on about each category based on both the visual and semantic prior knowledge. To better utilize those prior knowledge, we propose the SEmantic Guided Attention (SEGA) mechanism where the semantic knowledge is used to guide the visual perception in a top-down manner about what visual features should be paid attention to when distinguishing a category from the others. As a result, the embedding of the novel class even with few samples can be more discriminative. Concretely, a feature extractor is trained to embed few images of each novel class into a visual prototype with the help of transferring visual prior knowledge from base classes. Then we learn a network that maps semantic knowledge to category-specific attention vectors which will be used to perform feature selection to enhance the visual prototypes. Extensive experiments on miniImageNet, tieredImageNet, CIFAR-FS, and CUB indicate that our semantic guided attention realizes anticipated function and outperforms state-of-the-art results.
翻译:由于对缺乏数据造成的新类别的理解不够全面,因此只有一种方法仍然具有挑战性。然而,人类可以快速地学习新课程,即使很少的样本,人类也可以很快地学习新课程,因为人类能够根据先前的视觉和语义知识,根据视觉和语义知识来判断每一类别应注重哪些歧视特征。为了更好地利用这些先前的知识,我们建议使用Semantic 方向关注(SEGA)机制,使用语义知识来以自上而下的方式指导视觉感知,在将某一类别与其它类别区分时,应当关注哪些视觉特征。因此,即使以少量样本的形式嵌入新课程,也可能更加具有歧视性。具体地说,对一个特征提取器进行了培训,将每个新类的少量图像嵌入视觉原型,帮助将先前的视觉知识从基础课程转移。然后我们学习一个网络,将语义学知识映射给特定类别的注意力矢量矢量矢量矢量矢量矢量矢量,用来进行特征选择,以加强视觉原型。对微型IMageNet、分级ImageNet、CIFSFS-artergage、CIFS-FS-FS-FS-FS-FS-FS-C和CUBIFF-S-S-S-S-S-S-S-S-S-S-S-SUB-S-S-S-S-S-S-S-S-SD-S-S-S-S-S-SON-SUB-SON-SON-S-SD-SD-S-SD-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-V-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-S-