New categories can be discovered by transforming semantic features into synthesized visual features without corresponding training samples in zero-shot image classification. Although significant progress has been made in generating high-quality synthesized visual features using generative adversarial networks, guaranteeing semantic consistency between the semantic features and visual features remains very challenging. In this paper, we propose a novel zero-shot learning approach, GAN-CST, based on class knowledge to visual feature learning to tackle the problem. The approach consists of three parts, class knowledge overlay, semi-supervised learning and triplet loss. It applies class knowledge overlay (CKO) to obtain knowledge not only from the corresponding class but also from other classes that have the knowledge overlay. It ensures that the knowledge-to-visual learning process has adequate information to generate synthesized visual features. The approach also applies a semi-supervised learning process to re-train knowledge-to-visual model. It contributes to reinforcing synthesized visual features generation as well as new category prediction. We tabulate results on a number of benchmark datasets demonstrating that the proposed model delivers superior performance over state-of-the-art approaches.
翻译:通过将语义特征转换成综合视觉特征,而不进行零光图像分类的相应培训样本,可以发现新的类别。虽然在利用基因对抗网络生成高质量的合成视觉特征方面取得了重大进展,但保证语义特征和视觉特征之间的语义一致性仍然是非常困难的。在本文件中,我们提议根据视觉特征学习的课堂知识,采用新的零光学习方法GAN-CST, 以解决这一问题。该方法由三个部分组成:类知识覆盖、半监督学习和三重损失。它不仅从相应的类中获取知识,而且从具有知识覆盖的其他类中获取知识。它确保知识到视觉学习过程有足够的信息来生成综合视觉特征。这个方法还采用半监督学习过程来重新引入知识到视觉模型。它有助于加强合成的视觉特征生成和新的类别预测。我们将一些基准数据集的结果制成图表,表明拟议的模型在州一级方法上表现优异。