DUET: 反零弹学习的跨现代语法地基 (DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot Learning)

Zero-shot learning (ZSL) aims to predict unseen classes whose samples have never appeared during training. One of the most effective and widely used semantic information for zero-shot image classification are attributes which are annotations for class-level visual characteristics. However, the current methods often fail to discriminate those subtle visual distinctions between images due to not only the shortage of fine-grained annotations, but also the attribute imbalance and co-occurrence. In this paper, we present a transformer-based end-to-end ZSL method named DUET, which integrates latent semantic knowledge from the pretrained language models (PLMs) via a self-supervised multimodal learning paradigm. Specifically, we (1) developed a cross-modal semantic grounding network to investigate the model's capability of disentangling semantic attributes from the images; (2) applied an attribute-level contrastive learning strategy to further enhance the model's discrimination on fine-grained visual characteristics against the attribute cooccurrence and imbalance; (3) proposed a multi-task learning policy for considering multi-model objectives. With extensive experiments on three standard ZSL benchmarks and a knowledge graph equipped ZSL benchmark, we find that DUET can often achieve state-of-the-art performance, its components are effective and its predictions are interpretable.

翻译：零点学习(ZSL)的目的是预测在培训期间从未出现过样本的隐蔽班级。在零点图像分类中,最有效和广泛使用的语义信息之一是作为课堂视觉特征说明的属性。然而,目前的方法往往没有区分图像之间的这些微妙视觉区别,不仅因为缺少细微图解,而且由于属性不平衡和共发现象。在本文中,我们提出了一个基于变压器的终端到终端ZSL方法,名为DUET,它通过自我监督的多式联运学习模式,将预先培训的语言模型(PLMs)的潜在语义知识整合在一起。具体地说,我们(1) 开发了一个跨模式语义地面网络,调查模型从图像中分离语义属性的能力;(2) 应用了属性水平对比学习战略,以进一步加强模型在精细度视觉特征上对属性差异和不平衡的区分;(3) 提出了一项多任务学习政策,以考虑多模范语言模型的目标。在三个标准DUSL的SL基准上进行广泛的实验,我们常常能够找到一个具有ZSL的州级基准,我们能够找到一个有效的数据。

相关内容

Duet

关注 0

Duet Game 开发商Kumobius Pty Ltd，更新时间2014年5月2日。
Duet Game的节奏并不复杂，通过不断的重新排列组合，来重新定义关卡的难度。

游戏界面不定时飘来方块，根据音乐的节奏来变换着队形。而玩家需要做的便是，在适当的时机，通过触摸屏幕来巧妙而灵活的躲避下坠的方块。点触屏幕两侧，使方块旋转或扭曲，避开前进道路上的障碍物。即使开头很简单，最后可能很复杂。

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

100+篇《自监督学习(Self-Supervised Learning)》论文最新合集

专知会员服务

166+阅读 · 2020年3月18日