通过多任务培训,学习 " 零热 " 、多视视屏基础字嵌入式 (Learning Zero-Shot Multifaceted Visually Grounded Word Embeddings via Multi-Task Training)

Language grounding aims at linking the symbolic representation of language (e.g., words) into the rich perceptual knowledge of the outside world. The general approach is to embed both textual and visual information into a common space -the grounded space-confined by an explicit relationship between both modalities. We argue that this approach sacrifices the abstract knowledge obtained from linguistic co-occurrence statistics in the process of acquiring perceptual information. The focus of this paper is to solve this issue by implicitly grounding the word embeddings. Rather than learning two mappings into a joint space, our approach integrates modalities by determining a reversible grounded mapping between the textual and the grounded space by means of multi-task learning. Evaluations on intrinsic and extrinsic tasks show that our embeddings are highly beneficial for both abstract and concrete words. They are strongly correlated with human judgments and outperform previous works on a wide range of benchmarks. Our grounded embeddings are publicly available here.

翻译：语言基础旨在将语言(如文字)的象征性表述与外部世界的丰富认知知识联系起来。一般的做法是将文字和视觉信息都嵌入共同空间――两种模式之间的明确关系将封闭的空间封闭起来。我们争辩说,这种方法牺牲了在获取感知信息过程中从语言共发统计数据中获得的抽象知识。本文件的重点是通过隐含词嵌入词层来解决这一问题。我们的方法不是将两个图解混入一个共同空间,而是通过多任务学习的方式,在文本和封闭空间之间确定一种可逆的、有根有据的绘图。对内在和外在任务的评价表明,我们的嵌入对于抽象和具体语言都非常有益。它们与人类的判断密切相关,并且超越了在广泛基准方面的以往工作。我们基于根基的嵌入在这里可以公开查阅。

相关内容

词向量表示

关注 37

分散式表示即将语言表示为稠密、低维、连续的向量。研究者最早发现学习得到词嵌入之间存在类比关系。比如apple−apples ≈ car−cars， man−woman ≈ king – queen 等。这些方法都可以直接在大规模无标注语料上进行训练。词嵌入的质量也非常依赖于上下文窗口大小的选择。通常大的上下文窗口学到的词嵌入更反映主题信息，而小的上下文窗口学到的词嵌入更反映词的功能和上下文语义信息。

对比学习简述

专知会员服务

90+阅读 · 2021年6月29日

【MIT】反偏差对比学习，Debiased Contrastive Learning

专知会员服务

91+阅读 · 2020年7月4日

零样本文本分类，Zero-Shot Learning for Text Classification

专知会员服务

97+阅读 · 2020年5月31日

【微软亚洲研究院】无监督词嵌入对齐的几何感知域自适应，Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings

专知会员服务

23+阅读 · 2020年4月21日