克服带有单词定义的贫穷单词嵌入 (Overcoming Poor Word Embeddings with Word Definitions)

Modern natural language understanding models depend on pretrained subword embeddings, but applications may need to reason about words that were never or rarely seen during pretraining. We show that examples that depend critically on a rarer word are more challenging for natural language inference models. Then we explore how a model could learn to use definitions, provided in natural text, to overcome this handicap. Our model's understanding of a definition is usually weaker than a well-modeled word embedding, but it recovers most of the performance gap from using a completely untrained word.

翻译：现代自然语言理解模型取决于预先培训的子字嵌入,但应用可能需要解释在预培训期间从未或很少见到的字眼。我们显示,对于自然语言推论模型来说,关键依赖稀有字眼的例子更具有挑战性。然后我们探索一个模型如何学会使用自然文本提供的定义来克服这一障碍。我们的模型对定义的理解通常比完善的字眼嵌入要弱,但是它从使用完全未经训练的字眼中恢复了大部分的性能差距。

相关内容

词向量表示

关注 37

分散式表示即将语言表示为稠密、低维、连续的向量。研究者最早发现学习得到词嵌入之间存在类比关系。比如apple−apples ≈ car−cars， man−woman ≈ king – queen 等。这些方法都可以直接在大规模无标注语料上进行训练。词嵌入的质量也非常依赖于上下文窗口大小的选择。通常大的上下文窗口学到的词嵌入更反映主题信息，而小的上下文窗口学到的词嵌入更反映词的功能和上下文语义信息。

最新《自监督表示学习》报告，70页ppt

专知会员服务

86+阅读 · 2020年12月22日

多标签学习的新趋势（2020 Survey）

专知会员服务

43+阅读 · 2020年12月6日

2020数据工程师成长路线图

专知会员服务

41+阅读 · 2020年9月6日

图节点嵌入(Node Embeddings)概述，9页pdf

专知会员服务

40+阅读 · 2020年8月22日