SensePOLAR: 预培训背景文字嵌入的字感感知理解性 (SensePOLAR: Word sense aware interpretability for pre-trained contextual word embeddings)

Adding interpretability to word embeddings represents an area of active research in text representation. Recent work has explored thepotential of embedding words via so-called polar dimensions (e.g. good vs. bad, correct vs. wrong). Examples of such recent approaches include SemAxis, POLAR, FrameAxis, and BiImp. Although these approaches provide interpretable dimensions for words, they have not been designed to deal with polysemy, i.e. they can not easily distinguish between different senses of words. To address this limitation, we present SensePOLAR, an extension of the original POLAR framework that enables word-sense aware interpretability for pre-trained contextual word embeddings. The resulting interpretable word embeddings achieve a level of performance that is comparable to original contextual word embeddings across a variety of natural language processing tasks including the GLUE and SQuAD benchmarks. Our work removes a fundamental limitation of existing approaches by offering users sense aware interpretations for contextual word embeddings.

翻译：添加字嵌入的可解释性是文字表达中积极研究的一个领域。最近的工作探索了通过所谓的极地维度(如好与坏、正确与错误)嵌入文字的潜力。最近的方法的例子包括SemAxis、POLAR、FramandAxis和BiImp。虽然这些方法为字提供了可解释的维度,但并没有设计来处理多语种,即它们不能轻易区分不同词义。为了解决这一限制,我们介绍了SensePOLAR, 原POLAR框架的扩展,使预先培训的背景字嵌入能够有意识的字觉解释性。由此产生的可解释性字嵌入达到与包括GLUE和SQUAD基准在内的各种自然语言处理任务原始背景词的相似性。我们的工作消除了现有方法的基本限制,为用户提供了对背景字嵌入的感知性解释。

相关内容

词向量表示

关注 37

分散式表示即将语言表示为稠密、低维、连续的向量。研究者最早发现学习得到词嵌入之间存在类比关系。比如apple−apples ≈ car−cars， man−woman ≈ king – queen 等。这些方法都可以直接在大规模无标注语料上进行训练。词嵌入的质量也非常依赖于上下文窗口大小的选择。通常大的上下文窗口学到的词嵌入更反映主题信息，而小的上下文窗口学到的词嵌入更反映词的功能和上下文语义信息。

【Google】深度学习对抗鲁棒性，43页ppt

专知会员服务

45+阅读 · 2020年10月31日

史上最全！358篇机器学习&自然语言处理综述论文！都这儿了

专知会员服务

129+阅读 · 2020年7月18日

【微软亚洲研究院】无监督词嵌入对齐的几何感知域自适应，Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings

专知会员服务

23+阅读 · 2020年4月21日

【牛津大学-DeepMind 】上下文嵌入综述，A Survey on Contextual Embeddings

专知会员服务

42+阅读 · 2020年3月17日