Distributional semantics based on neural approaches is a cornerstone of Natural Language Processing, with surprising connections to human meaning representation as well. Recent Transformer-based Language Models have proven capable of producing contextual word representations that reliably convey sense-specific information, simply as a product of self-supervision. Prior work has shown that these contextual representations can be used to accurately represent large sense inventories as sense embeddings, to the extent that a distance-based solution to Word Sense Disambiguation (WSD) tasks outperforms models trained specifically for the task. Still, there remains much to understand on how to use these Neural Language Models (NLMs) to produce sense embeddings that can better harness each NLM's meaning representation abilities. In this work we introduce a more principled approach to leverage information from all layers of NLMs, informed by a probing analysis on 14 NLM variants. We also emphasize the versatility of these sense embeddings in contrast to task-specific models, applying them on several sense-related tasks, besides WSD, while demonstrating improved performance using our proposed approach over prior work focused on sense embeddings. Finally, we discuss unexpected findings regarding layer and model performance variations, and potential applications for downstream tasks.
翻译:基于神经方法的分布式语义是自然语言处理的基石,与人的意义代表也有着惊人的联系。最近以变异语言为基础的语言模型已证明能够产生背景文字表达方式,能够可靠地传递感官特有信息,仅作为自我监督的产物。先前的工作表明,这些背景表述方式可以用来准确代表大感性目录,作为感觉嵌入,其程度是Word Sense Disamgication(WSD)任务远程解决方案比为任务专门培训的模式要好。然而,除了WSD外,对于如何使用这些神经语言模型来产生感官嵌入器,以更好地利用每个NLM的含义代表能力。在这项工作中,我们引入了一种更加有原则性的方法,利用来自NLM所有层次的信息作为感性嵌入,对14 NLM 变式的预测性分析提供了信息。我们还强调这些感知觉与任务特有差异性结合的多功能,除了WSD外,将这些感性应用于若干与感性相关的任务,同时用我们用我们提议的模型展示了改进的绩效,而超越了先前工作的重点变化和下游应用。