Contextual embeddings represent a new generation of semantic representations learned from Neural Language Modelling (NLM) that addresses the issue of meaning conflation hampering traditional word embeddings. In this work, we show that contextual embeddings can be used to achieve unprecedented gains in Word Sense Disambiguation (WSD) tasks. Our approach focuses on creating sense-level embeddings with full-coverage of WordNet, and without recourse to explicit knowledge of sense distributions or task-specific modelling. As a result, a simple Nearest Neighbors (k-NN) method using our representations is able to consistently surpass the performance of previous systems using powerful neural sequencing models. We also analyse the robustness of our approach when ignoring part-of-speech and lemma features, requiring disambiguation against the full sense inventory, and revealing shortcomings to be improved. Finally, we explore applications of our sense embeddings for concept-level analyses of contextual embeddings and their respective NLMs.
翻译:环境嵌入是指从神经语言模型(NLM)中学会的新一代语义表达,它解决了影响传统语言嵌入的含义混为一谈的问题。在这项工作中,我们展示了背景嵌入可用于实现Word Sense Disamguation(WSD)任务中前所未有的收益。我们的方法侧重于创建包含WordNet全面覆盖的感知层嵌入,而没有诉诸感知分布或任务特定模型的明确知识。因此,使用我们代表的简单近邻(k-NNN)方法能够利用强大的神经测序模型持续超过先前系统的性能。我们还分析了在忽略部分感知和利玛特征时我们的方法的稳健性,这要求对全感清册进行分辨,并揭示有待改进的缺点。最后,我们探索了我们感嵌入感的应用,用于对背景嵌入及其各自的NLMs进行概念层面分析。