The goal of Word Sense Disambiguation (WSD) is to identify the sense of a polysemous word in a specific context. Deep-learning techniques using BERT have achieved very promising results in the field and different methods have been proposed to integrate structured knowledge to enhance performance. At the same time, an increasing number of data augmentation techniques have been proven to be useful for NLP tasks. Building upon previous works leveraging BERT and WordNet knowledge, we explore different data augmentation techniques on context-gloss pairs to improve the performance of WSD. In our experiment, we show that both sentence-level and word-level augmentation methods are effective strategies for WSD. Also, we find out that performance can be improved by adding hypernyms' glosses obtained from a lexical knowledge base. We compare and analyze different context-gloss augmentation techniques, and the results show that applying back translation on gloss performs the best.
翻译:Word Sensense Dismendation (WSD) 的目标是在特定背景下确定一个多元单词的感知。 使用 BERT 的深层学习技术已经在实地取得了非常有希望的成果,并且提出了不同的方法来整合结构化知识以提高绩效。 与此同时,越来越多的数据增强技术已被证明对NLP的任务有用。 在利用 BERT 和 WordNet 知识的以往工作的基础上, 我们探索了上下文损失配对的不同数据增强技术, 以改善 WSD 的性能。 在我们的实验中, 我们显示, 句级和字级增强方法都是 WSD 的有效战略。 我们还发现, 通过添加从词汇学知识库获得的超nyms损失可以改进性能。 我们比较并分析不同的上下文- gloss 增强技术, 结果显示, 将回译到 Gloss 是最好的。