In the paper, we test two different approaches to the {unsupervised} word sense disambiguation task for Polish. In both methods, we use neural language models to predict words similar to those being disambiguated and, on the basis of these words, we predict the partition of word senses in different ways. In the first method, we cluster selected similar words, while in the second, we cluster vectors representing their subsets. The evaluation was carried out on texts annotated with plWordNet senses and provided a relatively good result (F1=0.68 for all ambiguous words). The results are significantly better than those obtained for the neural model-based unsupervised method proposed in \cite{waw:myk:17:Sense} and are at the level of the supervised method presented there. The proposed method may be a way of solving word sense disambiguation problem for languages that lack sense annotated data.
翻译:在论文中,我们测试了波兰人对 { 无人监督的} 单词感觉模糊化任务的两种不同方法。 在这两种方法中,我们使用神经语言模型来预测类似于被忽略的单词,并根据这些词来预测单词感的分布。 在第一个方法中,我们选择了相似的单词,而在第二个方法中,我们分组了代表其子群的矢量。评价是在带有 plWordNet 感的附加说明的文本上进行的,并提供了相对良好的结果(所有含混的单词都是F1=0.68)。结果大大优于在\cite{waw:myk:17:sense} 中提出的基于神经模型的非监督方法,并且处于该方法的监督水平。提议的方法可能是解决缺乏说明性数据的语言的单词感模糊化问题。