A critical challenge faced by supervised word sense disambiguation (WSD) is the lack of large annotated datasets with sufficient coverage of words in their diversity of senses. This inspired recent research on few-shot WSD using meta-learning. While such work has successfully applied meta-learning to learn new word senses from very few examples, its performance still lags behind its fully supervised counterpart. Aiming to further close this gap, we propose a model of semantic memory for WSD in a meta-learning setting. Semantic memory encapsulates prior experiences seen throughout the lifetime of the model, which aids better generalization in limited data settings. Our model is based on hierarchical variational inference and incorporates an adaptive memory update rule via a hypernetwork. We show our model advances the state of the art in few-shot WSD, supports effective learning in extremely data scarce (e.g. one-shot) scenarios and produces meaning prototypes that capture similar senses of distinct words.
翻译:受监督的单词感脱节(WSD)面临的一个关键挑战是缺乏大量附加说明的数据集,这些数据集在各种感官中充分覆盖了各种字眼。这激发了最近利用元学习对几发WSD的研究。虽然这项工作成功地应用了元学知识从极少数例子中学习新字感,但其性能仍然落后于完全监督的对应数据。为了进一步缩小这一差距,我们提议了一个元学习环境中的WSD语义记忆模型。语义记忆包罗了该模型整个寿命期内所看到的经验,有助于在有限的数据环境中更好地普及。我们的模型以等级差异推论为基础,并通过超级网络纳入了适应性记忆更新规则。我们展示了我们的模型在几发WSD中的艺术状态,支持在极缺乏数据(例如一发)的情景中有效学习,并产生具有类似独特词感的原型。