The success of deep learning methods hinges on the availability of large training datasets annotated for the task of interest. In contrast to human intelligence, these methods lack versatility and struggle to learn and adapt quickly to new tasks, where labeled data is scarce. Meta-learning aims to solve this problem by training a model on a large number of few-shot tasks, with an objective to learn new tasks quickly from a small number of examples. In this paper, we propose a meta-learning framework for few-shot word sense disambiguation (WSD), where the goal is to learn to disambiguate unseen words from only a few labeled instances. Meta-learning approaches have so far been typically tested in an $N$-way, $K$-shot classification setting where each task has $N$ classes with $K$ examples per class. Owing to its nature, WSD deviates from this controlled setup and requires the models to handle a large number of highly unbalanced classes. We extend several popular meta-learning approaches to this scenario, and analyze their strengths and weaknesses in this new challenging setting.
翻译:深层次学习方法的成功取决于对感兴趣的任务附加说明的大型培训数据集的可用性。与人类智慧不同,这些方法缺乏多功能性,难以迅速学习和适应新任务,因为标签上的数据很少。元学习的目的是通过培训关于大量少发任务的模式来解决这个问题,目的是从少数例子中迅速学习新任务。在本文中,我们提议了一个为几发单词感觉不和(WSD)的元学习框架,其目标是学习从少数有标签的例子中分离出看不见的单词。迄今为止,元学习方法通常都用美元、美元和速记的分类设置来测试,其中每个任务都有一门以美元为单位的课,每班以美元为例。由于其性质,WSD偏离了这一受控的设置,要求模型处理大量高度不平衡的班。我们向这一设想推广了几种流行的元学习方法,并分析了它们在这种富有挑战性的新环境中的长处和短处。