For many natural language processing (NLP) tasks the amount of annotated data is limited. This urges a need to apply semi-supervised learning techniques, such as transfer learning or meta-learning. In this work we tackle Named Entity Recognition (NER) task using Prototypical Network - a metric learning technique. It learns intermediate representations of words which cluster well into named entity classes. This property of the model allows classifying words with extremely limited number of training examples, and can potentially be used as a zero-shot learning method. By coupling this technique with transfer learning we achieve well-performing classifiers trained on only 20 instances of a target class.
翻译:对于许多自然语言处理(NLP)的任务来说,附加说明的数据数量有限,这促使需要采用半监督的学习技术,如转让学习或元学习。在这项工作中,我们利用Protodomic网络(一种衡量学习技术)处理命名实体识别(NER)的任务。它学会了将哪些文字混入命名实体类别中的中间表达方式。模型的这一属性允许用极其有限的培训实例对词语进行分类,并有可能用作零速学习方法。通过将这一技术与转让学习结合起来,我们只对20个目标类别中的良好分类人员进行了培训。