Zero-resource named entity recognition (NER) severely suffers from data scarcity in a specific domain or language. Most studies on zero-resource NER transfer knowledge from various data by fine-tuning on different auxiliary tasks. However, how to properly select training data and fine-tuning tasks is still an open problem. In this paper, we tackle the problem by transferring knowledge from three aspects, i.e., domain, language and task, and strengthening connections among them. Specifically, we propose four practical guidelines to guide knowledge transfer and task fine-tuning. Based on these guidelines, we design a target-oriented fine-tuning (TOF) framework to exploit various data from three aspects in a unified training manner. Experimental results on six benchmarks show that our method yields consistent improvements over baselines in both cross-domain and cross-lingual scenarios. Particularly, we achieve new state-of-the-art performance on five benchmarks.
翻译:零资源实体识别(NER)因特定领域或语言的数据匮乏而严重受损,大多数关于零资源NER通过微调不同辅助任务,从各种数据中转让知识的研究大多涉及零资源NER。然而,如何适当选择培训数据和微调任务仍然是一个未决问题。在本文件中,我们通过从三个方面(即域、语言和任务)转让知识以及加强它们之间的联系来解决这一问题。具体地说,我们提出了指导知识转移和任务微调的四个实用指南。我们根据这些准则设计了一个面向目标的微调(TOF)框架,以统一的培训方式利用三个方面的数据。六个基准的实验结果表明,我们的方法在跨主题和跨语言情景的基线方面都取得了一致的改进。特别是,我们在五个基准上取得了新的最新业绩。