Recently, neural methods have achieved state-of-the-art (SOTA) results in Named Entity Recognition (NER) tasks for many languages without the need for manually crafted features. However, these models still require manually annotated training data, which is not available for many languages. In this paper, we propose an unsupervised cross-lingual NER model that can transfer NER knowledge from one language to another in a completely unsupervised way without relying on any bilingual dictionary or parallel data. Our model achieves this through word-level adversarial learning and augmented fine-tuning with parameter sharing and feature augmentation. Experiments on five different languages demonstrate the effectiveness of our approach, outperforming existing models by a good margin and setting a new SOTA for each language pair.
翻译:最近,神经方法在无需人工制作特征的情况下,在许多语言的命名实体识别(NER)任务中取得了最新成果,然而,这些模型仍然需要人工附加说明的培训数据,而许多语言没有这种数据。在本文中,我们提出了一个不受监督的跨语言净化模型,这种模型可以在完全不受监督的情况下,不依赖任何双语词典或平行数据,将净化知识从一种语言传给另一种语言。我们的模型通过字级对抗性学习和增强参数共享和特性增强的微调来实现这一点。对五种不同语言的实验显示了我们的方法的有效性,以良好的优势优于现有模型,并为每种语言制定了新的SOTA。