Entity Linking is one of the essential tasks of information extraction and natural language understanding. Entity linking mainly consists of two tasks: recognition and disambiguation of named entities. Most studies address these two tasks separately or focus only on one of them. Moreover, most of the state-of-the -art entity linking algorithms are either supervised, which have poor performance in the absence of annotated corpora or language-dependent, which are not appropriate for multi-lingual applications. In this paper, we introduce an Unsupervised Language-Independent Entity Disambiguation (ULIED), which utilizes a novel approach to disambiguate and link named entities. Evaluation of ULIED on different English entity linking datasets as well as the only available Persian dataset illustrates that ULIED in most of the cases outperforms the state-of-the-art unsupervised multi-lingual approaches.
翻译:实体链接是信息提取和自然语言理解的基本任务之一。实体链接主要包括两个任务:识别和分离被点名实体。大多数研究分别讨论这两个任务,或只侧重于其中之一。此外,大多数最先进的实体连接算法要么受到监督,在没有附加说明的社团或语言依赖的情况下,这些算法表现不佳,不适合多种语言的应用。在本文件中,我们采用了一种不受监督的语文独立实体混淆(ULIED),采用新颖的方法来混淆和连接被点名实体。对将数据集连接在一起的不同英国实体以及仅有的波斯数据集进行的ULIED的评估表明,在大多数情况下,ULIED不符合最先进的不受监督的多语言方法。