Entity linking (EL) is the process of linking entity mentions appearing in web text with their corresponding entities in a knowledge base. EL plays an important role in the fields of knowledge engineering and data mining, underlying a variety of downstream applications such as knowledge base population, content analysis, relation extraction, and question answering. In recent years, deep learning (DL), which has achieved tremendous success in various domains, has also been leveraged in EL methods to surpass traditional machine learning based methods and yield the state-of-the-art performance. In this survey, we present a comprehensive review and analysis of existing DL based EL methods. First of all, we propose a new taxonomy, which organizes existing DL based EL methods using three axes: embedding, feature, and algorithm. Then we systematically survey the representative EL methods along the three axes of the taxonomy. Later, we introduce ten commonly used EL data sets and give a quantitative performance analysis of DL based EL methods over these data sets. Finally, we discuss the remaining limitations of existing methods and highlight some promising future directions.
翻译:实体链接(EL)是一个将实体在网络文本中提及的实体与知识库中相应实体联系起来的过程。EL在知识工程和数据挖掘领域发挥着重要作用,它支撑着知识基础人口、内容分析、关系提取和回答问题等各种下游应用。近年来,在各个领域都取得了巨大成功的深层次学习(DL)在EL方法中也得到了利用,以超越传统的机器学习方法,并产生最新业绩。在这次调查中,我们介绍了对现有基于DL的EL方法的全面审查和分析。首先,我们提出了一个新的分类学,利用三个轴:嵌入、特性和算法,组织现有的基于DL的EL方法。然后,我们系统地调查在分类的三个轴线上的代表性EL方法。随后,我们介绍了10个常用的EL数据集,并对基于DL的EL方法在这些数据组中的绩效进行了定量分析。最后,我们讨论了现有方法的剩余局限性,并突出了一些有希望的未来方向。