In this survey, we provide a comprehensive description of recent neural entity linking (EL) systems developed since 2015 as a result of the "deep learning revolution" in NLP. Our goal is to systemize design features of neural entity linking systems and compare their performances to the best classic methods on the common benchmarks. We distill generic architectural components of a neural EL system, like candidate generation and entity ranking summarizing the prominent methods for each of them, such as approaches to mention encoding based on the self-attention architecture. The vast variety of modifications of this general neural entity linking architecture are grouped by several common themes: joint entity recognition and linking, models for global linking, domain-independent techniques including zero-shot and distant supervision methods, and cross-lingual approaches. Since many neural models take advantage of pre-trained entity embeddings to improve their generalization capabilities, we provide an overview of popular entity embedding techniques. Finally, we briefly discuss applications of entity linking, focusing on the recently emerged use-case of enhancing deep pre-trained masked language models such as BERT.
翻译:在本次调查中,我们全面介绍了最近神经实体连接系统(EL)的情况,这些系统是2015年以来在NLP的“深入学习革命”中开发的。我们的目标是将神经实体连接系统的设计特点系统化,并将其性能与共同基准的最佳经典方法进行比较。我们将神经实体系统的通用建筑构件,如候选人生成和实体排名,总结每种系统的主要方法,例如根据自我注意结构提及编码的方法。对这一一般神经实体连接结构的大规模修改,按几个共同主题归类:联合实体识别和链接、全球链接模式、领域独立技术(包括零射和远程监督方法)以及跨语言方法。由于许多神经模型利用预先培训的实体嵌入式来改进其一般化能力,我们概述了普通实体嵌入技术。最后,我们简要讨论了实体连接的应用,重点是最近出现的加强深层培训的隐蔽语言模型(如BERT)的使用情况。