In this survey, we provide a comprehensive description of recent neural entity linking (EL) systems developed since 2015 as a result of the "deep learning revolution" in NLP. Our goal is to systemize design features of neural entity linking systems and compare their performance to the prominent classic methods on common benchmarks. We distill generic architectural components of a neural EL system, like candidate generation and entity ranking, and summarize prominent methods for each of them. The vast variety of modifications of this general neural entity linking architecture are grouped by several common themes: joint entity recognition and linking, models for global linking, domain-independent techniques including zero-shot and distant supervision methods, and cross-lingual approaches. Since many neural models take advantage of entity and mention/context embeddings to catch semantic meaning of them, we provide an overview of popular embedding techniques. Finally, we briefly discuss applications of entity linking, focusing on the recently emerged use-case of enhancing deep pre-trained masked language models based on the transformer architecture.
翻译:在这次调查中,我们全面介绍了最近神经实体连接系统(EL)的动态,该系统是2015年以来在NLP的“深入学习革命”中开发的。我们的目标是将神经实体连接系统的设计特点系统化,并将其性能与著名的典型的共同基准方法进行比较。我们将神经实体连接系统的通用建筑构件,如候选生成和实体排名等,并概述了每种系统的主要方法。这种普通神经实体连接结构的大规模修改,按几个共同主题归类:联合实体识别和连接模式、全球连接模式、域独立技术(包括零点和远程监督方法)以及跨语言方法。由于许多神经实体模型利用实体和提及/链接来获取它们的语义含义,我们概述了流行嵌入技术。最后,我们简要讨论了实体连接的应用,重点是最近出现的加强基于变压器结构的深层预先培训的隐蔽语言模型的使用情况。