Entity matching is the problem of identifying which records refer to the same real-world entity. It has been actively researched for decades, and a variety of different approaches have been developed. Even today, it remains a challenging problem, and there is still generous room for improvement. In recent years we have seen new methods based upon deep learning techniques for natural language processing emerge. In this survey, we present how neural networks have been used for entity matching. Specifically, we identify which steps of the entity matching process existing work have targeted using neural networks, and provide an overview of the different techniques used at each step. We also discuss contributions from deep learning in entity matching compared to traditional methods, and propose a taxonomy of deep neural networks for entity matching.
翻译:实体匹配问题在于确定哪些记录指的是同一个真实世界实体。它已经进行了几十年的积极研究,并制定了各种不同的方法。即使今天,它仍然是一个具有挑战性的问题,仍有很大的改进空间。近年来,我们看到基于自然语言处理的深层学习技术的新方法出现。在这次调查中,我们介绍了如何利用神经网络来进行实体匹配。具体地说,我们确定了实体匹配进程中哪些步骤针对的是使用神经网络的现有工作,并概述了每个步骤使用的不同技术。我们还讨论了实体与传统方法相比的深层学习所作出的贡献,并提出了实体匹配的深层神经网络分类。