In many information extraction applications, entity linking (EL) has emerged as a crucial task that allows leveraging information about named entities from a knowledge base. In this paper, we address the task of multimodal entity linking (MEL), an emerging research field in which textual and visual information is used to map an ambiguous mention to an entity in a knowledge base (KB). First, we propose a method for building a fully annotated Twitter dataset for MEL, where entities are defined in a Twitter KB. Then, we propose a model for jointly learning a representation of both mentions and entities from their textual and visual contexts. We demonstrate the effectiveness of the proposed model by evaluating it on the proposed dataset and highlight the importance of leveraging visual information when it is available.
翻译:在许多信息提取应用中,连接(EL)的实体已成为一项关键任务,能够从知识库中利用有关被点名实体的信息,在本文件中,我们处理多式联运实体连接(MEL)的任务,这是一个新兴研究领域,使用文字和视觉信息绘制在知识库(KB)中明确提及一个实体的图示。首先,我们提出为MEL建立一个充分附加说明的推特数据集的方法,在TwitterKB中界定了各实体。然后,我们提出一个模式,共同从它们的文字和视觉背景中学习被点名实体和实体的表述。我们通过对拟议数据集进行评估,展示拟议模型的有效性,并突出强调在可以获得视觉信息时利用这些信息的重要性。