Multimodal named entity recognition (MNER) and multimodal relation extraction (MRE) are two fundamental subtasks in the multimodal knowledge graph construction task. However, the existing methods usually handle two tasks independently, which ignores the bidirectional interaction between them. This paper is the first to propose jointly performing MNER and MRE as a joint multimodal entity-relation extraction task (JMERE). Besides, the current MNER and MRE models only consider aligning the visual objects with textual entities in visual and textual graphs but ignore the entity-entity relationships and object-object relationships. To address the above challenges, we propose an edge-enhanced graph alignment network and a word-pair relation tagging (EEGA) for JMERE task. Specifically, we first design a word-pair relation tagging to exploit the bidirectional interaction between MNER and MRE and avoid the error propagation. Then, we propose an edge-enhanced graph alignment network to enhance the JMERE task by aligning nodes and edges in the cross-graph. Compared with previous methods, the proposed method can leverage the edge information to auxiliary alignment between objects and entities and find the correlations between entity-entity relationships and object-object relationships. Experiments are conducted to show the effectiveness of our model.
翻译:多式名称实体识别(MNER)和多式联运关系提取(MRE)是多式联运知识图构建任务中的两项基本子任务,然而,现有方法通常独立处理两项任务,忽视它们之间的双向互动。本文是第一个建议共同履行MNER和MRE作为多式联运实体-关系提取(JMERE)的一项联合任务。此外,目前的MNER和MRE模型只考虑将视觉对象与视觉和文字图形中的文本实体相匹配,而忽略实体实体实体实体实体关系和对象对象对象关系。为了应对上述挑战,我们建议为JMERE任务建立一个边加亮的图形调整网络和文字关系标记(EEGA)。具体地说,我们首先设计一个字面关系标记,以利用MNER和MRE之间的双向互动,避免错误传播。然后,我们建议一个边加亮的图形协调网络,通过对查找的节点和边边框关系来强化JMERE的任务。与前几个模型相比,拟议方法可以将边端的图像关系与实验实体之间的关系与实验对象之间的关系与实验实体之间的对比。