Document-level relation extraction faces two overlooked challenges: long-tail problem and multi-label problem. Previous work focuses mainly on obtaining better contextual representations for entity pairs, hardly address the above challenges. In this paper, we analyze the co-occurrence correlation of relations, and introduce it into DocRE task for the first time. We argue that the correlations can not only transfer knowledge between data-rich relations and data-scarce ones to assist in the training of tailed relations, but also reflect semantic distance guiding the classifier to identify semantically close relations for multi-label entity pairs. Specifically, we use relation embedding as a medium, and propose two co-occurrence prediction sub-tasks from both coarse- and fine-grained perspectives to capture relation correlations. Finally, the learned correlation-aware embeddings are used to guide the extraction of relational facts. Substantial experiments on two popular DocRE datasets are conducted, and our method achieves superior results compared to baselines. Insightful analysis also demonstrates the potential of relation correlations to address the above challenges.
翻译:文件级关系提取面临两个被忽视的挑战:长尾问题和多标签问题。 先前的工作主要侧重于为实体对口获得更好的背景描述, 很少解决上述挑战。 在本文中, 我们分析了相互关系的共同关联性, 并首次将其引入 DocRE 任务。 我们争论说, 这些关联性不仅可以转让数据丰富关系和数据残缺关系之间的知识, 以协助培训尾随关系, 还可以反映语义距离, 指导分类者识别多标签实体对口的语义密切关系。 具体地说, 我们使用关系嵌入为介质, 从粗略和细细细的视角提出两个共同关联性预测子任务来捕捉关联性。 最后, 学习的关联性嵌入被用来指导对关联性事实的提取。 在两个流行的DocRE数据集上进行了大量实验, 我们的方法也取得了优于基线的结果。 仔细的分析还表明, 与应对上述挑战的关联性关系的潜力。