Distantly supervised datasets for relation extraction mostly focus on sentence-level extraction, and they cover very few relations. In this work, we propose cross-document relation extraction, where the two entities of a relation tuple appear in two different documents that are connected via a chain of common entities. Following this idea, we create a dataset for two-hop relation extraction, where each chain contains exactly two documents. Our proposed dataset covers a higher number of relations than the publicly available sentence-level datasets. We also propose a hierarchical entity graph convolutional network (HEGCN) model for this task that improves performance by 1.1\% F1 score on our two-hop relation extraction dataset, compared to some strong neural baselines.
翻译:在这项工作中,我们建议交叉文件关系提取,其中关系图的两个实体出现在通过共同实体链链连接的两种不同文件中。根据这个想法,我们为双点关系提取创建一个数据集,其中每个链条完全包含两个文件。我们提议的数据集涵盖的关系比公开提供的判刑水平数据集要多得多。我们还建议为这项任务建立一个等级实体图图变动网络(HEGCN)模型,该模型将我们两点关系提取数据集的性能提高1.1-F1分,而不是一些强大的神经基线。