Document-level relation extraction (DocRE) is the task of identifying all relations between each entity pair in a document. Evidence, defined as sentences containing clues for the relationship between an entity pair, has been shown to help DocRE systems focus on relevant texts, thus improving relation extraction. However, evidence retrieval (ER) in DocRE faces two major issues: high memory consumption and limited availability of annotations. This work aims at addressing these issues to improve the usage of ER in DocRE. First, we propose DREEAM, a memory-efficient approach that adopts evidence information as the supervisory signal, thereby guiding the attention modules of the DocRE system to assign high weights to evidence. Second, we propose a self-training strategy for DREEAM to learn ER from automatically-generated evidence on massive data without evidence annotations. Experimental results reveal that our approach exhibits state-of-the-art performance on the DocRED benchmark for both DocRE and ER. To the best of our knowledge, DREEAM is the first approach to employ ER self-training.
翻译:文献级关系提取(DocRE)是确定每个实体之间在一份文件中的所有关系(DocRE)的任务。证据被定义为含有实体对对之间关系的线索的句子,被证明有助于DocRE系统注重相关文本,从而改进关系提取。然而,DocRE中的证据检索(ER)面临两个主要问题:记忆消耗量高和注释有限。这项工作旨在解决这些问题,以改进DocRE中使用ER的情况。首先,我们建议DREEAM是一种记忆高效的方法,将证据信息作为监督信号,从而指导DocRE系统的注意模块对证据进行高分量处理。第二,我们提出DREEAM自我培训战略,以便从无证据说明的大规模数据自动生成的证据中学习ER。实验结果表明,我们的做法展示了DocRE和ER的DocRED基准方面的先进业绩。据我们所知,DREEAM是使用ER自我培训的第一种方法。